Clearing counter on SAN switches => peak in graph

vejpuste · January 2018

Hello,
I found discusion https://forum.xorux.com/discussion/83/clearing-crc-error-counter-on-switch-leads-to-huge-numbers-in-crc-error-graph but lack the necessary result.
Clearing counter affects a lot of the charts and repair is manual work.
I think it would be possible to check reset counter before writing data to rrd and this resolve the problem.
It is necessary to know the expected maximum value and the previous value.
Regards,
Libor Vejpustek

Pavel · January 2018

Hello,

there is no simple solution identify reset of the counter.
How to do it, when it might overflow its maximum, i.e. next value might have less absolute value then previous one.

BTW this is not a problem if you have Brocade Network Advisor as our data source where we do not save data as counters.

vejpuste · January 2018

If you know last value, previous value and maximum value, it's result reset counter.

(max_value - last_value)/max_value > max_value/100 => reset counter => update value in rrd with value U - undef. This eliminate peak in rrd file with type COUNTER.

Second variant is use of DERIVE type in rrd.

Reset counter is allways miss of data.

It's possible to test this 2 alternatives in real usage and compare graph with current graph.
Regards,
Libor Vejpustek

Pavel · January 2018

well, solution is use other type, gauge or derive.
we do not use anymore counter types, for new projetcs we always translate data to gauge before saving it in rrdtool (based on knowledge of actual and previous value)

Far biggest problem here is backward compatability.
Switch to different data type in rrdtool would mean losing complete history.

If you use as data source for SAN data BNA then there we already use only gauge and such problem cannot occure.

vejpuste · January 2018

Update with value U after counter reset eliminate peak problem with COUNTER type.
Can we call in Czech language, its better for me.
Regards,
Libor Vejpustek

vejpuste · January 2018

Try this script :

#!/bin/bash

function zapis

{

echo "`date` $1 # $2"

rrdtool update datafile.rrd N:${1}0:${2}0

sleep 60

}.

cd /root/

mv datafile.rrd datafile_old.rrd

rrdtool create datafile.rrd --step 60 --start N DS:packets:COUNTER:120:0:1000000000 DS:packets2:DERIVE:120:0:1000000000 RRA:AVERAGE:0.5:1:60 RRA:AVERAGE:0.5:4:60 RRA:AVERAGE:0.5:24:60

#sleep 60

zapis 10 10

zapis 12322 12322

zapis 22322 22322

zapis 32322 32322

zapis 35322 42322

zapis 42322 52322

zapis 52322 62322

zapis 62322 72322

zapis 2322 2322

zapis 99333 99333

zapis 10 10

zapis 1000 1000

zapis 2000 2000

zapis 3000 3000

zapis 4000 4000

zapis 5000 5000

zapis 6000 6000

zapis 7000 7000

zapis 8000 8000

# This line eliminate peak in rrd file.

#zapis U 0

zapis 1000 1000

zapis 2000 2000

zapis 3000 3000

zapis U 0

zapis 1000 1000

zapis 2000 2000

zapis 3000 3000

zapis U 100

zapis 1000 1000

zapis 2000 2000

zapis 3000 3000

Pavel · January 2018

No to vypada pekne

Urcite to muzeme minimalne zkusit tam kde mame z hist duvodu jeste countery.
Ted se k tomu ale urcite nedostaneme, nejsou zdroje, tak jsem to alespon dal na todo list.

Dik!

vejpuste · January 2018

Na vyzkouseni by bylo dobre zkusit udelat ukladani udaju paralelne do 2 souboru stejneho typu COUNTER a az by chybna spicka nastala porovnat jaky soubor ma lepsi vysledky.
My jsme zkusili nasadit monitoring na SAN switche a po resetovani counteru se to projevilo na jednom portu a na druhem ne.
Mozna bych nasel chvili a zkusil bych to otestovat sam, jenom bych se musel prohrabat zdrojakama.
Preji klidny den.
Libor Vejpustek

Clearing counter on SAN switches => peak in graph

Comments

Howdy, Stranger!

Categories

In this Discussion