XIV volumes data not displayed after linux host reboot

atifsyed · July 2019

Hi

We are running stor2rrd v2.40 on a RHEL 7.5. After applying patches , we do not see all the LUNS in the web GUI.

looks at stor2rrd CLI , we see the following error in logs/error.log-XIVNAMEHERE

WANING Tue Jul 2 15:10:10 2019 xivperf.pl: Same sample: IBM.2812-7860226.volume.9e814d0006a = 20190629071904.000000+000, skipping

There are hundreds if not thousands of lines above this. I have looked at the code on github , this happens when previous volume data has no time stamp. Is there any fix like deleting the stale files to kick start this back to life.

Atif

atifsyed · July 2019

This is the code that executes ref https://github.com/dago/stor2rrd/blob/master/dist_storage/bin/xivperf.pl

Line 335 - 339 ,

&warning("Same sample: ".$cacheKey." = ".$stat->{'StatisticTime'}.", skipping.");

$retry_cnt = 0;

if($debug) { print("=\n"); }

last;

if ( $storage_epoch == $cached_raw_counters->{'storage_timestamp'} ) {

Pavel · July 2019

Hi,

try to remove these file

stor2rrd/data/XIVNAMEHERE/tmp/cache.file
stor2rrd/data/XIVNAMEHERE/tmp/config.file

Let us know if that helps

atifsyed · July 2019

Thanks for your reply.

This location had no files stor2rrd/data/XIVNAMEHERE/tmp/*

Instead I found the files here

stor2rrd/data/XIVNAMEHERE/cache.file

stor2rrd/data/XIVNAMEHERE/config.file

I have backed up the original file and deleted them. I re-ran load.sh and I will give it a couple of hours to see if that helps, will keep you posted.

atifsyed · July 2019

Only one host is displaying the correct IO rate and data rate on the Stor2rrd interface. Rest of the hosts show "-nan" values in the graphs.

atifsyed · July 2019

WE have dig through the time line, the day data stopped collecting was a reboot , interestingly this XIV was a target XIV in mirrored relationship. We switched to this XIV as primary and removed the mirrored relationships.

After doing that, Stor2rrd got in the state where it displays -nan in the graphs

atifsyed · July 2019

The linux host reboot and mirror relationship change happened at the same day, stor2rrd never worked properly after the bounce

Lukas · July 2019

Hello,

Send us logs pls.
Note a short problem description in the text field of the upload form.

cd /home/stor2rrd/stor2rrd # or where is your STOR2RRD working dir

tar cvhf logs.tar logs tmp/*txt

gzip -9 logs.tar

Send us logs.tar.gz via https://upload.stor2rrd.com
You might even attach screenshots when it helps in understanding of the issue.

thank you

Pavel · July 2019

Hi,

pls screenshots

1. host which is working,

2. some other host

3. volume aggregated graph

4. ls -l data/<storage>HOST

5. cat data/<storage>/HOST/host*txt

https://upload.stor2rrd.com

atifsyed · July 2019

Historical data on this XIV is not important as it was a mirrored target until we changed it to source recently.

Any way we can remove this XIV completely from stor2rrd and rediscover it again.

Pavel · July 2019

you can remove data/<storage name> directory, then everything wil be rediscovered

atifsyed · July 2019

I removed the old data using rm -Rf data/xivnamehere*

Rediscovery worked and now we have some hosts with values in the charts and some state the following error in the GUI.

Error happened

Check

- $$LPAR2RRD_HOME$$/logs/error-cgi.log

- Web server error log

When I checked the log/error.log-xivname , I see it flooded with entries like this one

Warning Tue Jul 9 11:50:13 2019 xivperf.pl: Same sample: IBM.2812-7860226.volume.xxxxxxxxxxxx = 20190629071904.000000+000, skipping

where xxxxxxxxxxx changes every line. Interesting fact , June 29 is when we stopped the remote mirror relationships and unlocked the volumes in this XIV and made them independent volumes. It looks like it has nothing to do with the reboot but it is somehow effected by stopping the relationship and unlocking the volumes on XIV.

XIV itself is working fine but stor2rrd is not working properly after the replication change.

XIV volumes data not displayed after linux host reboot

Comments

Howdy, Stranger!

Categories

In this Discussion