Performance data stopped for ONTAP 9.1P8 system - load_host failed
Hello Jirka,
for one of our ONTAP 9.1 System the performance data in Stor2RRD stopped 2 weeks ago. Our other systems are displaying just fine. I can't remember that we did changes on the systems.
Today I upgraded Stor2RRD from 2.01 to 2.20. This did not resolve the issue. But after the upgrade the historical performance data of the affected system is not displaying any more :-(
With 2.01 we received this messeges in /home/lpar2rrd/stor2rrd/logs/error.log:
Tue Mar 6 13:40:34 CET 2018: an error in /usr/bin/perl -w /home/lpar2rrd/stor2rrd/bin/storage.pl: 1 /home/lpar2rrd/stor2rrd/bin/storage.pl:186
Tue Mar 6 13:40:34 2018: jaf-aff-01 : load_host failed: ERROR: /home/lpar2rrd/stor2rrd/data/jaf-aff-01/NODE/jaf-aff-01-02.rrd: conversion of '-' to float not complete: tail '-' at /home/lpar2rrd/stor2rrd/bin/LoadDataModule.pm line 16489.Use of uninitialized value $RRDp::error_mode in string eq at /usr/lib64/perl5/vendor_perl/RRDp.pm line 168, line 146.
With 2.20 we receive this messeges in /home/lpar2rrd/stor2rrd/logs/error.log:
Tue Mar 6 14:20:17 2018: jaf-aff-01 : load_host failed: ERROR: /home/lpar2rrd/stor2rrd/data/jaf-aff-01/NODE/jaf-aff-01-02.rrd: conversion of '-' to float not complete: tail '-' at /home/lpar2rrd/stor2rrd/bin/LoadDataModule.pm line 27359. /home/lpar2rrd/stor2rrd/bin/storage.pl:198
Tue Mar 6 14:20:17 CET 2018: an error in /usr/bin/perl -w /home/lpar2rrd/stor2rrd/bin/storage.pl: 1
Comments
-
I have the same pronblem. At error log I found:
ERROR: /home/stor2rrd/data/mam-clst1/NODE/mam-clst1-01.rrd: conversion of '-' to float not complete: tail '-' at /home/stor2rrd/bin/LoadDataModule.pm line 27359
Use of uninitialized value $RRDp::error_mode in string eq at /usr/lib64/perl5/vendor_perl/RRDp.pm line 168, <FH> line 100.
ERROR: /home/stor2rrd/data/mam-clst1/NODE/mam-clst1-01.rrd: conversion of '-' to float not complete: tail '-' at /home/stor2rrd/bin/LoadDataModule.pm line 27359
Use of uninitialized value $RRDp::error_mode in string eq at /usr/lib64/perl5/vendor_perl/RRDp.pm line 168, <FH> line 100.
ERROR: /home/stor2rrd/data/mam-clst1/NODE/mam-clst1-01.rrd: conversion of '-' to float not complete: tail '-' at /home/stor2rrd/bin/LoadDataModule.pm line 27359
Use of uninitialized value $RRDp::error_mode in string eq at /usr/lib64/perl5/vendor_perl/RRDp.pm line 168, <FH> line 100.
ERROR: /home/stor2rrd/data/mam-clst1/NODE/mam-clst1-01.rrd: conversion of '-' to float not complete: tail '-' at /home/stor2rrd/bin/LoadDataModule.pm line 27359
-
Hi again,
Today I uploaded the system logs of our installation.
I forgot to mention: the system with the problem is named JAF-AFF-01cd /home/stor2rrd/stor2rrd # or where is your STOR2RRD working dir tar cvhf logs.tar logs etc tmp/*txt gzip -9 logs.tar Send us logs.tar.gz via https://upload.stor2rrd.com
-
Hi Mario,
you have last valid data from 22nd Feb, then appeared corrupted data from the storage.
Have you upgraded NetApp fimware?
Try remove old data
cd /home/stor2rrd/stor2rrd
rm data/jaf-aff-01/*perf*
Let it work for some time ...
Unfortunatelly we see in log other problem with no solution from our side, read more here: http://www.stor2rrd.com/netapp-ontap_v9.htm
This is in your actuall logs:ERROR Tue Apr 24 08:00:15 2018 naperf.pl: Bad NetApp data detected! Collection ignored... SUBSYS: volume NAME: file_sas_01 METRIC: read_data VALUE: 117911140KB
-
Hi Leszek,
try to do similar thing like we have adviced above:
cd /home/stor2rrd/stor2rrd
rm data/mam-clst1/*perf*
-
Hello Pavel,
Many thanks! The data for our System is displayed again :-)
I can remember we had authentication issue because of the system time of the NetApp being off for more then 15 minutes. I think the error came form this time issue.
We have no issue with the data display of file_sas_01. Just ignoring the bad NetApp data detected message ;-)
Mario
-
Hi Pavel,
Bingo! Removing old *perf* data helped. Thank you.
Leszek
-
Hi,
got exact same issue with 2.41 and ONTAP 9.4P3. Removing the old perf data helped only for a few minutes. Any other hints?
Chris -
BTW: I got this issue also with Version 2.40. It worked fine for months. The only thing we changed is to put more workload on the Netapp.
-
Hi,send us this:cd /home/stor2rrd/stor2rrdtail -100 logs/error.log-<storage name>support@stor2rrd. com
-
Surprised by your quick answer I sended the Logs.
-
I've sended you both logs (there are two NetApp Systems). One holds the active Data, the other one is only Snapmirror target. The Volumes an Data on both systems are very similar. So maybe the same Volume Names causes this issues!?!
-
Hi Pavel, you did a great job and your response was very quick!
Many thanks!
Howdy, Stranger!
Categories
- 1.6K All Categories
- 41 XORMON NG
- 25 XORMON
- 149 LPAR2RRD
- 13 VMware
- 16 IBM i
- 2 oVirt / RHV
- 4 MS Windows and Hyper-V
- Solaris / OracleVM
- XenServer / Citrix
- Nutanix
- 6 Database
- 2 Cloud
- 10 Kubernetes / OpenShift / Docker
- 122 STOR2RRD
- 19 SAN
- 7 LAN
- 17 IBM
- 3 EMC
- 12 Hitachi
- 5 NetApp
- 15 HPE
- Lenovo
- 1 Huawei
- 1 Dell
- Fujitsu
- 2 DataCore
- INFINIDAT
- 3 Pure Storage
- Oracle