Performance data stopped for ONTAP 9.1P8 system - load_host failed

Hello Forum,
Hello Jirka,

for one of our ONTAP 9.1 System the performance data in Stor2RRD stopped 2 weeks ago. Our other systems are displaying just fine. I can't remember that we did changes on the systems.

Today I upgraded Stor2RRD from 2.01 to 2.20. This did not resolve the issue. But after the upgrade the historical performance data of the affected system is not displaying any more :-(

With 2.01 we received this messeges in /home/lpar2rrd/stor2rrd/logs/error.log:

Tue Mar 6 13:40:34 CET 2018: an error in /usr/bin/perl -w /home/lpar2rrd/stor2rrd/bin/storage.pl: 1 /home/lpar2rrd/stor2rrd/bin/storage.pl:186
Tue Mar 6 13:40:34 2018: jaf-aff-01 : load_host failed: ERROR: /home/lpar2rrd/stor2rrd/data/jaf-aff-01/NODE/jaf-aff-01-02.rrd: conversion of '-' to float not complete: tail '-' at /home/lpar2rrd/stor2rrd/bin/LoadDataModule.pm line 16489.Use of uninitialized value $RRDp::error_mode in string eq at /usr/lib64/perl5/vendor_perl/RRDp.pm line 168, line 146.

With 2.20 we receive this messeges in /home/lpar2rrd/stor2rrd/logs/error.log:

Tue Mar  6 14:20:17 2018: jaf-aff-01 : load_host failed: ERROR: /home/lpar2rrd/stor2rrd/data/jaf-aff-01/NODE/jaf-aff-01-02.rrd: conversion of '-' to float not complete: tail '-' at /home/lpar2rrd/stor2rrd/bin/LoadDataModule.pm line 27359. /home/lpar2rrd/stor2rrd/bin/storage.pl:198
Tue Mar  6 14:20:17 CET 2018: an error in /usr/bin/perl -w /home/lpar2rrd/stor2rrd/bin/storage.pl: 1

Thanks for your help, Mario









Comments

  • I have the same pronblem. At error log I found:
    ERROR: /home/stor2rrd/data/mam-clst1/NODE/mam-clst1-01.rrd: conversion of '-' to float not complete: tail '-' at /home/stor2rrd/bin/LoadDataModule.pm line 27359
    Use of uninitialized value $RRDp::error_mode in string eq at /usr/lib64/perl5/vendor_perl/RRDp.pm line 168, <FH> line 100.
    ERROR: /home/stor2rrd/data/mam-clst1/NODE/mam-clst1-01.rrd: conversion of '-' to float not complete: tail '-' at /home/stor2rrd/bin/LoadDataModule.pm line 27359
    Use of uninitialized value $RRDp::error_mode in string eq at /usr/lib64/perl5/vendor_perl/RRDp.pm line 168, <FH> line 100.
    ERROR: /home/stor2rrd/data/mam-clst1/NODE/mam-clst1-01.rrd: conversion of '-' to float not complete: tail '-' at /home/stor2rrd/bin/LoadDataModule.pm line 27359
    Use of uninitialized value $RRDp::error_mode in string eq at /usr/lib64/perl5/vendor_perl/RRDp.pm line 168, <FH> line 100.
    ERROR: /home/stor2rrd/data/mam-clst1/NODE/mam-clst1-01.rrd: conversion of '-' to float not complete: tail '-' at /home/stor2rrd/bin/LoadDataModule.pm line 27359


  • Hi again,
    Today I uploaded the system logs of our installation.
    I forgot to mention: the system with the problem is named JAF-AFF-01
    cd /home/stor2rrd/stor2rrd # or where is your STOR2RRD working dir
    tar cvhf logs.tar logs etc tmp/*txt
    gzip -9 logs.tar
    
    Send us logs.tar.gz via https://upload.stor2rrd.com

  • Hi Mario,

    you have last valid data from 22nd Feb, then appeared corrupted data from the storage.
    Have you upgraded NetApp fimware?

    Try remove old data
    cd /home/stor2rrd/stor2rrd
    rm data/jaf-aff-01/*perf*

    Let it work for some time ...

    Unfortunatelly we see in log other problem with no solution from our side, read more here: http://www.stor2rrd.com/netapp-ontap_v9.htm

    This is in your actuall logs:

    ERROR Tue Apr 24 08:00:15 2018 naperf.pl: Bad NetApp data detected! Collection ignored... SUBSYS: volume  NAME: file_sas_01  METRIC: read_data  VALUE: 117911140KB







  • Hi Leszek,

    try to do similar thing like we have adviced above:

    cd /home/stor2rrd/stor2rrd
    rm data/mam-clst1/*perf*


  • Hello Pavel,
    Many thanks! The data for our System is displayed again :-)

    I can remember we had authentication issue because of the system time of the NetApp being off for more then 15 minutes. I think the error came form this time issue.

    We have no issue with the data display of file_sas_01. Just ignoring the bad NetApp data detected message ;-)

    Mario
  • Hi Pavel,
    Bingo! Removing old *perf* data helped. Thank you.
    Leszek
  • Hi,

    got exact same issue with 2.41 and ONTAP 9.4P3. Removing the old perf data helped only for a few minutes. Any other hints?

    Chris
  • BTW: I got this issue also with Version 2.40. It worked fine for months. The only thing we changed is to put more workload on the Netapp.
  • Hi,

    send us this:
    cd /home/stor2rrd/stor2rrd
    tail -100 logs/error.log-<storage name>

    support@stor2rrd. com


  • Surprised by your quick answer I sended the Logs. 
  • I've sended you both logs (there are two NetApp Systems). One holds the active Data, the other one is only Snapmirror target. The Volumes an Data on both systems are very similar. So maybe the same Volume Names causes this issues!?!
  • resolution of the last problem from hochstic user.

    The problem was double entries for NetApp agent in crontab.
    Agent started twice for each storage and corrupted output file with perf data (when 2 processes wrote in single file at the same time)

  • Hi Pavel, you did a great job and your response was very quick!
    Many thanks!
Sign In or Register to comment.