HPE3PAR Error in gathering statvlun
Hello,
I'm facing the following on my HPE3PAR 20400 error.log:
<snip>
Mon Feb 4 12:36:03 2019 ERROR : hp3parperf.pl: Parent died in SIG ALRM - return code: 102
Mon Feb 4 12:36:09 2019 ERROR : hp3parperf.pl: Child server statvlun died in SIG ALRM - return code: 104
Mon Feb 4 12:36:09 2019 ERROR : hp3parperf.pl: Child server statvlun died in SIG ALRM - return code: 104
<snip>
Months ago I head the same which I fixed by using:
<snip>
hpe3parperf.pl:
#$command = $ssh." statvlun -rw -d ".$interval." -iter 1";
$command = $ssh." statvlun -ni -rw -d ".$interval." -iter 1";
<snip>
$command = $ssh." statvlun -ni -rw -d ".$interval." -iter 1";
<snip>
I'm using quite old Version of stor2rrd 2.01 - might this be the reason for the behaviour?
Cheers,
ku
Comments
-
Hi,do you have many of vluns? How many roughly?we do not use -ni, we haddefinitelly upgrade to the latest one 2.41.I cannot say if that resolves it, if not then send us logs.The latest version is only the version which we support for free users.cd /home/stor2rrd/stor2rrd # or where is your STOR2RRD working dir
export STORAGE_NAME=<your storage name alias>
tar cvhf logs.tar tmp/$STORAGE_NAME *txt logs/*$STORAGE_NAME
gzip -9 logs.tar
Send us logs.tar.gz via https://upload.stor2rrd.com
-
Hello Pavel,we are having around 26000 vluns in use.After changing the Perl Script as mentioned in my previous Thread, we exclude the vluns which do not have Perfdata sampled by using "-ni" switch. After doing this everything went fine.Will follow your suggestion in upgrading - thus I do not think that this will help. I'm thinking of timeout values which need to be applied accordingly.We are sampling every 5 minutes HPE3PAR Data. We have 3 HPE3PARs in use. Only Problems on the biggest one.If executing commands manually (via ssh commandline) results are given properly.Sometimes we receive "unable to alloc 4800 bytes" on executing.Cheers,ku
-
how long takes that command directly from ssh?time ssh ... "statvlun ..."There is alert set to 10 minutes if 300 is in SAMPLE_RATE (etc/stor2rrd.cfg) which obviously timing out.
-
Hello Pavel,did the upgrade to 2.41 - problem persists.Uploaded Data via upload.stor2rrd.com.SAMPLE_RATE=300Output of time command during execution of "statvlun -ni -rw -d 1 -iter 1"<snip>real 0m22.711s
user 0m0.176s
sys 0m0.062s<snip>Cheers,ku -
Hello Pavel,since upgrade to 2.41 it seems to work.Thanks for you quick support into this.Cheers,ku
-
Sorry to bother - it works only sporadically.Still facing in error.log-HPE3PAR20400:<snip>Tue Feb 5 08:56:09 2019 ERROR : hp3parperf.pl: Child server statvlun died in SIG ALRM - return code: 104
Tue Feb 5 08:56:09 2019 ERROR : hp3parperf.pl: Child (27066384) exited with code 26624 (6800) = 104<snip>Even if changing the execution command of statvlun tostatvlun -ni -rw -d 1 -iter 1Any Ideas?Cheers,ku -
time ssh <3par> "statvlun -ni -rw -d 300 -iter 1 "
-
See below:real 5m22.021suser 0m0.248s
sys 0m0.068s
-
edit bin/hp3parperf.pladd 3 lines behind line 664:print "DEBUG: statvlun $timeout\n";
$timeout = 600;It will look like this:sub server_statvlun {
if ( not defined $timeout or $timeout eq '' ) {
$timeout = 360;
}
else {
$timeout = $timeout + 60;
}
print "DEBUG: statvlun $timeout\n";
$timeout = 600;let it work and let us know if that helps, attach also this:tail logs/output.log-HPE3PAR20400
-
Did the change in bin/hp3parperf.pl:<snip>+497 sub server_statvlun {
+498 if ( not defined $timeout or $timeout eq '' ) {
+499 $timeout = 360;
+500 }
+501 else {
+502 $timeout = $timeout + 60;
+503 }
+504
+505 print "DEBUG: statvlun $timeout\n";
+506 $timeout = 600;
<snip>In my bin/hp3parperf.pl the statement you mentioned started at line 497.Cheers,ku
-
I received in error.log-HPE3PAR20400<snip>Global symbol "$command" requires explicit package name at /opt/stor2rrd/bin/hp3parperf.pl line 605.
Execution of /opt/stor2rrd/bin/hp3parperf.pl aborted due to compilation errors.
<snip>
-
use this :Gunzip it and copy to /home/stor2rrd/stor2rrd/binIf your web browser gunzips it automatically then just rename it: mv hp3parperf.pl.gz hp3parperf.plSize mus bels -l bin/hp3parperf.pl
-rwxrwxr-x+ 1 stor2rrd stor2rrd 20974 Feb 6 15:43 bin/hp3parperf.pl
-
I'll reverted the change to below:<snip>sub server_statvlun {
if ( not defined $timeout or $timeout eq '' ) {
$timeout = 360;
$timeout = 600;
}
else {
$timeout = $timeout + 60;
$timeout = 600;
}<snip>Error Log tells nothing now.
-
I really wanted to have there debug print to see how timeout was set before, can you put there file as per my previous email?tail logs/output.log-HPE3PAR20400
-
Using your script ends up in<snip>Global symbol "$command" requires explicit package name at /opt/stor2rrd/bin/hp3parperf.pl line 605.
Execution of /opt/stor2rrd/bin/hp3parperf.pl aborted due to compilation errors.
Global symbol "$command" requires explicit package name at /opt/stor2rrd/bin/hp3parperf.pl line 605.
Execution of /opt/stor2rrd/bin/hp3parperf.pl aborted due to compilation errors.
Global symbol "$command" requires explicit package name at /opt/stor2rrd/bin/hp3parperf.pl line 607.
Execution of /opt/stor2rrd/bin/hp3parperf.pl aborted due to compilation errors.
Global symbol "$command" requires explicit package name at /opt/stor2rrd/bin/hp3parperf.pl line 607.
Execution of /opt/stor2rrd/bin/hp3parperf.pl aborted due to compilation errors.
Global symbol "$command" requires explicit package name at /opt/stor2rrd/bin/hp3parperf.pl line 607.
Execution of /opt/stor2rrd/bin/hp3parperf.pl aborted due to compilation errors.<snip>I'm going back to my script - which works.Cheers,ku
-
you must use something else than is on the link I provided you, it does not fail anyhow, more over there is no $command variable on the line with error:open( my $tmpfile, '>', $filename ) || error( "open file failed: $filename : $!", 111 ) && exit 111;Script on the link has this size:ls -l bin/hp3parperf.pl
-rwxr-xr-x 1 stor2rrd stor2rrd 20974 Feb 7 09:36 bin/hp3parperf.plpls use it and let us known.
-
By the way - fixing timeout from 300 to 600 solves the problem. Statistics are now collected correctly.Thanks for your support.
-
I know, but what I want to see is how $timeout was set before. it should have been set 360 but it would work even befor when that cmd takes 320secs at you as per above testing.thanks.
Howdy, Stranger!
Categories
- 1.6K All Categories
- 48 XORMON NG
- 25 XORMON
- 153 LPAR2RRD
- 13 VMware
- 16 IBM i
- 2 oVirt / RHV
- 4 MS Windows and Hyper-V
- Solaris / OracleVM
- XenServer / Citrix
- Nutanix
- 7 Database
- 2 Cloud
- 10 Kubernetes / OpenShift / Docker
- 124 STOR2RRD
- 19 SAN
- 7 LAN
- 17 IBM
- 3 EMC
- 12 Hitachi
- 5 NetApp
- 15 HPE
- Lenovo
- 1 Huawei
- 2 Dell
- Fujitsu
- 2 DataCore
- INFINIDAT
- 3 Pure Storage
- Oracle