Linux on Power Agent Problem
Hi,
we have the problem that the lpar2rrd agent 5.05.4 Version on our: SLES12 SP2 (ppc64le) is not sending any data to the lpar2rrd server.
With the HMC we collect the infos, but the additional data SAN IOPS etc. are not collected and that would be a important info for us.
When I try to change the Data Source on the WEB UI I see only "HMC (agentless)"
LPAR2rrd-Server: 5.05
$ tail -1 /var/tmp/lpar2rrd-agent-192.168.121.16-root.txt | cut -d : -f 1-7
9080-MHE*#######:lnx0a:9:1513843202:Thu Dec 21 09:00:02 2017 version 5.05-4
$ tail -1 /var/tmp/lpar2rrd-agent-192.168.121.16-root-ps_job.txt | cut -d : -f 1-7
408617,root,[kworker/18:0],00:00:00,0,0
Telnet works:
$ telnet 192.168.121.16 8162
Trying 192.168.121.16...
Connected to 192.168.121.16.
Escape character is '^]'.
Thank you for your help!
we have the problem that the lpar2rrd agent 5.05.4 Version on our: SLES12 SP2 (ppc64le) is not sending any data to the lpar2rrd server.
With the HMC we collect the infos, but the additional data SAN IOPS etc. are not collected and that would be a important info for us.
When I try to change the Data Source on the WEB UI I see only "HMC (agentless)"
LPAR2rrd-Server: 5.05
$ tail -1 /var/tmp/lpar2rrd-agent-192.168.121.16-root.txt | cut -d : -f 1-7
9080-MHE*#######:lnx0a:9:1513843202:Thu Dec 21 09:00:02 2017 version 5.05-4
$ tail -1 /var/tmp/lpar2rrd-agent-192.168.121.16-root-ps_job.txt | cut -d : -f 1-7
408617,root,[kworker/18:0],00:00:00,0,0
Telnet works:
$ telnet 192.168.121.16 8162
Trying 192.168.121.16...
Connected to 192.168.121.16.
Escape character is '^]'.
Thank you for your help!
Comments
-
this output, looks like something failing in the agent
/usr/bin/perl /opt/lpar2rrd-agent/lpar2rrd-agent.pl -d <LPAR2RRD-SERVER>
-
$ s /usr/bin/perl /opt/lpar2rrd-agent/lpar2rrd-agent.pl 192.168.121.16 -d
LPAR2RRD agent version:5.05-4
Thu Dec 21 15:57:26 2017
main timeout 600
JOB TOP setting: MAX_JOBS=20, LOAD_LIMIT=10, LOAD_LIMIT_MIN=1, PROCESSES_INCLUDE= --
uname -W 2>/dev/null
false
false
false
cat /proc/meminfo 2>>/var/tmp/lpar2rrd-agent-192.168.121.16-root.err
cat /proc/ppc64/lparcfg 2>>/var/tmp/lpar2rrd-agent-192.168.121.16-root.err
false
cat /proc/vmstat 2>>/var/tmp/lpar2rrd-agent-192.168.121.16-root.err
false 2>>/var/tmp/lpar2rrd-agent-192.168.121.16-root.err
false 2>/dev/null
false 2>>/var/tmp/lpar2rrd-agent-192.168.121.16-root.err
ifconfig -a 2>>/var/tmp/lpar2rrd-agent-192.168.121.16-root.err
false 2>>/var/tmp/lpar2rrd-agent-192.168.121.16-root.err
uname -n 2>>/var/tmp/lpar2rrd-agent-192.168.121.16-root.err
vmstat 60 2 2>>/var/tmp/lpar2rrd-agent-192.168.121.16-root.err
9080-MH:lnx00:9:1513868246:Thu Dec 21 15:57:26 2017 version 5.05-4:|::::mem:::4290320768:3661468032:6288527363603741888:57726144:pgs:::0:0:24607.9375:0.0:::lan:eth0:192.168.101.10:16962918304963:175760633497:::::lan:eth1:10.1.245.10:7764992448214:582111082689:::::lan:eth2:172.30.61.10:944962155345:8633136337424:::::cpu:::0:0:1:0::
/var/tmp/lpar2rrd-agent-192.168.121.16-root.txt
9080-MHE:lnx0a:9:1513868246:Thu Dec 21 15:57:26 2017 version 5.05-4:|::::mem:::4290320768:3661468032:6288527363603741888:57726144:pgs:::0:0:24607.9375:0.0:::lan:eth0:192.168.101.10:16962918304963:175760633497:::::lan:eth1:10.1.245.10:7764992448214:582111082689:::::lan:eth2:172.30.61.10:944962155345:8633136337424:::::cpu:::0:0:1:0::
Agent send : not sending data this time (act_time=1513868246, last_send_time=1513867802, next_time=1513868102, random=6)
-
ok, I did not notice that you "cut" output in previous example. here it looks ok.
ls -l /var/tmp/lpar2rrd*
tail -2 /var/tmp/lpar2rrd*err
-
$ ls -l /var/tmp/lpar2rrd*
-rw-r--r-- 1 root root 123642 Dec 21 16:30 /var/tmp/lpar2rrd-agent-192.168.121.16-root-ps_job.txt
-rw-r--r-- 1 root root 2169 Nov 30 16:09 /var/tmp/lpar2rrd-agent-192.168.121.16-root.err
-rw-r--r-- 1 root root 0 Dec 21 16:31 /var/tmp/lpar2rrd-agent-192.168.121.16-root.stamp
-rw-r--r-- 1 root root 10 Dec 21 16:31 /var/tmp/lpar2rrd-agent-192.168.121.16-root.stamp-send
-rw-r--r-- 1 root root 0 Dec 21 14:33 /var/tmp/lpar2rrd-agent-192.168.121.16-root.stamp-trimlogs
-rw-r--r-- 1 root root 1745 Dec 21 16:36 /var/tmp/lpar2rrd-agent-192.168.121.16-root.txt
-rw-r--r-- 1 root root 0 Dec 21 16:31 /var/tmp/lpar2rrd-agent-192.168.121.16-root.txt-tmp
-rw-r--r-- 1 root root 6332 Dec 21 16:31 /var/tmp/lpar2rrd-agent-192.168.121.16-root.txtorig
-rw-r--r-- 1 root root 0 Nov 17 14:00 /var/tmp/lpar2rrd-agent-nmon-192.168.121.16-root-time_file.txt
$ tail -2 /var/tmp/lpar2rrd*err
Thu Nov 30 16:09:06 2017: wrong server response: agent_time:1512054001 : recv_time:0 :
Thu Nov 30 16:09:06 2017: Error: Not all data has been sent out, refused line: 9080-MHE*78:lnx0a:9:1512054001:Thu Nov 30 16:00:01 2017 version 5.05-4:|::::mem:::4290320768:3509294592:7810261763453965952:55328640:pgs:::0:0:24607.9375:0.0:::lan:eth0:192.168.101.10:9502665536164:125387129903:::::lan:eth1:10.1.245.10:7755117983677:579608000258:::::lan:eth2:172.30.61.10:679845861905:7290371919012:::::cpu:::0:0:0:0:::CPUTOP:263757:h3eadm:hdbindexserver -port 30040:4265820:1074:3426822976:3431358272::CPUTOPb/log/DB_H31/:683:75:241536:643776::CPUTOP:263759:h3eadm:hdbindexserver -port 30043:219542:65:25269568:101901056::CPUTOP:263629:h3eadm:hdbnameserver:92488:27:19408832:98522880::CPUTOP:47876:sapadm:/usr/sap/hostctrl/exe/sapstartsrv pf=/usr/sap/hostctrl/exe/host_profile -D:73548:21:183168:1698112::CPUTOP:48827:root:/usr/sap/hostctrl/exe/saposcol -l -w60 pf=/usr/sap/hostctrl/exe/host_profile:50878:15:19648:24640::CPUTOP:263715:h3eadm:hdbpreprocessor:52975:15:1698816:91459520::CPUTOP:264237:h3eadm:hdbwebdispatcher:46682:14:1816640:90378496::CPUTOP:263713:h3eadm:hdbcompileserver:45577:14:1555008:90306944::CPUTOP:120017:daaadm:/u... -hostvm -nodeName=smdagent -file=/usr/sap/DAA/SMDA98/smdagentgroup.properties -jvmFile=/usr/sap/DAA/SMDA98/work/jstart.jvm -traceFile=/usr/sap/DAA/SMDA98/work/dev_smdagent -javaOutFile=/usr/sap/DAA/SMDA98/work/jvm_smdagent.out:34225:11:606592:9144640::CPUTOP:120015:-hostvm -nodeName=smdagent_saph3e00a -file=/usr/sap/DAA/SMDA98/smdagentgroup.properties -jvmFile=/usr/sap/DAA/SMDA98/work/jstart.jvm -traceFile=/usr/sap/DAA/SMDA98/work/dev_smdagent_saph3e00a -javaOutFile=/usr/sap/DAA/SMDA98/work/jvm_smdagent_saph3e00a.out:33635:11:612288:9225152: /opt/lpar2rrd-agent/lpar2rrd-agent.pl:2454
-
It looks fine, data is collected and trasferent to lpar2rrd server.
Are you deleteing serial from examples above, I see only server type like: 9080-MH, there should be serial.
On the server side:
cd /home/lpar2rrd/lpar2rrd
ls -l data/*/*/lnx0a/
-
Hi Pavel,
maybe I have deleted some lines in the output to hide some private information.
here is the ls output:
-rw-r--r-x 1 root root 3825840 Dec 28 08:00 data/9080-.../hmc0001a/lnx0a.rrm
PS: Is there a possible way to provide only you some private information?
Thanks!
-
ok, all data except SAN is collected.
Note that not all FC drivers on Linux provide perf data.
Is that exists on your Linux host?
ls -l /sys/class/fc_host/*/statistics/tx_frames
Is there any data inside (cat ...)?
BTW: you can reach us at: support at lpar2rrd.com
-
no there is no directory statistics:
$ ls -l /sys/class/fc_host/*/statistics/tx_frames
ls: cannot access '/sys/class/fc_host/*/statistics/tx_frames': No such file or directory
$ ls /sys/class/fc_host/*/
/sys/class/fc_host/host0/:
dev_loss_tmo issue_lip port_id port_type subsystem uevent
device maxframe_size port_name power supported_classes
fabric_name node_name port_state speed tgtid_bind_type
/sys/class/fc_host/host1/:
dev_loss_tmo issue_lip port_id port_type subsystem uevent
device maxframe_size port_name power supported_classes
fabric_name node_name port_state speed tgtid_bind_type
So I guess they is no way for the lpar2rrd-agent to get the SAN Data instead?
Thank you for your help!
-
hmm, there is no file with stats as you can see, no way for us ...
If you find any other method how to get perf data at you then let us know.
-
okay, thank you any way and have nice New Year's Eve.
See you 2018 -
I have the same problem whit a Solaris ldom, y see the followin messages on the /var/tmp/lpar2rrd-agent.out file
Agent send : not sending data this time (act_time=1535008380, last_send_time=1535008142, next_time=1535008442, random=9)
on server i see the information of server but don show on GUI
/home/lpar2rrd/lpar2rrd/data/Solaris/no_hmc/lbdgdbmurex3[lpar2rrd@lbdgvlpar2rrd lbdgdbmurex3]$ ls -ltotal 20576-rw-r--r-- 1 lpar2rrd lpar2rrd 7 Aug 9 05:54 agent.cfg-rw-r--r-- 1 lpar2rrd lpar2rrd 3825840 Aug 9 09:12 cpu.mmmdrwxrwxr-x 2 lpar2rrd lpar2rrd 6 Aug 23 02:13 JOB-rw-r--r-- 1 lpar2rrd lpar2rrd 14 Aug 22 23:16 lan-net0.cfg-rw-r--r-- 1 lpar2rrd lpar2rrd 3825840 Aug 23 02:09 lan-net0.mmm-rw-r--r-- 1 lpar2rrd lpar2rrd 8 Aug 9 05:54 lan-net1.cfg-rw-r--r-- 1 lpar2rrd lpar2rrd 3825840 Aug 9 09:12 lan-net1.mmm-rw-r--r-- 1 lpar2rrd lpar2rrd 5738368 Aug 23 02:09 mem.mmm-rw-r--r-- 1 lpar2rrd lpar2rrd 3825840 Aug 23 02:09 pgs.mmm
-
Hi,send us this info via support at lpar2rrd.comagent side:tail -1 /var/tmp/lpar2rrd*txtls -l /var/tmp/lpar2rrd*lpar2rrd server side:cd /home/lpar2rrd/lpar2rrdgrep -i lbdgdbmurex3 logs/error.log-daemon| tail
-
root@lbdgdbmurex3:~# tail -1 /var/tmp/lpar2rrd*txt27960,ora11g,ora_w008_murex1,00:00,4853400,4840800root@lbdgdbmurex3:~# ls -l /var/tmp/lpar2rrd*-rw-r--r-- 1 lpar2rrd lpar2rrd 17905 Sep 10 11:00 /var/tmp/lpar2rrd-agent-NimserverPRD-bck-lpar2rrd-ps_job.txt-rw-r--r-- 1 lpar2rrd lpar2rrd 2096 Aug 30 12:45 /var/tmp/lpar2rrd-agent-NimserverPRD-bck-lpar2rrd.err-rw-r--r-- 1 lpar2rrd lpar2rrd 0 Sep 10 11:19 /var/tmp/lpar2rrd-agent-NimserverPRD-bck-lpar2rrd.stamp-rw-r--r-- 1 lpar2rrd lpar2rrd 10 Sep 10 11:19 /var/tmp/lpar2rrd-agent-NimserverPRD-bck-lpar2rrd.stamp-send-rw-r--r-- 1 lpar2rrd lpar2rrd 0 Sep 10 09:40 /var/tmp/lpar2rrd-agent-NimserverPRD-bck-lpar2rrd.stamp-trimlogs-rw-r--r-- 1 lpar2rrd lpar2rrd 323 Sep 10 11:20 /var/tmp/lpar2rrd-agent-NimserverPRD-bck-lpar2rrd.txt-rw-r--r-- 1 lpar2rrd lpar2rrd 0 Sep 10 11:19 /var/tmp/lpar2rrd-agent-NimserverPRD-bck-lpar2rrd.txt-tmp-rw-r--r-- 1 lpar2rrd lpar2rrd 8792 Sep 10 11:19 /var/tmp/lpar2rrd-agent-NimserverPRD-bck-lpar2rrd.txtorig-rw-r--r-- 1 root root 0 Aug 23 01:20 /var/tmp/lpar2rrd-agent-NimserverPRD-bck-root.err-rw-r--r-- 1 root root 0 Aug 23 01:20 /var/tmp/lpar2rrd-agent-NimserverPRD-bck-root.stamp-rw-r--r-- 1 root root 10 Aug 23 01:20 /var/tmp/lpar2rrd-agent-NimserverPRD-bck-root.stamp-send-rw-r--r-- 1 root root 0 Aug 23 01:20 /var/tmp/lpar2rrd-agent-NimserverPRD-bck-root.stamp-trimlogs-rw-r--r-- 1 root root 0 Aug 23 01:20 /var/tmp/lpar2rrd-agent-NimserverPRD-bck-root.txt-tmp-rw-r--r-- 1 root root 317 Aug 23 01:20 /var/tmp/lpar2rrd-agent-NimserverPRD-bck-root.txtorig-rw-r--r-- 1 lpar2rrd lpar2rrd 1706 Sep 10 11:20 /var/tmp/lpar2rrd-agent.outroot@lbdgdbmurex3:~#
=======================[root@lbdgvlpar2rrd lpar2rrd]# grep -i lbdgdbmurex3 logs/error.log-daemon| tailSat Jul 28 20:46:20 2018: Client comunication failed - client: (192.168.60.15): ERROR: /home/lpar2rrd/lpar2rrd/data/Solaris/no_hmc/lbdgdbmurex3/lan-net0.mmm: illegal attempt to update using time 1532803449 when last update time is 1532803449 (minimum one second step) at /home/lpar2rrd/lpar2rrd/bin/lpar2rrd-daemon.pl line 1038. : sending ok time anyway : 1532803449 : :Sat Jul 28 13:54:42 2018: Client comunication failed - client: (192.168.60.15): ERROR: /home/lpar2rrd/lpar2rrd/data/Solaris/no_hmc/lbdgdbmurex3/lan-net0.mmm: illegal attempt to update using time 1532803511 when last update time is 1532803511 (minimum one second step) at /home/lpar2rrd/lpar2rrd/bin/lpar2rrd-daemon.pl line 1038. : sending ok time anyway : 1532803511 : :Sat Jul 28 13:54:42 2018: Client comunication failed - client: (192.168.60.15): ERROR: /home/lpar2rrd/lpar2rrd/data/Solaris/no_hmc/lbdgdbmurex3/lan-net0.mmm: illegal attempt to update using time 1532803575 when last update time is 1532803575 (minimum one second step) at /home/lpar2rrd/lpar2rrd/bin/lpar2rrd-daemon.pl line 1038. : sending ok time anyway : 1532803575 : :Sat Jul 28 13:54:42 2018: Client comunication failed - client: (192.168.60.15): ERROR: /home/lpar2rrd/lpar2rrd/data/Solaris/no_hmc/lbdgdbmurex3/lan-net0.mmm: illegal attempt to update using time 1532803630 when last update time is 1532803630 (minimum one second step) at /home/lpar2rrd/lpar2rrd/bin/lpar2rrd-daemon.pl line 1038. : sending ok time anyway : 1532803630 : :Sat Jul 28 13:54:42 2018: Client comunication failed - client: (192.168.60.15): ERROR: /home/lpar2rrd/lpar2rrd/data/Solaris/no_hmc/lbdgdbmurex3/lan-net0.mmm: illegal attempt to update using time 1532803692 when last update time is 1532803692 (minimum one second step) at /home/lpar2rrd/lpar2rrd/bin/lpar2rrd-daemon.pl line 1038. : sending ok time anyway : 1532803692 : :Sat Jul 28 13:54:42 2018: Client comunication failed - client: (192.168.60.15): ERROR: /home/lpar2rrd/lpar2rrd/data/Solaris/no_hmc/lbdgdbmurex3/lan-net0.mmm: illegal attempt to update using time 1532803760 when last update time is 1532803760 (minimum one second step) at /home/lpar2rrd/lpar2rrd/bin/lpar2rrd-daemon.pl line 1038. : sending ok time anyway : 1532803760 : :Sat Jul 28 13:54:42 2018: Client comunication failed - client: (192.168.60.15): ERROR: /home/lpar2rrd/lpar2rrd/data/Solaris/no_hmc/lbdgdbmurex3/lan-net0.mmm: illegal attempt to update using time 1532803812 when last update time is 1532803812 (minimum one second step) at /home/lpar2rrd/lpar2rrd/bin/lpar2rrd-daemon.pl line 1038. : sending ok time anyway : 1532803812 : :Sat Jul 28 13:54:42 2018: Client comunication failed - client: (192.168.60.15): ERROR: /home/lpar2rrd/lpar2rrd/data/Solaris/no_hmc/lbdgdbmurex3/lan-net0.mmm: illegal attempt to update using time 1532803870 when last update time is 1532803870 (minimum one second step) at /home/lpar2rrd/lpar2rrd/bin/lpar2rrd-daemon.pl line 1038. : sending ok time anyway : 1532803870 : :Sat Jul 28 13:54:42 2018: Client comunication failed - client: (192.168.60.15): ERROR: /home/lpar2rrd/lpar2rrd/data/Solaris/no_hmc/lbdgdbmurex3/lan-net0.mmm: illegal attempt to update using time 1532803926 when last update time is 1532803926 (minimum one second step) at /home/lpar2rrd/lpar2rrd/bin/lpar2rrd-daemon.pl line 1038. : sending ok time anyway : 1532803926 : :Sat Jul 28 13:54:42 2018: Client comunication failed - client: (192.168.60.15): ERROR: /home/lpar2rrd/lpar2rrd/data/Solaris/no_hmc/lbdgdbmurex3/lan-net0.mmm: illegal attempt to update using time 1532803991 when last update time is 1532803991 (minimum one second step) at /home/lpar2rrd/lpar2rrd/bin/lpar2rrd-daemon.pl line 1038. : sending ok time anyway : 1532803991 : : -
as per me it si working fine, do you see data in the GUI --> Unmanaged --> Solaris for that host?BTW we are finishing completly new Solaris implementation which follows up Solaris (Sparc/x86) virtualisation (LDOM/Global zone support etc). It is being already beta tested, release in Oct 2018.
-
tnks
Howdy, Stranger!
Categories
- 1.6K All Categories
- 41 XORMON NG
- 25 XORMON
- 149 LPAR2RRD
- 13 VMware
- 16 IBM i
- 2 oVirt / RHV
- 4 MS Windows and Hyper-V
- Solaris / OracleVM
- XenServer / Citrix
- Nutanix
- 6 Database
- 2 Cloud
- 10 Kubernetes / OpenShift / Docker
- 122 STOR2RRD
- 19 SAN
- 7 LAN
- 17 IBM
- 3 EMC
- 12 Hitachi
- 5 NetApp
- 15 HPE
- Lenovo
- 1 Huawei
- 1 Dell
- Fujitsu
- 2 DataCore
- INFINIDAT
- 3 Pure Storage
- Oracle