lpar2rrd network monitoring wrong data
we have a AIX LPAR with three different network interfaces and different traffic. but in the lpar2rrd gui the performance data is for all three interfaces the same and this is not true.
[MB/sec]
Int | READ/IN | Avg | Max | WRITE/OUT | Avg | Max |
---|---|---|---|---|---|---|
en10 | 206.63 | 684.22 | 201.58 | 501.72 | ||
en11 | 206.63 | 684.19 | 201.58 | 501.70 | ||
en12 | 206.63 | 684.17 | 201.58 | 501.69 |
How can I fix this!
lpar2rrd Agent-version: the latest
lpar2rrd Server-version: 4.70
we try to install the lpar2rrd agent for the first time and want to roll out the agent on all production machines and need to fix this issue first.
I use following crontab entry:
* * * * * /usr/bin/perl /opt/lpar2rrd-agent/lpar2rrd-agent.pl <lpar2rrd_server>
I have no entries in the /var/tmp/lpar2rrd*err files since May 11
Comments
-
Hi
we do not support such old version for free users.
Pls upgrade to the latest 5.00 (server & agent) and let us know if the issue persist.
-
Hi Pavel,
I have upgraded to the latest lpar2rrd version 5.0 (server and agent) but have the same problem[MB/sec]
en10 231.60 699.67 18.53 225.12 en11 231.60 699.67 18.53 225.11 en12 231.60 699.65 18.53 225.11
-
I am checking it in our enc and do not see that, all looks ok, AIX, Linux.
What is your agent OS?
rpm -qa| grep lpar2rrd
tail -1 /var/tmp/lpar2rrd*txt
-
Hi Pavel,
$ rpm -qa | grep lpar
lpar2rrd-agent-5.00-0
$ oslevel -s
7200-01-01-1642 AIXHere is the tail -1 output
==> /var/log/tb017/lpar2rrd-agent-lpar2rrd_server-root.txt <==
9080-MME*1234567:<hostname>:3:1495011360:Wed May 17 10:56:00 2017 version 4.96-0:4024000000|8:<hostname>:0::mem:::1073741824:920092540:153649284:101339108:0:88235776:pgs:::0:0:24576:7:::lan:en10:192.168.203.31:1165993083957646:2335900672640161:136089428177:894643750007:::lan:en11:192.168.113.31:1165993086376987:2335900672896979:136089428232:894643750092:::lan:en12:10.1.236.120:1165993088459607:2335900673179882:136089428278:894643750198:::cpu:::0:11:49:3:::san:fcs0:0x20000090FADC896E:1699758578828377:211911817552992:66670565202:15896992573:::san:fcs1:0x20000090FADC896F:1780582838391467:960022827282432:7942518717:3793802404:::san:fcs2:0x20000090FAE09249:1699759495447486:211928097357920:66657032687:15896959120:::san:fcs3:0x20000090FAE0924A:1800190725660606:973053833142272:8046810280:3833127311:::san_resp:fcs0::0.8:0.4:::::san_resp:fcs1::27.9:4.5:::::san_resp:fcs2::0.8:0.4:::::san_resp:fcs3::27.9:4.5::::==> /var/log/tb017/lpar2rrd-agent-nmon-lpar2rrd_server-root-time_file.txt <==
SRV_9080-MME*1234567_LPR_<hostname>_TIME_1494507902_XOR_<hostname>_170511_0000.nmonMaybe it is not correct displayed, because we are using real native hardware network adapter in our lpar and not VEA (Virtual ethernet adapter) as usual under AIX?
Thanks!
-
here is our network setup with physical network adapters. the physical adapters combined to a ethernet channel and on the ethernet channel devices we created three virtuell network interfaces for our three different vlans:
$ lsdev | grep ent
ent0 Available 02-00 PCIe3 4-Port 10GbE SR Adapter (df1020e21410e304)
ent1 Available 02-01 PCIe3 4-Port 10GbE SR Adapter (df1020e21410e304)
ent2 Available 02-02 PCIe3 4-Port 10GbE SR Adapter (df1020e21410e304)
ent3 Available 02-03 PCIe3 4-Port 10GbE SR Adapter (df1020e21410e304)
ent4 Available 03-00 PCIe3 4-Port 10GbE SR Adapter (df1020e21410e304)
ent5 Available 03-01 PCIe3 4-Port 10GbE SR Adapter (df1020e21410e304)
ent6 Available 03-02 PCIe3 4-Port 10GbE SR Adapter (df1020e21410e304)
ent7 Available 03-03 PCIe3 4-Port 10GbE SR Adapter (df1020e21410e304)
ent8 Available Virtual I/O Ethernet Adapter (l-lan)
ent9 Available EtherChannel / IEEE 802.3ad Link Aggregation
ent10 Available VLAN
ent11 Available VLAN
ent12 Available VLAN
$ lsattr -El 10
lsattr: 0514-519 The following device was not found in the customized
device configuration database:
10
$ lsattr -El ent9
adapter_names ent0,ent1,ent4,ent5 EtherChannel Adapters True
alt_addr 0xe60086007b02 Alternate EtherChannel Address True
auto_recovery yes Enable automatic recovery after failover True
backup_adapter NONE Adapter used when whole channel fails True
hash_mode default Determines how outgoing adapter is chosen True
interval short Determines interval value for IEEE 802.3ad mode True
mode 8023ad EtherChannel mode of operation True
netaddr 0 Address to ping True
noloss_failover yes Enable lossless failover after ping failure True
num_retries 3 Times to retry ping before failing True
retry_time 1 Wait time (in seconds) between pings True
use_alt_addr yes Enable Alternate EtherChannel Address True
use_jumbo_frame yes Enable Gigabit Ethernet Jumbo Frames True
$ lsattr -El ent11
base_adapter ent9 VLAN Base Adapter True
vlan_priority 0 VLAN Priority True
vlan_tag_id 2113 VLAN Tag ID True
$ lsattr -El ent12
base_adapter ent9 VLAN Base Adapter True
vlan_priority 0 VLAN Priority True
vlan_tag_id 236 VLAN Tag ID True
$$ lsattr -El ent10
base_adapter ent9 VLAN Base Adapter True
vlan_priority 0 VLAN Priority True
vlan_tag_id 2203 VLAN Tag ID TrueOn the network interfaces ent10,ent11 and ent12 are our service ips defined.
-
entstat -d en10| egrep "Bytes"
entstat -d en11| egrep "Bytes"
entstat -d en12| egrep "Bytes"
sleep 10
entstat -d en10| egrep "Bytes"
entstat -d en11| egrep "Bytes"
entstat -d en12| egrep "Bytes"
-
$ entstat -d en10| egrep "Bytes"
Bytes: 1242811655484412 Bytes: 2521087133781420
$ entstat -d en11| egrep "Bytes"
Bytes: 1242811657235642 Bytes: 2521087134061111
$ entstat -d en12| egrep "Bytes"
Bytes: 1242811658715091 Bytes: 2521087134732180
$ sleep 10
$ entstat -d en10| egrep "Bytes"
Bytes: 1242813951892748 Bytes: 2521088268639117
$ entstat -d en11| egrep "Bytes"
Bytes: 1242813952700723 Bytes: 2521088269711420
$ entstat -d en12| egrep "Bytes"
Bytes: 1242814297905151 Bytes: 2521088466227944
$
-
I can see that traffic is very similar on all 3 interfaces, is that possible?
1242813951892748-1242811655484412
2296408336
1242813952700723-1242811657235642
2295465081
1242814297905151-1242811658715091
2639190060
-
Hi Pavel,
when i look at the current nmon network throughput, the entstat Bytes makes no sense to me.
The most network traffic is definitely at en12.
-
we just report what entstat reports. If it reports wrong numbers then it is a problem in entstat.
I remember the same case. entstat apparently ignores VLAN taged networks and shows just total traffic. Do they have same MAC: netstat -in
You can upload nmon file to lpar2rrd to compare it.
-
Hi Pavel,
yes they have the same mac address and it looks like that this is the problem. Do you remember how to fix that behaviour in aix or maybe on lpar2rrd.
-
there is no fix or workaround.
I am not sure if we (customer) have raised a call with IBM.
You can try it, the problem is clear.
-
Hi pavel,
okay, I will create a call and we will see what the ibm says.
thanks anyway.
-
ok, thanks.
Let us know.
-
Hi,
an update from the IBM support:
the vlan adapters get there statistics from the real hardware adapter attached to them.
that is so designed from ibm.
Maybe there is another way for lpar2rrd with different commands to get the correct results?
What do you think Pavel?
Thanks!
-
Hi,
I do not think that on the VIOS (AIX) level we can get such stat by using std OS cmds.
There is something on the HMC level how to get some VLAN stats as far as I remember, but its priority has never reached level to look into at least
-
Hello,I'm seeing the same problem on all LPARs with EtherChannel(on LPARs with a single virtual Ethernet, the network load with reliable data)on any 5.0X version lpar2rrd and older
if the lpar2rrd collects data simply from the output of the entstat, for several interconnected interfaces, the data may be processed incorrectly?
the entstat counters on my problem LPARs look reliablyfor exampleentstat -r en2
entstat -d en2
sleep 60
entstat -d en2ETHERNET STATISTICS (en2) :
Device Type: EtherChannel
Elapsed Time: 0 days 0 hours 0 minutes 0 seconds
Transmit Statistics: Receive Statistics:
-------------------- -------------------
Packets: 55 Packets: 24
Bytes: 22319 Bytes: 2335
Statistics for every adapter in the EtherChannel:
-------------------------------------------------
Number of adapters: 2
Active channel: primary channel
Operating mode: Network interface backup mode
ETHERNET STATISTICS (ent0) :
Device Type: Host Ethernet Adapter (l-hea)
Transmit Statistics: Receive Statistics:
-------------------- -------------------
Packets: 70 Packets: 50
Bytes: 35984 Bytes: 11052
Backup adapter - ent1:
======================
ETHERNET STATISTICS (ent1) :
Device Type: Host Ethernet Adapter (l-hea)
Transmit Statistics: Receive Statistics:
-------------------- -------------------
Packets: 0 Packets: 0
Bytes: 0 Bytes: 0
ETHERNET STATISTICS (en2) :
Device Type: EtherChannel
Elapsed Time: 0 days 0 hours 0 minutes 58 seconds
Transmit Statistics: Receive Statistics:
-------------------- -------------------
Packets: 223768 Packets: 326280
Bytes: 216985038 Bytes: 115945103
Statistics for every adapter in the EtherChannel:
-------------------------------------------------
Number of adapters: 2
Active channel: primary channel
Operating mode: Network interface backup mode
-------------------------------------------------------------
ETHERNET STATISTICS (ent0) :
Device Type: Host Ethernet Adapter (l-hea)
Transmit Statistics: Receive Statistics:
-------------------- -------------------
Packets: 223771 Packets: 326242
Bytes: 216986211 Bytes: 115942447
Backup adapter - ent1:
======================
ETHERNET STATISTICS (ent1) :
Device Type: Host Ethernet Adapter (l-hea)
Transmit Statistics: Receive Statistics:
-------------------- -------------------
Packets: 0 Packets: 44
Bytes: 0 Bytes: 3669P.S. on all graphs lpar2rrd with inaccurate LAN average load about the same ~14Mbps (At the same time, the actual load is quite different)
-
Hi,check below lpar2rrd agent data file what is transfered and compare it to entstat data.tail -1 /var/tmp/lpar2rrd*txtNote there could be filetered 2 files, use the one without "ps"in its name.You will see if there are correctly transfered data counters like in nthis example:... lan:en0:192.168.1.9:4069721276:3022293914:17760734:17787938 ...
Howdy, Stranger!
Categories
- 1.6K All Categories
- 48 XORMON NG
- 25 XORMON
- 153 LPAR2RRD
- 13 VMware
- 16 IBM i
- 2 oVirt / RHV
- 4 MS Windows and Hyper-V
- Solaris / OracleVM
- XenServer / Citrix
- Nutanix
- 7 Database
- 2 Cloud
- 10 Kubernetes / OpenShift / Docker
- 124 STOR2RRD
- 19 SAN
- 7 LAN
- 17 IBM
- 3 EMC
- 12 Hitachi
- 5 NetApp
- 15 HPE
- Lenovo
- 1 Huawei
- 2 Dell
- Fujitsu
- 2 DataCore
- INFINIDAT
- 3 Pure Storage
- Oracle