Is not possible to connect after upgrade agent ibm i v1.1.1
One day ago we did the upgrade to v1.1.1 agent ibm i and it doesn't connect to the server. I realized all steps indicated and nothing. The daemon y running. We download the correct file v1.1.1, before we enden rtv *cntrl and cleared the file history. I did it twice.
*******************************************************************************
*******************************************************************************
?????
******************************************************************************
*****************************************************************************
Comments
-
no other message after " IF: C_SNDSTS - protocol 5.1 UTF8 conv"? even if you refresh it by F5?
It might take about 30minutes that it appears there since the agent starts?
-
The last message is this:
IF: C_SNDSTS - protocol 5.1 UTF8 conv
I refresh with F5 and nothing? -
Check if there is running both agent job: RTV_SYSSTS and SND_SYSSTS (this one obvious running)
If not the check its job log.
-
Yes, both are running. I did the same with another partition & was ok.
Here the joblog
*********************************************************************
CALL PGM(C_RTVSTS) PARM(60 99 X'000100000003FFFFD5404040000A000000030000D5
404040000A00000003FFFFD5404040000A00000003FFFFD5404040000A000000030000D540
4040000A0000' X'0003FFFFD5404040000A000000030000D5404040000A00000003FFFFD5
404040000A00000003FFFFD5404040000A000000030000D5404040000A00000004FFFF' X'
00030000D5404040000A00000003FFFFD5404040000A00000003FFFFD5404040000A000000
030000D5404040000A00000004FFFFD54040400000000C00000004' X'0003FFFFD5404040
000A00000003FFFFD5404040000A000000030000D5404040000A00000004FFFFD540404000
00000C00000004FFFFD54040400000000C0000' X'0003FFFFD5404040000A000000030000
D5404040000A00000004FFFFD54040400000000C00000004FFFFD54040400000000C000000
060000D5404040012C000A000000000012000000070000D54040400000000AC44040404040
4040400000001900000003FFFFD5404040000F00040000D54040400000000C00000001D740
404040C9C4F0F0F0C340404040F1F7F24BF1F94BF1F4F94BF1F7F540404040404040404040
4040404040404040F8F1F6F240D7C86DC4E3C1D8D9C3E55CD3C9C2D34040404040D7C86DC4
E3C1D8E2D5C45CD3C9C2D34040404040D7C86DE4E2D9E2D7C3405CD3C9C2D34040404040D3
D7C1D9F2D9D9C440405CD3C9C2D34040404040D3D7C1D9F2D9D9C440400BB85CD5D6405CE8
C5E2405CE4E3C6F840400001D5404040618896948561D3D7C1D9F2D9D9C440404040404040
40404040404040404040404040404040404040404040404040404040404040404040404040
40404040404000000000000000000000000000000000000000000000000000000000000000
00000000000000000000000000000000000000000000000000000000000000000000000000
00000000000000000000000000000000000000000000000000000000000000000000000000
00000000000000000000000000000000000000000000000000000000000000000000000000
00000000000000000000000000000000000000000000000000000000000000000000000000
00000000000000000000000000000000000000000000000000000000000000000000000000
00000000000000000000000000000000000000000000000000000000000000000000000000
00000000000000000000000000000000000000000000000000000000000000000000000000
0000000000000000000000' X'00030000D5404040000A00000004FFFFD54040400000000C
00000004FFFFD54040400000000C000000060000D5404040012C000A000000000012000000
070000D54040400000000AC440404040404040400000001900000003FFFFD5404040000F00
*****************************************************************************************
CALL PGM(C_SNDSTS) PARM('172.19.149.175' '8162' X'0001D740404040C9C4F0F0F0
C340404040F1F7F24BF1F94BF1F4F94BF1F7F5' 'PH_DTAQRCV*LIBL' 'PH_DTAQSND*LIBL
' 'PH_USRSPC *LIBL' '*UTF8' 1 'N' '/home/LPAR2RRD')
IF: C_SNDSTS - protocol 5.1 UTF8 conv
-
Hi,
To analyze problem we need start LPAR2RRD agent with debug mode a need two User Space dumps.
path '/home/lpar2rrd' must exist
1/ Dump UserSpace by command (current)
CMD_DMPUSR USRSPCB(LPAR2RRD/PH_USRSPC) PATHIN('/home/lpar2rrd')
2/ Stop LPAR2RRD agent
go MENU
opt. 13. END_SYSSTS
3/ Start LPAR2RRD agent and change default parameter CYCLE(1) and DEBUG(*ON)
go MENU
opt.1. Set up parameters to start client as RTV_SYSSTS (opt.10)
opt.10. RTV_SYSSTS
change default parameter CYCLE(1) and DEBUG(*ON)
4/ After a few minutes (10 min) Stop LPAR2RRD agent go MENU , opt. 13. END_SYSSTS
5/ Dump UserSpace by command (after)CMD_DMPUSR USRSPCB(LPAR2RRD/PH_USRSPC) PATHIN('/home/lpar2rrd')
6/ Sent to us 4 IFS files from - path '/home/lpar2rrd'
this files are in *EBCDIC code, please send this file as a binary ( FTP .. BIN ) no conversion.
these IFS files look like:
two DMP_ . DMP_2017-02-08-13.41.45.813.log (date different of cause)
one SND_ SND_2016-12-05-22.30.24.773.log (date different of cause)
one_RTV_ RTV_2016-12-05-22.30.21.750.log (date different of cause)
upload it into https://upload.lpar2rrd.com
-
Files sended.
-
Hi,
thanks, we see the problem. It might happen when you have many pools defined in the system.
As the last point (5) there is a workaround which should make it work.
We would need for futhyer analyze start LPAR2RRD agent with debug mode again (as you did).1/ Stop LPAR2RRD agent
go MENU
opt. 13. END_SYSSTS
2/ Start LPAR2RRD agent and change default parameter CYCLE(1) and DEBUG(*ON)
in addition ...
change parameter:
List of ASPs information:
Time period (sec) 60 (default *TIMEPER)
change parameter:
List of Physical Interfaces:
Time period (sec) 60 (default *TIMEPER)
go MENU
opt.1. Set up parameters to start client as RTV_SYSSTS (opt.10)
opt.10. RTV_SYSSTS
change default parameter CYCLE(1) and DEBUG(*ON)
and change default parameter:
List of ASPs information:
Time period (sec) 60
and change parameter:
List of Physical Interfaces:
Time period (sec) 60
3/ After a few minutes (10 min) Stop LPAR2RRD agent go MENU , opt. 13. END_SYSSTS
4/ Sent to us 2 IFS files from - path '/home/lpar2rrd'
this files are in *EBCDIC code, please send this file as a binary ( FTP .. BIN ) no conversion.
these IFS files look like:
one SND_ SND_2016-12-05-22.30.24.773.log (date different of cause)
one_RTV_ RTV_2016-12-05-22.30.21.750.log (date different of cause)
5/ After that, start LPAR2RRD agent see 2/
change parameter back to default values - CYCLE(*MAX) DEBUG(*OFF)
change parameter:
List of ASPs information:
Time period (sec) 60
change parameter:
List of Physical Interfaces:
Time period (sec) 60LPAR2RRD agent should start working now.
-
thanks for data.
Have you tried to change parameters as per point 5?
Is the agent working working now?
-
ok, after analyze of the last data you have sent our suggested workaround does fully not help.
All should work except of pool data.
We will let you know as soon as the fix is available (during next week)
-
No work lpar2rrd agent, with suggested changes applied.
Same status de not conection Pavel.
Al rigeht, I will hope the fix.
-
Can you try attached code? It should resolve the issue:
Please restore this *PGM C_RTVSTS over the existing program into library LPAR2RRD.
move savf to QGPL by FTP (bin)
and use command
RSTOBJ OBJ(C_RTVSTS) SAVLIB(LPAR2RRD) DEV(*SAVF) SAVF(QGPL/C_RTVSTS) MBROPT(*ALL) ALWOBJDIF(*ALL)
to rewrite existing program. Check authoriry and ownership (LPAR2RRD - user ).
Start LPAR2RRD agent ( the same way as on other partitions )
Let us know .....
Thanks!
-
Morning Pavel.
With restore it works. It gets connected.
Now, I still have trouble with a partition, who doesn´t get connection throw ping and telnet.
Security and network department dont have any alerts about that. It leaved to work after an ipl.
I could try this methods and prove it (update to v 1.1.1 and restore the c_rtvsts) ?
The point is we don't have connection to the server.
Exist any way like start the agent on debug or any file that say really what happen.
With that solution would have in 100% the tool.
-
great, thanks for confirmation that it works.
It will be included into the next version 1.1.2
I do not fully understand the problem with the other partition.
Is ping and telnet to the server port working or not?
There is actually no any debug mode in the agent.
-
From especific and only one ibm i lpar I can´t establish connection with the server.
The rest of partition´s have comunication sucessfully!
In other words, I do a ping or telnet to ip server and no have answer.
Department of network and security don't have any alerts about that.
Is posible view any log or something that help me more and determinate the problem?
-
Hi,
if ping and telnet does not work there there is nothing to debug on the agent side.
There is no network connectivity and agent will not work.
Check your routing table and interfaces on the server, firewalls etc ... we cannot advice here anything, the problem is on your infra.
-
Hi,
The ip server is included in our routing tables. It doesn't cross throw firewall.
The packages don´t get out from the OS400. Don't get it to our switch core.
My question is:
Why do one lpar work fine and another not, if they are in the same power?
Can I check anything in the server? I don´t know. A traceroute for example. I dont have much experience on linux, my partners do.
Any suggestion in apply a command.
Thanks a lot
-
You said that ping did not work.
If ping does not work then the OS agent does not work either. It uses the same network path which obviously does not work i.e. your server is not able to reach lpar2rrd machine generally.
It is not an "OS agent" issue, it is a system issue we are not able to fix or troublesoot. There might be many of reasons ...
Make ping and telnet work (fix the network problem) and then let us know if the OS agent still does not work.
-
Hi Pavel, I am still checking the point of connection. I will notify any restlessness.
However, I updated the agent in another partition, we have the same trouble, it didn´t show succesfully connection in the log of sns_sts job. I restored the c_rtvsts pgm that you sent me and now it´s presenting this error:
IF: C_SNDSTS - connection Established()
WR: C_SNDSTS frame not sent 201 (
-
Hi,
Are you able to ping/telnet from this one?
Try to stop it the agent, clear historical data (item 15) and start it again.
-
I did that and it's working sucessfully. Thanks.
Let me check tomorrow with the another partition that it can´t connect to the server.
-
Pavel we have this error on joblog SND_SYSSTS
IF: C_SNDSTS - protocol 5.1 UTF8 conv
IF: C_SNDSTS - connection Established()
WR: C_SNDSTS frame not sent 201 (
WR: C_SNDSTS frame not sent 201 (
WR: C_SNDSTS frame not sent 201 (
WR: C_SNDSTS frame not sent 201 (
WR: C_SNDSTS frame not sent 201 (
The comunication is ok
-
Hi,
is there anything related on the server side in logs/error.log-daemon?
-
Hi,
I appologize, one more our issue in the agent, it is still related to the previous issue but on another place.
I will send you the final version 1.1.2 next week.
-
Here is the fix:
http://www.lpar2rrd.com/download/C_SNDSTS.zip
Please restore this *PGM C_SNDSTS over the existing program into library LPAR2RRD.
move savf to QGPL by FTP (bin)
and use command
RSTOBJ OBJ(C_SNDSTS) SAVLIB(LPAR2RRD) DEV(*SAVF) SAVF(QGPL/C_SNDSTS) MBROPT(*ALL) ALWOBJDIF(*ALL)
to rewrite existing program. Check authoriry and ownership (LPAR2RRD - user ).
Clear history .. go menu opt.15. Clear RTV_SYSSTS history.
Start LPAR2RRD agent ( the same way as on other partitions )
let us know .....
-
Hi Miguel,
have you got time to test it?
Thanks.
-
Yes and it works
-
Pavel I have a question about historical report on lpar2rrd, specifically on THREADSat the level of SHRPOOL. This current number of threads by shrpool can see like max jobs entries for subsystem assigned to shrpool? Or you have any form to see max jobs entries carried out for a subsysten in specific?
-
Hi,
I do not uderstand the question.
If there is a limit per jobs in a subsystem then it has nothing with threads. Each job can create as many threads as it needs.
We do not have metric "jobs per subsystems".
-
Thanks
-
Hello all,I want to refresh this topic, because i think i still have this problem on my system.Over the year ago i resolve my connection problem applying fix ( C_RTVSTS and C_SNDSTS ) from this topic. But after time the Licence Agreement expired ( the option 14 didn't help) I get this message:
ER:RTV_SYSSTS Expiration date of this client is 2018-07-31.
I guess that the Expiration date i hardcoded in these files.
New lpar2rrd agent should resolve this problem but he does not do it for me. Now im using 1.1.6 and the only a get in joblog of SND_SYSSTS is:IF: C_SNDSTS - protocol 5.4 UTF8 conv
full joblogDisplay All Messages
System: XYZ
Job . . : SND_SYSSTS User . . : LPAR2RRD Number . . . : 979506
Job 979506/LPAR2RRD/SND_SYSSTS started on 22/05/19 at 18:32:38 in
subsystem LPAR2RRD in LPAR2RRD. Job entered system on 22/05/19 at
18:32:38.
Job 979506/LPAR2RRD/SND_SYSSTS submitted.
>> CALL PGM(C_SNDSTS) PARM('I_REMOVED_IP_ADDRESS' '8162' X'0001D74040404C4F0F0F0C34
0404040F1F04BF5F04BF2F44BF1F1' 'PH_DTAQRCV*LIBL' 'PH_DTAQSND*LIBL' 'PH_USR
SPC *LIBL' '*UTF8' 1 'N' '/home/LPAR2RRD')
IF: C_SNDSTS - protocol 5.4 UTF8 conv
Bottom
Press Enter to continue.
Second, that i think the topic is related - now i have 26 System Pool (SHRPOOL) on my systemi have connection with lpar2rrd server , I check telnet and ping
Can you help me with this case ?
Howdy, Stranger!
Categories
- 1.6K All Categories
- 48 XORMON NG
- 25 XORMON
- 153 LPAR2RRD
- 13 VMware
- 16 IBM i
- 2 oVirt / RHV
- 4 MS Windows and Hyper-V
- Solaris / OracleVM
- XenServer / Citrix
- Nutanix
- 7 Database
- 2 Cloud
- 10 Kubernetes / OpenShift / Docker
- 124 STOR2RRD
- 19 SAN
- 7 LAN
- 17 IBM
- 3 EMC
- 12 Hitachi
- 5 NetApp
- 15 HPE
- Lenovo
- 1 Huawei
- 2 Dell
- Fujitsu
- 2 DataCore
- INFINIDAT
- 3 Pure Storage
- Oracle