"Out of memory" when using OS agent locally on LPAR2RRD server to import NMON files

I have for several years been copying a number of NMON files to the LPAR2RRD server and then imported them using a local OS agent. This has been working fine.

Now suddenly I get an "Out of memory" when running the import. I have tried adding more physical memory, I have tried importing smaller number of files ... nothing helps. LPAR2RRD is ver. 7.50.

Does anyone have a hint?

Regards, Carsten

[@XORUX staff: I do NOT have premium level support]

Comments

  • Pavel
    edited October 2022

    Hi,


    what is your ulimit setup for lpar2rrd user?

    ulimit -a


    how is memory usage during running of it? I suuppose it takes a few minutes,, canyou send a zoomend memory graph from that period?


    have you increased humebr of monitored lpars? anything else what was changed before the error appeared?

  • Hi

    ulimit settings are like this (hard limits first - soft limits after):

    lpar2rrd@lpar2rrd:/home/lpar2rrd> ulimit -Ha

    time(seconds)       unlimited

    file(blocks)        unlimited

    data(kbytes)        unlimited

    stack(kbytes)       4194304

    memory(kbytes)      unlimited

    coredump(blocks)    unlimited

    nofiles(descriptors) unlimited

    threads(per process) unlimited

    processes(per user) unlimited

    lpar2rrd@lpar2rrd:/home/lpar2rrd> ulimit -Sa

    time(seconds)       unlimited

    file(blocks)        unlimited

    data(kbytes)        unlimited

    stack(kbytes)       4194304

    memory(kbytes)      unlimited

    coredump(blocks)    unlimited

    nofiles(descriptors) 65536

    threads(per process) unlimited

    processes(per user) 16384


    It does NOT run for several minutes, only seconds.

    I have tried running a "vmstat" at the same time ... doesn't show much, but here it is:


    No, nothing has changed. Suddenly last week, it just stopped working.

    I should maybe mention, that I (for SEVERAL weeks) have been importing NMON files for a number of Linux servers. These NMON data has never become visible (NMON tabs not shown in LPAR2RRD). This problem was never solved, but I have continued to let the agent import those data.


    Regards, Carsten

  • Hi,

    can you upload that file

    via https://upload.lpar2rrd.com

    Note a short problem description in the text field of the upload form.

  • I could, but last time I uploaded stuff, I was contacted by someone in your company, who told me that I should not upload stuff, since I haven't got premium level support.

    /Carsten

  • Hi

    do not hesitate and upload

  • Hi

    Don't know if it is the last file mentioned, that causes the error - so I uploaded the complete bunch of files, that I tried to import.

    The last file mentioned, before the error, was: b4cta11_221021_0000.nmon

    /Carsten

  • Hi,

    there is problem with NMON output from partition "fral01".

    Do you have in this partition last/newest NMON version?

    If not, can you pls upgrade NMON to the latest version and send us the new NMON file.

    In the meantime, do not use nmon files from this partition as it ends with that "Out of memory".

  • Hi

    You are right. I removed NMON files from fral01 and imported all 352 other NMON files. Without problems!

    I don't really need data from fral01 - since it is a very special (and unmanaged) server, so from my point of view, I will just stop trying to import files from fral01. But if you are interested in digging deeper in this, I can check NMON version and update, if never version is found!?

    Regards, Carsten

  • Hi,

    yes pls, check NMON version and update if there is newer one,

    in any case, let us know

  • Hi

    I am using "nmon_power_64le_rhel7 version 16m" and this is, as far as I can see, the newest version.

    I have been running it like this:

    nmon_power_64le_rhel7 -f -d 10240 -g /b4restore/scripts/nmon_grp -D -s 60 -c 1440 -m /var/nmon

    Maybe the "-d 10240" is crazy ... it was an attempt to avoid problems on servers with many disks.

    Please let me know, if you want me to try something with this.

    Regards, Carsten

  • Hi again

    Explanation found - I think.

    Last week configuration on this server (fral01) changed, so that it now has more than 11000 /dev/sd... devices.

    I will not try anymore on this server - (someone else is responsible for it :-)

    Import from all other servers are working fine (except nmon tabs are not shown on most Linux LPARs - but that is another case).

    Thank you for your help!

    Regards, Carsten

  • Hi

    we have found, that the problem file is no problem for agent running on linux

    but running on AIX is Out of memory, it means it is more problem with Perl on AIX

    you have it on AIX, have not you

  • Hi

    Yes, LPAR2RRD server (and import of NMON files) is on AIX

    /Carsten

  • Hi

    we have adapted the agent script for NMON files,

    download and upgrade the appropriate for you

    http://www.lpar2rrd.com/download-temp/lpar2rrd-agent-7.50-2.ppc.rpm

    http://www.lpar2rrd.com/download-temp/lpar2rrd-agent_7.50-2_all.deb

    http://www.lpar2rrd.com/download-temp/lpar2rrd-agent-7.50-2.noarch.rpm

    You can now work with the huge file fral..., so we fixed the bug out of memory

    You should have data in lpar2rrd for all linux machines.

    Try at first for one machine

    Let us know

  • Hi

    Thank you for this update of the agent script.

    I started with 1 file - that went well. So I took all files from my archive. and started import of 797 files - that also went well, and all LPARS, for which I have imported NMON data, now shows NMON tabs.

    I have stopped collecting files from fral01. But to test your script, I collected 1 file covering 24 hours. That file I have just imported - also without problems ... no "Out of memory"!

    So yes, previously discussed problems seems to be fixed.

    Only strange thing left, is that I have 2 LPARs that now is shown as "Removed" - even though they are not removed, and data from HMC is being updated fine. I am not sure when this problem has started ... or if it is related to the above mentioned problems.

    /Carsten

  • try to upgrade to this version, it should recolve it

    https://www.lpar2rrd.com/download-temp/lpar2rrd-7.51-5.tar

  • cudsen
    edited November 2022

    Thank you for the new version - but it did not solve the problem. I still have 2 LPARs that is shown as "Removed".

    Furthermore: For both of these 2 LPARs, I am importing NMON files, but only one of them shows NMON tabs.

    Screendump shows one of the "removed" LPARs (and as you can see, HMC data is up-to-date):


  • sorry, no further responses from us, you have refused our support offer in the past, then we expect you do not need our services at all

  • Hello

    That is perfectly understandable.

    I didn't expect Xorux to provide support, I just asked on the forum, if anyone had a hint. And please note, that I made it clear on the very first post, that I do not have premium level support.

    /Carsten

  • By the way, I found the reason for why those 2 LPARs were shown as removed: It was because they had a "+" in the LPAR name. I have now renamed them, and they are appearing as normal.

    /Carsten

Sign In or Register to comment.