VMware graphs are dropping often

My Cluster CPU and memeory graphs are dropping often. Any reason for this ?






Comments

  • send us logs

    Note a short problem description in the text field of the upload form.
    
    cd /home/lpar2rrd/lpar2rrd  # or where is your LPAR2RRD working dir
    tar cvhf logs.tar logs etc tmp/*txt 
    gzip -9 logs.tar
    
    Send us logs.tar.gz via https://upload.lpar2rrd.com

  • Did you find the reason? We have same issue.
  • Hi,

    we have no feedback.
    Basically problem is that data load is longer than an hour.
    It was quite huge environment.

    To speed up proccesing you can increase paralelization processing of ESXi.
    Normally there is 9 processes running to get ESXi data from each vCenter.
    Increase it to 20.

    cd /home/lpar2rrd/lpar2rrd
    vi etc/.magic

    VMWARE_PARALLEL_RUN=20
    export VMWARE_PARALLEL_RUN

    Let it run.
    Does it help?


  • Hi Pavel,  
    (Version 5.07)

    just found this old thread. 
    I think you comment is not correct!
    I checked the load_vmware.sh script and found, that the script is looking for  "VCENTER_PARALLEL_RUN" and not for "VMWARE_PARALLEL_RUN".

    So, the correct syntax in etc/.magic should be:

    VCENTER_PARALLEL_RUN=20
    export VCENTER_PARALLEL_RUN

    Furthermore, this variable ist not really considered in the scripts, so it makes no sense to use it as it is!
    I've now added the following lines to the load_vmware.sh: 
    if [ $VCENTER_PARALLEL_RUN -gt 1 ]; then
    while [ `ps -e | grep vmware_run.sh | wc -l` -ge $VCENTER_PARALLEL_RUN ];
    do
    echo "waiting ...."
    sleep 1
    done
    fi
    if [ $VCENTER_PARALLEL_RUN -eq 1 ]; then
    $BINDIR/vmware_run.sh 2>>$ERRLOG   | tee -a $INPUTDIR/logs/load_vmware.log
    else
    eval '$BINDIR/vmware_run.sh 2>>$ERRLOG  | tee -a $INPUTDIR/logs/load_vmware.log' &
    fi
    That works so far but the problem is not the time fetching the data from the VCenters, but the creation of the charts. The part of creating all the charts is sooo time consuming and takes more than 1 hour. How can we parallelise this?
     
    BR Juergen


  • Hi,

    VMWARE_PARALLEL_RUN is correct variable :)

    # grep VMWARE_PARALLEL_RUN bin/*
    bin/vmw2rrd.pl:if ( defined $ENV{VMWARE_PARALLEL_RUN} ) {
    bin/vmw2rrd.pl:    $PARALLELIZATION = $ENV{VMWARE_PARALLEL_RUN};

    Charts are not created in advance.
    What is your product version?

    Try to increase paralle run to 20.

    Let us know.



  • Hi Pavel,
    as I mentioned in the first line, we are actually on Version 5.07.

    In this version is no VMWARE_PARALLEL_RUN !

    $ grep VMWARE_PARALLEL_RUN bin/vmw2rrd.pl
    $ grep PARALLEL bin/vmw2rrd.pl           
    my $PARALLELIZATION = 10;
    #   $PARALLELIZATION = 1;
        if ( !$do_fork || $cycle_count == $PARALLELIZATION ) {
    ok, now I've redone my changes and set $PARALLELIZATION = 20

    I'll send you a feedback when a few runs are finished.

    BR Juergen
  • Hi,

    also definitelly upgrade to 6.02, there was a lot of improvements and optimisations on the back-end.

  • Hi,
    after one night, it is not working. All VMware charts are interrupted. All Power charts are ok.
    We got the problem with the migration from AIX to Linux. 

    However, I've to discuss it first, but I think we will upgrade to the most actual version.

    BR Juergen

  • Hi Pavel,
    now we upgraded to 6.02. The problem is almost solved. We still have a one hour gap at midnight but only on all VMware-, not on the Power graphs.

    The load.sh takes appx. 45 minutes and the daily_lpar_check takes appx. 30 minutes, together more than one hour. So the next start of load.sh is blocked by "There is already running another copy of load.sh, exiting ...".

    Only for letting you know, we have the following amount of systems:
               60 HMCs
             174 ManagedSystems (Ps)
           2005 LPARs
               29 VCenter
             449 VMware Hosts
         13835 VMware VMs 

    Best regards,
    Juergen

  • Hi,

    wow, it is prety big environment!

    Use this as a hot fix, it will resolve the issue


    -rwxrwxr-x 1 lpar2rrd lpar2rrd 22664 May 27 12:32 load.sh

    Gunzip it and copy to /home/lpar2rrd/lpar2rrd/ (755, lpar2rrd owner)

    If your web browser gunzips it automatically then just rename it: mv load.sh.gz load.sh

    Assure that file size is the same as on above example

    If you are on Linux then change interpretten on the first line of the script to /bin/bash



  • Hi Pavel,

    it works! 
    Sending daily_lpar_check.pl into the background with nohup is an easy solution!

    Thank you so far!

    BR Juergen
Sign In or Register to comment.