RAIDIX Health event spam

We have RAIDIX 4.7.1 and stor2rrd 7.00
Every 5min stor2rrd sent empy HW event mail for RAIDIX.

In file /home/stor2rrd/stor2rrd/bin/raidixperf.pl i find string:
my $cmd = "jq '[.[] | select(.end == null) | select(.status != \\\"ok\\\")]' /var/lib/raidix/alerts.json;";

When executing this command on RAIDIX, I get response:
raidix-node# jq '[.[] | select(.end == null) | select(.status != "ok")]' /var/lib/raidix/alerts.json
[]

Now I change string to:
my $cmd = "jq '[.[] | select(.end == null) | select(.status)]' /var/lib/raidix/alerts.json;";

...and i have nomaly output in "Health status" menu.

Is this change correct for the algoritm of stor2rrd ?

Comments

  • Hi,

    we present only NOK alerts, it is how we do that for all storage devices we support.
    We do not care about "ok" alerts
  • when i use != "ok" shell command output is empty ([]), and stor2rrd does not correctly process this situation: "Health status" empty + e-mail every 5 min.
  • file /home/stor2rrd/stor2rrd/bin/raidixperf.pl:

    in sub health_status

     my $cmd = "jq '[.[] | select(.end == null) | select(.status != \\\"ok\\\")]' /var/lib/raidix/alerts.json;";
    return  - []

    ...then :
      my $return = run_ssh_cmd_json($ip,$user,$port,$cmd);
      my $scalar = 0;
      if (defined $return && $return ne "" && ref($return) eq "ARRAY"){
        $scalar = @{$return};
      }
    $scalar = 0 , and we get out of sub health_status

    ...then in line 135:
    alerts();
    #print Dumper %volumes_stats;
    #print "#### volume ip 96/94\n";
    #print Dumper $conf;
    #print "#### volume ip 100\n";
    #print Dumper $conf2;
    exit;

     going to "sub alerts"
    sub alerts{
      my $status = "";
    By default $status not defined.

     foreach my $host (keys (%alerts)){
        if (defined $alerts{$host} && $alerts{$host} ne ""){
          foreach my $errors (keys (%{$alerts{$host}})){
            if (defined $alerts{$host}{$errors}{'status'} && $alerts{$host}{$errors}{'status'} ne ""){
              if ($alerts{$host}{$errors}{'status'} eq "error" || $status eq "error"){
                $status = "error";
              }else{
                $status = "OK";
              }
            }
            if (defined $alerts->{$host}->{$errors}->{'msg'} && $alerts->{$host}->{$errors}->{'msg'} ne ""){
              $error_message[$errors] = $alerts->{$host}->{$errors}->{'msg'};
            }
          }
        }
      }

    After this code $status remains unchanged because we do not have %alerts

    And then in the line 1275:
    ## WARNING
      if ( $status ne "error" && $status ne "OK" ) {
        if ( exists &Xorux_lib::health_status_alerting ) {
          Xorux_lib::health_status_alerting($component_name,\@error_message);
        }
      }

    ... e-mail for you..

    i changed 1276 line to
    if ( $status ne "error" && $status ne "OK" && $status ne "" ){ 
    ...


  • Hi @IaroslavB,

    you are right, there is a bug. Thank you for reporting this issue.

    Here is a fix intended to version 7.00:

    NOTE: next version 7.10 which will be released soon will contain this fix as well and some additional changes in that code which will not be compatible with version 7.00. So do not use it for versions 7.10+.


    Gunzip it and copy to /home/stor2rrd/stor2rrd/bin (755, stor2rrd owner)
    -rwxrwxr-x 1 stor2rrd stor2rrd 51568 Jan 26 09:48 raidixperf.pl
    If your web browser gunzips it automatically then just rename it: mv raidixperf.pl.gz raidixperf.pl
    Assure that file size is the same as on above example
Sign In or Register to comment.