Overall Health Status shows all FC Switches NotOK but not all are individually showing as NotOK
in SAN
We have 4 Cisco MDS 9418S FC switches, 2 at each site, dual redundant connectivity to IBM SWIZ fs5100(s). I added them to SAN in STOR2RRD (Hyper-V image) and they seem to be collecting the correct info. The 'Overall' Health Status shows all 4 as NotOK
but going to each of the individual switches (in SAN SWITCH>{SWITCH NAME}>Health status>Switch Status(tab), two show as OK, and 2 show as 'warning'.
The two in marginal status are at one site, and the two that are 'status: ok' are at the other. One site is live, processing production data to the SAN across the Fiber, the other site is the failover/role swap site so it is seeing no 'fiber' read/write (is it HW replication, so only IP), so that is my guess as to why the idle site is 'ok'.
P.S. What are (some of) the conditions that you are coding for to change the status from OK to warning (and what are the status I can expect to see...) I cannot find any hint as to what criteria you might be using.
but going to each of the individual switches (in SAN SWITCH>{SWITCH NAME}>Health status>Switch Status(tab), two show as OK, and 2 show as 'warning'.
The two in marginal status are at one site, and the two that are 'status: ok' are at the other. One site is live, processing production data to the SAN across the Fiber, the other site is the failover/role swap site so it is seeing no 'fiber' read/write (is it HW replication, so only IP), so that is my guess as to why the idle site is 'ok'.
P.S. What are (some of) the conditions that you are coding for to change the status from OK to warning (and what are the status I can expect to see...) I cannot find any hint as to what criteria you might be using.
Comments
-
Hi,
the main health status of the Cisco SAN switch is based on the following 2 conditions.
1. the switch status
connUnitStatus - 1.3.6.1.3.94.1.6.1.6
possible values: 1-unknown, 2-unused, 3-ok, 4-warning, 5-failed
Red status = warning or failed
2. port statuses
a) ifOperStatus - 1.3.6.1.2.1.2.2.1.8
possible values: 1-up, 2-down, 3-testing, 4-unknown, 5-dormant, 6-notPresent, 7-lowerLayerDown
Red status = dormant or lowerLayerDown
b) fcIfOperStatusCause - 1.3.6.1.4.1.9.9.289.1.1.2.1.7
You can find possible values and statuses which causes red status of the switch in the stor2rrd instalation here:cd /home/stor2rrd/stor2rrd # or where is your STOR2RRD working dir
head etc/cisco-status.txt
# Cisco status error file
# It is based on http://tools.cisco.com/Support/SNMP/do/BrowseOID.do?local=en&translate=Translate&typeName=FcIfOperStatusReason
#
# You can modify it on your own, when you save it under cisco-status_custom.txt
# then it will be prefered and will not be overwriten by upgrade
#
grey : 1 : other
green : 2 : none
red : 3 : hwFailure
red : 4 : loopbackDiagFailure
Howdy, Stranger!
Categories
- 1.6K All Categories
- 41 XORMON NG
- 25 XORMON
- 149 LPAR2RRD
- 13 VMware
- 16 IBM i
- 2 oVirt / RHV
- 4 MS Windows and Hyper-V
- Solaris / OracleVM
- XenServer / Citrix
- Nutanix
- 6 Database
- 2 Cloud
- 10 Kubernetes / OpenShift / Docker
- 122 STOR2RRD
- 19 SAN
- 7 LAN
- 17 IBM
- 3 EMC
- 12 Hitachi
- 5 NetApp
- 15 HPE
- Lenovo
- 1 Huawei
- 1 Dell
- Fujitsu
- 2 DataCore
- INFINIDAT
- 3 Pure Storage
- Oracle