What Does an Alert Look Like?

Top  Previous  Next

Disk-Related Messages

In the event a S.M.A.R.T. alert is generated by your disk drive, it will be detected by SMARTMon-UX the next time the program polls the disk. If you have the email option (-M) invoked, your system will send out an email similar to: "Device on /dev/hd1 SMART Status:FAILED - Failure imminent".

 

The message header will be "SMARTMon Alert from computer.domain." (i.e., SMARTMon Alert from system.mydomain.com).

 

You should take some immediate actions to minimize possibility of data loss.

 

In addition, this information will be recorded in the Windows Event log or smartmon-ux.log if running Windows family operating systems or the standard UNIX/LINUX syslog file. See use of the -L and -LRemote command to control the names of the log files for your particular operating system.  The example below shows what the software reports on a failing SAS disk

 

# tail /var/log/smartmon-ux

Tue Jun 10 11:11:24 2008: /dev/rdsk/c4t16d0s0 polled at Tue Jun 10 11:20:24 2008 Status:Passed

Tue Jun 10 11:11:24 2008: /dev/rdsk/c4t17d0s0 polled at Tue Jun 10 11:20:24 2008 Status:Passed

Tue Jun 10 11:21:24 2008: /dev/rdsk/c1t1d0s0 polled at Tue Jun 10 11:21:24 2008 Status:Passed

Tue Jun 10 11:21:24 2008: /dev/rdsk/c1t2d0s0 polled at Tue Jun 10 11:21:24 2008 Status:Passed

Tue Jun 10 11:21:24 2008: /dev/rdsk/c4t12d0s0 polled at Tue Jun 10 11:21:24 2008 Status:Passed

Tue Jun 10 11:21:25 2008: /dev/rdsk/c4t13d0s0 polled at Tue Jun 10 11:21:24 2008 Status:Passed

Tue Jun 10 11:21:25 2008: /dev/rdsk/c4t14d0s0 polled at Tue Jun 10 11:21:25 2008 Status:Passed

Tue Jun 10 11:21:25 2008: /dev/rdsk/c4t15d0s0 polled at Tue Jun 10 11:21:25 2008 Status:FAILED - Failure imminent (Predictive Failure Analysis (S.M.A.R.T.) threshold reached)

Tue Jun 10 11:21:25 2008: /dev/rdsk/c4t16d0s0 polled at Tue Jun 10 11:21:25 2008 Status:Passed

Tue Jun 10 11:21:25 2008: /dev/rdsk/c4t17d0s0 polled at Tue Jun 10 11:21:25 2008 Status:Passed

 

Enclosure-Related Messages (SES)

If you have a component fail in your SES enclosure, the message text might contain something like:

PSU #0 Critical DC failure [LED ON] XYRATEX SS-1202-FCAL 50-05-0C-C0-00-00-3D-DD

 

The SES code within SMARTMon-UX returns status text messages for all SES pages defined within the specification. Note that not all SES enclosures monitor all components defined in the spec. You should contact your storage vendor to learn which SES Components  monitor their hardware. Below is a list of components that SMARTMon-UX monitors and reports

SES Device Status Element (i.e., disk drive status)
SES Array Element (i.e., is the device a hot spare, part of a critical array, rebuilding, etc...)
SES Cooling Element (fans, and fan speed)
SES Temperature Element (returns temperature and thermal overtemp/undertemp warnings)
SES Power Element (includes over/under voltage and AC/DC power loss)
SES Door Lock Element (for each device bay)
SES Audible Alarm Status Element (muted, enabled, sounding, etc...)
SES Electronics Status Element
SCC Electronics Status Element
SES Volatile Cache Status Element
SES UPS Status Element (includes battery status, and AC/DC power status)
SES SCSI Port Status Element
SES Language Element Status Element
SES Communication Port Status Element
SES Voltage Sensor Status Element (displays input voltage)
SES Current Sensor Status Element (displays current drawn)
SES SCSI Initiator Port Status Element

 

In addition, if there is an alert, the software will report the make and model of enclosure along with the world-wide name.

 

Regardless of the message type. SMARTMon-UX will make an entry in either the default system log or a log file specific to smartmon-ux, if the program was invoked with the -L option.

 

Enclosure-Related Messages (SAF-TE)

If you have a component fail in your SAF-TE enclosure, the message text might contain something like:

Critical - Power Supply #1  Malfunctioning (Commanded on) CNSi JSS122

 

The SAF-TE code within SMARTMon-UX returns status text messages for all SAF-TE devices defined within the specification. Note that not all SES enclosures monitor all components defined in the spec. You should contact your storage vendor to learn which SAF-TE components their hardware monitors. Below is a list of components that SMARTMon-UX monitors and reports

Fan Status (Operational; malfunctioning; not installed; unknown)
Power Supply Status (Operational and on; Operational and off; Malfunctioning and commanded on; Malfunctioning and commanded off; Not present; Present; Unknown)
Door Lock Status (Locked; Unlocked; Unknown)
Speaker Status (Off/No Speaker Installed; On)
Temperature (Reports value in degrees Celsius and Fahrenheit)
Device Slot status (for each device bay) Reports No device inserted in slot, Device inserted in slot, Device power on, Device power off; SCSI ID of device in slot

 

In addition, if there is an alert, the software will report the make and model of enclosure along with the world-wide name.

 

Regardless of the message type. SMARTMon-UX will make an entry in either the default system log or a log file specific to smartmon-ux, if the program was invoked with the -L option.