
Command-Line Invocation and Application Notes
Reasonably current SCSI, FC and SAS disk drives (such as the Seagate 10K.5 family) have a programmable feature that lets the disk be configured so it scans the disk for correctable errors during idle time. If your disk has this firmware and capability, you can us the software to configure, disable, and report test results.
Disable Background Media Scanning
The -bmsd command disables background media scanning.
Usage
Enable Background Media Scanning
The -bmse command disables background media scanning.
Usage
smartmon-ux -bmse n DeviceList
Where: n represents the hourly scanning interval. Once the disk is programmed to enable scanning, the disk will automatically begin a new scan after the supplied interval. If disk power is lost, the timer will automatically reset to zero, and scanning will automatically continue. Send the -bmsd command to stop and disable scanning.
Report Background Media Scan Results
The -bmsr command disables background media scanning.
Usage
The command below was run on a SPARC Solaris 10 system that has 6 SAS disks. We added the time command to the prompt so that you can see how quickly the command runs. This was also run with wild-cards to select all disks attached to controller #4.
# time ./smartmon-ux -bmsr /dev/rdsk/c4*s0
SMARTMon-UX [Release 1.36, Build 8-JUN-2008] - Copyright 2001-2008 SANtools(R), Inc. http://www.SANtools.com
Discovered SEAGATE ST3146855SS S/N "3LN23ER0" on /dev/rdsk/c4t12d0s0 (Not Enabling SMART)(140014 MB)
Background Media Scan Report @ Sun Jun 8 16:33:03 2008
Accumulated power-on minutes: 135086 [94 days]
Number of background scans performed: 34
Background scanning status: medium scan halted, waiting for interval timer expiration
Background scan percentage completed: 0.00
Defect# PowerOnMins HexBlockNumber State Reassignment Status AdditionalInfo
0 8 577a4b OK recovered via in-place rewrite Recovered error Recovered data with retries
1 46392 381f8 OK recovered via in-place rewrite Recovered error Recovered data with retries
2 46402 7598a8e OK recovered via in-place rewrite Recovered error Recovered data with retries
3 117139 2cfae2a OK recovered via in-place rewrite Recovered error Recovered data with retries
4 117149 9c9036c OK recovered via in-place rewrite Recovered error Recovered data with retries
5 131136 77b3f4d OK recovered via in-place rewrite Recovered error Recovered data with retries
6 135041 77339d3 OK recovered via in-place rewrite Recovered error Recovered data with retries
Discovered SEAGATE ST3146855SS S/N "3LN2A027" on /dev/rdsk/c4t13d0s0 (Not Enabling SMART)(140014 MB)
Background Media Scan Report @ Sun Jun 8 16:33:03 2008
Accumulated power-on minutes: 134976 [94 days]
Number of background scans performed: 34
Background scanning status: medium scan halted, waiting for interval timer expiration
Background scan percentage completed: 0.00
Number of defects reported: 0
Discovered SEAGATE ST3146855SS S/N "3LN29PAS" on /dev/rdsk/c4t14d0s0 (Not Enabling SMART)(140014 MB)
Background Media Scan Report @ Sun Jun 8 16:33:03 2008
Accumulated power-on minutes: 134904 [94 days]
Number of background scans performed: 35
Background scanning status: medium scan halted, waiting for interval timer expiration
Background scan percentage completed: 0.00
Defect# PowerOnMins HexBlockNumber State Reassignment Status AdditionalInfo
0 148 d99d9f7 OK recovered via in-place rewrite Recovered error Recovered data with retries
1 8855 761f75d OK recovered via in-place rewrite Recovered error Recovered data with retries
Discovered SEAGATE ST3146855SS S/N "3LN29ZZ5" on /dev/rdsk/c4t15d0s0 (Not Enabling SMART)(140014 MB)
Background Media Scan Report @ Sun Jun 8 16:33:04 2008
Accumulated power-on minutes: 134325 [93 days]
Number of background scans performed: 35
Background scanning status: medium scan halted, waiting for interval timer expiration
Background scan percentage completed: 0.00
Defect# PowerOnMins HexBlockNumber State Reassignment Status AdditionalInfo
0 133 37fc7 OK recovered via in-place rewrite Recovered error Recovered data with retries
1 117114 2bf620f OK recovered via in-place rewrite Recovered error Recovered data with retries
2 130954 7b ERR waiting for WRITE Controller/drive hardware failed Track following error
3 130954 1c8 ERR waiting for WRITE Controller/drive hardware failed Track following error
4 130954 37fc7 OK recovered via in-place rewrite Recovered error Recovered data with retries
5 131392 37fc8 OK recovered via in-place rewrite Recovered error Recovered data with retries
6 133380 38039 OK recovered via in-place rewrite Recovered error Recovered data with retries
7 133792 d699104 OK recovered via in-place rewrite Recovered error Recovered data with retries
Discovered SEAGATE ST3146855SS S/N "3LN27XJ9" on /dev/rdsk/c4t16d0s0 (Not Enabling SMART)(140014 MB)
Background Media Scan Report @ Sun Jun 8 16:33:04 2008
Accumulated power-on minutes: 134950 [94 days]
Number of background scans performed: 38
Background scanning status: medium scan halted, waiting for interval timer expiration
Background scan percentage completed: 0.00
Defect# PowerOnMins HexBlockNumber State Reassignment Status AdditionalInfo
0 46356 3b46c18 OK recovered via in-place rewrite Recovered error Recovered data with retries
1 133307 80a34 ERR recovered via in-place rewrite Controller/drive hardware failed Track following error
Discovered SEAGATE ST3146855SS S/N "3LN29QG4" on /dev/rdsk/c4t17d0s0 (SMART enabled)(140014 MB)
Background Media Scan Report @ Sun Jun 8 16:33:04 2008
Accumulated power-on minutes: 134993 [94 days]
Number of background scans performed: 35
Background scanning status: medium scan halted, waiting for interval timer expiration
Background scan percentage completed: 0.00
Defect# PowerOnMins HexBlockNumber State Reassignment Status AdditionalInfo
0 127 381a8 OK recovered via in-place rewrite Recovered error Recovered data with retries
1 46378 de80f44 OK recovered via in-place rewrite Recovered error Recovered data with retries
2 56468 3a44867 OK recovered via in-place rewrite Recovered error Recovered data with retries
3 86795 a817a7f OK recovered via in-place rewrite Recovered error Recovered data with retries
4 130059 de863e6 OK recovered via in-place rewrite Recovered error Recovered data with retries
5 131031 1e240 ERR waiting for WRITE Controller/drive hardware failed Track following error
6 132850 e01e8c4 OK recovered via in-place rewrite Recovered error Recovered data with retries
7 133350 1f62 ERR waiting for WRITE Controller/drive hardware failed Track following error
8 133350 8034a ERR waiting for WRITE Controller/drive hardware failed Track following error
9 133350 805b4 ERR waiting for WRITE Controller/drive hardware failed Track following error
10 134778 e01e8fa OK recovered via in-place rewrite Recovered error Recovered data with retries
Program Ended.
real 0m1.15s
user 0m0.01s
sys 0m0.02s
#
The PowerOnMins field represents the total minutes that the disk has been powered on. The value is non-volatile, so the minutes increase only while the disk is powered on. The fields marked with ERR correspond to defects that are in need of repair. These are bad blocks that can not be read. If the disks are part of a software RAID set, then you should launch a data consistency repair using whatever utility is appropriate for your operating system.
Note that it took a little over one second to report all unrecoverable blocks for nearly one terabyte worth of storage. The blocks that it reports were discovered during prior automated background media scans (see the -bmse function in this section).
Using Media Scan Results with Software RAID
BGMS not only improves data integrity by automatically repairing failing blocks by rewriting them, but can also provide enough information to construct a script to rebuild software RAID volumes when the need arises. For example, if you have two disks that mirror each other (RAID-1),and smartmon-ux tells you that block #1234 is bad and unreadable, then you can instruct the operating system to run a consistency repair on the volume to recover. If the media scan results -bmsr reports that there are no bad blocks, then there is no need to run a manual check for bad blocks that could take hours or even days if you have a large storage pool.
The script, FindBadBlocks.sh utilizes the -bmsr function to enumerate all bad blocks and report them by slice (the equivalent of a partition). This, in turn, can be used by the system administrator to determine whether or not a repair is warranted for any particular volume. This script was run against the same Solaris 10 system that supplied the scan results shown above.
./FindBadBlocks.sh
PhysicalDevPath Days:Hrs:Min Offset State
/dev/rdsk/c1t2d0s0 - - OK
/dev/rdsk/c4t12d0s0 0:00:08 577a4b Recovered via in-place rewrite
/dev/rdsk/c4t12d0s0 32:05:12 381f8 Recovered via in-place rewrite
/dev/rdsk/c4t12d0s0 32:05:22 7598a8e Recovered via in-place rewrite
/dev/rdsk/c4t12d0s0 81:08:19 2cfae2a Recovered via in-place rewrite
/dev/rdsk/c4t12d0s0 81:08:29 9c9036c Recovered via in-place rewrite
/dev/rdsk/c4t12d0s0 91:01:36 77b3f4d Recovered via in-place rewrite
/dev/rdsk/c4t12d0s0 93:18:41 77339d3 Recovered via in-place rewrite
/dev/rdsk/c4t14d0s0 0:02:28 d99d9f7 Recovered via in-place rewrite
/dev/rdsk/c4t14d0s0 6:03:35 761f75d Recovered via in-place rewrite
/dev/rdsk/c4t15d0s0 0:02:13 37fc7 Recovered via in-place rewrite
/dev/rdsk/c4t15d0s0 81:07:54 2bf620f Recovered via in-place rewrite
/dev/rdsk/c4t15d0s0 90:22:34 7b ERR waiting for WRITE Controller/drive hardware failed Track following error
/dev/rdsk/c4t15d0s0 90:22:34 1c8 ERR waiting for WRITE Controller/drive hardware failed Track following error
/dev/rdsk/c4t15d0s0 90:22:34 37fc7 Recovered via in-place rewrite
/dev/rdsk/c4t15d0s0 91:05:52 37fc8 Recovered via in-place rewrite
/dev/rdsk/c4t15d0s0 92:15:00 38039 Recovered via in-place rewrite
/dev/rdsk/c4t15d0s0 92:21:52 d699104 Recovered via in-place rewrite
/dev/rdsk/c4t16d0s0 32:04:36 3b46c18 Recovered via in-place rewrite
/dev/rdsk/c4t16d0s0 92:13:47 80a34 Recovered via in-place rewrite
/dev/rdsk/c4t17d0s0 0:02:07 381a8 Recovered via in-place rewrite
/dev/rdsk/c4t17d0s0 32:04:58 de80f44 Recovered via in-place rewrite
/dev/rdsk/c4t17d0s0 39:05:08 3a44867 Recovered via in-place rewrite
/dev/rdsk/c4t17d0s0 60:06:35 a817a7f Recovered via in-place rewrite
/dev/rdsk/c4t17d0s0 90:07:39 de863e6 Recovered via in-place rewrite
/dev/rdsk/c4t17d0s0 90:23:51 1e240 ERR waiting for WRITE Controller/drive hardware failed Track following error
/dev/rdsk/c4t17d0s0 92:06:10 e01e8c4 Recovered via in-place rewrite
/dev/rdsk/c4t17d0s0 92:14:30 1f62 ERR waiting for WRITE Controller/drive hardware failed Track following error
/dev/rdsk/c4t17d0s0 92:14:30 8034a ERR waiting for WRITE Controller/drive hardware failed Track following error
/dev/rdsk/c4t17d0s0 92:14:30 805b4 ERR waiting for WRITE Controller/drive hardware failed Track following error
/dev/rdsk/c4t17d0s0 93:14:18 e01e8fa Recovered via in-place rewrite
Page url: http://www.santools.com/santool/index.html?background_scan_widget.htm