4

Linux检测磁盘坏道

 2 years ago
source link: https://owenyk.github.io/2021/11/26/Linux%E6%A3%80%E6%B5%8B%E7%A3%81%E7%9B%98%E5%9D%8F%E9%81%93/#comment-waline
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
neoserver,ios ssh client

一、使用badblocks命令

badblocks [-svw][-b <区块大小>][-o <输出文件>][磁盘装置][磁盘区块数][启始区块]  

参数说明:

-b<区块大小> 指定磁盘的区块大小,单位为字节。  
-o<输出文件> 将检查的结果写入指定的输出文件。  
-s 在检查时显示进度。  
-v 执行时显示详细的信息。  
-w 在检查时,执行写入测试。  
[磁盘装置] 指定要检查的磁盘装置。  
[磁盘区块数] 指定磁盘装置的区块总数。  
[启始区块] 指定要从哪个区块开始检查。  

参考:Linux badblocks命令

1、列出磁盘信息

user@user-pc:~$ sudo fdisk -l
Disk /dev/sda: 931.51 GiB, 1000204886016 bytes, 1953525168 sectors
Disk model: nal USB 3.0
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes
Disklabel type: dos
Disk identifier: 0x627a0fc4
Device Boot Start End Sectors Size Id Type
/dev/sda1 * 2048 41840639 41838592 20G 83 Linux
/dev/sda2 41842686 1953523711 1911681026 911.6G f W95 Ext'd (LBA)
/dev/sda5 222707712 419434495 196726784 93.8G 7 HPFS/NTFS/exFAT
/dev/sda6 419436544 838868991 419432448 200G 7 HPFS/NTFS/exFAT
/dev/sda7 838871040 1187000319 348129280 166G 7 HPFS/NTFS/exFAT
/dev/sda8 1187002368 1396715519 209713152 100G 7 HPFS/NTFS/exFAT
/dev/sda9 1396717568 1953523711 556806144 265.5G 7 HPFS/NTFS/exFAT
/dev/sda10 41842688 56438783 14596096 7G 83 Linux
/dev/sda11 56440832 58439679 1998848 976M 82 Linux swap / Solaris /dev/sda12 58441728 61046783 2605056 1.2G 83 Linux
/dev/sda13 61048832 222693375 161644544 77.1G 83 Linux

2、将检测坏道信息保存到文件:

user@user-pc:~$ sudo badblocks -v /dev/sda8 > /tmp/badsectors.txt # sudo badblocks -s -v /dev/sda8 -o badsectors.txt 
Checking blocks 0 to 104856575
Checking for ba blocks (read-only test): done Pass completed, 0 bad blocks found. (0/0/0 errors)

3、修复(待验证)

(1)先用dd尽量备份坏块
dd if=/dev/sdb bs=4096 skip=扫描出的磁盘区块坏道 of=/tmp/15435904.dat count=1
(2)对这些坏块进行重写(注意! -w写测试会覆盖数据)
badblocks -w -f /dev/sdb 坏道结束编号 坏道起始编号
(3)如果前面的操作有成功的备份/tmp/15435904.dat, 就把它写回:
dd if=/tmp/15435904.dat of=/dev/sdb seek=15435904 bs=4096 count=1

参考:linux下利用badblocks程序在线修复坏道

也可以提示操作系统不要使用损坏区块存储
# e2fsck -l /tmp/badsectors.txt /dev/sdb

备注: 执行e2fsck命令前,需要先挂载设备

二、使用smartmontools工具扫描:

user@user-pc:~$ sudo apt update
命中:1 http://packages.microsoft.com/repos/code stable InRelease
...
命中:12 https://mirror.serverion.com/mariadb/repo/10.6/debian bullseye InRelease
已下载 39.4 kB,耗时 7秒 (5,607 B/s)
正在读取软件包列表... 完成
正在分析软件包的依赖关系树... 完成
正在读取状态信息... 完成
所有软件包均为最新。
user@user-pc:~$ sudo apt install smartmontools
正在读取软件包列表... 完成
正在分析软件包的依赖关系树... 完成
正在读取状态信息... 完成
建议安装:
gsmartcontrol smart-notifier mailx | mailutils
下列【新】软件包将被安装:
smartmontools
升级了 0 个软件包,新安装了 1 个软件包,要卸载 0 个软件包,有 0 个软件包未被升级。
需要下载 565 kB 的归档。
解压缩后会消耗 2,168 kB 的额外空间。
获取:1 http://mirrors.tuna.tsinghua.edu.cn/debian bullseye/main amd64 smartmontools amd64 7.2-1 [565 kB]
已下载 565 kB,耗时 0秒 (1,212 kB/s)
正在选中未选择的软件包 smartmontools。
(正在读取数据库 ... 系统当前共安装有 172754 个文件和目录。)
准备解压 .../smartmontools_7.2-1_amd64.deb ...
正在解压 smartmontools (7.2-1) ...
正在设置 smartmontools (7.2-1) ...
Created symlink /etc/systemd/system/smartd.service → /lib/systemd/system/smartmo
ntools.service.
Created symlink /etc/systemd/system/multi-user.target.wants/smartmontools.servic
e → /lib/systemd/system/smartmontools.service.
正在处理用于 man-db (2.9.4-2) 的触发器 ...

2、使用方法

检测路径根据$ sudo fdisk -l列出的对应。
(1)检测磁盘某部分

user@user-pc:~$ sudo smartctl -H /dev/sda8 -x
smartctl 7.2 2020-12-30 r5155 [x86_64-linux-5.10.0-9-amd64] (local build)
Copyright (C) 2002-20, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Vendor: TO Exter
Product: nal USB 3.0
Revision: 0104
Compliance: SPC-4
User Capacity: 1,000,204,886,016 bytes [1.00 TB]
Logical block size: 512 bytes
Physical block size: 4096 bytes
LU is fully provisioned
Logical Unit id: 0x3020150331000810
Serial number: 2015033100081
Device type: disk
Local Time is: Fri Nov 26 09:43:16 2021 CST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled
Temperature Warning: Disabled or Not Supported
Read Cache is: Enabled
Writeback Cache is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART Health Status: OK
Current Drive Temperature: 0 C
Drive Trip Temperature: 0 C

Error Counter logging not supported

No Self-tests have been logged

Device does not support Background scan results logging


(2)检测挂载的某个磁盘

user@user-pc:~$ sudo smartctl -H /dev/sda -x
smartctl 7.2 2020-12-30 r5155 [x86_64-linux-5.10.0-9-amd64] (local build)
Copyright (C) 2002-20, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Family: Western Digital Blue
Device Model: WDC WD10EZEX-08M2NA0
Serial Number: WD-WCC3F5NUNC7S
LU WWN Device Id: 5 0014ee 20bf6d4ff
Firmware Version: 01.01A01
User Capacity: 1,000,204,886,016 bytes [1.00 TB]
Sector Sizes: 512 bytes logical, 4096 bytes physical
Rotation Rate: 7200 rpm
Device is: In smartctl database [for details use: -P show]
ATA Version is: ACS-2, ACS-3 T13/2161-D revision 3b
SATA Version is: SATA 3.1, 6.0 Gb/s (current: 6.0 Gb/s)
Local Time is: Fri Nov 26 09:44:39 2021 CST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled
AAM feature is: Unavailable
APM feature is: Unavailable
Rd look-ahead is: Enabled
Write cache is: Enabled
DSN feature is: Unavailable
ATA Security is: Disabled, NOT FROZEN [SEC1]
Wt Cache Reorder: Unknown

=== START OF READ SMART DATA SECTION ===
SMART Status not supported: Incomplete response, ATA output registers missing
SMART overall-health self-assessment test result: PASSED
Warning: This result is based on an Attribute check.
user@user-pc:~$ sudo smartctl -H /dev/sda -a
smartctl 7.2 2020-12-30 r5155 [x86_64-linux-5.10.0-9-amd64] (local build)
Copyright (C) 2002-20, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Family: Western Digital Blue
Device Model: WDC WD10EZEX-08M2NA0
Serial Number: WD-WCC3F5NUNC7S
LU WWN Device Id: 5 0014ee 20bf6d4ff
Firmware Version: 01.01A01
User Capacity: 1,000,204,886,016 bytes [1.00 TB]
Sector Sizes: 512 bytes logical, 4096 bytes physical
Rotation Rate: 7200 rpm
Device is: In smartctl database [for details use: -P show]
ATA Version is: ACS-2, ACS-3 T13/2161-D revision 3b
SATA Version is: SATA 3.1, 6.0 Gb/s (current: 6.0 Gb/s)
Local Time is: Fri Nov 26 12:25:59 2021 CST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART Status not supported: Incomplete response, ATA output registers missing
SMART overall-health self-assessment test result: PASSED
Warning: This result is based on an Attribute check.

General SMART Values:
Offline data collection status: (0x82) Offline data collection activity
was completed without error.
Auto Offline Data Collection: Enabled.
Self-test execution status: ( 0) The previous self-test routine completed
without error or no self-test has ever
been run.
Total time to complete Offline
data collection: (11580) seconds.
Offline data collection
capabilities: (0x7b) SMART execute Offline immediate.
Auto Offline data collection on/off support.
Suspend Offline collection upon new
command.
Offline surface scan supported.
Self-test supported.
Conveyance Self-test supported.
Selective Self-test supported.
SMART capabilities: (0x0003) Saves SMART data before entering
power-saving mode.
Supports SMART auto save timer.
Error logging capability: (0x01) Error logging supported.
General Purpose Logging supported.
Short self-test routine
recommended polling time: ( 2) minutes.
Extended self-test routine
recommended polling time: ( 120) minutes.
Conveyance self-test routine
recommended polling time: ( 5) minutes.
SCT capabilities: (0x3035) SCT Status supported.
SCT Feature Control supported.
SCT Data Table supported.

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE
1 Raw_Read_Error_Rate 0x002f 200 200 051 Pre-fail Always - 0
3 Spin_Up_Time 0x0027 173 165 021 Pre-fail Always - 2308
4 Start_Stop_Count 0x0032 094 094 000 Old_age Always - 6265
5 Reallocated_Sector_Ct 0x0033 200 200 140 Pre-fail Always - 0
7 Seek_Error_Rate 0x002e 200 200 000 Old_age Always - 0
9 Power_On_Hours 0x0032 095 095 000 Old_age Always - 4052
10 Spin_Retry_Count 0x0032 100 100 000 Old_age Always - 0
11 Calibration_Retry_Count 0x0032 100 100 000 Old_age Always - 0
12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always - 857
192 Power-Off_Retract_Count 0x0032 200 200 000 Old_age Always - 106
193 Load_Cycle_Count 0x0032 198 198 000 Old_age Always - 6159
194 Temperature_Celsius 0x0022 106 085 000 Old_age Always - 37
196 Reallocated_Event_Count 0x0032 200 200 000 Old_age Always - 0
197 Current_Pending_Sector 0x0032 200 200 000 Old_age Always - 0
198 Offline_Uncorrectable 0x0030 200 200 000 Old_age Offline - 0
199 UDMA_CRC_Error_Count 0x0032 200 200 000 Old_age Always - 0
200 Multi_Zone_Error_Rate 0x0008 200 200 000 Old_age Offline - 0

SMART Error Log Version: 1
No Errors Logged

SMART Self-test log structure revision number 1
No self-tests have been logged. [To run self-tests, use: smartctl -t]

SMART Selective self-test log data structure revision number 1
SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS
1 0 0 Not_testing
2 0 0 Not_testing
3 0 0 Not_testing
4 0 0 Not_testing
5 0 0 Not_testing
Selective self-test flags (0x0):
After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.


(3)检测挂载的另一个磁盘

user@user-pc:~$ sudo smartctl -H /dev/sdb -a
smartctl 7.2 2020-12-30 r5155 [x86_64-linux-5.10.0-9-amd64] (local build)
Copyright (C) 2002-20, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Family: Seagate Barracuda 7200.14 (AF)
Device Model: ST500DM002-1BD142
Serial Number: W2AYVPQG
LU WWN Device Id: 5 000c50 06add6856
Firmware Version: KC65
User Capacity: 500,107,862,016 bytes [500 GB]
Sector Sizes: 512 bytes logical, 4096 bytes physical
Rotation Rate: 7200 rpm
Device is: In smartctl database [for details use: -P show]
ATA Version is: ATA8-ACS T13/1699-D revision 4
SATA Version is: SATA 3.0, 6.0 Gb/s (current: 6.0 Gb/s)
Local Time is: Fri Nov 26 12:32:11 2021 CST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED
See vendor-specific Attribute list for marginal Attributes.

General SMART Values:
Offline data collection status: (0x82) Offline data collection activity
was completed without error.
Auto Offline Data Collection: Enabled.
Self-test execution status: ( 0) The previous self-test routine completed
without error or no self-test has ever
been run.
Total time to complete Offline
data collection: ( 592) seconds.
Offline data collection
capabilities: (0x7b) SMART execute Offline immediate.
Auto Offline data collection on/off support.
Suspend Offline collection upon new
command.
Offline surface scan supported.
Self-test supported.
Conveyance Self-test supported.
Selective Self-test supported.
SMART capabilities: (0x0003) Saves SMART data before entering
power-saving mode.
Supports SMART auto save timer.
Error logging capability: (0x01) Error logging supported.
General Purpose Logging supported.
Short self-test routine
recommended polling time: ( 1) minutes.
Extended self-test routine
recommended polling time: ( 79) minutes.
Conveyance self-test routine
recommended polling time: ( 2) minutes.
SCT capabilities: (0x303f) SCT Status supported.
SCT Error Recovery Control supported.
SCT Feature Control supported.
SCT Data Table supported.

SMART Attributes Data Structure revision number: 10
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE
1 Raw_Read_Error_Rate 0x000f 101 099 006 Pre-fail Always - 353256
3 Spin_Up_Time 0x0003 100 100 000 Pre-fail Always - 0
4 Start_Stop_Count 0x0032 098 098 020 Old_age Always - 2889
5 Reallocated_Sector_Ct 0x0033 100 100 036 Pre-fail Always - 0
7 Seek_Error_Rate 0x000f 087 060 030 Pre-fail Always - 611100360
9 Power_On_Hours 0x0032 091 091 000 Old_age Always - 7963
10 Spin_Retry_Count 0x0013 100 100 097 Pre-fail Always - 0
12 Power_Cycle_Count 0x0032 098 098 020 Old_age Always - 2870
183 Runtime_Bad_Block 0x0032 100 100 000 Old_age Always - 0
184 End-to-End_Error 0x0032 100 100 099 Old_age Always - 0
187 Reported_Uncorrect 0x0032 098 098 000 Old_age Always - 2
188 Command_Timeout 0x0032 100 100 000 Old_age Always - 0 0 0
189 High_Fly_Writes 0x003a 100 100 000 Old_age Always - 0
190 Airflow_Temperature_Cel 0x0022 062 044 045 Old_age Always In_the_past 38 (0 59 38 13 0)
194 Temperature_Celsius 0x0022 038 056 000 Old_age Always - 38 (0 8 0 0 0)
195 Hardware_ECC_Recovered 0x001a 034 026 000 Old_age Always - 353256
197 Current_Pending_Sector 0x0012 100 100 000 Old_age Always - 0
198 Offline_Uncorrectable 0x0010 100 100 000 Old_age Offline - 0
199 UDMA_CRC_Error_Count 0x003e 200 200 000 Old_age Always - 0
240 Head_Flying_Hours 0x0000 100 253 000 Old_age Offline - 7949h+37m+36.458s
241 Total_LBAs_Written 0x0000 100 253 000 Old_age Offline - 1573025949
242 Total_LBAs_Read 0x0000 100 253 000 Old_age Offline - 3066682191

SMART Error Log Version: 1
No Errors Logged

SMART Self-test log structure revision number 1
No self-tests have been logged. [To run self-tests, use: smartctl -t]

SMART Selective self-test log data structure revision number 1
SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS
1 0 0 Not_testing
2 0 0 Not_testing
3 0 0 Not_testing
4 0 0 Not_testing
5 0 0 Not_testing
Selective self-test flags (0x0):
After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.




About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK