glmos-code-explain
设备环境:(物理机、局域网/公网、系统/APP版本号9.29)
BUG现象:(更新9.29后,提示有一块硬盘掉盘,确实某2块硬盘有smart errlog,不过之前是不太影响使用的,后面选修复失败,提示空间损坏要关机插拔,插拔之后有另一个硬盘反应很慢,无论如何开关机都无法识别了,目前更换了数据背板为数据线直连,尝试了论坛提供的恢复方法,系统内可以看到空间正常,但是无法挂载使用。
硬盘信息:
sudo lsblk -fp
NAME FSTYPE FSVER LABEL UUID FSAVAIL FSUSE% MOUNTPOINTS
/dev/sda
**─/dev/sda1 linux_raid_member 1.2 HS1581:1 383dc4df-e27f-a8b4-2a65-31ff2ef7d1cd
**─/dev/md1 LVM2_member LVM2 001 P2a4Ep-Q3cV-bI5C-v3N0-Eukr-BTl2-hcB2BY
**─/dev/mapper/trim_6b592a70_df7e_4cb7_a99e_51d7718e77f4-0 ext4 1.0 5afaba2f-8b8d-4012-a51c-d3172b6404fb
/dev/sdb
**─/dev/sdb1 linux_raid_member 1.2 HS1581:1 383dc4df-e27f-a8b4-2a65-31ff2ef7d1cd
**─/dev/md1 LVM2_member LVM2 001 P2a4Ep-Q3cV-bI5C-v3N0-Eukr-BTl2-hcB2BY
**─/dev/mapper/trim_6b592a70_df7e_4cb7_a99e_51d7718e77f4-0 ext4 1.0 5afaba2f-8b8d-4012-a51c-d3172b6404fb
/dev/sdc
**─/dev/sdc1 linux_raid_member 1.2 HS1581:1 383dc4df-e27f-a8b4-2a65-31ff2ef7d1cd
**─/dev/md1 LVM2_member LVM2 001 P2a4Ep-Q3cV-bI5C-v3N0-Eukr-BTl2-hcB2BY
**─/dev/mapper/trim_6b592a70_df7e_4cb7_a99e_51d7718e77f4-0 ext4 1.0 5afaba2f-8b8d-4012-a51c-d3172b6404fb
/dev/sdd
**─/dev/sdd1 vfat FAT32 6871-F86D 84.5M 9% /boot/efi
**─/dev/sdd2 ext4 1.0 ca49fd18-feb1-49f8-89ec-e5f880d15346 48.7G 17% /
**─/dev/sdd3 linux_raid_member 1.2 HS1581:0 8f06b786-c11c-6e73-09bd-20058e7cb110
**─/dev/md0 LVM2_member LVM2 001 TBqs4Q-CBg9-oxeq-9blv-dxZf-tKm8-GEsk9H
**─/dev/mapper/trim_f22d4207_7928_4fd5_8755_20495e9bf1ef-0 btrfs 4386d4f6-c9d7-4b61-a33d-32e8bab370cb 40.5G 24% /vol1
/dev/sde
**─/dev/sde1 linux_raid_member 1.2 HS1581:1 383dc4df-e27f-a8b4-2a65-31ff2ef7d1cd
**─/dev/md1 LVM2_member LVM2 001 P2a4Ep-Q3cV-bI5C-v3N0-Eukr-BTl2-hcB2BY
**─/dev/mapper/trim_6b592a70_df7e_4cb7_a99e_51d7718e77f4-0 ext4 1.0 5afaba2f-8b8d-4012-a51c-d3172b6404fb
以3块events相同的盘组建后就是这样无法自动或者通过web来挂载空间,这个步骤因为之前不知道cat /proc/mdstat这个命令能看细节,所以创建之后失败也试过关机重试,每次加入最后这个盘都会提示它是个spare 它slot是 -1。
手动启动
xms@HS1581:$ sudo mdadm --assemble --force -v /dev/md1 /dev/sda1 /dev/sdb1 /dev/sdc1
mdadm: looking for devices for /dev/md1
mdadm: /dev/sda1 is identified as a member of /dev/md1, slot 2.
mdadm: /dev/sdb1 is identified as a member of /dev/md1, slot 1.
mdadm: /dev/sdc1 is identified as a member of /dev/md1, slot 3.
mdadm: forcing event count in /dev/sdb1(1) from 62407 up to 62805
mdadm: no uptodate device for slot 0 of /dev/md1
mdadm: added /dev/sda1 to /dev/md1 as 2
mdadm: added /dev/sdc1 to /dev/md1 as 3
mdadm: added /dev/sdb1 to /dev/md1 as 1
mdadm: /dev/md1 has been started with 3 drives (out of 4).
xms@HS1581:$ mdadm -Ds
-bash: mdadm: command not found
xms@HS1581:~$ sudo mdadm -Ds
ARRAY /dev/md/0 metadata=1.2 name=HS1581:0 UUID=8f06b786:c11c6e73:09bd2005:8e7cb110
ARRAY /dev/md1 metadata=1.2 name=HS1581:1 UUID=383dc4df:e27fa8b4:2a6531ff:2ef7d1cd
手动加入第四块盘(sde),提示这块盘变为spare,
sudo mdadm --add /dev/md1 /dev/sde1
mdadm: add new device failed for /dev/sde1 as 4: Device or resource busy
然后会看到重建,但是每秒100多k明显不太正常。想问问还有什么方法?
sudo cat /proc/mdstat
[sudo] password for xms:
Personalities : [raid1] [linear] [raid0] [raid6] [raid5] [raid4] [raid10]
md1 : active raid5 sde1[4] sdb1[1] sdc1[3] sda1[2]
23441679360 blocks super 1.2 level 5, 64k chunk, algorithm 2 [4/3] [_UUU]
[>....................] recovery = 0.0% (379652/7813893120) finish=960662.0min speed=135K/sec
bitmap: 0/59 pages [0KB], 65536KB chunk
手工挂提示:
Disk /dev/mapper/trim_6b592a70_df7e_4cb7_a99e_51d7718e77f4-0: 21.83 TiB, 24004274421760 bytes, 46883348480 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 65536 bytes / 196608 bytes
sudo mount /dev/mapper/trim_6b592a70_df7e_4cb7_a99e_51d7718e77f4-0 /vol22
mount: /vol22: mount(2) system call failed: Structure needs cleaning.
dmesg(1) may have more information after failed mount system call.
最后试了这个命令,看起来是不是还是有错,这个名录卡住了不动了。
sudo dumpe2fs /dev/mapper/trim_6b592a70_df7e_4cb7_a99e_51d7718e77f4-0
dumpe2fs 1.47.0 (5-Feb-2023)
Filesystem volume name:
Last mounted on: /app/logs
Filesystem UUID: 5afaba2f-8b8d-4012-a51c-d3172b6404fb
Filesystem magic number: 0xEF53
Filesystem revision #: 1 (dynamic)
Filesystem features: has_journal ext_attr dir_index filetype extent 64bit flex_bg sparse_super large_file huge_file dir_nlink extra_isize quota bigalloc metadata_csum project
Filesystem flags: signed_directory_hash
Default mount options: user_xattr acl
Filesystem state: clean
Errors behavior: Continue
Filesystem OS type: Linux
Inode count: 366280704
Block count: 5860418560
Reserved block count: 25610
Overhead clusters: 1448775
Free blocks: 807425776
Free inodes: 366091437
First block: 0
Block size: 4096
Cluster size: 65536
Group descriptor size: 64
Blocks per group: 524288
Clusters per group: 32768
Inodes per group: 32768
Inode blocks per group: 2048
RAID stride: 16
RAID stripe width: 48
Flex block group size: 16
Filesystem created: Thu Jul 17 13:14:46 2025
Last mount time: Thu Oct 2 00:19:35 2025
Last write time: Thu Oct 2 00:30:37 2025
Mount count: 37
Maximum mount count: -1
Last checked: Thu Jul 17 13:14:46 2025
Check interval: 0 ()
Lifetime writes: 20 TB
Reserved blocks uid: 0 (user root)
Reserved blocks gid: 0 (group root)
First inode: 11
Inode size: 256
Required extra isize: 32
Desired extra isize: 32
Journal inode: 8
Default directory hash: half_md4
Directory Hash Seed: 7e9020a5-746d-4fe6-bc57-22f516839f94
Journal backup: inode blocks
User quota inode: 3
Group quota inode: 4
Project quota inode: 12
Checksum type: crc32c
Checksum: 0x5a1f88cb
Journal features: journal_incompat_revoke journal_64bit journal_checksum_v3
Total journal size: 1024M
Total journal blocks: 262144
Max transaction length: 262144
Fast commit length: 0
Journal sequence: 0x000329d8
Journal start: 0
Journal checksum type: crc32c
Journal checksum: 0x4c89d819
)
出现频率:(必现)
联系方式:(18510957998)
日志文件:(通过网盘分享的文件:Debug_Log_20251009205626.7z
链接: https://pan.baidu.com/s/1MKkUWBgDTUGsh2s2zRNwrQ 提取码: fkxq)