-3.2# zpool status
pool: rpool
state: ONLINE
scan: resilvered 144G in 2h10m with 0 errors on Mon Dec 11 13:56:43 2017
config:
NAME STATE READ WRITE CKSUM
rpool ONLINE 0 0 0
mirror-0 ONLINE 0 0 0
c0t0d0s0 ONLINE 0 0 0
c0t1d0s0 ONLINE 0 0 0
二、故障描述
存储池rpool中一个磁盘故障,待更换:
-3.2# zpool status
pool: rpool
state: DEGRADED
status: One or more devices are faulted in response to persistent errors.
Sufficient replicas exist for the pool to continue functioning in a
degraded state.
action: Replace the faulted device, or use 'zpool clear' to mark the device repaired.
scan: resilvered 197G in 2h24m with 0 errors on Fri May 12 19:15:56 2017
config:
NAME STATE READ WRITE CKSUM
rpool DEGRADED 0 0 0
mirror-0 DEGRADED 0 0 0
c0t0d0s0 FAULTED 9 618 0 too many errors
c0t1d0s0 ONLINE 0 0 0
errors: No known data errors
{0} ok boot disk1 (以disk1磁盘进行引导,在用disk1将系统引导起来后再进行故障盘更换工作)
3.更换磁盘
-3.2# zpool offline rpool c0t0d0s0
bash-3.2# zpool status
pool: rpool
state: DEGRADED
status: One or more devices has been taken offline by the administrator.
Sufficient replicas exist for the pool to continue functioning in a
degraded state.
action: Online the device using 'zpool online' or replace the device with
'zpool replace'.
scan: scrub repaired 0 in 2h7m with 0 errors on Thu Jan 24 20:31:42 2019
config:
NAME STATE READ WRITE CKSUM
rpool DEGRADED 0 0 0
mirror-0 DEGRADED 0 0 0
c0t0d0s0 OFFLINE 0 0 0
c0t1d0s0 ONLINE 0 0 0
errors: No known data errors
bash-3.2# df -h
Filesystem size used avail capacity Mounted on rpool/ROOT/s10s_u11wos_24a
274G 109G 145G 43% /
/devices 0K 0K 0K 0% /devices
ctfs 0K 0K 0K 0% /system/contract
proc 0K 0K 0K 0% /proc
mnttab 0K 0K 0K 0% /etc/mnttab
swap 92G 448K 92G 1% /etc/svc/volatile
objfs 0K 0K 0K 0% /system/object
sharefs 0K 0K 0K 0% /etc/dfs/sharetab
fd 0K 0K 0K 0% /dev/fd
swap 92G 64K 92G 1% /tmp
swap 92G 72K 92G 1% /var/run
rpool/export 274G 34K 145G 1% /export
rpool/export/home 274G 13M 145G 1% /export/home
rpool 274G 106K 145G 1% /rpool
拔出旧磁盘,插入新磁盘 4.建立磁盘分区表
对于系统盘做镜像,需要将两个磁盘的分区表做成一模一样。
-3.2# devfsadm -C
bash-3.2# format
Searching for disks...done
c0t0d0: configured with capacity of 279.38GB
AVAILABLE DISK SELECTIONS:
0. c0t0d0 <SUN300G cyl 46873 alt 2 hd 20 sec 625>
/pci@0,600000/pci@0/scsi@1/sd@0,0
1. c0t1d0 <SUN300G cyl 46873 alt 2 hd 20 sec 625>
/pci@0,600000/pci@0/scsi@1/sd@1,0
2. c3t60060E8005638900000063890000000Cd0 <HITACHI-OPEN-V*19 -SUN-6008-4.94TB>
/scsi_vhci/ssd@g60060e8005638900000063890000000c
3. c3t60060E80056389000000638900000000d0 <HITACHI-OPEN-V*6 -SUN-6008-1.56TB>
/scsi_vhci/ssd@g60060e80056389000000638900000000
4. c3t60060E80056389000000638900000008d0 <HITACHI-OPEN-V*4 -SUN-6008-1.04TB> oracle
/scsi_vhci/ssd@g60060e80056389000000638900000008
5. c3t600507640081002FC0000000000000FCd0 <IBM-2145-0000-5.00TB>
/scsi_vhci/ssd@g600507640081002fc0000000000000fc
Specify disk (enter its number): 0
selecting c0t0d0
[disk formatted]
Disk not labeled. Label it now? y #新盘需打标签,2T以上容量可以打EFI标签(打标签会删除磁盘数据)
FORMAT MENU:
disk - select a disk
type - select (define) a disk type
partition - select (define) a partition table
current - describe the current disk
format - format and analyze the disk
repair - repair a defective sector
label - write label to the disk
analyze - surface analysis
defect - defect list management
backup - search for backup labels
verify - read and display labels
save - save new disk/partition definitions
inquiry - show vendor, product and revision
volname - set 8-character volume name
!<cmd> - execute <cmd>, then return
quit
format> p
PARTITION MENU:
0 - change `0' partition
1 - change `1' partition
2 - change `2' partition
3 - change `3' partition
4 - change `4' partition
5 - change `5' partition
6 - change `6' partition
7 - change `7' partition
select - select a predefined table
modify - modify a predefined partition table
name - name the current table
print - display the current table
label - write partition map and label to the disk
!<cmd> - execute <cmd>, then return
quit
partition> p
Current partition table (original):
Total disk cylinders available: 46873 + 2 (reserved cylinders)
Part Tag Flag Cylinders Size Blocks
0 root wm 0 - 20 128.17MB (21/0/0) 262500
1 swap wu 21 - 41 128.17MB (21/0/0) 262500
2 backup wu 0 - 46872 279.38GB (46873/0/0) 585912500
3 unassigned wm 0 0 (0/0/0) 0
4 unassigned wm 0 0 (0/0/0) 0
5 unassigned wm 0 0 (0/0/0) 0
6 usr wm 42 - 46872 279.13GB (46831/0/0) 585387500
7 unassigned wm 0 0 (0/0/0) 0
-3.2# zpool status
pool: rpool
state: DEGRADED
status: One or more devices are faulted in response to persistent errors.
Sufficient replicas exist for the pool to continue functioning in a
degraded state.
action: Replace the faulted device, or use 'zpool clear' to mark the device
repaired.
scan: resilvered 197G in 2h24m with 0 errors on Fri May 12 19:15:56 2017
config:
NAME STATE READ WRITE CKSUM
rpool DEGRADED 0 0 0
mirror-0 DEGRADED 0 0 0
c0t0d0s0 FAULTED 9 618 0 too many errors
c0t1d0s0 ONLINE 0 0 0
errors: No known data errors
将新换的磁盘加入存储池,制作磁盘镜像,开始数据拷贝
-3.2# zpool replace rpool c0t0d0s0
Make sure to wait until resilver is done before rebooting.
bash-3.2# zpool status
pool: rpool
state: DEGRADED
status: One or more devices is currently being resilvered. The pool will
continue to function, possibly in a degraded state.
action: Wait for the resilver to complete.
scan: resilver in progress since Mon Dec 11 11:46:11 2017
12.4M scanned out of 144G at 906K/s, 46h25m to go
12.2M resilvered, 0.01% done
config:
NAME STATE READ WRITE CKSUM
rpool DEGRADED 0 0 0
mirror-0 DEGRADED 0 0 0
replacing-0 DEGRADED 0 0 0
c0t0d0s0/old FAULTED 9 618 0 too many errors
c0t0d0s0 ONLINE 0 0 0 (resilvering)
c0t1d0s0 ONLINE 0 0 0
errors: No known data errors
bash-3.2# iostat -xn 3
extended device statistics
r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device
1.0 4.8 39.9 79.3 0.0 0.2 0.0 28.1 0 3 c0t0d0
1.5 6.7 35.0 114.8 0.0 0.2 0.0 23.5 0 3 c0t1d0
0.0 0.0 0.7 0.0 0.0 0.0 0.0 2.8 0 0 c3t60060E8005638900000063890000000Cd0
2.0 1.0 1.0 0.5 0.0 0.0 0.0 0.3 0 0 c3t60060E80056389000000638900000008d0
0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.2 0 0 c3t60060E80056389000000638900000000d0
0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 c3t600507640081002FC0000000000000FCd0
0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 rdms02b:vold(pid523)
extended device statistics
r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device
0.0 358.8 0.0 10546.3 0.0 8.4 0.0 23.3 0 98 c0t0d0
475.3 37.0 10446.3 392.8 0.0 2.2 0.0 4.2 0 56 c0t1d0
0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 c3t60060E8005638900000063890000000Cd0
0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 c3t60060E80056389000000638900000008d0
0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 c3t60060E80056389000000638900000000d0
0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 c3t600507640081002FC0000000000000FCd0
0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 rdms02b:vold(pid523)
extended device statistics
r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device
0.0 405.2 0.0 11338.2 0.0 9.1 0.0 22.4 0 100 c0t0d0
647.6 0.0 11317.5 0.0 0.0 1.4 0.0 2.1 0 44 c0t1d0
0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 c3t60060E8005638900000063890000000Cd0
0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 c3t60060E80056389000000638900000008d0
0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 c3t60060E80056389000000638900000000d0
0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 c3t600507640081002FC0000000000000FCd0
0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 rdms02b:vold(pid523
bash-3.2# zpool status
pool: backup
state: ONLINE
scan: none requested
config:
NAME STATE READ WRITE CKSUM
backup ONLINE 0 0 0
c3t60060E8005638900000063890000000Cd0 ONLINE 0 0 0
errors: No known data errors
pool: rpool
state: DEGRADED
status: One or more devices is currently being resilvered. The pool will
continue to function, possibly in a degraded state.
action: Wait for the resilver to complete.
scan: resilver in progress since Mon Dec 11 11:46:11 2017
107G scanned out of 144G at 17.5M/s, 0h36m to go
107G resilvered, 73.92% done
config:
NAME STATE READ WRITE CKSUM
rpool DEGRADED 0 0 0
mirror-0 DEGRADED 0 0 0
replacing-0 DEGRADED 0 0 0
c0t0d0s0/old FAULTED 9 618 0 too many errors
c0t0d0s0 ONLINE 0 0 0 (resilvering)
c0t1d0s0 ONLINE 0 0 0
errors: No known data errors
数据同步完成后如下所示,磁盘更换完毕
-3.2# zpool status
pool: backup
state: ONLINE
scan: none requested
config:
NAME STATE READ WRITE CKSUM
backup ONLINE 0 0 0
c3t60060E8005638900000063890000000Cd0 ONLINE 0 0 0
errors: No known data errors
pool: rpool
state: ONLINE
scan: resilvered 144G in 2h10m with 0 errors on Mon Dec 11 13:56:43 2017
config:
NAME STATE READ WRITE CKSUM
rpool ONLINE 0 0 0
mirror-0 ONLINE 0 0 0
c0t0d0s0 ONLINE 0 0 0
c0t1d0s0 ONLINE 0 0 0
errors: No known data errors
6.硬盘测试
重启系统,在xscf界面下,用新换的磁盘启动系统,若启动无问题,则磁盘更换成功:
{0} ok boot disk0
Boot device: /pci@0,600000/pci@0/scsi@1/disk@0 File and args: