ZFS RAIDZ disk change
2018-11-13
Here are some notes in order to change a failing disk on a RAIDZ pool.
This has been tested on FreeBSD 11.2. It may work with other versions,
but check gpart(8)
, zpool(8)
and the handbook to be sure.
My NAS runs FreeBSD 11.2 with zroot, 4x3TB disks in raidz1. Some days ago 1 of those disks started to report quite a few smart errors. ZFS itself did not report any errors, but I prefer to change the disk while it still works. It's probably faster (copy over re-build) and safer, as one does not face the possibility of a failing disk while rebuilding the RAID.
In this particular case ada2
was failing, and ada4 was the new disk.
This will change once the failing disk is removed, but I don't care as I
use gtp labels.
I don't like GPT GUID labels nor DiskID labels (although I see the point
on this latter ones when you have a bunch of disks ...). So, I have this
on /boot/loader.conf
kern.geom.label.gptid.enable="0"
kern.geom.label.disk_ident.enable="0"
First thing is to create thg GPT partition table:
gpart create -s GPT ada4
And replicate the same partition scheme on the new disk (in my particular case replacement disk and replaced disk are the same model):
gpart backup ada2 | gpart restore -F ada4
This only replicates the partition scheme, but not the labels. So that has to be done manually:
gpart modify -i 3 -l zfs4 ada4
gpart modify -i 2 -l swap4 ada4
gpart modify -i 1 -l gptboot4 ada4
As you can see on my schema I have a boot partition on each disk, a swap partition an another partition which is part of the zpool.
At this time, we're ready to replace the disk:
zpool replace zroot gpt/zfs2 gpt/zfs4
This can take a lot of time. It all depends on your hardware. In my case it took over 10h.
Is a good idea to setup now the bootloader in place on the new disk:
gpart bootcode -b /boot/pmbr -p /boot/gptzfsboot -i 1 ada4
Once finished everything is back to normal:
pool: zroot
state: ONLINE
scan: resilvered 2.15T in 10h23m with 0 errors on Tue Nov 13 04:31:35 2018
config:
NAME STATE READ WRITE CKSUM
zroot ONLINE 0 0 0
raidz1-0 ONLINE 0 0 0
gpt/zfs0 ONLINE 0 0 0
gpt/zfs1 ONLINE 0 0 0
gpt/zfs4 ONLINE 0 0 0
gpt/zfs3 ONLINE 0 0 0
errors: No known data errors
As a bonus, those commends can help a lot getting information about the disks, partitions and status:
zpool status
gpart show
gpart backup <provider>
camcontrol devlist
Take a look at the respective man pages before executing anything on your machine !
Have any comments ? Send an email to the comments address.