How to replace a failed hard drive in a GELI encrypted ZFS root mirror installation on FreeBSD

Published on 2022-10-18

In this tutorial I am going to show you how to replace a broken hard drive in a FreeBSD GELI encryptet ZFS mirror root installation. I'll assume that you have installed FreeBSD using the FreeBSD installer and used encryption on a guided (auto) ZFS installation. The installer partitions the two hard drives in an identical partition scheme, in which one of the partitions is fully encryptet using GELI. This is a great setup, but it requires a little knowledge of the internals to replace one of the drives.

When you run FreeBSD you should study gpart, the FreeBSD control utility for the disk partitioning GEOM class, and become really familiar with it. It's a really great tool - not to be confused with another tool, also called Gpart.

For this tutorial I am using image files and a virtual setup, but it is identical to the bare metal setup, with the only exception being the device names for the hard drives. I have two identical disks setup in in an old BIOS partition scheme, which looks like this:

# gpart show
=>    40  104857520  vtbd0  GPT  (50G)
         40       1024      1  freebsd-boot  (512K)
       1064        984         - free -  (492K)
       2048    4194304      2  freebsd-swap  (2.0G)
    4196352  100659200      3  freebsd-zfs  (48G)
  104855552       2008         - free -  (1.0M)

=>    40  104857520  vtbd1  GPT  (50G)
         40       1024      1  freebsd-boot  (512K)
       1064        984         - free -  (492K)
       2048    4194304      2  freebsd-swap  (2.0G)
    4196352  100659200      3  freebsd-zfs  (48G)
  104855552       2008         - free -  (1.0M)

If you are using EFI, it will look something like this instead (with an extra partition):

# gpart show
=>     40  2000409184  vtbd0 GPT  (954G)
          40      409600     1  efi  (200M)
      409640        1024     2  freebsd-boot  (512K)
      410664         984        - free -  (492K)
      411648    67108864     3  freebsd-swap  (32G)
    67520512  1932888064     4  freebsd-zfs  (922G)
  2000408576         648        - free -  (324K)

=>     40  2000409184  vtbd1 GPT  (954G)
          40      409600     1  efi  (200M)
      409640        1024     2  freebsd-boot  (512K)
      410664         984        - free -  (492K)
      411648    67108864     3  freebsd-swap  (32G)
    67520512  1932888064     4  freebsd-zfs  (922G)
  2000408576         648        - free -  (324K)

If we take a look at the mount points, it looks like this:

# mount
zroot/ROOT/default on / (zfs, local, noatime, nfsv4acls)
devfs on /dev (devfs)
zroot/tmp on /tmp (zfs, local, noatime, nosuid, nfsv4acls)
zroot/usr/home on /usr/home (zfs, local, noatime, nfsv4acls)
zroot/var/log on /var/log (zfs, local, noatime, noexec, nosuid, nfsv4acls)
zroot/usr/ports on /usr/ports (zfs, local, noatime, nosuid, nfsv4acls)
zroot/var/audit on /var/audit (zfs, local, noatime, noexec, nosuid, nfsv4acls)
zroot/var/crash on /var/crash (zfs, local, noatime, noexec, nosuid, nfsv4acls)
zroot on /zroot (zfs, local, noatime, nfsv4acls)
zroot/usr/src on /usr/src (zfs, local, noatime, nfsv4acls)
zroot/var/mail on /var/mail (zfs, local, nfsv4acls)
zroot/var/tmp on /var/tmp (zfs, local, noatime, nosuid, nfsv4acls)

INFO: For simplicity I have used the disk device names in the setup below. I highly recommend you always use disks ID's for ZFS, not device names or disk UUID's.

So, let's simulate that we have a failed hard drive. Checking the state of the ZFS pool will reveal that one of the drives in the mirror is unavailable:

# zpool status
  pool: zroot
 state: DEGRADED
status: One or more devices could not be opened.  Sufficient replicas exist for
        the pool to continue functioning in a degraded state.
action: Attach the missing device and online it using 'zpool online'.
   see: https://openzfs.github.io/openzfs-docs/msg/ZFS-8000-2Q
config:

        NAME             STATE     READ WRITE CKSUM
        zroot            DEGRADED     0     0     0
          mirror-0       DEGRADED     0     0     0
            vtbd0p3.eli  ONLINE       0     0     0
            vtbd1p3.eli  UNAVAIL      0     0     0  cannot open

errors: No known data errors

And only one disk is now showing with gpart:

# gpart show
=>    40  104857520  vtbd0  GPT  (50G)
         40       1024      1  freebsd-boot  (512K)
       1064        984         - free -  (492K)
       2048    4194304      2  freebsd-swap  (2.0G)
    4196352  100659200      3  freebsd-zfs  (48G)
  104855552       2008         - free -  (1.0M)

We cannot simply add a new disk without creating a GELI encrypted partition first, which means that we need to replicate the partition scheme which the installer has used.

This is the order of how we restore the mirror with a new disk:

  1. Install a replacement disk.
  2. Boot the system.
  3. Create the same partition scheme layout on the new disk, we can use gpart for that.
  4. If you're using an EFI setup, clone the EFI partition using dd.
  5. Copy the boot code from the working disk to the new disk, we can also use gpart for that.
  6. Initialize the relevant ZFS partition with geli and choose the same passphrase as the original disks was setup with.
  7. Attach the GELI partition and check the GELI status
  8. The do a ZFS replacement of the *.eli disk partition.
  9. ZFS begins resilvering.

This may sound complicated and time consuming, but don't worry, it really isn't. FreeBSD has some really great tools for this, and it only takes a minute or two.

So, let's get to it.

I'll assume that you have attached the new disk and removed the old defective disk. So now we need to create a backup of the partition table from the working disk. In my case the working disk is vtbd0:

# gpart backup vtbd0 > vtbd0.backup

We can then use that partition table backup to make an identical partition table on the new disk:

# gpart restore -l vtbd1 < vtbd0.backup

Notice that the backup was made from the working disk "vtbd0" and transferred to the new disk "vtbd1". In your setup that might be "ada0" and "ada1", or something similar.

Now, the disks has an identical partition table:

# gpart show
=>    40  104857520  vtbd0  GPT  (50G)
         40       1024      1  freebsd-boot  (512K)
       1064        984         - free -  (492K)
       2048    4194304      2  freebsd-swap  (2.0G)
    4196352  100659200      3  freebsd-zfs  (48G)
  104855552       2008         - free -  (1.0M)

=>    40  104857520  vtbd1  GPT  (50G)
         40       1024      1  freebsd-boot  (512K)
       1064        984         - free -  (492K)
       2048    4194304      2  freebsd-swap  (2.0G)
    4196352  100659200      3  freebsd-zfs  (48G)
  104855552       2008         - free -  (1.0M)

If you're using EFI, you will also have an efi partition at about 200M. You need to copy the EFI partition using dd, but ONLY if you're using EFI, otherwise you can simply skip this step. Pay attention to the partition number on your setup. In this example I have marked the partition number as X, but most likely it will be the first partition that is the EFI partition.

# dd if=/dev/vtbd0pX of=/dev/vtbd1pX bs=1M

Next, we need to copy the boot code to the new boot partition on the new drive:

# gpart bootcode -b /boot/pmbr -p /boot/gptzfsboot -i 1 vtbd1
partcode written to vtbd1p1
bootcode written to vtbd1

The -p option specifies the bootstrap code, in this case /boot/gptzfsboot. The -i option specifies the target partition for the -p partcode, which is the freebsd-boot partition.

Then we need to initialize GELI (this is like using cryptsetup luksFormat /dev/sda on Linux). Make sure you choose the correct partition, look for it with ls:

# ls /dev/vtb*
/dev/vtbd0              /dev/vtbd0p3            /dev/vtbd1p1
/dev/vtbd0p1            /dev/vtbd0p3.eli        /dev/vtbd1p2
/dev/vtbd0p2            /dev/vtbd1              /dev/vtbd1p3

This reveals that it's the third partition called vtbd0p3.eli.

So we need to initialize GELI on the third partition on the new drive (make sure you use the same passphrase as for the first drive):

# geli init -g /dev/vtbd1p3
Enter new passphrase:
Reenter new passphrase:

WARNING: Please note the -g option! The -g option enables booting from this encrypted root filesystem. The boot loader prompts for the passphrase and loads loader from the encrypted partition. Without the -g option, you cannot boot from the drive. Should you forget the -g option, you can use the command: geli configure -g /dev/vtbd1p3 to fix it.

Now we need to attach the encrypted device (this would be like using cryptsetup luksOpen /dev/sda cryptdisk on Linux):

# geli attach /dev/vtbd1p3
Enter passphrase:

Let's check the GELI status:

# geli status
       Name  Status  Components
vtbd0p3.eli  ACTIVE  vtbd0p3
vtbd1p3.eli  ACTIVE  vtbd1p3

We can now replace the defective disk in the ZFS pool. ZFS knows which disk to replace:

# zpool replace zroot vtbd1p3.eli vtbd1p3.eli

ZFS understands that the offline disk is to be replaced with the new disk, having the same device name.

ZFS will begin resilvering immediately, and in my case, since this is a simple virtual setup, it's done in a second or two:

# zpool status
  pool: zroot
 state: ONLINE
  scan: resilvered 2.08G in 00:00:13 with 0 errors on Fri Oct 14 23:45:17 2022
config:

        NAME             STATE     READ WRITE CKSUM
        zroot            ONLINE       0     0     0
          mirror-0       ONLINE       0     0     0
            vtbd0p3.eli  ONLINE       0     0     0
            vtbd1p3.eli  ONLINE       0     0     0

errors: No known data errors

The pool has been restored and is working again.

That's basically it! Have a good one!