ZFS Copy-on-write raw image backups confusion

Hi. In the urbackup documentation it mentions some steps for raw incrementel forever style images using zfs.

This involves creating a separate dataset with zfs and adjusting the urbackup config to reflect this.
I’ve done this, but I’m not quite sure what the benefit is.
Is less space being used this way? E.g. like incremental snapshots on btrfs?

Secondly, I’ve noticed that by using this approach, all image snapshots are constantly mounted on the backup server. This creates quite a big and unwieldy list when looking at the output of ‘df -h’ for example.

Is this the desired behaviour? Do the image snapshots need to ‘mounted’ and visible all the time, or is this simply ZFS default behaviour which can’t be circumvented?

By the way when using ZFS, is there any kind of duplicate detection being performed by Urbackup? Like if you backup 5 very similar clients? Will that result in 5 completely individual backups, or will they share identical data to result in less used storage space?

1 Like

The real benefit of using ZFS or btrfs as the storage backend for UrBackup is the forever incremental backup. If you do not use ZFS or btrfs, your incremental backups must be “chained” together. Chaining the incrementals means each subsequent incremental requires the previous incremental to be valid unless you are setup the incremental’s base as the last full backup.

Example:

Full backup on Monday. incremental on Tuesday, incremental on Wednesday, incremental on Thursday. In order to a restore a full machine backup from Thursday, the incrementals from Wednesday and Tuesday must be available and valid as well as the Full from Monday. If any of these are not available or become corrupt, you not be able to perform a restore.

You can set UrBackup to base incrementals off of the last full backup. This is a differential backup. In this case you would require the incremental and the full backup to do a restore.

How does ZFS help? ZFS is a Copy-on-Write file system. As data gets written to the filesystem, only the data in a file that is actually different will get written as new blocks. Any data that is shared between the original file and the updated file is shared. This helps save space.

ZFS COW also provides the mechanism where incremental backups do not require any previous incremental to be available to be valid.

ZFS also offers features like compression and dedup. COmpression will save space as data is written to the disk and dedup will help reduce overall space if there are common blocks between separate files. Dedup is very expensive in regards to CPU and RAM though. Compression is almost free.

Another great feature of ZFS and btrfs is replication. It is very easy under either filesystem to replicate your data block by block to remote storage. This can provide an off site backup of your backups without requiring the clients to communicate with 2 backup servers.

UrBackup saves individual backups as separate ZFS data-sets. UrBackup needs those data-sets to be mounted in order to see the data in order to perform backups and restores. So yes, they need to be mounted. You can set the data sets to unmount but UrBackup will not function properly.

2 Likes

Wouldn’t it be more elegant if UrBackup only mounted those datasets as needed? E.g. only when it’s actually performing a backup?

I would prefer that UrBackup have as little interaction with the filesystem as possible. Mounting and unmounting seems like it could introduce problems. But that is just me.

When looking at the output of df -h you could just grep -v -e and exclude anything that matches the base data set that UrBackup uses. This should cleanup the output.

  • Edit: Adding link to ZFS-JAIL(8) *

I’ve set up urbackup on TrueNAS Core using ZFS copy-on-write raw images basically to take advantage of the “forever incremental backup”, basically following the instructions given here. FWIW, in contrast to the instructions, I had to put the dataset, dataset_file and backupfolder configuration files in /usr/local/etc/urbackup instead of /etc/urbackup. I’ve been using RELEASE 12.3 for the jail.

My biggest grief is with mounting image backups on a client. Using VHD-based image backups, I could simply share the folder with the backups using SMB via TrueNAS and mount them using the Disk Management app.

The instructions linked above mention that the raw images can be mounted using iSCSI. However, this probably works only if the iSCSI target is created from inside the jail, as once the dataset is jail-controlled, that’s my understanding, it cannot be mounted (or shared) outside of the jail. This in turn means I can’t use the sharing configuration UI of TrueNAS and the iSCSI extent configuration doesn’t let me select the jailed dataset. See zfs-jail.8 — OpenZFS documentation

Mounting image backups (read-only, of course) on clients is extremely convenient. Not sure how to resolve this problem. It would be great if urbackup could provide this as an option for clients. For now, I’ll either go back to VHD images or try to mount a non jail-controlled dataset into the jail which can be shared and imply copy the raw image there from inside the jail using a shell.

Edit:

  • Formatting
  • Clarification what to mount as iSCSI extent

Ok, found a solution how to mount a raw ZFS COW image backup only using the TrueNAS GUI:

  1. Select the backup dataset containing the image to be mounted in TrueNAS>Storage>Pools. Fortunately, urbackup creates indifidual datasets for each backup.
  2. Click “Create Snapshot” from the … pop-up menu.
  3. In TrueNAS>Storage>Snapshots select the created snapshot, unfold, and click “Clone to new dataset”. (For the location of the new dataset, I didn’t use the default path but instead selected something outside of the jail, putting it under a root dataset).
  4. Add the *.raw file in the snapshot-dataset as a file extent to the configured iSCSI target
  5. Now I was able mount the backup image as a Windows drive letter using the Windows iSCSI Initiator App.

This procedure looks rather complex at a first glance, but it took only a couple of minutes once figured out, even for a wannabe SOHO admin like myself. Not a recommended procedure for normal users like SO’s or kids though.

In principle, I think this approach is rather clean due to the fact that a copy of the image is used for mounting. Thanks to ZFS COW mechanics, this copy used zero additional space and time. In fact the image was about 1,4 TB and snapshotting etc. was done in sub-second.