FreeBSD + ZFS (to dedup or not to dedup)

I’m running two urbackup servers today, both on Linux with BTRFS. According to Urbackup, the storage usage is ~47 TB, but btrfs reports ~16.5 TB used. The systems both have 6 x 10TB SATA disks for backup storage and one 250 GB SSD for OS, and 48 GB of RAM.

I want to move to FreeBSD + ZFS, since FreeBSD is the OS we usually run on our servers and we feel the ZFS tools are more mature then BTRFS, at least now, and the future of BTRFS looked cloudy the last time I checked.

I know I won’t be able to just turn on deduplication, since the RAM usage would kill the server and adding enough RAM for dedup is unlikely. The nicest thing about dedup is that we can just do full file backups of every server, instead of cherry picking directories. Thanks to dedup, we’re not wasting space storing system files and libraries multiple times. Managing backups is so much easier this way, a server is either backed up or it isn’t. I’ve been in situations where it turns out that valuable data wasn’t backed up because someone forgot to add the directory to the backup set or data was moved to another directory and the backup set not updated.

Does anyone run Urbackup server on ZFS with either L2ARC on SSD and deduplication on, or without deduplication and can comment on it? Would compression give us most of the benefits of dedup?

Hi! I wont recommend it without dedup. Check Copy-on-write file backups with ZFS in the manual. AFAIK this different on btrfs. Will store files many times.
We are using https://hup.hu/cikkek/20201202/openzfs_2_0 on linux, on ssd with dedup on and l2arc. 40 cores, 128 GB RAM, zil and arc on ssd. HDD storage.
IO speed impressive, but the backup sometimes feels sluggish. We could not figure out the reason yet.

Hi,
we’re using ZFS 0.8.3 with dedup=skein (much better performance than default sha256), compression=lz4, recordsize=1M on an AMD 7232P CPU with 128GB RAM and 500GB nvme SSD for L2ARC. We found no benefit on using ZIL on SSD, so we dropped it. Zpool layout is 4 stripes of mirrors with 4TB SATA HDDs. Logical used = 25TB, capacity = 35%

EDIT: Runs on Ubuntu 20.04.2 LTS

Cheers

Knut