On my main backup server I have a btrfs file system and btrfs functionality is enabled.
Most of my backups are incremental and there is only very little change.
Now I want to mirror this offsite to a remote host (which doesn’t have btrfs) using rsync.
Does this now result in each incremental backup actually using and getting transferred as a full backup?
Is there a way around this? Do I have to turn of the btrfs functionality? Is there a better way to deal with this?
If the offsite host does not have btrfs (with btrfs use e.g. buttersink) the only other option is to use some block level functionality. E.g. put btrfs on LVM and then use LVMsync.
Thanks for the ideas. Right now I’m leaning towards either switching the offsite server over to btrfs and then using buttersink or maybe just use a file as a loopback device on it with btrfs. Minio sounds interesting as well but in the end probably more complicated than really needed.
Adding a second server makes it more complicated to keep the configs in sync (like which clients and which directories to backup) plus puts more load on the network between client and internet where in the other case I can better control when the sync of backup server + offsite happens.
Ok, converted the destination computer over to btrfs (went a lot smoother than expected).
Now if I understood it right, buttersink only mirrors readonly subvolumes, urbackup on the other hand creates readwrite subvolumes. Do I need to create a readonly snapshot after each backup of each subvolume? Is anyone already doing this somehow? Is there something to hook into? Especially figuring out the correct parents looks to me a bit more complicated.
Ok, now a step further, failed with using buttersink (seems to somehow pick wrong parents) and now manually doing the btrfs send/receive. Works but I think I found one flaw in my thoughts.
For incremental updates this works fine as there’s a parent/child relationship between the subvolumes.
But I think what’s missing is the “normal” deduplication. If I understood it correctly, if the same file is present on multiple clients or didn’t change between full backups, it’s just reflinked. As there’s no relationship between full backup subvolumes, I can’t send them as parent/child and therefore this reflink is broken and each subvolume contains the full file.
Is this correct or did I miss something here? Any idea how to solve this?
You either do offline dedup afterwards using bedup, duperemove or bees or list all the existing subvolumes as clone source (with -c) in the btrfs send. Idk what this does to performance if you list hundreds of subvolumes there (plus at some point there will be a limit I guess).
I tried the -c but it fails with “did not find parent” as the subvolume does not have a parent. Dedup might work, but I’m not sure if this would work if I try to base the next transfer on a volume which was “deduped”.
Looking at another approach, if I would setup a urbackup server on my remote host and setup the local urbackupserver as a client, does the client send the same file twice or would the client already detect those same files?
Another thought, is there any downside of switching to only doing incremental backups? As far as I understood it, with the btrfs setup, each incremental backup is basically a full backup (contains everything in it as it is a snapshot of the last incremental backup which is a backup of the last incremental … which is a snapshot of the last full backup.
So if I would switch to only do incremental backups, each backup/snapshot has an existing parent and therefore I would eliminate the duplication as every snapshot has a parent.
I am trying to accomplish the same task. btrfs 2 btrfs
I will try to synchronize two backup servers using this script. I hope this script to make my like easier.