Synthetic VHDs with RAW image backups on BTRFS/ZFS via clone copy


#1

I am a huge fan of what you have made with urBackup, especially this one line from the FAQ:

“The modified files are copied as fast as possible from the client. Only afterwards the server examines the files and puts them into the right directory.”

I was amazed to find after months of searching that no other consumer-priced backup solutions care about this! As you may tell from my previous thread, I am slowly deploying urBackup to my entire homelab and household. I have a dedicated Ubuntu VM with urBackup going to a BTRFS volume over iSCSI. Both file and image backups set to incremental forever, but no need for compacting synthetic-full images because BTRFS is awesome. I adore this setup.

While doing this, I found what could be a solution for one of the few gripes I do have with urBackup’s setup here, the RAW images. I’m glad there is a bare-metal recovery option, but frankly, it is lacking. The recovery ISO is extremely limited compared to almost every other image backup solution. But this isn’t really a problem for me because once I get it converted to a VHD, I can now use the amazing Macrium restore disk to mount and “clone” it to any target easily, or start it in Hyper-V, or even just mount it in diskpart to explore the volume. But alas, this beautiful setup using BTRFS for fast simple incrementals forces me to use the RAW format, which requires the recovery ISO, which I have to boot in a VM inside the subnet on which the server resides to “restore” to the backup to a VHD. But, there is another way…

http://man7.org/linux/man-pages/man2/ioctl_ficlonerange.2.html

Clone copies! IOCTL-FICLONERANGE (a.k.a. BTRFS_IOC_CLONE_RANGE) allows you to “copy” some data from one file to another, but really just linking the data blocks as pre-deduplicated (just like reflinks) and thus COW. The result is duplicating any range of data from one file to another both immediately and without consuming any extra space. Therefore, why not use this to assemble what I am calling “Synthetic VHDs”? The process as I see it would be:

[edit] My original post had the steps I’d worked out to rebuild a disk from the partition images. But of course you must have already done this in urbackuprestoreclient on the recovery CD. I don’t see the source for this utility in the repository, but I’d assume you can reuse most of that code here. You’ll have to use normal copies for the MBR and GPT data since it is not block aligned in the .raw.mbr file. Then you just replace the main partition data copy call with one of the clone copy calls above via ioctl.

    __s64 src_fd = raw_image_file; //file descriptor for source RAW partition image
    __u64 src_offset = 524288; //start location of data within .raw file
    __u64 src_length = 0; //which means until EOF
    __u64 dest_offset = partition_start; //byte location of where this partition starts on the disk

Then you have to create the VHD footer. It looks like there is already a function for that too.[/edit]

Once we have generated this VHD file, I just need to share it with Samba to unlock all the restoration options I’m used to with other backup solutions. Since this should in theory take only a few seconds of time and KB on the disk, urBackup could do this for every RAW image backup, no need for new buttons or anything on the interface. (Maybe a “RAW + Synthetic VHD” option for image backup format?)

I’ve spent the last couple days building up a way to test this. I have minimal Linux experience and never interfaced with the kernel like this, so I’m starting slow. For now, I’m trying to reverse engineer the urBackup .raw files to test it with normal copies, since the source code is, ahem, not terribly readable for a newbie like me. Once that’s done, I’ll make a wrapper for the ioctl call and try building a few VHDs manually.

Even then, all I can really do in the end is confirm this creates working VHD files within the expected time and storage consumption. I don’t see any way to implement this in the source myself at this point, urBackup is far more complex than any projects I’ve worked on.

So Martin, if this intrigues you, please let me know how I can help implement it with my limited abilities. This feature would mean so much to me.


#2

VHDs have a small bitmap every 2MB (per default) so it won’t work. But I think it might work with the VHDX format. You have seen this? https://urbackup.atlassian.net/wiki/spaces/US/pages/78384925/Assemble+zero-copy+disk+image+on+Linux


#3

urbackuprestoreclient supports command line?? Nice, I’ll have to look into this further. I’m guessing --restore-mbr doesn’t do anything with the backup gpt at the end of the disk? Is the source available for urbackuprestoreclient?

Yes, I think that link goes about the same goal. It looks like you would need to keep the loop devices open though, so it would only be temporary. The ioctl clone approach would keep the result entirely within the filesystem.

VHD or VHDX would both work. I found this raw2vhd tool somewhere on the internet years ago, including the source. It only adds a single sector footer and the result is accepted by diskpart and Hyper-V. Perhaps the bitmap spacing is optional for fixed-size images?


#4

Success!

I created a wrapper for the BTRFS_IOC_CLONE_RANGE ioctl and manually rebuilt a VHD from RAW images using it. It took about 5 seconds to clone my 500GB image, and the filesystem only shows a space usage difference of 0.03GiB. I shared the result over Samba and it is mountable with diskpart and can be copy-converted to a new dynamic VHD/VHDX using the Hyper-V virtual disk tool. These are the two primary functions I need.

Next step is to automate and, with your help, integrate the feature into urbackup server. I am now porting the raw2vhd tool to Linux.