Swapping Hard Drives for redundancy

Hobi · August 29, 2024, 9:15pm

Im using URBackup for a few computers.
once in a few days, l want to change the backup hard disk (Drive D - with the URBACKUP folder)
to backup everything on another hard disk for redundancy - if the connected hard drive fails or infected, l always have the other one (thats not connected), not 100% up to date but at least l’ll have most of the data. tried that with just swapping the drives but it was full of bugs.
any solutions?

GilesP · August 29, 2024, 10:16pm

You could try multiple urbackup servers each with their own disk. Do a search on the forums.

Hobi · August 29, 2024, 11:34pm

Thank you and its a good idea, but it will be much easier with only different hard drives.
not only because its not another computer and stuff, but more because its more “protected”. instead of having my 2 hard drives running and “vulnerable” to anything like ransomware or anything, I always “hold” one hard drive with the data separated and disconnected for safety.
btw, what l do now, is once in a few days, l disable the inteternet, and CLONE the data disks, so l have a backup of a backup. its great and safe but takes long time and complicated

Bearded_Blunder · August 30, 2024, 9:14pm

Only way you’ll make something like that work is to set up a RAID mirror, which you break & swap the second drive with from time to time, so you have the old mirror copy.

Which will work, but probably less reliably than setting up RAID for redundancy & just leaving it alone to be a redundant array.

GilesP · August 30, 2024, 9:45pm

If you alternated which machine you switch on it would be functionally the same as your current set up. It is more effort to setup though.

Hobi · September 8, 2024, 7:49am

Thanks, l thought abt the raid 1 but its too unsafe for the data, but its my last option if l dont have a choice.
btw, tried swapping the hdds once in a while (between hdd1 and hdd2 as l mentioned on the topic) but its “confusing” the system probably because of inconsistence between what the system “supposed” to see and between what it “really” sees

Bearded_Blunder · September 9, 2024, 8:29pm

l thought abt the raid 1 but its too unsafe for the data, but its my last option if l dont have a choice.

Depends what you spend on it, perfectly possible to set up a 3 way mirror, reduce it to 2 drives for the swapping & back to 3 to rebuild… costs you an extra drive & either using software RAID or a pricey controller.

Just swapping the drives isn’t going to work, the database of what’s stored won’t match what’s on the swapped in drive.

About the only alternative I can think of would be to either set up a second server to periodically image the main one. Or use a different backup program to image or clone the entire server machine (OS & backup storage)

Dadr · September 16, 2024, 9:48pm

I’m in about the same position as Hobi: New to Urbackup and have it on a handful of home computers. I have an Ubuntu 22.04 system with a 6.8 kernel serving Urbackup. I’ve installed a separate backup drive formatted with btrfs. I’d like to have a pair of secondary drives where I replicate the data, and then have one off-line and the other off-site. It seems to me that there are several approaches that might make sense, and which I have questions about:

First the open-ended question - has anyone else done this? What did you do and why?
Bearded_Blunder mentioned making and breaking raid arrangements. It’s something that btrfs is supposed to be able to do, but I’m wary that making and breaking RAID with not just one but two other physical drives might find a few bugs in btrfs - and that it might not truly support this capability yet. I’m pretty sure that it was broken as recently as 2022. Does anyone have any experience with doing this on btrfs that they could share?
Sync the main backup drive to the two others. This might be as simple as rsync for Hobi, but with me using btrfs, it looks like something else might be necessary. I think it would be pretty safe to just dd the backup partition to one the same size on the other drives. However these are 14TB drives that are about 10TB full, and that’s a really long copy. To make it consistent, urbackup would prob need to be offline for over a day. Not so good. So I think that scripting of btrfs send-receive may be in order. I noticed a tool called btrfs-clone that looked pretty good, but seems largely abandoned, and would need work to get operational again. It also seems that btrbk might do this as well (in fact that seems to me like the best no-coding option right now). But I might be way overthinking this, and a really simple script to determine all the subvolumes and then send-receive them might be best overall. In these cases, I’m wondering if there are issues or requirements with syncing the btrfs subvolumes made by urbackup? The questions I have for these approaches include: Do the copies need to keep the same UUIDs? Is there a reasonable way to use parent and clone flags on btrfs send so that reflinks remain in place across subvolume copies or will they lose some level of deduplication? Has anyone done this?
GilesP mentioned using a second instance or second server of Urbackup to backup the first one. I think this gets recursive for me, who wants two other copies of the backup volume. Also, I have a question whether Urbackup even can backup the hierarchy of btrfs subvolumes on its backup volume? I suppose that one could use LVM underneath the backup drives, like described in the Developer Blog post:
How I backup LVM volumes on my Xen server | UrBackup Developer Blog
There is some question whether btrfs should be used over LVM or not.
Ask for a feature. Well, maybe, if one of the other approaches doesn’t work out. And maybe if it winds up being something that a lot of other people would like to see included. I’d like to not ask for a feature, as some of the other items in the roadmap look like things I’d rather see the developers work on. But it seems that backing up the backup repository and restoring it could be done much more intelligently if it were part of the urbackup system.

That was a pretty long-winded first post. Thanks for any clues you can help me with.

Dadr · November 6, 2024, 4:31pm

Ha, I guess my previous post was a tldr; I’ve tried some things since then and thought I would share the results.

First I tried to copy the entire drive / partition. I almost gave up on btrfs so that I could use the disk pool tools from ZFS as described in the urbackup manual. The similar operations for btrfs seem to destroy the data on the drive taken out of a pool. It did not seem like a solution.

I was hopeful that blocksync-fast would be able to make a copy and then only have to update blocks that changed. It worked well with my test setup, but when I tried it on the actual backup drive, the copy had a corrupted filesystem. So I then tried partclone.btrfs and I got the same results. It worked on small test drives, but cloning the actual backup drive wound up with a corrupted filesystem.

I pulled out a fair bit of my hair trying to figure out why these tools corrupted the filesystem. I took care not to mount a partition until I had assigned a new UUID with btrfstune -u. I even ran badblocks on the target drive - even though it was new - just to make sure that it was not a hardware problem. I never figured that out.

So, even with the warnings about it being out of date, I tried btrfs-clone. That tool takes snapshots of the subvolumes and then uses btrfs send/receive to copy them along with figuring out choices for the parent option on btrfs-send. That worked for both the test drives and the actual backup drive. It also has the benefit of not having to stop the urbackup server to unmount the backup drive and make a copy. Currently btrfs-clone only makes a complete copy, and that takes more than a day to complete. Also the copy seems to get larger than the source was, so it might be missing reflinks. I may modify the tool (it’s python) to just copy new or changed subvolumes. And I would like to make sure that the tool is referencing appropriate subvolumes for the parent option in btrfs-send.

That leads to my question about the subvolumes that urbackup creates. Clearly, snapshot subvolumes are linked to their previous version - but are there reflinks across snapshot trees? For example, if I am backing up client1 and client2 then I expect client1 snapshots to be reflinked to earlier versions of client1 snapshots. But if both clients share the same file, will urbackup de-dupe it by reflinking across snapshots from both client1 and client2?

I would love to learn if there is an approach with btrfs send/receive that will make exact copies of the backup drive.

Thanks

GilesP · November 6, 2024, 8:14pm

Why do you need two copies of the entire version history?

I am not suggesting you use a second backup server to backup the first. The natural way to address redundancy in urbackup is to use two independent backup servers. Each client backs up to both servers independently. It is true that the two backup servers will have slightly different version histories depending on timing and the schedule. But there is redundancy - a version history is available for each image even if one server is unavailable. Even if you do an exact copy at an instant it will soon be stale.

So my question is why do you need an exact copy of the backup server history?

Dadr · November 6, 2024, 9:31pm

Thank You GilesP - this is indeed something to think about. And forgive me, because now I can see that this is what you have already commented on in this thread, but I didn’t really get it until now.

It’s my intention to store a backup drive off site to recover from in case of disaster. I plan to keep it in the Safe Deposit Box at my bank, and swap it with the local one every month or so. I realize that a cloud service could host another instance of urbackup, and I think that would be really elegant, but such a service is a little pricey for me.

So I have a server with an internal drive performing backups. I also have the two additional external drives that I can connect with a USB dock. I intended to rotate those two every month or so and sync the primary backup drive to each of those. I thought that syncing was a better approach because most of my clients are not powered on most of the time. When they get used, they backup then. If the USB drive is also mostly offline, then just powering it up will not have a good overlap where backups could happen. That’s what drove me to seek a solution where I could clone the backup drive on demand.

But I think I get your point too - instead of trying to sync the data from the server’s internal drive to one of the external drives, I could use another instance of urbackup on my server (or even some other computer on my network) to perform a parallel backup to the drives that I rotate. Each drive would have its own instance of urbackup. It would duplicate LAN traffic and client activity, and I would have to leave the second drive online all the time, but I think that this could work.

Thanks again, I will look into how to set up multiple urbackup servers on the same LAN, and also it might be interesting to use qemu on the server so that when it sees the USB disk attached, it boots that instance of urbackup right off that drive. If I have to recover from a disaster, I only need to boot from that drive on some new system.