Advice needed on backup config

BatterPudding · May 23, 2021, 1:52pm

I have a simple a dumb Windows 10 Pro PC with multiple hard discs. No one uses it as a desktop, it is basically a file sever \ media server.

C: - 160GB just Windows and Apps
D: - 500GB docs and home business data
E:, F:, G: - 4TB and 6TB drives full of music and videos

Drive D changes often. Usually daily.
Drives E, F, G rarely change.

Ideally I would have daily file backups of D with multiple rollback copies.
And monthly images of C, E, F and G as these don’t change often.

BUT I now find I can’t do this due to 2TB size limits on making image backups.

As I have to use File backups for E, F, G can I change the timings of the sets? i.e. D files hourly, E,F,G files monthly?

Currently the answer I think I see is to do Daily image backups of CD and Monthly file backups of EFG.

How would YOU backup this PC?

uroni · May 24, 2021, 6:49pm

You can create “virtual clients” (see manual) to backup different sets of files/images at different intervals.

BatterPudding · May 25, 2021, 9:12am

Thanks. I like the idea of that. I’ll go investigate. I can then make a “Fast lane” and “Slow lane”.

It has taken 3 days to do the first file backup of those three large drives so I need to reduce that to a much rarer event. I realise the incremental will be quicker, but still no need to check those drives that often as things change so rarely.

Don_Wright · May 26, 2021, 5:29am

The reality of UrBackup is - you don’t need to make any changes. The first backup takes longer because effectively all files have changed from their previous state of “not backed up.” The next file backup will know which files have changed and only send those files to your backup server. If all that changed was the metadata (directory information) that shows the file was read or even moved to a different location, then only that metadata is sent, not a fresh copy of the file. This is because the UrBackup Client program on your Windows 10 PC keeps track of what files have been sent for backup already. In an “incremental” file backup, the Client only offers the Server the files it knows have changed among the list of selected directories and files. In a “full” file backup, the only difference is the Client offers all the selected files and directories once again. It does this by sending only the metadata, and the Server program then requests the specific files that are not already backed up. This can correct the rare instance of the Client losing track of some files that actually changed.

There are other optimizations and efficiencies, but that’s the general idea. UrBackup Client and Server work together to send as little traffic across the network as possible, unlike other backup programs that send copy after copy of files that never change. This saves time and resources on both ends.

As Uroni said, UrBackup also offers many ways to handle difficult backup situations, but simple settings using mostly defaults work well for most people. Try it for a while and see how long the second, third, and later backups take. I think you’ll be pleased.

BatterPudding · May 26, 2021, 9:45am

I’m a patient man. It took four days just to zero the UNRAID server. So three days on initial backup isn’t too scary. I’ll watch how the ongoing tests then behave. What you say all makes sense to me. Trouble is I guess when I start making “virtual clients” I have to be careful to keep the main data with this original client to avoid repeating the 3 days.

As this data changes so little I expect the next backup to complete in minutes

Don_Wright · May 29, 2021, 7:56am

Good news! The UrBackup file deduplication feature works with any file from any source. If a file is common to several clients, real or virtual, only one copy is saved in the “pile of files” and only metadata bookkeeping entries are needed for each backup. This leads to a common technique for pre-seeding files for a remote computer.

In case the reader hasn’t encountered this hint: One takes (or sends) a portable USB drive to the remote site, copies the relevant files onto the portable storage, then returns the USB drive to the location of the UrBackup Server. One then connects the drive to a client local to the server, temporarily adds the USB drive to the backup selection for that client, and performs a full backup at local network speed. After a successful backup the portable drive can be disconnected and the Client’s backup settings restored to their original values. The Server now has copies of each of these files that match those on the remote Client, and the remote backups now are based on the pre-existing files rather than having to transfer them over the typically much slower Internet connection.

I did one of these about two weeks ago and plan to do another this weekend, so I know how much it can help.

BatterPudding · May 29, 2021, 2:00pm

I have been noticing how clever the dedup is. But that is not the issue today. My initial problem is I am backing up a unique media server. Just the one machine, but 9TB of data films and music.

First backup caught me out as I had to swap from images to file backup which meant needing more space. So I jammed my server full. That three day backup suddenly was showing over a billion days remaining which is when I realised something was wrong. Haha! So next test will be to see if urbackup can resume that stalled backup okay now I have added more storage to the server.

BatterPudding · May 31, 2021, 10:31am

AH. Turns out urbackup did not like a four day pause in a backup as it won’t resume that previous backup. No acknowledgement of any of those files having been transferred. no log has it, only the stats page seems to be able to count those TBs.

I’ve restarted the server with more space, and hitting Continue on the client now gave me an even longer ETA of 9 days for the backup. Back at 0%.

Oh well… it is but a learning curve. I’ll give it a few hours in case the dedup side kicks in and it spots the data is already there and nothing needs backing up. Otherwise I may as well delete it all and start fresh.

I assume as I can’t see the big backup listed in the logs, when it stalled it was abandoned and now I have 8TB of orphaned data? Today’s Incremental backup seems to be writing out the same files again as if they don’t exist.

I’ll monitor it for another hour, but it looks like I’ll have no choice but to stop it and wipe it all out. I can’t have it “forgetting” about 8TB of already backed up data as I’ll end up filling up the server again.

Okay - so reading the manuals I see that I can run urbackupsrv remove-unknown to clear out anything that is unknown\unsuccessful from the folders? I guess if that 8TB of data is really “lost” to the urbackup server this is my saviour? Is this in the GUI somewhere? I get a Permission Denied when I try and run it from the command line on the server.

-=-=-

With the Virtual Clients I wanted to split the backup into “backup these three folders daily” and “Backup this 8TB of files monthly”. If I read this correctly I just need to add one “virtual” client. Then assign the “fast” directories to the first client, and the slower files to the virtual version. Giving me two clients listed: GERTY and GERTY[media]. With GERTY getting the daily backups on the core folders, and GERTY[media] getting monthly runs on a different set of folders?

(Edit note: This post has been re-written \ added to a few times as I answer my own puzzles)

BrainWaveCC · June 1, 2021, 1:42am

Yes, this will absolutely work.

Make sure you do it from an elevated CMD prompt.

I went through an interesting experience where that script came in super handy.

-ASB

BatterPudding · June 1, 2021, 9:30am

I’m logged in as root on a linux based UNRAID server. I’m not sure if the command fully works. Where am I supposed to run it from? Where is the file urbackupsrv? I may have to go back to the UNRAID forum and check if the command is available.

As this is basically the first backup I may just save time by nuking all of it and starting fresh. it has been a good learning curve to find what I can and can’t do.

Bingo. Got it. In UNRAID there is a Console attached to the Docker I can run this from. I have now triggered the command and success text has floated past. But it is completed far too quickly for my liking.

Maybe it still thinks these files are part of the backup? Even though I can’t find them listed anywhere. I’ll have to look closer again later. What I don’t get is there is nothing saying the backup was a success, so no way to recover one of these files.

BatterPudding · June 1, 2021, 2:17pm

This doesn’t make sense… okay, here is a list of backups:

Notice the sizes.

Here is a list of folders:

If I go inside 210523-2047 it is huge - many TBs of data. The 8TBs of drives R,S,T. And yet I do not see it listed. I was expecting the urbackupsrv remove-unknown sweep to delete that folder, yet it survived. This is the abandoned backup from when the server filled to full.

Weird… I just copied the backup_server-files.db out to another PC to look at in DB Browser for SQL Lite and sure enough, there are all those files listed. Okay, so piecing the puzzle together. The backup software is hardy enough to recognise it has a copy of all the files, but as the backup never completed I assume this is why they are not listed for me to access.

A new backup is running again… but now I am certain I have totally stuffed in. It is slowly doing a “full backup” which should be identical to what already exists. And as the hours tick by it goes from 3 days, to 9 day, to 15 days, to 21 days… and still not even past 1%. Edit: Funny… 10 mins after posting this comment the time started dropping fast… now below 13 days and dropping… I guess this is the calculating the dedups. Even as I type this it is now down to 11days. 30mins later and we are at 4days, 2% done. Zero data has actually been added to the server in that time - which would be correct as nothing has changed. Maybe I’ll leave it overnight to see what happens…

Main lesson - don’t use urbackup as a clock.

I think it is getting a little confused.

Okay… decision made… Nuke the entire site fom orbit - YouTube

BrainWaveCC · June 2, 2021, 2:04pm

What is the final verdict?

BatterPudding · June 2, 2021, 2:48pm

Well, it hasn’t been shot down yet. And we are about 24 hours after I started the new “Full Backup”. Currently the status says 30% done and ETA 2 days 3 hours on the client. (we are talking 9TB or so)
(Edit: A few hours later and we are on 36% with 1 day 13 hours remaining…)

It is interesting to watch. I can see that ZERO files have been copied to the server. Nothing has changed in the storage levels. I took a little snapshot of free space before I started and that isn’t changing. I can see a lot of reads happening on the server, but nothing adding to the space used. That makes sense as the files are already there.

When I look into the folder on the server from the command line I do see files listed. As no extra space seems to be getting consumed I assume this is the dedup going on? To be honest, it is a little puzzling as a quick du shows folders have stuff in them equal in size to what was backed up.

I think I’ll let this one run to the end, and then test recover a few random files. See what I get.

I am still not totally against wiping it all out so it is a clean start. It just feels like it is now working correctly (if a bit slow). The fact no extra data is being written kinda seems right.

Though that leads to different questions as to how hardy things are if some files get corrupted - will that corrupt data in all backups?

OnlyMe · June 2, 2021, 7:26pm

If you are running Linux, then you will need to sudo the command, otherwise you will be running with the user-level access - this is the same as running an elevated CMD session, but in Windows the entire session is elevated; when in Linux, you can elevate a single command from the existing Terminal session with the sudo prefix…

BatterPudding · June 3, 2021, 3:51pm

The du command gave me the same result with or without sudo.

I am now at the second morning after starting this. 62% left ETA 23 hours.71% left ETA 18 hours. What is noticeable is that now quota is slowly being nibbled on the server as it has now finally used an extra 0.5TB 1TB compared with the starting level.

What puzzles me is how come when I look into the folders and run a du I get sizes returned that it added up would be too big for the server?

/urbackup/PCNAME/210523-2047/ is the backup that failed
and /urbackup/PCNAME/210601-1555/ is the backup currently running.

These both include three large Media Full drives. Films, TV and Music.

If I run the du command on these the first one is 7TB and the currently filling one is 4.8TB. A total that is technically impossible as only 9.45TB currently on the whole server storage.

BrainWaveCC · June 4, 2021, 11:50am

That’s almost certainly a factor of the symbolic and hard links.

When you check the size of folder A, which also points to folders B, C and D, you are obtaining an aggregate size for that parent folder structure.

Then, when you look at folder F, which also points to B, D and G, the same thing happens.

That’s dedupe working in your favor.

-ASB

BatterPudding · June 4, 2021, 2:42pm

What is interesting to me about this dedup is that I still have zero record of the original failed backup in the logs, but the 7TB of files are in the database when I looked with SQL Lite Browser.

I’ll have to work out how to get du to tell me the real totals for the folder. When I look now after the backup has completed the “du -sh 210601-1555” has shrunk by a lot. Now it only says 2.3GB

The important thing is things seem to be working as expected. At the weekend I’ll do some test recovery sessions. I especially want to see that it handles Unicode text correctly as I have some Japanese titles in my collection.

The way that dedup is working is encouraging to see. It means I can mess around with the virtual clients more and it will recognise they are the same files already copied over. I need to change the backup of the main media server to be much less regular as it takes so long to do, but most of it rarely changes.

BatterPudding · June 5, 2021, 10:28am

Came back to the server a day later and a cleanup has happened. maybe due to starting another backup. I backup to an unraid server so it spreads the data over three disks. lots of shuffles have happened and lots of the original files have gone or moved. total is still the same but new writes have happened as the files on different disks now.

stupidly my testing has got tangled as a virtual client is now running and repeating the first backup again. lol. but i’ll hav this tamed soon. dedup is keeping the file totals static though.