Full file backup seems.. excessive?

Just wanted to confirm my understanding- I’m backing up approx 3tb worth of files currently and growing daily (stupid raw photo format…). My offsite server doesn’t have a great connection and my current ETA is 7 days for the full file backup.

And to my understanding, urbackup doesn’t have the hardcore ‘incremental’ system where all incremental file backups are true deltas where you need a full backup plus all the incrementals after it to recreate things.

therefore:

1/ Am I correct that the next full backup will take another 7 days - since it will re-send all files?
2/ Is it a good idea to turn off full backup? Or should I just let it churn away on a full backup every (say) 30 days

tldr: I don’t really see the benefit of doing full backups for files. Am I correct?

No, it will not send all files.
If you use a btrfs file system on the server then it will APPEAR on the server as a full backup. But,btrfs uses snapshots and copy on write.
I believe that if you use other file system - the server will still use smart linking.
I know that the client and server negotiate what has changed for incremental and only transfer that.
It us true that your understanding of what an incremental backup using deltas is NOT what urbackup does. IMHO what you describe is a somewhat (at least circa 1980) outdated method. Filesystems such as ZFS and btrfs and others are capable of linking in such a way that an incremental contains ALL files - as though they were a full backup. Thus not requiring a restore to incrementally restore deltas in the correct order.
Maybe this is not what you want? Maybe you want to be able to just restore the most recent delta always? But, you can still obtain the same result with some simple date based copying.

  1. I use incrementals every 2 days and full every 60 days. But, I am using btrfs on the server - so, the storage cost/increase is minimal.

You can use a container version of urbackup on the server (that’s what I do), If your server OS is doze based - you can even use btrfs (Winbtrfs - open source for XP upwards)
You don’t specify the OS for either server or client, new data averages or schedules - so I am unable to give more helpful information.
However, should you supply more detail - I will be happy to try and help.
Dave

Huh, I just noticed I basically asked the same question twice. But yes, I’m somewhat limited by my setup, I have a really old synology server parked at my dad’s that I barely managed to get urbackup set up on (had to hand-compile the package of a slightly older urbackup etc). But the main point of the backup of course is that hopefully only one or the other fails - the backup dies or the source dies (and my source also has raid6).

The thing my server does not have is oodles of space beyond what I’ve agreed with the aforementioned dad so I wonder if a full backup will just run up a huge storage bill with only a little bit of satisfaction for mr. Justin.

Is there some way I can test if a full backup will just make me occupy 2x the ‘required’ space of my single backup so far? Other than hit ‘backup’… of course… :slight_smile:

It’s amazing how many people on these forums think that backups only need to go on obsolete old tech because it’s only there ‘if the main system fails’. Then when the main system fails then they post on here complaining that the backup system is corrupt or broken and they cannot recover their data … and its the fault or urbackup

I don’t think that’s how the math works here. It’s all about risk. If you are diligent about checking if a backup can be restored (and I am, I check this manually), then being able to recover from one catastrophic failure is reasonable. If my main system dies, I have my dad’s backup - yes old syno box, but standard linux disk system and modern drives, so it can fail too, not to mention it’s raid 1. So for me to lose all my stuff I both need my main system die, and my dad’s system die at the same time. (and even then I have a few extra safeguards).

You don’t always have to have fancy gear to be in a decent state :wink:

Just to give a small update. I can confirm a few things:

  • A full backup does not need more space on the server than an incremental backup (assuming you have at least one copy of the data already), so even if you have almost no space left on your server, the backup will run fine if there’s nothing actually new.
  • A full backup will not send all data across again, only changed/new data.
  • A full backup rechecks all files and takes much longer.

Therefore I think the main difference between full and incremental is that with incremental, the client will decide if anything needs to be sent, on a full backup both sides probably check every file and compare notes (md5). I imagine theoretically if someone/something on the server side deletes/breaks a file of the backup, an incremental backup will not notice, but a full backup will.

This is all in the manual. Your second point is contradicted in Section 6.1:

  • The server starts downloading files. If the backup is incremental only new and changed files are downloaded. If the backup is a full one all files are downloaded from the client.

Well, there you go, sadly the manual seems to be wrong then. The first time I actually ran this backup, the file transfer took about a week. My most recent ‘full’ backup was less than 12 hours. (and - the bottleneck, at least in the initial backup situation) was the network, which hasn’t changed.

No, you cannot make that conclusion because you don’t understand how urbackup works. Just because files are transferred, it doesn’t mean the db is updated or the file written out to storage.

We have no idea of what you are measuring so noone can possibly know why your first run took a long time.

So you have no idea what I’m doing but I’m wrong? That’s some prime sleuthing there, sir.

But have it your way. I guess the urbackup logs confirm that my second full backup, which took 441 minutes (7h20) managed to squeeze that 3Tb through my 80Mbps uplink with… magic?

(for the mathematically inclined, in an ideal situation, and assuming my broadband provider actually gives me a solid 100Mbps rather than the 80 that I’m paying for, 3tb over 100Mps would take about 3 days, which it indeed ballpark matches the 4-ish days it took the first backup.)

Regardless, I’ve done a ton of spot check restores, and the 3.4Tb the server is reporting my backup to be seems to somehow contain the files I wanted to back up. I’m fine to rely on the magic. Cheers :slight_smile:

Internet mode transfers are compressed so your numbers are meaningless.

Why don’t you run two full file backups in sucession so that very few files will have changed on the second full file backup. By your argument, the second full backup will be instantaneous because no files will have changed.

Please report back here with your results. The manual will need updating if you are right.

1 Like

Not instantaneous, since as far as I can tell both machines, client and server are running MD5 or a similar hash to indeed check all files. Especially my dad’s piddling marvell armada cpu based server takes a long time to hash all files again. But yes, the amount of actual data transferred is minimal, which I confirmed with my network monitoring.

And what happens when you do an incremental backup instead of a full file backup?

Currently, an incremental backup takes 140 mins, a full backup 500 mins. Very few if any changes in the actual data to be backed up.

This thread seems to suggest in internet mode that even for full file backups not all files are downloaded. Maybe that is what you are seeing. The documentation is incomplete it seems.