Full file backups: what is the purpose?


#1

What is the purpose of full file backups? Why not just incrementals? I’m talking specifically to UrBackup and the way it does things.

I know the differences between full, incremental and differential backups in the traditional sense. And why one might be preferable over another, again, in the traditional sense. But it appears that UrBackup either emulates, or uses directly, the rsync functionality of --link-dest. So for files that are not copied during the incremental, hard links are created to the previous version of those non-copied files. Thus each backup appears as a full backup due to the magic of hard links.

So given this, why do you need to do full file backups? It seems a waste of resources to re-copy files that you already have a copy of. It seems like “incrementals into infinity” would be just fine.

This brings up another question - does UrBackup actually use rsync internally, with rsync’s nice algorithm to only copy the changed parts of files and not the entire file? The behavior of UrBackup seems rsync-like at times, but not completely. For example, UrBackup keeps some kind of database, which rsync does not need to do. But possibly this database is used more for storing configuration and statistics, rather than used to determine which files need to be backed up. I haven’t found any UrBackup documentation that goes into details like this.


#2

It s almost that

Urbackup has a slightly different meaning for “incrementals into infinity” , if you use btrfs, it is actually capable of using reflink for incrementals. This store only the differences between files. it copie the file in reflink mode, this takes zero space. Then tell it that only some chunks have been modified, and only theses chunks takes more spaces.

urbackup can do something like link dest for identical files in rsync. But not only for a file that s in the same place, it can also link a file which was backuped from a different server or from another place on the same client (my undertsandiung is that database is required here)

All this doesn’t come for free ressource wise. This is why you have some kind of slider that you can set in advanced parameter: transfert using raw, hashes, chunk hashes. This is because raw use no extra cpu compared to rsync, hashes allow to track fully identical files, chunk hash , allow to track partially identical files. So this slider change the cpu/bandwith balance and is tunable for full/inc local/wan file/image.


#3

I am trying to understand the need of full file backups, or if incremental backups only would be an option as well.
So far, searching the forum and the urbackup manual didn’t help.
This thread look promising even though I do not understand the answer to the question.

What are the pros/cons of just doing incremental backups?
What I need to take care of?
Does it make sense or makes something easier/more stable/ more secure if I do full backups form time to time (scheduled or manually)?

And please, if giving the answer just remove the filesystem benefits (ZFS/BTRFS) but include what could happen on a simple NTFS/EXT4 or any other filesystems not doing dedup or snapshotting on their own.


#4

I hear what you are saying I know personally i like to use the full backups as historic markers i.e. we have a number of database files on certain clients that can be quite large so if we did everyday incremental backups that are never removed we would run out of backup space in a short space of time or not be able to go back too far in history so its nice to have a full every month or so with the ability to go back a year maybe two with the incremental’s kept for the last 30 days

i’m sure if you dealing with small flat files it shouldn’t be a problem to just run never ending incremental’s.


#5

Full backups effectively give you a chance to “Empty the trash” If you have incremental going back 7 years, you’re storing files you deleted 6 years 11 months ago.


#6

I was thinking about this questions too. I am not absolutely sure but I think there is only one reason left for doing full backups.
Because of the way UrBackup stores the files with hard links there is no need for full backups in the traditional way to reduce the size of the incremental backups, or to delete older backups. You can always delete old backups no matter of full or incremental.

The only one reason left for doing full backups once in a while is bit rot. Assume you have a file which is never changed for years, this one is only really written once to the hard disk. In all this years it could be possible that i bit flips an the fill is corrupt, but you do not notice nor you have it backuped somewhere else.

If would be interesting if UrBackup checks all file with a checksum once in a while or uses checksums of ZFS or btrfs. And if it is corrupted renews it from the client.
Does anyone know this?
With this I can not see any need for full file backup.

Please correct me if I am wrong.


#7

You’ll have to run urbackupsrv verify-hashes -d -v "*" before the full backup. Otherwise it’ll just link identical files (again).

If you are using ZFS or btrfs this isn’t an issue anyway (if you have multiple disks in your btrfs/ZFS array).


#8

Some questions to this:

  1. If I do urbackupsrv verify-hashes -d -v "*" the corrupt files get deleted. Would they also be restored with an incremental backup?

  2. The mechanism of ZFS and btrfs would they only detect (and hopefully fix) the corruption when I try to restore the file or does making a backup with UrBackup does somehow also trigger this?

So anyway is there any good reason doing full file backups at all? If you have a RAID and ZFS or btrfs and scrub regularly. Should be the best method to keep your backup save.


#9

It only deletes the database entries, not the files, such that full backups don’t use the files as deduplication source.

It’s independent of UrBackup. You can run regular scrubs to repair those errors. But depending on your RAID level I’d say scrubs are unnecessary (but that also depends on the hardware you use, I guess). See https://raidsim.urbackup.org/ for appropriate RAID levels.

Yes. The only thing that full backups will do, is if there is some kind of bug in UrBackup (either client or server) that misses changes etc. it’ll kind of provide a back-stop for that.


#10

Ah yes, I read the command line help wrong.

Thank you very much for your detailed description!