Testing Feasibility of Using in Enterprise Environment

Hello Everyone!

I am currently using CommVault to backup approximately 100-120TB of data to on-prem servers/storage arrays. That data is then written to tape and sent offsite periodically. The customers requirement is the data be retained for 10 years.

Currently we only do file backups on our servers, but we really want to do image backups of both our servers and client machines. Since CommVault has a per TB licensing structure, the cost is already silly in my opinion and adding image backups for all the clients would be even more crazy.

I want your honest opinion on if you believe this would be a viable alternative for my situation. We have approximately 120 servers and 1200 client machines at various locations around the world. The backup server and storage array is on-prem for each site and would handle the backups of that site. The total data load I’m guessing would balloon to 200TB or more if we are taking full backup images of everything. The backup and large data (ex. file) servers are all on 10gb network.

  • Our standard backup schedule is to take incremental backups of the server data on Monday through Thursday. We then take a full backup on Friday and write to tape on the last Sunday of the Month.

  • Image backups we will only do on an as-needed basis (before a major change is made to the system), and we are really only interested in keeping the last good copy rather than multiple instances of it, unless of course it uses data de-duplication/incremental images and doesn’t take up a lot more disk space.

My other questions:

  • Does this use data deduplication for the backup images and “regular” data backups?
  • Does this support backing up to tape? I know this is old-school, and I’d love to see us move to off-site disk based backup at some point in the future, but this is what we have now.
  • Can the client be deployed and backup jobs initiated remotely from the server?
  • Is there any issue running this on an on-prem server/storage array with no internet connection?
  • What is the cost for this type of setup (software…I know what hardware we’ll need :slight_smile: ). I realize you might not be able to give me that info in the forum…but if there’s a contact I can reach out to that would be awesome.

Thanks everyone in advance for your help!

  • Does this use data deduplication for the backup images and “regular” data backups?
    With the correct setup yes (look that up btrfs and raw images in the doc)

  • Does this support backing up to tape? I know this is old-school, and I’d love to see us move to off-site disk based backup at some point in the future, but this is what we have now.
    No.
    Honestly tapes are a mess. I think i would trust amazon glacier more, it s few €/TB/month, then pay more when you extract the data.

  • Can the client be deployed and backup jobs initiated remotely from the server?
    yes

  • Is there any issue running this on an on-prem server/storage array with no internet connection?
    There’s no support for the arrays special protocol (can t remember the name), you gotta share it first.

  • What is the cost for this type of setup (software…I know what hardware we’ll need :slight_smile: ). I realize you might not be able to give me that info in the forum…but if there’s a contact I can reach out to that would be awesome.

Urbackup isn’t what you’d call expensive for the server licence (neither the aws or the appliance is), the client side licence can sum up but it’s a one time fee.
https://www.infscape.com/ (40€/month unlimited everything)
https://www.urbackup.org/commercial.html ( aws: m3.large sw=$0.024/h hw=$0.133/h )

If you want image backup get at least a single licence to try it out, it’s about 15€ per client for change block tracking license on window (you can do images without, but it is 100x faster with).
Also i wouldn t recommend to backup 200TB on a single server. Start by expecting more like 20TB (because lot of small files may be an issue) per server with 100 clients, then scale up from there if the server manages it .

Anyhow i’d recommend you try with a pilot in one location. The initial setup of a single client+server is very fast, like an afternoon. The clients auto-register to the first server they see, so after a week you would have scripted and deployed a 100 client site.
I would be you, i would try the free version some time, then use the 30 days trial to evaluate the paying options. (aws has a federation option to group multiple servers that may interest you)
You’d need like a reasonable amount of ram. At first you could try with at least 16 cores and 48gb of ram for 100 clients. Use an ssd for the urbackup database (almost mandatory) and a disk array of rotational drives for the images and files.
Then backup maybe like 20tb-40tb and see how the software works. At this point you ll get a good idea of it.

A bit to differentiate here: UrBackup is operating system independent (both server and client runs on Windows, FreeBSD and Linux). It saves files from the client as files on the server (deduplicating them) and images in standard image formats (VHD and raw). It is Open Source, so you’d pay nothing for it, of course.
If you want to further archive those (files/images) you could for example on Linux use tar to archive the current directory of a client (or all clients) to tape.

Of course now there is a large choice of file systems (xfs, btrfs, ZFS, NTFS, ReFS) and operating systems (Linux, Windows, FreeBSD) plus management software for that (FreeNAS, OpenMediaVault, Rockstor etc.). But I’d narrow it down to ZFS (with FreeBSD) and btrfs (Linux). I guess this can be discussed further if you want to go down this path…

The appliance orogor mentioned is based on Linux+btrfs. It has a large number of setup options, but it is build on doing offsite backups via Internet. For example it can mirror its storage to S3 or archive backups to S3 or replicate backups to other appliances. I guess, archive to tape could be implemented, but it isn’t currently present…

Unless it is done by e.g. ZFS it only does whole file deduplication. There is a random IO/RAM/cost trade-off with deduplication that gets worse the larger hard disks get, so dedup doesn’t really help each time I try it…

How much clients a single server handles is in my experience mostly limited by the amount of random IOPS the backup storage can do, but 100TB with 200 clients that have a lot of changing data is e.g. doable on low-end hardware.