Hi everyone this is my first post and so it’s going to be a bit of shot in the dark as to the etiquette.
Firstly - this is a herculean effort for one man! Even over a ten year time span and so I take my hat off to you Martin Sir on a job well done.
I am using Urbackup to provide a better backup solution for a fairly narrow use case where I think the Urbackup codebase excels; namely Hyper-v VM backups.
The reason for posting on this forum is to discuss Urbackup performance. Time is money and time is life and time is short so performance is important to us / me. Always has been, always will be. One critical context item here is the globe environment, power consumption part. This means we do not run all our servers all the time especially backup servers. We bring them up, we backup and we turn them off. So we want, no, “we need”, backup performance! This whole notion that all the servers run all the time is old news.
So you don’t get bored, where I am going with this is I want to know whether RSS [receive side scaling] is something which could theoretically be leveraged in Urbackup image backups? My testing shows the OS (Debian 13) has enabled RSS on all the Nic’s in the box and across all 8 cores but it does not look like it is being used by Urbackup during a backup process. Both in terms of the copy speed and the eth0/1/2/3 interrupt distribution on the server between the processors at the time of the copy operation.
The rest is now just some context on the question and some thoughts on the matter as it has unfolded for us.
Coming from the way we backed up hyper-v disks before this feature is key. We use the below PowerShell function in a loop for each vm hard disk to just make a straight copy of the disk. This was actually allot faster than export VM can do it, but required the serve to be off. (Not so hot.)
Why? The reason is that explorer will utilize RSS if it is enabled in the client and obviously the receiving side as well. This meant we got around 1GB of data every two seconds over the backup network from each hyper-v host OS server to the backup server. Then it’s just simple maths to calculate the job run time based on the image sizes.
So for a fresh installed server on say 2019 Server we used to take about 80 seconds for a complete backup.
function Copy-File { param([collections.Arraylist]$from,[string]$to) $FOF_CREATEPROGRESSDLG = "&H0&" $FOF_NOCONFIRMATION = "&H10&" $objShell = New-Object -ComObject "Shell.Application" $objdestFolder = $objShell.NameSpace($to) if ($from.count -gt 1) { [collections.Arraylist]$Foldernames = @() $from |% ` { [void]$Foldernames.add($([io.path]::GetDirectoryName($_))) } $foldernames | Group |% ` { $return = get-location set-location $_.name $objorigFolder = $objShell.NameSpace($_.Name) $guid = [guid]::NewGuid() $newFolderPath = $(join-path $_.name $guid) new-item $newFolderPath -ItemType directory | out-null $objcopyFolder = $objShell.NameSpace($newFolderPath) ($objorigFolder.items() |? {$_.path -in $from}).path |% ` { $objcopyFolder.MoveHere($_, $FOF_NOCONFIRMATION) } $objdestFolder.CopyHere($objcopyFolder, $FOF_CREATEPROGRESSDLG) $objmovefolder = $objShell.NameSpace($(join-path $to $guid)) $objmovefolder.items() |% ` { $objdestFolder.MOveHere($_.path, $FOF_NOCONFIRMATION) } $objcopyFolder.items() |% ` { $objorigFolder.MoveHere($_.path, $FOF_NOCONFIRMATION) } remove-item $guid remove-item (join-path $to $guid) Set-location $return } } else { [string]$file = $from[0] $objdestFolder.CopyHere($file, $FOF_CREATEPROGRESSDLG) } }
Now we are switching to UrBackup, RCT now means that our differential image backups are SO much faster once the first disk image is completed. This is simply a killer feature and we can back them up when online, bonus. But the initial backups are now taking like 5 minutes for the same size image as before. ![]()
Obviously, to avoid flooding the network with contending traffic and to keep performance as high as possible we would never dream of allowing multiple hosts on the same network to be backed up concurrently. Max Simultaneous backups therefore has to = 1. We do not have additional vlan’s for other backup networks at this site.
Long question short, I seem to have lost RSS which is a pity, since I am now limited to a mere 100MB/’s as opposed to 400MB/s with RSS utilising all the NIC’s simultaneously. I did try to increase the threads a client can use for a copy, but I am pretty sure this only affects file copy not image operations.
IHMO the network IS the bottleneck in the product codebase so far as I can tell. I am taking CPU and RAM out of the equation since I don’t think in a server environment these resources should even be a discussion. Disk perf is fairly simple to fix this days, for instance with a raid0 [or raid 10 if the backups are that critical] PCIx M.2 adapter card and a few M.2 nNVME disks. Then the limiting factors could be [but should not be] RAM and CPU.
So we are down to the network performance and obviously we can go and stick in a 10G network, but that’s an expense for a future stable economy and not for now.
With the following command on the server I can watch the ethernet interrupts spike the CPU’s and I would expect at least four cores to be in use, no dice.
watch -n 1 ‘cat /proc/interrupts | grep -E “(enp|eth|MSI)” | head -20’
Using ethtool I can see that RSS is for sure enabled and active, so what gives ?
Since the hyper-v hosts are Windows Server boxes and I am going to assume that urbackup client does not replace the Windows network stack - RSS we already know works fine on the client (sending) side. But our copy speeds indicate it’s not being utilized any more.
Obviously this stuff is really complicated and I am trying to say a huge amount in one post which is always tricky when so many factors are in play.
Thanks in advance.