High CPU load on clients after server restarts

So, this is a bit of a weird one.

tldr: Restarting the server makes the clients (UrBackupClientBackend.exe) on all machines to load CPU to 50 % (regardless of type).

Config:

  • UrBackup Server 2.4.13 on Debian Stretch (Jessie when it first occured)
  • UrBackup Client 2.4.10 (Windows 7, Windows 10, Server 2008 and 2016 – ie. everywhere. No UrBackup Clients on Linux VMs which I backup up differently.)

The long story:
My UrBackup server is running on a physical Debian machine than up until November had uptime of 500+ days, and then it crashed. I went to the site, server booted up okay (it’s a HP Microserver N54L I believe), backups seemed to be working (not much else running on this server otherwise apart from some rsync scripts every now and then)…

Except, a week later users started reporting the apps on a terminal server being too slow, which was a a bit weird to me since I had just moved the terminal server VMs from old school HDDs to NVMe SSDs.

Went to Zabbix and sure enough, the CPU load was constantly high across all machines (servers and workstations), and it started the very hour I have booted the UrBackup server back up (it was down for two days). Weird.

So I logged in to Hyper-V and noticed all the Windows VMs had unusually high CPU load for the time of the day (10 pm). Long story short, I went into each one, and there it was, on every VM there it was – UrBackupClientBackend.exe, averaging 50 % CPU - regardless of how much CPU was assigned. Roughly 50 % on all cores, even on the HyperV itself. Compounded, it made everything noticeably slower.

The solution was easy though, simply restarting the UrBackup service everywhere.

Everything worked fine for a few weeks, then I decided to upgrade the server from Jessie to Stretch (and then eventually to Buster), which of course needed yet another restart.

I’ve since tinkered with some Zabbix CPU load triggers and sure enough, within a day I got emails that the CPU load was excessively high for a long period of time. Logged into Zabbix and yet again the CPU load started almost immediately after the UrBackup server rebooted.

Restarted the services again, everything back to normal.

Now I don’t plan on restarting the server often, but anyone has any ideas what might be causing this and where to look next time I will be restarting the server (the whole physical machine?

I am not yet sure if service urbackupsrv restart would cause the same issue, I’d assume it would, but that’s definitely an experiment for a weekend.

Could you create crash dumps when it occurs like here UrBackup Client at idle time it uses one core per 100% ? Also, protocol it with procmon?

Disabling ipv6 would confirm it is a ipv6 issue.

I can’t fix the issue until I have hints about what happens and why, it happens on a client I can debug or somebody does the debugging…

So I’ve still had one VM where it was stuck (Windows 7), but I am not 100% sure how to use the NotMyFault.exe to create a crash dump (it looks like I only have one shot at it). I’ve left it running in stuck mode for now so I can still try. Luckily it’s a VM that doesnt have a whole lot of CPU assigned. In my case, it seems like the load always takes about 50% of each available core.

Hhopefully I’ve managed to make somewhat useful exports from procmon. One is filtered over a few minutes, and the other is all events over a few seconds. I don’t know if it’s useful but I also included dump of the running process from Task Manager.

I’ll try to disable the ipv6 on some clients and will see what that does.

https://www.dropbox.com/s/72voozghdbif61k/urbackup-issue-9556-in-forums.zip?dl=1

Thanks! I might have found the issue. If you can reproduce please try if it is fixed with this version:

https://beta.urbackup.org/tmp/UrBackup%20Client%202.4.11.exe

1 Like

Awesome, glad the dumps were useful. I’ll put the beta on some of the computers, restart the server and will see what happens.

Yes, it seems the problem is no longer appearing with the 2.4.11!