HELP! Server 2.1.18 & 2.2.20 exits with signal 6 line 527 of server_writer.cpp

Hi,

I’m running URBackup server in a FreeNAS jail. It’s been running very well since July '16.
I updated to 2.1.18 a couple of weeks ago.
It’s ran completely fine until this, Mar 27 19:36:45 freenas kernel: pid 25941 (urbackupsrv), uid 1001: exited on signal 6 and again Mar 28 13:29:08 freenas kernel: pid 42176 (urbackupsrv), uid 1001: exited on signal 6

Google tells me, Signal 6 is SIGABRT – see /usr/include/sys/signal.h. A process dying
with this signal is usually due to it calling the abort(3) function.
That generally indicates that the process itself has found that some essential pre-requisite for correct function is not available and voluntarily killing itself, rather than the process being killed by the kernel because it ran over resource limits or looked at memory addresses funny or something.

Any clue as to what’s going on greatly appreciated.

Thanks,
Glenn

It just happened again…
2017-03-28 18:58:41: Whole block. currpos=498073600 block_for_chunk_start=498073600 chunk_start=498597888
2017-03-28 18:58:41: FileClientChunked: Whole block start=498597888
Assertion failed: (empty_end%vhd_blocksize == 0), function emptyVHDBlock, file urbackupserver/server_writer.cpp, line 527.
Abort

And again…
2017-03-29 01:04:29: Connecting to ClientService of “BSutherlandGeneral” failed: Error sending ‘running’ (2) ping to client
Assertion failed: (empty_end%vhd_blocksize == 0), function emptyVHDBlock, file urbackupserver/server_writer.cpp, line 527.
Abort

It just happened again. So I’ve taken the plunge and upgraded to 2.2.20 beta…
Server 2.2.20 beta has been stable so far. My average backup times have gone from 1min (it’s every two hours) to between 10 and 20mins…

The backup server process just aborted with no error…
I think I’ll install the FreeNAS updates and see if it makes any difference.

OK. So i woke up to 2017-03-29 21:39:47: Loading file “99626793352D7DEB0F40125C2327FE649B520160”
Assertion failed: (empty_end%vhd_blocksize == 0), function emptyVHDBlock, file urbackupserver/server_writer.cpp, line 527.
Abort

Can anyone tell me how to rectify this? It’s a real problem now.

So it appears to be a problem trimming the hash file…
The line in bold is 527… Can anyone help me out here, Uroni?

bool ServerVHDWriter::emptyVHDBlock(int64 empty_start, int64 empty_end)
{
assert(empty_start%vhd_blocksize == 0);
assert(empty_end%vhd_blocksize == 0);

_i64 block_start = empty_start / vhd_blocksize;
if (empty_start%vhd_blocksize != 0)
{
	++block_start;
}

_i64 block_end = empty_end / vhd_blocksize;

bool ret = true;
for (; block_start<block_end; ++block_start)
{
	if (hashfile->Seek(block_start*sha_size))
	{
		if (hashfile->Write(reinterpret_cast<const char*>(zero_hash), sha_size) != sha_size)
		{
			Server->Log("Error writing to hashfile while setting block to empty.", LL_WARNING);
			ret = false;
		}
	}
	else
	{
		Server->Log("Error seeking in hashfile setting block to empty.", LL_WARNING);
		ret = false;
	}
}
return ret;

}

It just quit again… :frowning:
2017-03-30 13:53:37: HT: Linked file: “/mnt/backups/AMD-Server/170330-0844/glenn/.cache/mozilla/firefox/bs977und.default/cache2/entries/0DB6FAF618DF47390A39E915DD4FA451B81EADCF”
Assertion failed: (empty_end%vhd_blocksize == 0), function emptyVHDBlock, file urbackupserver/server_writer.cpp, line 527.

I’m getting desperate now :frowning:
2017-03-30 17:55:25: Flushing FileClient…
Assertion failed: (empty_end%vhd_blocksize == 0), function emptyVHDBlock, file urbackupserver/server_writer.cpp, line 527.
Abort

Same again last night. Mar 30 23:47:18 freenas kernel: pid 65157 (urbackupsrv), uid 1001: exited on signal 6
This is really causing me problems now.

Can anyone help at all?

Once again
2017-03-31 18:53:18: Flushing FileClient…
Assertion failed: (empty_end%vhd_blocksize == 0), function emptyVHDBlock, file urbackupserver/server_writer.cpp, line 527.
Abort

Can nobody offer any help?

Apr 1 00:01:32 freenas kernel: pid 16350 (urbackupsrv), uid 1001: exited on signal 6 :frowning:

And again.
Assertion failed: (empty_end%vhd_blocksize == 0), function emptyVHDBlock, file urbackupserver/server_writer.cpp, line 527.
Abort

I really can;t understand why this suddenly started. I noticed the ZFS pool was 73% used so…
I reduced the number of retained incremental and full backups and then ran a full clean up. Which removed more than 1TB or unwanted data garbage.

I crossed my fingers and went out with my family for the day. Came home and it had exited again.

In desperation I just created a new jail, compiled from source and copied /usr/local/var/urbackup from the original jail.

It’s now running the new jail. I will report back either way.

Thanks for reading my rant…

I just realised that the assertion fails in the emptyVHDBlock function… So, if it exits again, I will disable image backups for now and at least get all the file backups completed…

Well it failed yet again :- Apr 2 03:00:03 freenas kernel: pid 1562 (syslog-ng), uid 0: exited on signal 6 (core dumped)

After a restart I disabled image backup support.So far so good…

UPDATE: It’s now been 24 hours since the last signal 6. Looks as though it’s going to be ok as long as I don’t use image backup.

LAST UPDATE: Since completely disabling image backup the server has been running fine for almost three weeks. Let’s hope I can turn it back on with the next version.

We had this same issue running it in a jail on freenas as well. Now what we have done is installed it on a full blown distro running off a 120gig usb stick and croned cat /deev/sda > /dev/sdb to a second usb stick to get the same redundancy as freenas. then the drives we are using as backup drives are in raid five mounted on /media/BACKUP. I know thats just a work around but hopefully it helps shrugs

Thanks for the info.
It’s not really an option here. I have 20 x 3TB drives in RAIDZ2 with 25TB of backup data.
I’ve re-enabled image since the last update and it’s working but does crash occasionally, although three times so far today.
I’ve been using immortal to monitor and restart when needed. It’s from immortal.run

Did you try disabling apparmor and/or selinux like we were discussing in the other thread? I know this is a different issue but Im curious if its the same root cause.