Segmentation fault related to block hashes - Server 2.1.19

aj_potc · September 27, 2017, 3:12pm

After some time of operating without trouble, my server hit a segfault yesterday. I was able to manually restart it, and now it seems to be running fine again, but I wanted to report this bug just in case it would be helpful.

Sep 26 11:50:05 backup4 urbackupsrv: WARNING: Block hash wrong. Getting whole block. currpos=769654784
     
[snip many similar entries]

Sep 26 11:51:23 backup4 urbackupsrv: MD5::finalize:  Already finalized this digest!
Sep 26 11:51:23 backup4 urbackupsrv: WARNING: Block hash wrong. Getting whole block. currpos=769654784
Sep 26 11:51:23 backup4 kernel: fbackup load[1240]: segfault at 7fe225e552b8 ip 000000000054f6b5 sp 00007fe225e55260 error 6 in urbackupsrv[400000+63f000]
Sep 26 11:51:24 backup4 systemd: urbackup-server.service: main process exited, code=killed, status=11/SEGV
Sep 26 11:51:24 backup4 systemd: Unit urbackup-server.service entered failed state.
Sep 26 11:51:24 backup4 systemd: urbackup-server.service failed.

uroni · October 3, 2017, 4:37pm

Is this with a server that has debug info? (E.g. the debian packages?)

aj_potc · October 3, 2017, 5:24pm

This is on CentOS 7, using the Red Hat package downloaded from your site. I’m not aware if it has any other debugging enabled, but I would be happy to check if you could please tell me where to look.

aj_potc · October 18, 2017, 5:03pm

This error appears to have occurred again. The log entries are slightly different, however:

Oct 17 22:51:43 backup4 urbackupsrv: MD5::finalize:  Already finalized this digest!
Oct 17 22:51:43 backup4 urbackupsrv: WARNING: Block hash wrong. Getting whole block. currpos=697827328
Oct 17 22:51:44 backup4 urbackupsrv: *** stack smashing detected ***: /usr/bin/urbackupsrv terminated
Oct 17 22:51:44 backup4 urbackupsrv: ======= Backtrace: =========
Oct 17 22:51:44 backup4 urbackupsrv: /lib64/libc.so.6(__fortify_fail+0x37)[0x7f13b71b0d87]
Oct 17 22:51:44 backup4 urbackupsrv: /lib64/libc.so.6(__fortify_fail+0x0)[0x7f13b71b0d50]
Oct 17 22:51:44 backup4 urbackupsrv: /usr/bin/urbackupsrv[0x550507]
Oct 17 22:51:44 backup4 urbackupsrv: /usr/bin/urbackupsrv[0x551b8b]
Oct 17 22:51:44 backup4 urbackupsrv: /usr/bin/urbackupsrv[0x42bfa0]
Oct 17 22:51:44 backup4 urbackupsrv: /usr/bin/urbackupsrv[0x42d613]
Oct 17 22:51:44 backup4 urbackupsrv: /usr/bin/urbackupsrv[0x569b93]
Oct 17 22:51:44 backup4 urbackupsrv: /usr/bin/urbackupsrv[0x45da7c]
Oct 17 22:51:44 backup4 urbackupsrv: /usr/bin/urbackupsrv[0x423846]
Oct 17 22:51:44 backup4 urbackupsrv: /lib64/libpthread.so.0(+0x7e25)[0x7f13b746be25]
Oct 17 22:51:44 backup4 urbackupsrv: /lib64/libc.so.6(clone+0x6d)[0x7f13b719934d]
Oct 17 22:51:44 backup4 urbackupsrv: ======= Memory map: ========

[snip]

uroni · October 18, 2017, 8:35pm

Sorry, cannot see the problem. Could you e.g.

Put it into debug logging mode
Create (and send) a core dump
Or follow https://urbackup.atlassian.net/wiki/spaces/US/pages/8323075/Debugging+with+gdb+on+Linux

Thanks!

Other info would be appreciated as well. E.g. what kind of client (Windows or Linux) or if it is e.g. backing up a sparse file.

aj_potc · October 19, 2017, 1:42pm

Thanks for your reply.

I’ve now put the server into debug logging mode, so hopefully this will catch something interesting next time.

How can I create a core dump? Or would this need to be done at the time of the crash?

Unfortunately I can’t find any info in the server or client logs to indicate if any system was performing a backup during the time when the failure occurred. In the Web UI, I can’t see any backups happening around this time. Can you tell me how I could find this out?

Thanks.

uroni · October 19, 2017, 1:46pm

The debug log should show it next time.

Add LimitCORE=infinity to the [Service] section in the systemd service file. Make sure the file specified by sysctl kernel.core_pattern is writeable by urbackup. Per default it is core in the working directory (/var/urbackup).