urbackup_srv crashing with internet clients

Hi I have been testing a setup of urbackup server backing up 3 internet windows clients (WinXP & Win7). The urbackup_srv process appears to
be crashing after about 45 mins of processing clients.

Server is Ubuntu 12.04.02 LTS x64 and I have tried both installing the Ubuntu package and downloading source and building.

Have triied both urbackup_Srv version 1.2.2 & 1.2.3 with backup dir being a ZFS filesystem (zfs-fuse)

I have been struggling to find the cause of the crash from the log files (Set to debug), here is the last section of my last logfile (from a 1.2.2 server version)

2013-08-02 08:13:01: Authed+capa for client 'brad-PC' (token auth) - 1 spare connections
2013-08-02 08:13:14: Error sending 'running' ping to client
2013-08-02 08:13:14: Timeout: Error sending 'running' ping to client
2013-08-02 08:13:24: Connecting to target service...
2013-08-02 08:13:24: Established internet connection. Service=0
2013-08-02 08:13:25: InternetService: New Connection incomming 5113056 s: 109
2013-08-02 08:13:25: Authed+capa for client 'brad-PC' (token auth) - 1 spare connections
2013-08-02 08:13:42: Connecting to target service...
2013-08-02 08:13:43: Established internet connection. Service=0
2013-08-02 08:13:43: InternetService: New Connection incomming 5131466 s: 113
2013-08-02 08:13:44: Authed+capa for client 'brad-PC' (token auth) - 1 spare connections
2013-08-02 08:13:50: Connecting to target service...
2013-08-02 08:13:50: Established internet connection. Service=0
2013-08-02 08:13:50: Sending Identity to client "DRYSERVER" failed. Retrying in 10min...
2013-08-02 08:13:50: InternetService: New Connection incomming 5138446 s: 82
2013-08-02 08:13:50: Authed+capa for client 'DRYSERVER' (token auth) - 1 spare connections
2013-08-02 08:14:02: Connecting to target service...
2013-08-02 08:14:02: Established internet connection. Service=0
2013-08-02 08:14:02: InternetService: New Connection incomming 5150704 s: 109
2013-08-02 08:14:03: Authed+capa for client 'brad-PC' (token auth) - 1 spare connections
2013-08-02 08:14:14: Connecting to target service...
2013-08-02 08:14:14: Established internet connection. Service=0
2013-08-02 08:14:14: InternetService: New Connection incomming 5162504 s: 113
2013-08-02 08:14:14: HTTP: New Connection incomming 5162905 s: 114
2013-08-02 08:14:14: Parsing done... starting handling request_num: 0 5162916
2013-08-02 08:14:14: Authed+capa for client 'brad-PC' (token auth) - 1 spare connections
2013-08-02 08:14:16: HTTP: New Connection incomming 5164558 s: 114
2013-08-02 08:14:16: Parsing done... starting handling request_num: 0 5164568
2013-08-02 08:14:18: HTTP: New Connection incomming 5166067 s: 114
2013-08-02 08:14:18: Parsing done... starting handling request_num: 0 5166077
2013-08-02 08:14:19: HTTP: New Connection incomming 5167562 s: 114
2013-08-02 08:14:19: Parsing done... starting handling request_num: 0 5167573
2013-08-02 08:14:21: HTTP: New Connection incomming 5169069 s: 114
2013-08-02 08:14:21: Parsing done... starting handling request_num: 0 5169077
2013-08-02 08:14:22: HTTP: New Connection incomming 5170581 s: 109
2013-08-02 08:14:22: Parsing done... starting handling request_num: 0 5170592
2013-08-02 08:14:24: HTTP: New Connection incomming 5172091 s: 109
2013-08-02 08:14:24: Parsing done... starting handling request_num: 0 5172101
2013-08-02 08:14:25: HTTP: New Connection incomming 5173582 s: 109
2013-08-02 08:14:25: Parsing done... starting handling request_num: 0 5173591
2013-08-02 08:14:27: HTTP: New Connection incomming 5175073 s: 109
2013-08-02 08:14:27: Parsing done... starting handling request_num: 0 5175080
2013-08-02 08:14:28: HTTP: New Connection incomming 5176572 s: 109
2013-08-02 08:14:28: Parsing done... starting handling request_num: 0 5176583
2013-08-02 08:14:30: HTTP: New Connection incomming 5178088 s: 109
2013-08-02 08:14:30: Parsing done... starting handling request_num: 0 5178099
2013-08-02 08:14:31: HTTP: New Connection incomming 5179589 s: 109
2013-08-02 08:14:31: Parsing done... starting handling request_num: 0 5179599
2013-08-02 08:14:32: Connecting to target service...
2013-08-02 08:14:32: Established internet connection. Service=0
2013-08-02 08:14:32: InternetService: New Connection incomming 5180693 s: 109
2013-08-02 08:14:33: HTTP: New Connection incomming 5181074 s: 114
2013-08-02 08:14:33: Parsing done... starting handling request_num: 0 5181084
2013-08-02 08:14:33: Authed+capa for client 'brad-PC' (token auth) - 1 spare connections
2013-08-02 08:14:34: HTTP: New Connection incomming 5182577 s: 114
2013-08-02 08:14:34: Parsing done... starting handling request_num: 0 5182587
2013-08-02 08:14:36: HTTP: New Connection incomming 5184078 s: 114
2013-08-02 08:14:36: Parsing done... starting handling request_num: 0 5184087
2013-08-02 08:14:37: HTTP: New Connection incomming 5185625 s: 114
2013-08-02 08:14:37: Parsing done... starting handling request_num: 0 5185635
2013-08-02 08:14:37: HTTP: New Connection incomming 5185875 s: 114
2013-08-02 08:14:38: Parsing done... starting handling request_num: 0 5186921
2013-08-02 08:14:40: HTTP: New Connection incomming 5188559 s: 114
2013-08-02 08:14:40: Parsing done... starting handling request_num: 0 5188566
2013-08-02 08:14:42: HTTP: New Connection incomming 5190045 s: 114
2013-08-02 08:14:42: Parsing done... starting handling request_num: 0 5190055
2013-08-02 08:14:43: HTTP: New Connection incomming 5191543 s: 114
2013-08-02 08:14:43: Parsing done... starting handling request_num: 0 5191554
2013-08-02 08:14:45: HTTP: New Connection incomming 5193069 s: 114
2013-08-02 08:14:45: Parsing done... starting handling request_num: 0 5193079
2013-08-02 08:14:46: HTTP: New Connection incomming 5194614 s: 114
2013-08-02 08:14:46: Parsing done... starting handling request_num: 0 5194625
2013-08-02 08:14:48: HTTP: New Connection incomming 5196130 s: 114
2013-08-02 08:14:48: Parsing done... starting handling request_num: 0 5196139
2013-08-02 08:14:49: HTTP: New Connection incomming 5197614 s: 114
2013-08-02 08:14:49: Parsing done... starting handling request_num: 0 5197624
2013-08-02 08:14:51: HTTP: New Connection incomming 5199141 s: 114
2013-08-02 08:14:51: Parsing done... starting handling request_num: 0 5199151
2013-08-02 08:14:52: HTTP: New Connection incomming 5200684 s: 114
2013-08-02 08:14:52: Parsing done... starting handling request_num: 0 5200695
2013-08-02 08:14:54: HTTP: New Connection incomming 5202226 s: 114
2013-08-02 08:14:54: Parsing done... starting handling request_num: 0 5202236
2013-08-02 08:14:55: HTTP: New Connection incomming 5203745 s: 114
2013-08-02 08:14:55: Parsing done... starting handling request_num: 0 5203755
2013-08-02 08:14:57: Error sending 'running' ping to client
2013-08-02 08:14:57: Timeout: Error sending 'running' ping to client  

The server stopped performing backups at 8:14.57 (same time as the last log in the logfile) and was 1% complete doing a full file backup from 1 on the internet clients…

root@backup:/mnt/backups/live/urbackup_tmp_files# ls -ltr
total 1545
-rwxr-x--- 1 urbackup urbackup   12913 Aug  2 07:13 2
-rwxr-x--- 1 urbackup urbackup 1572864 Aug  2 08:14 85
root@backup:/mnt/backups/live/urbackup_tmp_files#  

When the server process initially freezes and stops backing up the web interface is accessible for approx 5 mins after this and then the urbackup_srv process
on the server terminates.

Few extra details about my config…

root@backup:/var/log# uname -a
Linux backup 3.2.0-23-generic #36-Ubuntu SMP Tue Apr 10 20:39:51 UTC 2012 x86_64 x86_64 x86_64 GNU/Linux  

root@backup:/var/log# curl --version
curl 7.31.0 (x86_64-unknown-linux-gnu) libcurl/7.22.0 OpenSSL/1.0.1 zlib/1.2.3.4 libidn/1.23 librtmp/2.3
Protocols: dict file ftp ftps gopher http https imap imaps ldap pop3 pop3s rtmp rtsp smtp smtps telnet tftp
Features: GSS-Negotiate IDN IPv6 Largefile NTLM NTLM_WB SSL libz TLS-SRP  
  • I have both compression and encryption turned on all clients for transfers
  • I have compression turned on for the ZFS filesystem on the server
  • If I backup an internet client with only small backup amount select (< 1Mb) then it suceedes
  • Internet clients only have limited upstream bandwidth 30-75Kbit/s and I only plan to perform internet backups and no local network backups
  • Using file backups, image backups disabled
  • My Ubuntu server is up to date and currently curl & urbackup_srv are the only 2 packages built from source
  • IPv6 disabled on my machine

If anyone could give me some ideas on how to further debug I would be much appreciated, I will looking at writing up and instal
guide for how ive installed on Ubuntu from scratch if this would help. I am also thinking of attempting an install on another distro
such as debian or centOS as this particular server will be dedicated to backups, does anyone have any suggestions for me.

Many Thanks,
Brad.

Thanks for the detailed report. I’ll definitely look at this area (internet connections).

Another thing which would probably help:

  • With the compiled server, replace “exec” in the last line of /usr/local/bin/start_urbackup_server with “gdb --args” (Install gdb if necessary).
  • Start the server with “start_urbackup_server --loglevel debug”
  • Enter “run”
  • Wait till it crashes
  • Enter “bt” and post the result.

That would show the approximate location where the crash occurs.

What sometimes helps more is replacing the “exec” with “valgrind” (Install if not already present). But this would slow things down considerably.

Firstly I think i have linked it to only 1 client causing the server to crash, still testing

Ive ran the server process through gdb but at the moment the process has hung (or died or exited maybe running thorough gdb im not sure). That is it exists still…

root@backup:/mnt# ps -ef | grep urbackup
root      2094  1162  0 01:22 pts/0    00:00:00 /bin/sh /usr/local/sbin/start_urbackup_server --loglevel debug
root      2097  2094  0 01:22 pts/0    00:00:00 gdb --args /usr/local/sbin/urbackup_srv --daemon --plugin /usr/local/lib/liburbackupserver_urlplugin.so --plugin /usr/local/lib/liburbackupserver_cryptoplugin.so --plugin /usr/local/lib/liburbackupserver_downloadplugin.so --plugin /usr/local/lib/liburbackupserver_fsimageplugin.so --plugin /usr/local/lib/liburbackupserver_httpserver.so --plugin /usr/local/lib/liburbackupserver.so --http_root /usr/local/var/urbackup/www --workingdir /usr/local/var --user urbackup --snapshot_helper /usr/local/bin/urbackup_snapshot_helper --port 55413 --logfile /var/log/urbackup.log --loglevel debug --http_port 55414 --pidfile /var/run/urbackup_srv.pid
root      2723  2228  0 03:43 pts/1    00:00:00 grep --color=auto urbackup  

but backup is not complete and it is not responding to http

Also gdb gives no stack (usually the process dies 5 mins after the backup stops and http hangs)… This is my gdb session…

root@backup:~# start_urbackup_server --loglevel debug
GNU gdb (Ubuntu/Linaro 7.4-2012.04-0ubuntu2.1) 7.4-2012.04
Copyright (C) 2012 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
For bug reporting instructions, please see:
<http://bugs.launchpad.net/gdb-linaro/>...
Reading symbols from /usr/local/sbin/urbackup_srv...done.
(gdb) run
Starting program: /usr/local/sbin/urbackup_srv --daemon --plugin /usr/local/lib/liburbackupserver_urlplugin.so --plugin /usr/local/lib/liburbackupserver_cryptoplugin.so --plugin /usr/local/lib/liburbackupserver_downloadplugin.so --plugin /usr/local/lib/liburbackupserver_fsimageplugin.so --plugin /usr/local/lib/liburbackupserver_httpserver.so --plugin /usr/local/lib/liburbackupserver.so --http_root /usr/local/var/urbackup/www --workingdir /usr/local/var --user urbackup --snapshot_helper /usr/local/bin/urbackup_snapshot_helper --port 55413 --logfile /var/log/urbackup.log --loglevel debug --http_port 55414 --pidfile /var/run/urbackup_srv.pid
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
[Inferior 1 (process 2099) exited normally]
(gdb) bt
No stack.
(gdb) bt
No stack.
(gdb) bt
No stack.
(gdb)  

Im going to give the process another few hours to see what happens then ill try valgrind, not too worried about the speed impact as its only internet clients and only
have the crash suspect client connected at the moment

Aww, sorry. I think you have to start it with

start_urbackup_server --no_daemon --loglevel debug

.

The bug may also be occuring in connection with multiple internet clients.

Hi uroni,
Finally caught the crash with gdb, (did not want to crash for me for a few days). It seems to occur when there are client connections errors for any particular internet client.

Here is the output from gdb…

2013-08-06 00:30:39: Connecting Channel to ClientService failed - CONNECT error -55
2013-08-06 00:30:43: Connecting to ClientService of "brad-PC" failed: Error sending 'running' ping to client
2013-08-06 00:30:49: Connecting Channel to ClientService failed - CONNECT error -55
2013-08-06 00:30:53: Connecting to ClientService of "brad-PC" failed: Error sending 'running' ping to client
2013-08-06 00:30:59: Connecting Channel to ClientService failed - CONNECT error -55
terminate called after throwing an instance of 'CryptoPP::ZlibDecompressor::Adler32Err'
  what():  ZlibDecompressor: ADLER32 check error  

Program received signal SIGABRT, Aborted.
[Switching to Thread 0x7ffff0b95700 (LWP 7297)]
0x00007ffff711a425 in raise () from /lib/x86_64-linux-gnu/libc.so.6
(gdb)
(gdb)
(gdb) bt
#0  0x00007ffff711a425 in raise () from /lib/x86_64-linux-gnu/libc.so.6
#1  0x00007ffff711db8b in abort () from /lib/x86_64-linux-gnu/libc.so.6
#2  0x00007ffff798d69d in __gnu_cxx::__verbose_terminate_handler() ()
   from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#3  0x00007ffff798b846 in ?? () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#4  0x00007ffff798b873 in std::terminate() () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#5  0x00007ffff798b96e in __cxa_throw () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#6  0x00007ffff5160051 in CryptoPP::ZlibDecompressor::ProcessPoststreamTail() ()
   from /usr/lib/libcrypto++.so.9
#7  0x00007ffff515dfd1 in CryptoPP::Inflator::ProcessInput(bool) () from /usr/lib/libcrypto++.so.9
#8  0x00007ffff515e299 in CryptoPP::Inflator::Put2(unsigned char const*, unsigned long, int, bool)
    () from /usr/lib/libcrypto++.so.9
#9  0x00007ffff54c6317 in Put (blocking=true, length=<optimized out>, inString=<optimized out>,
    this=0x7fffd401fe48) at /usr/include/cryptopp/cryptlib.h:788
#10 ZlibDecompression::decompress (this=0x7fffd401fe40, input=<optimized out>,
    input_size=<optimized out>, output=0x7fffd40b66b0, flush=true, output_off=0)
    at ZlibDecompression.cpp:6
#11 0x00007ffff3ca4e70 in CompressedPipe::Process (this=0x7fffd40b6690, buffer=<optimized out>,
    bsize=<optimized out>) at ../urbackupcommon/CompressedPipe.cpp:159
#12 0x00007ffff3ca51a5 in Read (timeoutms=120000, bsize=4096,
    buffer=0x7ffff0b8d620 "n\025-}m\263C\221\332bS�\031\335B\267X\240\374\221mH\256\252\004+dk\237j\250\217\064F0\203SE\300Gٓ\265oC¸\240\353_߯\345Kz\"\355\nN\b\264\270\263mR\316\037Px\371\021x\330\036\367\347'^\361 j\205yG\263\020\331ã\250\310\324T2\006\204M{\230\063\312\003tdž\342\257aTw\230\353\026\343\244#z8\370\333.a\241ʶ}(\177\277\365\026^\302\322S\201UE\366L\272\030\177\317x\r]\261\066}\216k\375\242\031\260\212\335\065\035\274Zm|\366\201\375j\037\256\031\236\347J\367Y$\")\316\024\210\026]\232\257\253\261MZct\274\335f\257V"..., this=0x7fffd40b6690)
    at ../urbackupcommon/CompressedPipe.cpp:101
#13 CompressedPipe::Read (this=0x7fffd40b6690,
    buffer=0x7ffff0b8d620 "n\025-}m\263C\221\332bS�\031\335B\267X\240\374\221mH\256\252\004+dk\237j\250\217\064F0\203SE\300Gٓ\265oC¸\240\353_߯\345Kz\"\355\nN\b\264\270\263mR\316\037Px\371\021x\330\036\367\347'^\361 j\205yG\263\020\331ã\250\310\324T2\006\204M{\230\063\312\003tdž\342\257aTw\230\353\026\343\244#z8\370\333.a\241ʶ}(\177\277\365\026^\302\322S\201UE\366L\272\030\177\317x\r]\261\066}\216k\375\242\031\260\212\335\065\035\274Zm|\366\201\375j\037\256\031\236\347J\367Y$\")\316\024\210\026]\232\257\253\261MZct\274\335f\257V"..., bsize=4096, timeoutms=120000)
    at ../urbackupcommon/CompressedPipe.cpp:67
#14 0x00007ffff3c283cc in FileClient::GetFile (this=0x7ffff0b92300, remotefn=..., file=
    0x7fffd8043360, hashed=<optimized out>) at fileclient/FileClient.cpp:612
#15 0x00007ffff3bf5462 in BackupServerGet::load_file (this=0x7fffec01ebf0, fn=..., short_fn=...,
    curr_path=..., os_path=..., fc=..., with_hashes=true, last_backuppath=...,
    last_backuppath_complete=..., download_ok=@0x7ffff0b8ff38: false, hashed_transfer=true)
    at server_get.cpp:1630
#16 0x00007ffff3bf6d95 in BackupServerGet::load_file_patch (this=0x7fffec01ebf0, fn=...,
    short_fn=..., curr_path=..., os_path=..., last_backuppath=..., last_backuppath_complete=...,
    fc=..., fc_normal=..., download_ok=@0x7ffff0b8ff38: false) at server_get.cpp:1555
#17 0x00007ffff3c0a2b8 in BackupServerGet::doIncrBackup (this=0x7fffec01ebf0, with_hashes=true,
    intra_file_diffs=true, on_snapshot=false, disk_error=@0x7ffff0b94e1f: false,
    log_backup=<optimized out>) at server_get.cpp:2166
#18 0x00007ffff3c1021d in BackupServerGet::operator() (this=0x7fffec01ebf0) at server_get.cpp:521
#19 0x000000000044c6f8 in CPoolThread::operator() (this=0x7fffec01c1c0) at ThreadPool.cpp:40
#20 0x0000000000417329 in thread_helper_f (t=<optimized out>) at Server.cpp:1146
#21 0x00007ffff74aae9a in start_thread () from /lib/x86_64-linux-gnu/libpthread.so.0
#22 0x00007ffff71d7ccd in clone () from /lib/x86_64-linux-gnu/libc.so.6
#23 0x0000000000000000 in ?? ()
(gdb)  

Thanks, Brad.

Thanks! Tried to fix this bug and fixed two other bugs, which I could produce (race conditions) for internet clients in http://urpc.dyndns.org/urbackup_unstable/urbackup-server-1.2.4.281.tar.gz .

Good work uroni, the new version does not appear to be crashing the server with my internet clients now. I will continue to test.

I do have an issue with connectivity from 1 of the clients still I will open a new thread for it, to see if you have any ideas.

Thanks again.