Server dies each nigh with same last message in debug log file

Hi all,

Now for 2 night the Server died , each time same log entries as last

Client HASH=WbR645sdC3LmOgRuRGNYDBUUWt+xkoX5kFBrHFfGAVE= Server hash=bt8UO18fzsAK95fM38kdbiR1Wpu9m1hP/ld5zqrgD8= hblock=44672
WARNING: checksum for image block wrong. Retrying…

I just restarted it… Problem is I am running out of space because one cannot manually start the cleanup,…I changed the cleanup Windows from 2 to 5 hopefully today it wont die…

I just found folloing Messages in /var/log/Messages.log
Feb 22 03:46:09 backuppc rsyslogd: [origin software=“rsyslogd” swVersion=“5.8.10” x-pid=“1024” x-info=“http://www.rsyslog.com”] rsyslogd was HUPed
Feb 22 23:02:41 backuppc kernel: md: md0: data-check done.
Feb 25 07:27:23 backuppc kernel: urbackup_srv[27764]: segfault at 9485260 ip 09485260 sp b6bfd0dc error 15
Feb 26 02:49:14 backuppc kernel: urbackup_srv[12845]: segfault at 0 ip 02369502 sp b55e78e0 error 4 in liburbackupserver.so.0.0.0[22c4000+20f000]
Feb 26 07:50:52 backuppc kernel: libauparse.so.0[14646]: segfault at 1 ip 00000001 sp bf89fe14 error 14 in libauparse.so.0.0.0[b65000+14000]
Feb 27 02:09:53 backuppc kernel: urbackup_srv[14707]: segfault at 0 ip 006fa502 sp b3dc68e0 error 4 in liburbackupserver.so.0.0.0[655000+20f000]

This might be usefull , I think atleast ist not a config Problem from the idiotic user (me…) …

I just started manually the Recalculate Statistics and it died again log file

2015-02-27 07:50:13: ERROR: Error preparing Query [UPDATE clients SET bytes_used_files=0]: library routine called out of sequence
2015-02-27 07:50:13: ERROR: Error preparing Query [UPDATE backups SET size_bytes=0]: library routine called out of sequence
2015-02-27 07:50:13: ERROR: Error preparing Query [UPDATE backups SET size_calculated=0]: library routine called out of sequence
2015-02-27 07:50:13: ERROR: Error preparing Query [END;]: library routine called out of sequence

Greetings

I have found out a way to manually force the Server to do the clean up now.

I have set jobcount to 0 , and cleanup Windows to 0 24… right now I am doing the cleanup. It is progressing nicely but These Messages come up nonstop

2015-02-27 07:50:13: ERROR: Error preparing Query [UPDATE clients SET bytes_used_files=0]: library routine called out of sequence
2015-02-27 07:50:13: ERROR: Error preparing Query [UPDATE backups SET size_bytes=0]: library routine called out of sequence
2015-02-27 07:50:13: ERROR: Error preparing Query [UPDATE backups SET size_calculated=0]: library routine called out of sequence
2015-02-27 07:50:13: ERROR: Error preparing Query [END;]: library routine called out of sequence

Is this a race condition ? Is it trying to do mulitple Client at once ?

THank you

Sorry, for the issues you have.

With the first crash. I’ll test the recovery from hash errors during image transfers. It would be really helpful if you ran UrBackup in gdb. Then we would have the exact location where it is crashing. See instructions here: https://urbackup.atlassian.net/wiki/display/US/Debugging+with+gdb+on+Linux

The other error is a threading issue with the statistics recalculation. I have fixed that. Thank you for reporting the problem.

Thank you very much for your Response!!!
I just downloaded the source , and I compiled it on CentOS , I had to alter the config file since I dont have libcurl 7.20.0 on CentOS , but only 7.19.xx , I altered it. It compiled. At the end it was complaining about not finding automake which is installed, but I just ignored it.
I made backups of the original urbackup_server and start_urbackup_srv executables. And I copied only These two files over from the compiled Folder. I started it with the gdb command. It said the debug headers were found. I did a run for it . It is working right now. I will hopefully be able to send you the necesary files.

Also the other error. I dont have 2 urbackup Servers. What I have though is… I have a Linux box, and urbackup is listening on two Interfaces! . This is because the previous sysadmin segmented the Network without real Need… and I am not able to Change it back (for now) . The Client Computer from where I ran the command was contacting the Server on the 192.168.88.0 / 255.255.255.0 Network.

I think we best wait till the error Pops up, I have a hunch , that ist a db Problem. ( I probably managed to screw it up somehow…) Or a TIMEOUT issue… we’ll see.

Thank you !

It died right now, with following error

2015-03-05 13:05:19: GT: Loaded file “Unknown.Log”
2015-03-05 13:05:19: Loading file “Syscache.hve”
2015-03-05 13:05:19: PT: Hashing file “Unknown.Log”
2015-03-05 13:05:19: GT: Loaded file “Syscache.hve”
2015-03-05 13:05:19: PT: Hashing file “Syscache.hve”
2015-03-05 13:05:19: Loading file “{4febeb68-b734-11e4-ae30-5cff35003dd9}{3808876b-c176-4e48-b7ae-04046e6cc752}”
2015-03-05 13:05:21: HT: Linked file: “/backuppc/urbackup/a-miljic1/150305-1250/C/ProgramData/Microsoft/Windows Defender/Scans/History/Service/History.Log”
2015-03-05 13:05:21: HT: Copying file: “/backuppc/urbackup/a-miljic1/150305-1250/C/ProgramData/Microsoft/Windows Defender/Scans/History/Service/Unknown.Log”
2015-03-05 13:05:21: HT: Copying file: “/backuppc/urbackup/a-miljic1/150305-1250/C/System Volume Information/Syscache.hve”
2015-03-05 13:05:25: GT: Loaded file “{4febeb68-b734-11e4-ae30-5cff35003dd9}{3808876b-c176-4e48-b7ae-04046e6cc752}”
2015-03-05 13:05:25: Loading file “{4febeb92-b734-11e4-ae30-5cff35003dd9}{3808876b-c176-4e48-b7ae-04046e6cc752}”
2015-03-05 13:05:25: PT: Hashing file “{4febeb68-b734-11e4-ae30-5cff35003dd9}{3808876b-c176-4e48-b7ae-04046e6cc752}”
2015-03-05 13:05:31: Copying files from tmp table…
2015-03-05 13:05:31: done.
2015-03-05 13:05:40: GT: Loaded file “{4febeb92-b734-11e4-ae30-5cff35003dd9}{3808876b-c176-4e48-b7ae-04046e6cc752}”
2015-03-05 13:05:40: Loading file “{4febebf0-b734-11e4-ae30-5cff35003dd9}{3808876b-c176-4e48-b7ae-04046e6cc752}”
2015-03-05 13:05:40: Client hash=3SuDFkHSklVOyiogx37vq8BaPpj6J6y7f3ryAhRScvE= Server hash=mAlECeQ0tCqRcNAaX7tBo/6KgqzoCO+0tUus8N3jvxA= hblock=2932864
2015-03-05 13:05:40: WARNING: Checksum for image block wrong. Retrying…

Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0xb3bf1b70 (LWP 26396)]
0x009c6502 in BackupServerGet::doImage(std::basic_string<char, std::char_traits, std::allocator > const&, std::basic_string<wchar_t, std::char_traits<wchar_t>, std::allocator<wchar_t> > const&, int, int, bool, std::basic_string<char, std::char_traits, std::allocator >) () from /usr/lib/liburbackupserver.so
Missing separate debuginfos, use: debuginfo-install urbackup-server-1.4.7-11.1.i686

Is this then more helpfull ? (from gdb… run )

I tried to compile again, since I dont get the liburbackup so files… I guess that is still missing for debug. I have the necesary dev package installed
ANd I still get this error

UrlFactory.cpp:108: error: ‘CURLOPT_MAIL_FROM’ was not declared in this scope
UrlFactory.cpp:116: error: ‘CURLOPT_MAIL_RCPT’ was not declared in this scope
make[3]: *** [UrlFactory.lo] Error 1
make[3]: Leaving directory /home/urbackup-server-1.4.7/urlplugin' make[2]: *** [all] Error 2 make[2]: Leaving directory /home/urbackup-server-1.4.7/urlplugin’
make[1]: *** [all-recursive] Error 1
make[1]: Leaving directory `/home/urbackup-server-1.4.7’
make: *** [all] Error 2

I would just build it without mail support

./configure --without-mail

Thank you, it is to late , I am typing from home. I will recompile agian tomorrow.
Anyway it died again, i did bt : is this more heplfull ? Seems like it died at same place…Image checksum

2015-03-05 19:14:03: Client hash=IFvx8Fw+IP1K1yq/e00JWoxtroD3PX//KSZ0KgfLnr0= Server hash=RYCDEnsfXsPQdqkaI12LOPVTqJ8BeY+03zVRAtAUkFs= hblock=62080
2015-03-05 19:14:03: WARNING: Checksum for image block wrong. Retrying…

Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0xb5de6b70 (LWP 26946)]
0x009c6502 in BackupServerGet::doImage(std::basic_string<char, std::char_traits, std::allocator > const&, std::basic_string<wchar_t, std::char_traits<wchar_t>, std::allocator<wchar_t> > const&, int, int, bool, std::basic_string<char, std::char_traits, std::allocator >) () from /usr/lib/liburbackupserver.so
Missing separate debuginfos, use: debuginfo-install urbackup-server-1.4.7-11.1.i686
(gdb) run
The program being debugged has been started already.
Start it from the beginning? (y or n) n
Program not restarted.
(gdb) bt

0 0x009c6502 in BackupServerGet::doImage(std::basic_string<char, std::char_traits, std::allocator > const&, std::basic_string<wchar_t, std::char_traits<wchar_t>, std::allocator<wchar_t> > const&, int, int, bool, std::basic_string<char, std::char_traits, std::allocator >) () from /usr/lib/liburbackupserver.so
1 0x009b0aa0 in BackupServerGet::operator()() () from /usr/lib/liburbackupserver.so
2 0x0809043f in CPoolThread::operator() (this=0xb7d256d0) at ThreadPool.cpp:41
3 0x08059750 in thread_helper_f (t=0xb7d256d0) at Server.cpp:1312
4 0x0026fb39 in start_thread () from /lib/libpthread.so.0
5 0x00366c2e in clone () from /lib/libc.so.6

/home/urbackup-server-1.4.7/Server.cpp:1331: undefined reference to pthread_attr_setstacksize' /home/urbackup-server-1.4.7/Server.cpp:1335: undefined reference to pthread_create’
/home/urbackup-server-1.4.7/Server.cpp:1336: undefined reference to pthread_detach' Mutex_lin.o: In function CMutex::TryLock()’:
/home/urbackup-server-1.4.7/Mutex_lin.cpp:72: undefined reference to pthread_mutex_trylock' Mutex_lin.o: In function CMutex’:
/home/urbackup-server-1.4.7/Mutex_lin.cpp:28: undefined reference to pthread_mutexattr_init' /home/urbackup-server-1.4.7/Mutex_lin.cpp:33: undefined reference to pthread_mutexattr_settype’
/home/urbackup-server-1.4.7/Mutex_lin.cpp:43: undefined reference to pthread_mutexattr_destroy' sqlite3.o: In function pthreadMutexTry’:
/home/urbackup-server-1.4.7/sqlite/sqlite3.c:18355: undefined reference to pthread_mutex_trylock' sqlite3.o: In function pthreadMutexAlloc’:
/home/urbackup-server-1.4.7/sqlite/sqlite3.c:18223: undefined reference to pthread_mutexattr_init' /home/urbackup-server-1.4.7/sqlite/sqlite3.c:18224: undefined reference to pthread_mutexattr_settype’
/home/urbackup-server-1.4.7/sqlite/sqlite3.c:18226: undefined reference to pthread_mutexattr_destroy' collect2: ld returned 1 exit status make[2]: *** [urbackup_srv] Error 1 make[2]: Leaving directory /home/urbackup-server-1.4.7’
make[1]: *** [all-recursive] Error 1
make[1]: Leaving directory `/home/urbackup-server-1.4.7’
make: *** [all] Error 2

Any ideas what could I do now ? ( CentOS 6…)

I think I may have fixed the issue with 1.4.8.