Appliance problems

Jason_Marsh · November 6, 2019, 7:44pm

I have setup an appliance for my cousin so she doesn’t lose precious photos of her kids again, but the appliance is giving me problems now.

In August 2019 it alerted me that it couldn’t update because there was no room. It had a 16GB USB boot drive. When I went over to check on it, the backup storage wasn’t showing up. On the status page it said “There was an error. The backup storage is currently not available.”

I thought something got hosed, never even checked the USB, and just ripped it out and started with a clean install to a new 32GB stick. Simple enough, the boys were home so all machines that needed backed up were present. Done, everything going fine…

…Until today. Same alert by email, can’t update because there’s no room. I login and get the same message on status page “There was an error. The backup storage is currently not available.”

So, something is going horribly wrong and I don’t know what. It appears the backup storage isn’t mounting, though it is shown on storage page as offline.

What will I have to do to get this thing healthy and happy again?

Appliance is Community Edition 8.25, on Kernel 4.19.72

jason@twila:~$ free -h
total used free shared buff/cache available
Mem: 7.5G 266M 7.0G 10M 267M 7.1G
Swap: 2.0G 0B 2.0G

jason@twila:~$ df -h
Filesystem Size Used Avail Use% Mounted on
udev 3.8G 0 3.8G 0% /dev
tmpfs 773M 11M 762M 2% /run
/dev/sdc2 27G 9.9G 16G 39% /
tmpfs 3.8G 8.0K 3.8G 1% /dev/shm
tmpfs 5.0M 0 5.0M 0% /run/lock
tmpfs 3.8G 0 3.8G 0% /sys/fs/cgroup
tmpfs 3.8G 0 3.8G 0% /tmp
/dev/sdc1 94M 42M 48M 47% /boot
/dev/sdc2 27G 9.9G 16G 39% /var/urbackup
/dev/sdc2 27G 9.9G 16G 39% /var/log
tmpfs 773M 0 773M 0% /run/user/1001

jason@twila:~$ mount
sysfs on /sys type sysfs (rw,nosuid,nodev,noexec,relatime)
proc on /proc type proc (rw,nosuid,nodev,noexec,relatime)
udev on /dev type devtmpfs (rw,nosuid,relatime,size=3907172k,nr_inodes=976793,mode=755)
devpts on /dev/pts type devpts (rw,nosuid,noexec,relatime,gid=5,mode=620,ptmxmode=000)
tmpfs on /run type tmpfs (rw,nosuid,noexec,relatime,size=791028k,mode=755)
/dev/sdc2 on / type btrfs (rw,noatime,compress-force=lzo,nossd,space_cache=v2,subvolid=1675,subvol=/sys_root_current)
securityfs on /sys/kernel/security type securityfs (rw,nosuid,nodev,noexec,relatime)
tmpfs on /dev/shm type tmpfs (rw,nosuid,nodev)
tmpfs on /run/lock type tmpfs (rw,nosuid,nodev,noexec,relatime,size=5120k)
tmpfs on /sys/fs/cgroup type tmpfs (ro,nosuid,nodev,noexec,mode=755)
cgroup on /sys/fs/cgroup/systemd type cgroup (rw,nosuid,nodev,noexec,relatime,xattr,release_agent=/lib/systemd/systemd-cgroups-agent,name=systemd)
pstore on /sys/fs/pstore type pstore (rw,nosuid,nodev,noexec,relatime)
cgroup on /sys/fs/cgroup/cpu,cpuacct type cgroup (rw,nosuid,nodev,noexec,relatime,cpu,cpuacct)
cgroup on /sys/fs/cgroup/freezer type cgroup (rw,nosuid,nodev,noexec,relatime,freezer)
cgroup on /sys/fs/cgroup/cpuset type cgroup (rw,nosuid,nodev,noexec,relatime,cpuset)
cgroup on /sys/fs/cgroup/devices type cgroup (rw,nosuid,nodev,noexec,relatime,devices)
cgroup on /sys/fs/cgroup/net_cls,net_prio type cgroup (rw,nosuid,nodev,noexec,relatime,net_cls,net_prio)
cgroup on /sys/fs/cgroup/perf_event type cgroup (rw,nosuid,nodev,noexec,relatime,perf_event)
debugfs on /sys/kernel/debug type debugfs (rw,relatime)
hugetlbfs on /dev/hugepages type hugetlbfs (rw,relatime,pagesize=2M)
mqueue on /dev/mqueue type mqueue (rw,relatime)
tmpfs on /tmp type tmpfs (rw,relatime)
/dev/sdc1 on /boot type ext2 (rw,noatime,nodiratime,errors=remount-ro)
/dev/sdc2 on /var/urbackup type btrfs (rw,noatime,compress-force=lzo,nossd,space_cache=v2,subvolid=1677,subvol=/urbackup)
/dev/sdc2 on /var/log type btrfs (rw,noatime,compress-force=lzo,nossd,space_cache=v2,subvolid=1678,subvol=/log)
tmpfs on /run/user/1001 type tmpfs (rw,nosuid,nodev,relatime,size=791024k,mode=700,uid=1001,gid=1001)

uroni · November 6, 2019, 9:43pm

Sorry about the problems. Could you use the “Report problem” link on the bottom to report this attaching system information?

Otherwise /var/log/app.log may contain more information, as well as /var/log/kern.log.

Output of lsblk would be interesting as well.

Jason_Marsh · November 26, 2019, 7:24pm

Sorry it took so long. I’ve been busy… So, I went ahead and hit the “report problem” button today. I was going to post the logs mentioned, but app.log was massive, and kernel log wasn’t accessible (permissions??). If I was to post output of app.log, would it be necessary to include log entries from before the problem?

Here’s lsblk…

jason@twila:~$ lsblk
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
sda 8:0 0 2.7T 0 disk
-sda1 8:1 0 2.7T 0 part
sdb 8:16 0 2.7T 0 disk
-sdb1 8:17 0 2.7T 0 part
sdc 8:32 1 28.7G 0 disk
-sdc1 8:33 1 97M 0 part /boot
-sdc2 8:34 1 26.5G 0 part /media/fs
-sdc3 8:35 1 2G 0 part [SWAP]

uroni · November 26, 2019, 8:17pm

Thanks for sending the log files.

The problem is that the Linux kernel and the system version (kernel modules) are mismatched. The best option would be to finish upgrading…

I’ve worked on making updates + the system use less space, so hopefully this doesn’t reoccur in the future and I’ll look into improving this error case.

You could temporarily add more storage (btrfs device add /dev/sdx / && upgrade && btrfs device delete /dev/sdx /) or perhaps running sudo /root/emergency-system-make-space.sh helps. More error prone would be going to /media/fs and deleting previous system versions not needed for the upgrade btrfs subvol del sys_root_xx. Don’t delete 7.8, 8.25 as those are needed for the upgrade.

Jason_Marsh · November 27, 2019, 10:36pm

I ran the “spacemaker” because all attempts to rm -R anything in the media/fs directory ended in a fail. (read-only filesystem). Is it possible there’s something wrong with the flash boot device? It was still less than half full. I didn’t df -h after the spacemaker, though I wish I had, just to see how much space it made. Anyways, it’s rebooted and I see “Mount operation in progress. Backup storage will become available shortly.” It’s been five minutes or so now. I’ll go have a cup of coffee and come back.

I noticed that it’s showing a 9.x update available. Should I go for this one once it’s survived a reboot or two, or should I wait it out a bit?

uroni · November 28, 2019, 1:09am

Did you try deleting it with btrfs subvol del sys_root_xx (as described). rm -R creates metadata which might cause it to run out of metadata space…

Yeah, that likely won’t finish (still the same issue, probably).

Yeah…

If you want to make your current system version work following should fix it:

wget https://app.urbackup.com/4f8588b7-34de-4af8-9673-5eae126d2eea/linux-image-4.19.72_4.19.72-2_amd64.deb
dpkg -i linux-image-4.19.72_4.19.72-2_amd64.deb

Jason_Marsh · December 1, 2019, 5:54pm

So, I used your final suggestion, and it worked. Since I was doing it remotely I couldn’t paste in, but at least I can type, sort of… after a few failed attempts WGETting non-existant files I found all my typos and, wahoo, the storage is online and it’s backing up machines!!!

Now, for the $10,000 question…
It seems the only problems I’ve had out of UrBackup have stemmed from problems with updates. So, should I abandon any thoughts of updating the system forever, or should I go for that 9.x update?

uroni · December 1, 2019, 7:00pm

At some point, you’d probably want to update it. But as long as it isn’t reachable from the Internet you can put that off (perhaps till there also is some hardware problem)…
Unfortunately I can’t change the update system such that it retroactively works with lower free space.