Iowait High CPU percentage

I’m using the infscape Appliance on baremetal install. It has worked great for 5+ months. Suddenly in the last couple days I’ve noticed backups taking way longer than normal and slow load times on the web interface. I checked the netdata stats today and noticed that iowait is sitting at 95-100% on the CPUs almost all the time. It was not like that a few weeks ago. I have added more endpoints to be backed up but it just seemed to go from nothing to super high very quickly rather than a gradual thing as I added more endpoints. I have included my hardware below. Does anyone have any insight on ways to fix this? Would replacing the RAID controller with one that has 1GB cache help? Or do I just need to buy an additional server and spread the load out between the two? (or more)

Dell PowerEdge R510

  • (2) x Quad-Core L5630 2.13 GHz 8-cores
  • 32GB Memory Kit - (4 x 8GB) PC3 DDR3 10600R
  • (1) x Dell PERC H700 512MB Cache w/BBU
  • (1) x 256GB SSD (for OS)
  • (8) x 4TB HDD (Auto-layout RAID, BTRFS)
  • 300 backup clients
    • 22 of those are image + file backup
    • rest are file backup only
    • all are internet clients

I figured out the culprit.

I was at 90% storage capacity on my RAID volume. The software had gotten stuck during the nightly cleanup. Once it had purged enough to get below 90% it’s running like a dream again. I’m not sure if there is a setting somewhere that puts some sort of limit on 90% but I’m buying some more hard drives and will install them to see if I can keep that from happening again

The RAID uses disks individually, so the cache in the disk controller should be unused (or irrelevant compared to individual disk cache), so it wouldn’t help to have more RAM there.

What would help is a larger + separate NVMe (preferable with low latency) for use as RAID cache. You can also configure it to use RAM as RAID cache in the advanced settings.

For speeding up deletion (and reading) the RAID metadata disks help, but I see you already went this way: Error adding RAID Metadata device . Often when it cleans up it needs to read file system metadata first (which causes the iowait if that is not in cache).