Hello, after the backup, the server takes the HASH records of the files on the same disk, which is a situation reduces performance.
My suggestion is to take the HASH records to a different disk. I think that using an SSD for HASH will provide a great performance.
What do you think?
Why don’t you try mounting a different disk where the urbackup db is stored? Hashing performance will be dependent on CPU too.
DB located in OS disk (SSD)
In the task manager, all loads are at low or normal levels. However, HDD is 100%
DB doesn’t have to be on OS disk - try mounting a different disk at /var/urbackup
That’s normal since hashing is reading all of the backup’s files back from the storage to hash them.
I also want this not to be normal
so if the hash data is saved to a different disk it will increase the overall performance
This is not the topic/suggestion
You don’t understand. The hash data is stored on the SSD already. But creating the hashes requires reading the backed up files, which is always going to be reading from where the files are i.e. the HDDs (unless you back up to SSDs, but…)
I don’t really understand the purpose of this request.
If the bottleneck is uploading and then reading files on the server HDD (rather than hashing and db ops) then you could mount a fast SDD on tmp and set the flag in advanced settings for ‘Temporary files as file backup buffer’.
Hash recods in backup folder with same folder tree
Not really an expert on this but, do you think the "Temporary File Buffers" would help?
Take a look at Administration Manual - Enabling temporary file buffers
Give it a try and let us know how it went.
The temporary folder is already using in SSD, but I don’t think this will help the HASH process.
I guess is that the files are first taken to the temporary folder, and when the process is finished, they are copied to the main folder, and then the HASH process begins.
Makes sense.
Your understanding (which I think is correct) is that the temp folder would only help during the transmission of data, not hashing.
yes this case hash disk feature really good idea
It’s not necessarily a good idea.
First it complicates the code base - there are now potentially two stores of data on different disks and there is increased potential for them to get out of sync. This would increase the number of people asking for support on the forums with corrupt data stores.
Second, it does not necessarily increase performance. I already use an SSD for my data store so I would see no improvement. Some data stores may also have an SSD for good small file performance. Has there been any performance testing?
Third, someone has to implement it without introducing new bugs for people. There is a .hashes folder for each client and its location is hard coded in the code base.
I don’t know about coding but there won’t be 2 different data stores
Maybe disk mapping may be required in the restore process for restoration. This can be solved by selecting the hash disk at the beginning.
I’m not sure how much it complicates the code because even if the “.hashes” folder is fixed in the code, it will create a folder on the 2nd disk only at the beginning of the hash process and hash files can be transferred with junction for the .hashes folder.
Even if you use an SSD for backup, the demand for reading and writing to the same disk will decrease, which will increase performance and extend the write life of the SSD. Maybe it will have less effect than a mechanical disk, but it will definitely be beneficial.