I have a question on the finer points of the process of how/when a file moves across the network between client and server.
From reading the manual/forums and watching the logs, I think I understand that a file is copied from the client to the server, a hash of the file is created at the server and compared against the hashes for existing files already stored on the server. If a match is found, the files are assumed to be identical and a hard link is created, in lieu of making another copy of the file.
I understand that there are options and variation on this process, such as going ahead and making that second copy of the file rather than using a hard link. As nearly as I can tell, there is an option to have the client create a hash of the file, which can be compared with the server’s hash to make sure the file copied from client to server correctly.
If the client creates a hash of the file, then sends the file to the server, then the server creates a hash and it’s identical to the client’s hash (confirming the copy was successful), then the server finds an existing hash of the file copied from some other client and hard links to the existing file…isn’t the actual copy process between the client and the server redundant? Is there a scenario where the client creates the hash, sends it to the server prior to the copy process, the server finds an existing copy of the file and hard links it thus completely bypassing the file copy process?
If that’s not the way things are now, then that could potentially save a whole ton of bandwidth in some scenarios.