You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Currently, all hard links are resolved as regular files. Sometimes, excessive bytes are read from the remote server. This can be problematic for repositories making heavy use of hard links, e.g., fedora.
Rsync has hard link support and can accurately transfer hard links between servers. Dev and inode ids are transmitted through the wire on the file list transfer stage. The client may recognize duplicated dev and ino pairs and initiate file content transfer only for the first instance.
One naive approach is to manually specify a source by some heuristics and treat the rest as symlinks. However, there are challenges with this implementation.
Hard links are non-directional. So it's better to see them as a cluster rather than a link. One approach is to manually specify a source by some heuristics and treat the rest as symlinks. However, if the "virtual" source is removed later, a new source must be chosen, and all other files initially sharing the same inode should be rewritten to point to the new source. Furthermore, detecting them without changing the metadata format (to book-keeping hard links) is expensive because we must reverse-track all entries pointed to this source. Therefore it's not a good choice to reuse existing symlink handling.
Another possible implementation is to use hard link info only as an optimization. When the generator requests a file, first check if there's another file with the same dev & ino already asked. If yes, do not request this file and reuse the hash (remember, we use the content hash to address files). The only extra cost we need to pay (other than receiving and storing dev & ino fields in FileEntrys) is a hash table from (dev, ino) to file idx.
The text was updated successfully, but these errors were encountered:
Currently, all hard links are resolved as regular files. Sometimes, excessive bytes are read from the remote server. This can be problematic for repositories making heavy use of hard links, e.g., fedora.
Rsync has hard link support and can accurately transfer hard links between servers. Dev and inode ids are transmitted through the wire on the file list transfer stage. The client may recognize duplicated dev and ino pairs and initiate file content transfer only for the first instance.
One naive approach is to manually specify a source by some heuristics and treat the rest as symlinks. However, there are challenges with this implementation.
Hard links are non-directional. So it's better to see them as a cluster rather than a link. One approach is to manually specify a source by some heuristics and treat the rest as symlinks. However, if the "virtual" source is removed later, a new source must be chosen, and all other files initially sharing the same inode should be rewritten to point to the new source. Furthermore, detecting them without changing the metadata format (to book-keeping hard links) is expensive because we must reverse-track all entries pointed to this source. Therefore it's not a good choice to reuse existing symlink handling.
Another possible implementation is to use hard link info only as an optimization. When the generator requests a file, first check if there's another file with the same dev & ino already asked. If yes, do not request this file and reuse the hash (remember, we use the content hash to address files). The only extra cost we need to pay (other than receiving and storing dev & ino fields in
FileEntry
s) is a hash table from (dev, ino) to file idx.The text was updated successfully, but these errors were encountered: