Explain the chunker a bit in the DESIGN document - dedup - deduplicating backup program (HTM) git clone git://bitreich.org/dedup/ git://enlrupgkhuxnvlhsf6lc3fziv5h2hhfrinws65d7roiv6bfj7d652fid.onion/dedup/ (DIR) Log (DIR) Files (DIR) Refs (DIR) Tags (DIR) README (DIR) LICENSE --- (DIR) commit 08600b08eec99d0c6fce2749ade192cadd4a0ba5 (DIR) parent af4f203b687f0d19bb16036c882fbf2dad994393 (HTM) Author: sin <sin@2f30.org> Date: Thu, 16 May 2019 16:43:35 +0300 Explain the chunker a bit in the DESIGN document Diffstat: M DESIGN | 10 +++++++++- 1 file changed, 9 insertions(+), 1 deletion(-) --- (DIR) diff --git a/DESIGN b/DESIGN @@ -51,4 +51,12 @@ block hashes of the data stored in the snapshot. The chunker interface --------------------- -TBD +The chunker issues variable length blocks. The minimum block size is +512KB, the maximum block size is 8MB and the average block size is +2MB. These configuration parameters can be modified by editing +config.h but it can be tricky to tune it properly. + +The buzhash[0] rolling hash algorithm is used to fingerprint the input +stream. + +[0] http://www.serve.net/buz/Notes.1st.year/HTML/C6/rand.012.html