[HN Gopher] The PolymurHash universal hash function ___________________________________________________________________ The PolymurHash universal hash function Author : fanf2 Score : 60 points Date : 2023-08-18 14:39 UTC (1 days ago) (HTM) web link (github.com) (TXT) w3m dump (github.com) | nullc wrote: | The proof of almost-universality won't apply to subsets of bits, | and many applications of hashes only use a subset of bits. The | construction techniques used to get almost-universality might | even make some subset of bits very poor in their performance: or | at least my experience with designing linear codes for error | detection/correction has been that strong codes often have some | bits that are surprisingly weak. | | So hopefully people have tested this empirically that the | collision rates for the top and bottom n bits are all reasonable | looking for varrious values of n (and might be useful to know | which bits are the strongest, assuming you get a choice). | Assuming there are no hidden gotchas like that, this looks pretty | great. | orlp wrote: | I am working on a new hash that uses more memory (aiming around | 4KiB that's shared in a process), but is a whole lot faster. | The finalizer for that hash is a universal hash which I believe | is 2^(2-k) almost-delta universal mod 2^k for any k <= 64. So | the bottom bits are always high quality. | | PolymurHash has no such guarantee, the claims are only over the | full hash. That said, I don't expect problems in practice due | to the Murmur style bit mixing at the end. | nullc wrote: | > I am working on a new hash that uses more memory (aiming | around 4KiB that's shared in a process), but is a whole lot | faster. | | Though it's hard to imagine anything with 4kib of state being | very fast! is it just a precomputed power table or something | where the whole thing isn't even accessed for short inputs? | | > The finalizer for that hash is a universal hash which I | believe is 2^(2-k) almost-delta universal mod 2^k for any k | <= 64. So the bottom bits are always high quality. | | That's interesting! | | > PolymurHash has no such guarantee, the claims are only over | the full hash. That said, I don't expect problems in practice | due to the Murmur style bit mixing at the end. | | That mix at the end is just an addition with a secret | constant, right? I'd expect then the least significant bits | to not be improved much by it. | rurban wrote: | Confirmed, I tested it. https://github.com/rurban/smhasher | fanf2 wrote: | Nice! Does smhasher have a benchmark for short unaligned | strings, up to about 16 bytes? | rurban wrote: | Only aligned, from 1--32 | cyanydeez wrote: | I'd love this as a wasm library. | aappleby wrote: | MurmurHash author here, looks interesting and I will take a | closer look later today. ___________________________________________________________________ (page generated 2023-08-19 23:00 UTC)