[HN Gopher] The PolymurHash universal hash function
       ___________________________________________________________________
        
       The PolymurHash universal hash function
        
       Author : fanf2
       Score  : 60 points
       Date   : 2023-08-18 14:39 UTC (1 days ago)
        
 (HTM) web link (github.com)
 (TXT) w3m dump (github.com)
        
       | nullc wrote:
       | The proof of almost-universality won't apply to subsets of bits,
       | and many applications of hashes only use a subset of bits. The
       | construction techniques used to get almost-universality might
       | even make some subset of bits very poor in their performance: or
       | at least my experience with designing linear codes for error
       | detection/correction has been that strong codes often have some
       | bits that are surprisingly weak.
       | 
       | So hopefully people have tested this empirically that the
       | collision rates for the top and bottom n bits are all reasonable
       | looking for varrious values of n (and might be useful to know
       | which bits are the strongest, assuming you get a choice).
       | Assuming there are no hidden gotchas like that, this looks pretty
       | great.
        
         | orlp wrote:
         | I am working on a new hash that uses more memory (aiming around
         | 4KiB that's shared in a process), but is a whole lot faster.
         | The finalizer for that hash is a universal hash which I believe
         | is 2^(2-k) almost-delta universal mod 2^k for any k <= 64. So
         | the bottom bits are always high quality.
         | 
         | PolymurHash has no such guarantee, the claims are only over the
         | full hash. That said, I don't expect problems in practice due
         | to the Murmur style bit mixing at the end.
        
           | nullc wrote:
           | > I am working on a new hash that uses more memory (aiming
           | around 4KiB that's shared in a process), but is a whole lot
           | faster.
           | 
           | Though it's hard to imagine anything with 4kib of state being
           | very fast! is it just a precomputed power table or something
           | where the whole thing isn't even accessed for short inputs?
           | 
           | > The finalizer for that hash is a universal hash which I
           | believe is 2^(2-k) almost-delta universal mod 2^k for any k
           | <= 64. So the bottom bits are always high quality.
           | 
           | That's interesting!
           | 
           | > PolymurHash has no such guarantee, the claims are only over
           | the full hash. That said, I don't expect problems in practice
           | due to the Murmur style bit mixing at the end.
           | 
           | That mix at the end is just an addition with a secret
           | constant, right? I'd expect then the least significant bits
           | to not be improved much by it.
        
       | rurban wrote:
       | Confirmed, I tested it. https://github.com/rurban/smhasher
        
         | fanf2 wrote:
         | Nice! Does smhasher have a benchmark for short unaligned
         | strings, up to about 16 bytes?
        
           | rurban wrote:
           | Only aligned, from 1--32
        
       | cyanydeez wrote:
       | I'd love this as a wasm library.
        
       | aappleby wrote:
       | MurmurHash author here, looks interesting and I will take a
       | closer look later today.
        
       ___________________________________________________________________
       (page generated 2023-08-19 23:00 UTC)