[HN Gopher] On modern hardware the min-max heap beats a binary heap
       ___________________________________________________________________
        
       On modern hardware the min-max heap beats a binary heap
        
       Author : pedro84
       Score  : 141 points
       Date   : 2020-09-02 14:11 UTC (8 hours ago)
        
 (HTM) web link (probablydance.com)
 (TXT) w3m dump (probablydance.com)
        
       | twotwotwo wrote:
       | Like folks mentioned there, I wonder if a higher-fanout heap
       | (people asked about 4-ary) might also do well in practice.
       | Looking at Wikipedia (https://en.wikipedia.org/wiki/D-ary_heap),
       | it looks like maybe so -- the growth in number of comparisons
       | isn't that bad, and it mentions 4-ary heaps working well
       | specifically.
       | 
       | (Like how linear search wins for small N, or insertion sort helps
       | with small subarrays in introsort: more steps, but each step is
       | also a lot cheaper on modern hardware due to fewer branch
       | mispredictions, better cache locality, or something else that
       | better fits the hardware's strengths.)
        
         | barbecue_sauce wrote:
         | I don't have a formal background in CS. Any insight into why
         | "d-" is used as the prefix?
        
           | dpbriggs wrote:
           | It's the number of children each node can have
        
             | adrianmonk wrote:
             | Maybe they're asking why the letter "d". That is, why
             | "d-ary" instead of "n-ary" or "k-ary".
        
         | innocenat wrote:
         | I remember doing tests years ago and find that the extra
         | comparison and loop control can tank the performance for n-ary
         | when n>2. But that was >10 years ago.
        
         | mehrdadn wrote:
         | I've tried D-ary heaps before (even min-max d-ary heap).
         | Somehow my memory was that D = 3 or 4 sometimes performed
         | better than 2, but now that I checked, in my code (which I
         | spent a lot of times optimizing) I settled on plain binary heap
         | at the end. So maybe my memory was faulty? Though I could swear
         | I saw a performance improvement for D > 2 at some point. Sadly
         | I don't recall why I reverted to binary heap exactly, or even
         | whether it was related to speed or something else. Anybody
         | tried it and remembered how it turned out?
        
       | noctune wrote:
       | And if you only need a monotone priority queue (i.e. priority
       | queue where the popped elements are monotonically
       | increasing/decreasing) you should consider using a radix heap.
       | This monotone requirement can be satisfied more than you would
       | expect when eg. using time as key or in pathfinding. I have a
       | simple radix heap implementation here:
       | https://github.com/mpdn/radix-heap
        
       | ghj wrote:
       | In certain domains, the trend has been to give up constant
       | factors in order to increase programmer productivity (e.g.,
       | python pays a 10x slowdown but is batteries included).
       | 
       | So in that case I would use this data structure even if it
       | weren't faster. I can't count the number of times I have had to
       | mess with inserting negative priorities into a min heap to create
       | a max heap! We should just have one data structure that does
       | both.
       | 
       | (though taking this idea to the logical extreme means we should
       | just use Order Statistic Tree for everything since it not only
       | gives you log(n) min/max, but also log(n) find kth and get rank
       | of x)
        
         | brandon_edens wrote:
         | Looks like 50x to me. https://benchmarksgame-
         | team.pages.debian.net/benchmarksgame/...
         | 
         | Taking 27 milliseconds just to start the interpreter.
        
         | twotwotwo wrote:
         | So, kind of a non sequitur, but I occasionally think about Qt's
         | QList<T>. When introduced, it was odd for reserving space at
         | the beginning as well as the end, so you had constant-time
         | inserts/removals at the beginning. That and storing pointers
         | for any T larger than a pointer also reduced the constant
         | factor on insertion/deletions from the middle. Since it was
         | always stored as a list of pointer-sized items, some code could
         | be shared between implementations for different types, so your
         | program's binary could be a bit smaller. I think they said at
         | the initial release that they benchmarked a bunch of options in
         | some real code and this won.
         | 
         | That was waaay at odds with the mindset of the STL, which
         | seemed to me and I think a lot of folks like the state-of-the-
         | art of containers at the time: with the STL if you
         | prepend/insert much you need to use a deque/linked list, you
         | decide pointer vs. value each time you declare a type, and if
         | you want to know about the constant factor on an insert in the
         | middle of the list you probably already lost.
         | 
         | (Qt has/had some other fun pragmatic things, like copy-on-write
         | types in a lot of places. I didn't really do nearly enough with
         | it to discover the pitfalls, but was fun to read.)
         | 
         | So I think about that, about how there are likely some trees
         | out there that could just as well be sorted arrays, about
         | adaptive structures that change their behavior based on
         | dynamically observing how they're used (one recently discussed
         | here: http://blog.erlang.org/the-new-scalable-ets-
         | ordered_set/), even about tricks in in JS string
         | implementations and VMs. I don't have answers, but I sometimes
         | wonder if we should be thinking _more_ in terms of tools that
         | may not be perfect on all axes but soften or eliminate
         | performance cliffs we take for granted. Not that the zero-
         | overhead fine-tuned types are going away, just that they aren
         | 't the only imaginable approach.
         | 
         | I wonder if we'll see progress in those sort of imperfect-but-
         | resilient tools over time. I think the long-run trend is
         | towards _some_ form of programming being accessible to more
         | people more easily (like how, as you note, folks do real work
         | in Python now). That certainly fits with trying to make your
         | tools resilient to  "misuse"--or, better, properly supporting
         | more patterns so they're not even misuse. No conclusions, just
         | hand-waving, but I do wonder if anyone working in software
         | construction tools will eventually more usefully wave their
         | hands in that direction :P
        
           | ghj wrote:
           | You would probably enjoy reading about an alternative
           | implementation for STL's std::deque, called a tier
           | vector[1][2].
           | 
           | It supports O(1)
           | push_front/pop_front/push_back/pop_back/operator[]/at (like
           | you would expect from a deque) but also O(sqrt(N)) insert and
           | remove from middle!!!
           | 
           | The paper was from 1999 and 2001 but I only learned about it
           | from a recent HN post[3] where some guy rediscovered it (20
           | years too late). I still wonder why this design lost to the
           | std::deque we have in STL today.
           | 
           | [1] http://www.cphstl.dk/Report/Deque-first-attempt/cphstl-
           | repor...
           | 
           | [2] https://www.ics.uci.edu/~goodrich/pubs/wads99.pdf
           | 
           | [3] https://news.ycombinator.com/item?id=20872696
        
         | gpderetta wrote:
         | Surely you should ve able to just use a different comparator
         | instead (> instead of <).
        
           | ghj wrote:
           | Not in python! But I guess using negatives is the same as
           | using different key (which is what python uses instead of
           | comparators).
           | 
           | The real problem is if you need min/max at the same time.
           | Then you not only need a min heap and max heap, you also need
           | to track what was already popped from either heap! This is
           | because in python you can't delete an element if it's not at
           | the root. So tracking "tombstones" will let you pop from one
           | heap, then know to ignore it the next time it comes up in the
           | other heap.
           | 
           | This is of course super complicated to maintain and I get it
           | wrong all the time. Luckily only shows up in algo interview
           | problems.
        
             | nicoburns wrote:
             | Sounds like python is _not_ batteries included compared to
             | other languages in this regard!
        
       | gliese1337 wrote:
       | I have used a min-max heap once. I don't remember why I needed it
       | at the time--it was a previous job--but I do remember that I had
       | to roll my own, because it's just not that popular of a data
       | structure, and it was the obvious and only good solution to the
       | problem at the time.
       | 
       | So, it's nice to see a detailed analysis of the structure like
       | this! Perhaps if it becomes more popular, I will find more places
       | to use it.
        
         | mav3rick wrote:
         | IIRC it's used in chess programs to evaluate moves.
        
           | nwallin wrote:
           | Perhaps you're thinking of minimax? It's an unrelated concept
           | to min-max heaps.
           | 
           | https://en.wikipedia.org/wiki/Minimax
           | 
           | https://en.wikipedia.org/wiki/Min-max_heap
        
           | Hello71 wrote:
           | you're thinking of minimax.
        
         | bjo590 wrote:
         | I used a min heap in a FANG interview. It's an obvious/good
         | solution if the problem has a mix of reading/removing the
         | smallest number in a data structure and writing new numbers to
         | the data structure.
        
           | rgossiaux wrote:
           | A min heap is different from a min-max heap. A min-max heap
           | supports the operations of both a min heap and a max heap
           | (essentially by interleaving the two). A normal min heap is a
           | standard data structure, a min-max heap less so.
        
       | gpderetta wrote:
       | Impressive. Looking forward to the d-heap article.
        
         | IshKebab wrote:
         | Yeah I was going to say, it sounds like it is faster because it
         | has higher arity rather than because it is min and max. So if
         | you only need min or max a d-heap is probably better. Hopefully
         | he will update the article.
         | 
         | https://en.wikipedia.org/wiki/D-ary_heap
         | 
         | (Also I didn't know they were called d-heaps, thanks!)
        
       | nickcw wrote:
       | If you need a min-max heap (or double ended heap) in Go here is
       | one I've used: https://github.com/aalpar/deheap
       | 
       | Very useful when you need it!
        
       | cellularmitosis wrote:
       | It would be neat to fire this up on an older processor which
       | doesn't have modern instruction-level parallelism and verify the
       | difference in performance
        
         | brandmeyer wrote:
         | On x86 you'd have to search pretty far back before the
         | available ILP really dropped off. Some of the lower-end OoO
         | ARMs might be a good testing ground, though. Say, a Raspberry
         | Pi 4? Earlier-gen RPi used in-order cores.
        
       ___________________________________________________________________
       (page generated 2020-09-02 23:00 UTC)