[HN Gopher] Ideal divisors: when a division compiles down to jus...
       ___________________________________________________________________
        
       Ideal divisors: when a division compiles down to just a
       multiplication
        
       Author : chmaynard
       Score  : 85 points
       Date   : 2021-04-28 17:18 UTC (5 hours ago)
        
 (HTM) web link (lemire.me)
 (TXT) w3m dump (lemire.me)
        
       | whalesalad wrote:
       | I have been seeing this Wordpress theme a lot lately and it is so
       | bad the way the content is pushed so far to the right. On my 27"
       | monitor the left margin of the text is essentially the center of
       | the display.
        
         | jlarocco wrote:
         | I have a 27" monitor also, and it looks fine to me.
        
           | tachyonbeam wrote:
           | 27" monitor on Chrome here. Looks pretty far to the right.
        
           | tzs wrote:
           | How about if the browser window is maximized?
           | 
           | What I get in that case is 23 cm for the darkish blue left
           | sidebar and 36 cm for the light blue right area that contains
           | the article. Within that right area, the article is in 19 cm
           | white rectangle with just under 2 cm margins.
           | 
           | So, starting from the left here is what I see:
           | 
           | 15 cm empty dark blue background,
           | 
           | 6 cm of sidebar text on dark blue background,
           | 
           | 2 more cm of empty dark blue background,
           | 
           | 2 cm of empty light blue background,
           | 
           | A tad under 2 cm of empty white background,
           | 
           | About 14.5 cm of text on white background,
           | 
           | A tad under 2 cm of empty white background,
           | 
           | 15.5 cm of empty light blue background.
           | 
           | It really is quite ugly. It looks like something I would do
           | because I completely suck at CSS.
        
             | dhosek wrote:
             | I almost never1 run a browser maximized on anything bigger
             | than my 16" laptop screen (and even that not so much). Most
             | of the time, I have browser windows that are 1/2 (on my 24"
             | monitor) or 1/3 (on my ultrawide monitor) of the width and
             | the full height (2/3 width on my laptop)).
             | 
             | 1 The exception being browser windows to run sites like
             | Netflix where I'm really using it as a video interface.
        
         | tyingq wrote:
         | One thing that often works in this situation with chrome is the
         | "Emulate CSS Media Type: Print" option from the rendering tab
         | in Dev Tools. The theme appears to be the wordpress default
         | theme from 2015, so you should see less of it as times goes on.
        
       | omgJustTest wrote:
       | Floating point, while convenient does have substantial drawbacks
       | in the implementations of circuits that we have today. Efficiency
       | gains like this somewhat expose facets that are normally hidden.
       | It will be interesting to see the transition to either analog NN
       | or quantum computers, which inherently have better
       | representations of continuous values.
       | 
       | Edit: Additionally the serial vs parallel processing paradigm
       | shift is a main advantage of the switch too! Queued systems seem
       | to have severe scheduling limitations, which the real-world does
       | not always inherently have.
        
         | svat wrote:
         | The linked article is about integer division and has nothing to
         | do with floating point, and in fact I can't see how this idea
         | could be applied to floating-point arithmetic. Can it?
        
           | wizzwizz4 wrote:
           | Not directly, no. The underlying mathematics _kind of_ can,
           | but you 'd need dedicated circuitry (or bit-hacking) for it
           | because of the floating point parts, and it would fall over
           | around subnormals.
        
       | andrewfromx wrote:
       | i'd like to find these numbers for base9 vs base10. see
       | https://stackoverflow.com/questions/67226727/is-there-a-way-...
       | if you look at that sample golang code, is that the right way to
       | go about this? Or is there a much simpler way already built into
       | a high level language like go?
        
         | [deleted]
        
       | jvickers wrote:
       | In JavaScript (specifically node / V8) if dividing multiple
       | values by the same divisor, is it fastest to calculate the
       | reciprocal and then multiply it?
       | 
       | How about other languages (and compiler systems)?
       | 
       | Are there binary tricks worth knowing about to quickly calculate
       | reciprocals?
        
         | brrrrrm wrote:
         | It's typically faster to calculate a reciprocal (especially at
         | compile time) and then multiply it, but this is often at the
         | expense of precision. In C++, for example, the -O3 flag on gcc
         | will still emit a division (divss) to preserve compliant
         | floating point precision in this example:
         | https://godbolt.org/z/PY3Wjno4P
         | 
         | However, when we enable `-ffast-math`, we permit the compiler
         | to do a bit more aggressive optimization in this domain
         | (trading off precision), and we see multiplication (mulss) is
         | emitted: https://godbolt.org/z/KEb1nc43z
        
       | svat wrote:
       | If you've never seen it before, this is a beautiful blog post on
       | related matters: "Optimising pointer subtraction with 2-adic
       | integers" http://blog.sigfpe.com/2010/05/optimising-pointer-
       | subtractio...
        
         | kevinventullo wrote:
         | A few months back I wrote a blog post very much inspired by
         | Dan's post, though a little more academic:
         | https://kevinventullo.com/2020/12/21/2-adic-logarithms-and-f...
         | 
         | The idea is to exploit deeper structures in p-adic number
         | theory to develop useful and novel algorithms with unsigned
         | integer arithmetic, e.g. fast integer exponentiation. Feedback
         | welcome!
         | 
         | (Warning: math heavy)
        
         | thechao wrote:
         | > I don't know how gcc generates its approximate 2-adic
         | reciprocals. Possibly it uses something based on the Euclidean
         | GCD algorithm. I wasn't able to find the precise line of source
         | in a reasonable time.
         | 
         | It's 0x1FFF...F / 7 = 0x249...249249249.
         | 
         | Just about the oddest thing I've seen; I guess, if I were the
         | author, it'd be my "1 in 10000" day?
        
           | xyzzy_plugh wrote:
           | Makes more sense in binary:                 0xFFF..F
           | 111111111111111111111111       0x7
           | 000000000000000000000111       0x249..9
           | 001001001001001001001001
        
       | tragomaskhalos wrote:
       | Genuine question: why would any compiler bother to optimise
       | division with such obtuse dividends, other than a purists'
       | satisfaction ?
        
         | warkdarrior wrote:
         | Division is orders of magnitude slower than multiplication.
        
           | Tuna-Fish wrote:
           | Less than one order of magnitude these days. On Tiger Lake,
           | integer multiply is 3 cycles, while integer division is 12
           | for 32-bit integers and 15 for 64-bit.
           | 
           | Of course, the multiply is fully pipelined while the division
           | is only partially, but even that leaves only a 10-fold
           | difference.
           | 
           | Division is much faster these days than it used to be. It's
           | still worth it to have compiler optimizations for it, but
           | it's not important to avoid it anymore.
        
             | gowld wrote:
             | Why do some people assume decimal ten as the base of the
             | orders of magnitude when discussing binary arithmetic?
        
               | cma wrote:
               | Timespan/cycle count is almost always denoted in base 10
               | in conversation.
        
             | Fordec wrote:
             | > Less than one order of magnitude these days.
             | 
             | > only a 10-fold difference.
             | 
             | A ten fold difference is _literally_ one order of
             | magnitude. Which is what the post you 're replying to
             | factually stated.
        
               | cma wrote:
               | orders usually means 2 or more
        
             | jbay808 wrote:
             | Embedded systems programmers don't always have such luxury!
        
         | martincmartin wrote:
         | They're often special cases of general algorithms. You can do
         | many divisions as combinations of multiplications & shifts; in
         | this case, the resulting code simplifies down to the smallest
         | possible.
        
         | Denvercoder9 wrote:
         | There's no specific optimisation for these specific numbers.
         | There's a more general optimisation to turn division by a fixed
         | number into a multiplication and shift where possible (because
         | division is a _lot_ slower than multiplication), and there is
         | another optimisation that eliminates useless shifts. Combined
         | they yield this result for these specific numbers.
        
           | joe_the_user wrote:
           | Yeah, a more detailed description of this general method
           | seems missing from the article. Anyone care to provide it?
        
             | chrchang523 wrote:
             | See the "Labor of Division" blog posts by the author of
             | libdivide:
             | 
             | https://ridiculousfish.com/blog/posts/labor-of-division-
             | epis...
             | 
             | https://ridiculousfish.com/blog/posts/labor-of-division-
             | epis...
        
               | joe_the_user wrote:
               | So basically calculating the multiplicative inverse of a
               | number and storing it as something like a float (digits +
               | exponent) with the digits and exponent stored separately
               | to avoid overflow, underflow, accuracy issues.
               | 
               | And it's faster if and only if you're repeating division
               | by a single number since you're still doing the
               | equivalent of divide to get the invest (but repeated
               | division is actually quite common so this is useful).
        
             | pkaye wrote:
             | The book "Hackers Delight" goes over a lot of these kinds
             | of software algorithms for integers.
        
       | layoutIfNeeded wrote:
       | >I call the corresponding divisors ideal
       | 
       | Not a particularly good name, as "ideal" already has an
       | established meaning in algebra:
       | https://en.m.wikipedia.org/wiki/Ideal_(ring_theory)
        
         | gowld wrote:
         | It's not an ideal name, even though it's ideal.
        
         | withinboredom wrote:
         | This is computer science, not algebra.
        
           | Tainnor wrote:
           | algebra plays a huge role in CS, e.g. cryptography
        
       | chrchang523 wrote:
       | See also https://libdivide.com/ , when you need to repeatedly
       | divide by a constant that isn't known until runtime.
        
       ___________________________________________________________________
       (page generated 2021-04-28 23:00 UTC)