https://lemire.me/blog/2020/06/26/gcc-not-nearest/

Skip to content

Daniel Lemire's blog

Daniel Lemire is a computer science professor at the University of
Quebec (TELUQ) in Montreal. His research is focused on software
performance and data engineering. He is a techno-optimist.

Menu and widgets

  * My home page
  * My papers
  * My software

Subscribe

You can subscribe to this blog by email.

Where to find me?

I am on Twitter and GitHub:

Follow @lemire
You can also find Daniel Lemire on

  * on Google Scholar with 3k citations and over 70 peer-reviewed
    publications,
  * on Facebook,
  * and on LinkedIn.

If you want to meet Daniel in person, he organizes regular talks open
to the public in Montreal (typically in French): tribalab and
technolab .
Search for: [                    ] [Search]
Support my work!

I do not accept any advertisement. However, you can support the blog
with donations through paypal. Please consider getting in touch if
you are a supporter so that I can thank you.

Recent Posts

  * GNU GCC does not round floating-point divisions to the nearest
    value
  * Science and Technology links (June 20th 2020)
  * Computational overhead due to Docker under macOS
  * Reusing a thread in C++ for better performance
  * Science and Technology links (June 6th 2020)

Recent Comments

  * Daniel Lemire on GNU GCC does not round floating-point divisions
    to the nearest value
  * Brian Kessler on GNU GCC does not round floating-point divisions
    to the nearest value
  * Daniel Lemire on GNU GCC does not round floating-point divisions
    to the nearest value
  * A.J. S. on GNU GCC does not round floating-point divisions to the
    nearest value
  * Simon Lindholm on GNU GCC does not round floating-point divisions
    to the nearest value

Pages

  * A short history of technology
  * About me
  * Book recommendations
  * My bets
  * My favorite articles
  * My readers
  * My sayings
  * Predictions
  * Terms of use
  * Write good papers

Archives

Archives [Select Month        ]

Boring stuff

  * Log in
  * Entries feed
  * Comments feed
  * WordPress.org

GNU GCC does not round floating-point divisions to the nearest value

I know that floating-point arithmetic is a bit crazy on modern
computers. For example, floating-point numbers are not associative:

0.1+(0.2+0.3) == 0.599999999999999978
(0.1+0.2)+0.3 == 0.600000000000000089

But, at least, this is fairly consistent in my experience. You should
simply not assume fancy properties like associativity to work in the
real world.

Today I stumbled on a fun puzzle. Consider the following:

double x = 50178230318.0;
double y = 100000000000.0;
double ratio = x/y;

If God did exist, the variable ratio would be 0.50178230318 and the
story would end there. Unfortunately, there is no floating-point
number that is exactly 0.50178230318. Instead it falls between the
floating-point number 0.501782303179999944 and the floating-point
number 0.501782303180000055.

It important to be a bit more precise. The 64-bit floating-point
standard represents numbers as a 53-bit mantissa followed by a power
of two. So let us spell it out the way the computer sees it:

0.501782303179999944 == 4519653187245114  * 2 ** -53
0.501782303180000055 == 4519653187245115  * 2 ** -53

We have to pick the mantissa 4519653187245114 or the
mantissa 4519653187245115. There is no way to represent exactly
anything that falls in-between using 64-bit floating-point numbers.
So where does 0.50178230318 fall exactly? We have approximately...

0.50178230318 = 4519653187245114.50011795456 * 2 ** -53

So the number is best approximated by the largest of the two values
(0.501782303180000055).

Python gets it right:

>>> "%18.18f" % (50178230318.0/100000000000.0)
'0.501782303180000055'

JavaScript gets it right:

> 0.50178230318 == 0.501782303180000055
true
> 0.50178230318 == 0.501782303179999944
false

So the story would end there, right?

Let us spin up the GNU GCC 7 compiler for x86 systems and run the
following C/C++ program:

#include <stdio.h>
int main() {
  double x = 50178230318.0;
  double y = 100000000000.0;
  double ratio = x/y;
  printf("x/y  = %18.18f\n", ratio);
}

Can you predict the result?

$ g++ -o round round.cpp
$ ./round
x/y  = 0.501782303179999944

So GNU GCC actually picks the smallest and furthest value, instead of
the nearest value. It remains true even if you set specifically ask
the compiler to round to nearest (fesetround(FE_TONEAREST)).

You may doubt me so I have created a docker-based test.

You might that it is a bug that should be report, right?

There are dozens if not hundreds of similar reports to the GNU GCC
team. They are being flagged as invalid.

Let me recap: the GNU GCC compiler may round the result of a division
between two floating-point numbers to a value is not the nearest. And
it is not considered a bug.

The explanation is that the compiler first rounds to nearest using 80
bits and then rounds again. This is what fancy numerical folks call
FLT_EVAL_METHOD = 2.

However, the value of FLT_EVAL_METHOD remains at 2 even if you add
optimization flags such as -O2, and yet the result will change. The
explanation is that the optimizer figures out the solution at
compile-time and does so ignoring the FLT_EVAL_METHOD value.

You can also try to pass GNU GCC the flags -msse -mfpmath=sse, as
experts recommend, but as my script demonstrates, it does not solve
the issue (and then you get FLT_EVAL_METHOD = -1). You need to also
add an appropriate target (e.g., -msse -mfpmath=sse -march=pentium4).

If you are confused as to why all of this could be possible without
any of it being a bug, welcome to the club. You can examine the
assembly on godbolt.

Published by

[4b7361]

Daniel Lemire

A computer science professor at the University of Quebec (TELUQ).
View all posts by Daniel Lemire

Posted on June 26, 2020June 27, 2020Author Daniel LemireCategories  

6 thoughts on "GNU GCC does not round floating-point divisions to the
nearest value"

 1. [706bfc] Brian Kessler says:
    June 27, 2020 at 3:20 am

    I think you should be more specific and state that the issue is
    with the x87 floating point unit. The x87 has one internal
    precision that defaults to 80-bits, though this can be set to 32
    or 64 bits using the FLDCW instruction to load the control word
    for the floating point unit. For most cases, people want and
    expect the internal precision to equal the operand precision
    (FLT_EVAL_METHOD=0), which is what SSE instructions do. Note that
    explicitly requesting SSE does actually work in gcc as long as
    the processor supports SSE. See here in godbolt https://
    godbolt.org/z/Xqi_S7

    For your docker test, you are using the i386/ubuntu image and so
    you need to explicitly tell gcc that you have a machine with sse
    instructions, e.g. -march=pentium4, which gives the desired
    behavior:

    compiling round with -mfpmath=sse -march=pentium4
    x/y = 0.501782303180000055
    expected x/y (round to even) = 0.501782303180000055
    Expected assumes FLT_EVAL_METHOD = 1
    Equal
    Equal
    FE_DOWNWARD : 0.501782303179999944
    FE_TONEAREST : 0.501782303180000055
    FE_TOWARDZERO: 0.501782303179999944
    FE_UPWARD : 0.501782303180000056
    FLT_EVAL_METHOD = 0

    The real question is why is gcc emitting any x87 instructions
    (except for long double) on architectures that support sse. I
    would expect -march=pentium4 to use sse by default, but sse has
    to be specifically requested.

    Reply
     1. [4b7361] Daniel Lemire says:
        June 27, 2020 at 2:31 pm


            I think you should be more specific and state that the
            issue is with the x87 floating point unit. The x87 has
            one internal precision that defaults to 80-bits, though
            this can be set to 32 or 64 bits using the FLDCW
            instruction to load the control word for the floating
            point unit. For most cases, people want and expect the
            internal precision to equal the operand precision
            (FLT_EVAL_METHOD=0), which is what SSE instructions do.

        Did you get the part where if you add the flag '-O2' then
        FLT_EVAL_METHOD remains unchanged, but the rounding becomes
        correct (meaning that it rounds to the nearest 64-bit float)?

            For most cases, people want and expect the internal
            precision to equal the operand precision (FLT_EVAL_METHOD
            =0), which is what SSE instructions do.

        If the x87 unit can do it, why isn't it the default since we
        agree that it is what most people want?

        Reply
         1. [706bfc] Brian Kessler says:
            June 27, 2020 at 4:18 pm

            Looking at godbolt, gcc -O2 appears to compute the
            division at compile time using the precision of the
            operands. I don't see any division instructions issued at
            all, so the FLT_EVAL_METHOD is not relevant in this case
            since the x87 is not actually performing the
            calculations. Compile time and run time floating point
            calculations not matching is a whole other can of worms 

            As mentioned above, x87 unit can perform the calculations
            only at a single given precision by altering the
            precision in the control word. It could emulate
            FLT_EVAL_METHOD=0, but the precision would have to be
            updated when switching between any calculations using
            float, double or long double. I imagine this might not be
            the default behavior because of the performance overhead
            of loading and restoring precision in the control word
            for different precisions. For code that mostly or solely
            is using doubles, the compiler could avoid adjusting the
            precision most of the time at the cost of tracking the
            precision state throughout the code. It would be
            interesting to see if any compiler has a mode where it
            adjusts the x87 precision automatically to emulate
            FLT_EVAL_MODE=0

            Reply
             1. [4b7361] Daniel Lemire says:
                June 27, 2020 at 4:38 pm


                    I don't see any division instructions issued at
                    all, so the FLT_EVAL_METHOD is not relevant in
                    this case since the x87 is not actually
                    performing the calculations. Compile time and run
                    time floating point calculations not matching is
                    a whole other can of worms

                This is insanity.

                Reply
 2. [69abe4] Simon Lindholm says:
    June 27, 2020 at 9:02 am

    Even with SSE enabled you still don't necessarily get complete
    IEEE 754 compliance, since GCC by default emits fused
    multiply-add instructions whenever it can:

    #include <stdio.h>

    double a = 0.1, b = 10.0, c = -1;
    int main() {
    double ab = a * b;
    printf("%.5e\n", ab + c);
    }

    This prints 5.55112e-17 with GCC -O2 -march=native (or -mfma),
    and 0.00000e+00 with GCC -O2. (Can be turned off with
    -ffp-contract=off.)

    Reply
 3. [77354b] A.J. S. says:
    June 27, 2020 at 1:34 pm

    If God did exist, really? I stopped reading right after that, get
    your stuff together or don't write at all.

    Reply

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked 
*

To create code blocks or other preformatted text, indent by four
spaces:

    This will be displayed in a monospaced font. The first four
    spaces will be stripped off, but all other whitespace
    will be preserved.
    
    Markdown is turned off in code blocks:
     [This is not a link](http://example.com)

To create not a block, but an inline code span, use backticks:

Here is some inline `code`.

For more help see http://daringfireball.net/projects/markdown/syntax

        [                                             ]
        [                                             ]
        [                                             ]
        [                                             ]
        [                                             ]
        [                                             ]
        [                                             ]
Comment [                                             ]

Name * [                              ]

Email * [                              ]

Website [                              ]

[ ] Save my name, email, and website in this browser for the next
time I comment.

Receive Email Notifications? [no, do not subscribe      ] 
[instantly    ]
Or, you can subscribe without commenting.

[Post Comment] 

Post navigation

Previous Previous post: Science and Technology links (June 20th 2020)
Proudly powered by WordPress