[HN Gopher] One does not simply calculate the absolute value ___________________________________________________________________ One does not simply calculate the absolute value Author : mot2ba Score : 93 points Date : 2021-08-22 18:12 UTC (4 hours ago) (HTM) web link (habr.com) (TXT) w3m dump (habr.com) | Sharlin wrote: | Computing the absolute value of a signed (fixed-size, two's | complement) integer is not trivial either, because there is no | positive i such that `abs(INT_MIN) == i`. Some languages like | Java just give you `abs(INT_MIN) == INT_MIN` because: | -100...000 = ~100...000 + 1 = 011...111 + 1 = 100_000 | | where ~ is the bitwise complement. Even worse, in C and C++ | specifically, because overflow of signed integers is undefined | behavior, so is `abs(INT_MIN)`! To be able to represent the | absolute value, you need to either return the corresponding | unsigned type (if your language has them), promote to a wider | type, or return a tuple `(carry_bit, result)`. | m4r35n357 wrote: | Bah! Just square & take the square root! | kurthr wrote: | And don't forget complex numbers or higher dimensionality. | beervirus wrote: | Does this correctly handle all the different NaN values, weird | corner cases, etc. properly? | pedrocr wrote: | For double it would be harder to check but for float (that | should have the same corner cases) it's easy to just check all | possible values if you have a gold standard. I've been doing | that for image processing functions and it's amazing how many | corner cases you fix and keep fixed by writing unit tests that | iterate over all possible values or a good subset of them when | that's not possible. | JadeNB wrote: | > Does this correctly handle all the different NaN values, | weird corner cases, etc. properly? | | Of course the answer to "have you covered all weird corner | cases" is _always_ 'no', but the article addresses the issue of | different NaNs: | | > [...] However, it turns out that doubleToLongBits is also not | entirely trivial, because it canonicalizes NaNs. There are many | ways to encode a not-a-number as a double, but only one of them | is canonical. These different NaNs are even more similar twins, | they can be distinguished neither via Double.compare, nor via | any operation. Even string representation is the same. But they | look different in computer memory. To avoid surprises, | doubleToLongBits converts any NaN to the canonical form, which | is encoded in long as 0x7ff8000000000000L. Of course, this | procedure adds more conditions, which we do not need here. | | > What can we do? It appears that there's another method | doubleToRawLongBits. It doesn't do any smart conversion of NaN | and just returns exactly the same bit representation: [...] | sp332 wrote: | The question is whether that last technique, of just flipping | the sign bit, handles NaNs and infinities. | pm215 wrote: | It's the required semantics of the IEEE floating point | spec's absolute-value operation. This does mean that abs() | doesn't behave like eg addition in that it won't set fp | exception flags for signaling NaNs, or canonicalize NaNs, | or set a flag to note that this was a denormal number. But | at the language level "follows IEEE semantics" is probably | overall less surprising. | | chs() ("flip the sign bit") is the other operation that's | defined as effectively just an operation on the | representation rather than a "do a real floating point | computation". | MontyCarloHall wrote: | Yes. | | The sign bit is totally independent of the value being | encoded, whether that's a number, q/sNaN, or infinity. | | Nitpick: we are not flipping the sign bit, but rather | setting it to 0. | klyrs wrote: | Yeah, the title is a head-scratcher. It _is_ simple to | calculate the absolute value. Only, it doesn 't look simple | in a high-level language. | andi999 wrote: | Does anybody know a good reference on the minus 0 thing? How do | you end up with it? Is 4-4=+0.0 and -4 +4=-0.0? Or is it | different? | aaaaaaaaaaab wrote: | Both are exactly 0.0. | | Regarding signed zero, read this paper from Kahan: | https://homes.cs.washington.edu/~ztatlock/599z-17sp/papers/b... | gvx wrote: | The easiest way to get a signed zero is with a unary minus | operation. That is, the expression -x produces -0.0 when x is | an unsigned zero (so 0.0) | MontyCarloHall wrote: | No, those both yield +0. | | 0/(-x) would return -0, for all x > 0 | | Likewise, underflows on negative numbers would also return -0. | thaumasiotes wrote: | > 0/(-x) would return -0, for all x > 0 | | I find this one a little weird, but it seems to be true. | | The more normal way to get -0 would be to divide a positive | number by negative infinity, or a negative number by positive | infinity. | 63 wrote: | From some quick testing, -1.0 * 0.0 = -0.0 and the order | doesn't matter. | da_chicken wrote: | > _For example, if you divide 1.0 by +0.0 and -0.0, you get | completely different answers: +Infinity and -Infinity._ | | This must be a Java-ism. In most languages I've used you get a | division by zero error or NaN. | titzer wrote: | It's IEEE-754. | magicalhippo wrote: | And there are algorithms that rely on this, like this[1] | branchless axis-aligned bounding box intersection test. | | It exploits the division by zero sign logic of IEEE-754, so | that the min/max operations used also handles the edge cases | where the ray is parallel to an axis. | | [1]: https://tavianator.com/2011/ray_box.html | wodenokoto wrote: | While parent isn't correct in their assumption, this was also | my first thought after reading the article and I appreciate the | ensuing discussion. | | I hope others will take the entire thread into consideration | before downvoting. | kmm wrote: | Which languages did you try? I found that dividing 1.0 by 0.0 | gives Infinity in Java, C, Julia, JavaScript, Ruby, Clojure, | Haskell and Rust. I get division errors in Python and Perl. | | Curiously, Go gives me a compiler error if I divide constant | literals, but assigning 0.0 to a variable and dividing by it | does work. On top of that, assigning -0.0 to a float variable | leads to it having the value +0.0, which I would call a | borderline compiler bug. Both are probably a consequence of Go | constant literals not being IEEE-754 floats but some arbitrary | precision number types, but it's very unexpected behavior. | da_chicken wrote: | > Which languages did you try? | | Python and C#. | Kranar wrote: | Python does throw an exception but C# does not. | kmm wrote: | Dividing 1.0 by 0.0 gives [?] in C# as well when I try it. | da_chicken wrote: | You're right. I must've divided 0 by 0 and -0. That's the | only way C# gives me NaN. I thought I'd specifically | avoided doing that. | ketzu wrote: | +/- Infinity seems to be correct according to IEEE754. | | IEEE754 section 7.3: | | The divideByZero exception shall be signaled if and only if an | exact infinite result is defined for an operation on finite | operands. The default result of divideByZero shall be an [?] | correctly signed according to the operation: -- For division, | when the divisor is zero and the dividend is a finite non-zero | number, the sign of the infinity is the exclusive OR of the | operands' signs (see 6.3). -- For logB(0) when logBFormat is a | floating-point format, the sign of the infinity is minus | (-[?]). | Agentlien wrote: | This is what you get according to IEE 754. | | You'd get NaN from 0.0/0.0 or division by zero if you did 1/0 | as integer division. | titzer wrote: | Haha, -0. After spending literal _years_ of my life dealing | with that little encoding nuisance, in everything from the V8 | TurboFan JIT, to WebAssembly, to adding floating point to | Virgil, I just gotta say, that little turd is so not worth it. | I 've read a lot of different justifications for it (it's a | limit, it represents _approaching_ 0 from the negatives, some | algorithms _need_ it, etc), I have to admit, I just don 't buy | it. It's an encoding artifact that's produced so much | accidental complexity that, in my book, has been _far_ more | trouble than it 's been worth. | | Change my mind. | darig wrote: | -mind | nimish wrote: | How else would you represent a branch cut or even a jump at | 0? | | https://people.freebsd.org/~das/kahan86branch.pdf from the | father of ieee-754 is illuminating. | FabHK wrote: | Haha, was just going to link that. Here's an ELI5, though | not that illuminating: | | https://hackernoon.com/negative-zero-bbd5fd790af3 | | But, bottom line, Kahan (not only the father of IEEE 754, | but also, with Golub, father of the SVD!) has thought about | it a lot, and though he called the signed zero "a pain in | the ass", he also said "There really wasn't a way around | that and you were stuck with it." | | http://history.siam.org/pdfs2/Kahan_final.pdf | croes wrote: | So for programming convenience we ignore correct math? | FabHK wrote: | We cannot represent "correct math" on a computer, because | the set of reals is uncountable, but in 64 bits we can | only represent finitely many numbers, not infinitely | many, let alone uncountably many. | | And given that we have to implement some sort of | approximation to the usual math on a closed subset of the | reals (rationals, even), why not extend it with some | "numbers" (negative zero, plus/minus infinity, NaNs) | that, yes, enable programming convenience? | croes wrote: | There is a difference between 2.4987665678 as 2.5 and | undefined as +-infinity especially because 1.0/+0.0 is | +infinity and 1/0 is just an exception in Java. Shouldn't | it be the same result? | FabHK wrote: | > In most languages I've used you get a division by zero error | or NaN | | Or it raises an exception, which once disabled a navy warship | for hours: | | > Now decommissioned, the USS Yorktown was among the first | warhips extensively computerized to reduce crew (by 10% to 374) | and costs (by $2.8 million per year). | | > On 21 Sept. 1997, the Yorktown was maneuvering off the coast | of Cape Charles, VA, when a crewman accidentally ENTERed a | blank field into a data base. The blank was treated as a zero | and caused a Divide-by-Zero Exception which the data-base | program could not handle. It aborted to the operating system, | Microsoft Windows NT 4.0, which crashed, bringing down all the | ship's LAN consoles and miniature remote terminals. | | > The Yorktown was paralyzed for 2 3/4 hours, unable to control | steering, engines or weapons, until the operating system had | been re-booted. Fortunately the Yorktown was not in combat nor | in crowded shipping lanes. | | > If IEEE 754's default had been in force, the division by zero | would have insinuated into the data-base an [?] and/or NaN, | which would have been detected afterwards without a crash. | | http://people.eecs.berkeley.edu/~wkahan/Boulder.pdf | | https://www.wired.com/1998/07/sunk-by-windows-nt/ | | EDIT: It was 2 3/4 hours, not 23 hours. I copied/pasted | carelessly from the PDF. | da_chicken wrote: | Eh, there's like six separate issues that led to this. | | 1. Interface accepted blank value for a form when a numeric | value is required. | | 2. Application and database permitted a null or coerced a | zero when the value is required and cannot be zero. You've | got to sanitize your input. | | 3. Application did not check for or gracefully handle | something as common as a division by zero exception. Instead | it proceeded to a buffer overrun and crashed. | | 4. Application crashing apparently caused an OS crash? Yes, | okay, it's NT 4.0 and a buffer overrun, but that still should | not happen. Protected mode is a thing. | | 5. There is no redundancy in a system that manages the | steering, weapons, and engines of a warship. | | 6. A warship was unable to restore the single CNC system to | working order for 23 hours. | | Furthermore... there's no evidence that the software would | have behaved more sensibly or more controllably if it had a | value of infinity instead of an error. It shouldn't have had | a zero in the first place. | togaen wrote: | Easy solution: switch to C++, use std::copysign | literallyaduck wrote: | Just convert to string then replace the "-" char with empty, then | you can convert to the type you want. Sure performance suffers, | but it gets you home by quitting time. Don't actually do this. | prof-dr-ir wrote: | "1e-3" | toxik wrote: | It's exp abs, and it's a beautiful feature. | [deleted] | [deleted] | im3w1l wrote: | > return Double.longBitsToDouble( > | Double.doubleToRawLongBits(value) & 0x7fffffffffffffffL); | | Technically, this will change NaN values into different ones. | FabHK wrote: | The sign of the NaN might change (as you'd expect from an | absolute value), but the NaN payload (which IEEE 754 envisaged | might carry debug information) won't change, right? | HWR_14 wrote: | The NaN payload's debug information is allowed to use the | sign bit as part of the debug code (according to my knowledge | of IEEE 754). Personally, I cannot imagine needing that many | debug codes in most common usage, and I don't know if many | implementations do, but it would definitely be something to | check in your specific implementation details. | ectopod wrote: | Why do you say that? It extracts the bits using | doubleToRawLongBits which does not do NaN canonicalization by | design. And it gets back to a double with longBitsToDouble | which "returns the double floating-point value with the same | bit pattern". So it doesn't change anything either. | kennywinker wrote: | They really missed an opportunity to call it caNaNicalization | im3w1l wrote: | NaN values can have sign bit set. This will unset it. | toxik wrote: | So we defined abs(NaN), sounds like a feature to me ;-) | gene91 wrote: | You made a good point. But one might argue that -nan should | become nan after an abs operation. I don't know enough | about this topic to determine whether such an argument has | merit though. | HWR_14 wrote: | It's not -NaN vs NaN. There are a bunch of NaN types, | which are not related like that. It could (according to | the spec) change "NaN: Attempt to arcsin value greater | than 1" to "NaN: Zero divided by Zero operation" | athrowaway3z wrote: | Wouldn't max( x,-x ) cover everything ? | Kranar wrote: | That will not handle -0.0 unfortunately since 0 == -0. | alisonkisk wrote: | That doesn't matter because Math.max() knows the difference | between +0.0 and -0.0. | | Try it and you'll see. | bordercases wrote: | It does that because of the way they implement arithmetic, | not about an inherent property of the mathematical function | `max`. The problem has to be solved once. | alisonkisk wrote: | The article's (dubious, as mentioned at the end) goal is | finding the most efficient way in CPU time. | kmm wrote: | Naively, max(a, b) would be implemented as | return a >= b ? a : b; | | But if x is -0.0, you're back to the same problem that -0.0 >= | +0.0 actually turns out to be true and not false, hence the | return value will be -0.0. | | Looking at the actual implementation[0], doubleToRawLongBits is | used again here to take care of this edge case. So it would | work, but a lot slower than the version given in the article. | | 0: | https://github.com/openjdk/jdk/blob/36e2ddad4d2ef3ce27475af6... | andi999 wrote: | Couldn't you also implement max as return a>b? a:b; | | (changing greater equal to just greater). Would the problem | then still persist? | kmm wrote: | Positive and negative zero cannot be distinguished by | normal equality checks, meaning +0.0 == -0.0. Therefore, | for consistency's sake +0.0 > -0.0 ought to be false, which | would yield a return value of -0.0 for max(+0.0, -(+0.0)) | gvx wrote: | That wouldn't solve it: max(0.0, -0.0) would be false ? 0.0 | : -0.0 | | The problem is that 0.0 == -0.0. As long as you don't take | care of signed zero, you cannot guarantee that abs(x) has | no negative sign. | amelius wrote: | > After such interviews, there will be rumors about your company. | | It makes no sense to bring this up in interview questions. Just | about every programmer knows that IEEE 754 contains some quirks. | If you want to know the boring details, just look them up in the | manual. | MontyCarloHall wrote: | It's interesting how fairly complex bit twiddling hacks are well- | known for integers, but even very simple but useful bit twiddles | for floats (like this one) are obscure enough to warrant their | own article. | | Of course, the most famous bit manipulation of floats is the fast | inverse square root trick [0]. The bits of a float as an integer | literal are proportional to its log2 value (plus a constant), | which is handy for computing a reciprocal square root in | logspace, which when converted back to a float gets | exponentiated, yielding the answer (after a couple more | refinement steps). If you actually take the time to manually | hammer out the bitwise manipulations, the fast inverse square | root is quite simple, but it is nonetheless regarded as black | magic, likely due to the totally non-explanatory comments in the | Quake source code. | | Funnily enough, directly following the fast inverse square root | in the Quake source code is a function to compute absolute values | by zeroing the float sign bit [1], just as described in the | article. | | [0] | https://en.wikipedia.org/wiki/Fast_inverse_square_root#Alias... | | [1] https://github.com/id-Software/Quake-III- | Arena/blob/dbe4ddb1... | HWR_14 wrote: | Well, with bit twiddling for floats, you have some very | important questions. Like, do you maintain the distinct NaN | values or are you okay with all NaNs producing a valid (but | possibly different) NaN value. | Y_Y wrote: | What if the twiddle wasn't obscure and the article was | unwarranted? | MontyCarloHall wrote: | Given the amount of discussion this article has sparked on | HN, I'd argue that as obvious as it might be, the technique | is nonetheless obscure. The amount of highly experienced and | accomplished programmers I've encountered who have no idea | how floating point numbers work is astoundingly high. | | If a similar article were written about how shifting an | integer to the right by one bit is equivalent to dividing it | by two, I highly suspect it would attract no attention on | this site, aside from a couple dismissive comments grousing | about why such a basic fact needs its own article. | kevin_thibedeau wrote: | Bit twiddling integers doesn't require unsafe type punning like | this. It's also largely safe to assume a nominal two's | complement machine is all your code will ever run on. If not, | you'll know to be careful. You can't make such blanket | assumptions with floating point representations. ___________________________________________________________________ (page generated 2021-08-22 23:00 UTC)