[HN Gopher] There Are Only Four Billion Floats-So Test Them All ... ___________________________________________________________________ There Are Only Four Billion Floats-So Test Them All (2014) Author : albrewer Score : 94 points Date : 2023-02-09 16:51 UTC (6 hours ago) (HTM) web link (randomascii.wordpress.com) (TXT) w3m dump (randomascii.wordpress.com) | FpUser wrote: | Silly me. As far as I can remember I've always used epsilon when | comparing floats. | eru wrote: | That's not a general solution. | | (And if you know what you are doing, you can also compare | floats exactly in some circumstances.) | | But the article already mentions that. | mattbillenstein wrote: | Not often you can employ exhaustive testing, but it's useful in | some cases. | | I did a project in college for a VLSI minor where we implemented | a 16bit multiplier - after we got the chips back we were supposed | to do scan testing, but I just wrote up a little program in an | fpga, wired up my chip to test, and swept all the possible inputs | verifying the output. I think I could only run this at 10MHz, and | you had to shift in the operands one bit at a time, but it still | only took on the order of an hour or two to complete. | | I also verified this setup by capturing some cycles on a logic | analyzer - then pulled together a little awk script to turn a | dump of that into the actual numbers I could check. Fun project. | chriswarbo wrote: | 4 billion inputs is a little on the extreme end, especially for | slower interpreted languages, but it's always useful to look for | functions which can be exhaustively tested; e.g. booleans and | enums. | | I spotted a bug in a code review today, which was formatting | dates with ordinal numbers, like "9th Feb". The ordinal logic | only has to work for inputs 1 to 31, so I wrote an exhaustive | test, which it failed by outputting "11st", "12nd" and "13rd" | (due to only looking at the last digit) | ben0x539 wrote: | A while back for some videogame related data processing I ended | up writing a "regular number to roman numeral" (or maybe the | other way around) function in idk ruby or something. Not super | convinced that I had gotten it right, I sat down to write a few | tests, and then I realized I only needed it to work for numbers | up to like 16. So I deleted the actual code, extended the table | of test cases to cover 1 through 20 just in case, and just did | a lookup in that. | SnowHill9902 wrote: | That's actually the right solution to that specific problem. | giraffe_lady wrote: | true senior eng shit right there. | | https://i.imgflip.com/7alx6q.jpg | archi42 wrote: | I loved this, since this resonates with my own experience. If the | problem space is small enough, just test against a known good | (but slower) implementation. With custom algorithms this can also | be very interesting for the person writing the test: Their | reference code needs to be correct beyond doubt, but can forfeit | mostly every manual optimization. Also, for larger input spaces I | like to do a nested for-loop over vectors of "interesting" | values, plus add a large set of random values. | bhaney wrote: | > For very small numbers (less than about FLT_EPSILON * 0.25) | adding 0.5 gives 0.5 exactly, and this then rounds to zero | | Is that not correct? Isn't it expected that very small numbers | round to zero? | zerocrates wrote: | This is in the context of the implementation of a ciel() | function: a very small but still positive float should have a | ciel() of 1. As said near this part, IEEE default rounding is | round half to nearest even: 0.5 rounds to 0, the closest even | integer, so if your addition of 0.5 to the argument to ciel() | results in exactly 0.5, the result will incorrectly be 0 | instead of 1. | teraflop wrote: | The quote is talking about code that is _supposed_ to find the | ceiling of a number, and does so by adding 0.5 and then | rounding to the nearest value (breaking ties by rounding | towards even). | | If you have a very small but non-zero positive number, it gets | rounded to 0.5 after the addition, and then the second rounding | incorrectly rounds toward 0 instead of 1. | | The result is that you have a value x such that x > ceil(x), | which shouldn't ever happen. | bhaney wrote: | > supposed to find the ceiling | | Oh duh. I somehow forgot that over the span of a few | sentences. Thanks. | zX41ZdbW wrote: | I also did this when searching for a fast string-to-float | conversion function for ClickHouse: | | https://github.com/ClickHouse/ClickHouse/blob/master/src/IO/... | | - we create a table, insert all four billion floats there, and | check our functions with SQL queries. | | Currently, the state-of-the-art library is | https://github.com/lemire/fast_double_parser, but we use a | slightly worse (but more performant) routine. | | It is also interesting how to format floats in string in an | efficient and user-friendly way. | | For this problem, we switched iteratively: strtod -> Google's | Double Conversion -> Ryu (patched) -> Dragonbox. Currently, we | use Dragonbox. | ignoramous wrote: | > _...but we use a slightly worse (but more performant) | routine._ | | If it ( _dragonbox_?) is more performant, why is it also worse? | wiredfool wrote: | Presumably it's not as accurate? | neilv wrote: | I used a similar idea the other day, while implementing in Rust | the CRC algorithm that a device required, but the vendor didn't | document 100% clearly. | | Fortunately, I had some device vendor sample code in C (of | varying quality and provenance), which I believed was probably | correct, but I wasn't 100% certain which (if any) of numerous CRC | standard variants it was actually implementing (so I couldn't | just use an off-the-shelf Rust library). | | So, the fastest/best way to get a perfectly compatible Rust | implementation seemed to be: | | 1. Write idiomatic Rust code to try to mimic the behavior of the | C code. | | 2. Write some C code to generate a text file of exhaustive | results from the known-good C implementation, for all possible | inputs up to length N. | | 3. Exhaustively check the Rust implementation against that text | file of inputs and outputs. | AlotOfReading wrote: | For future reference there's an excellent tool for discovering | the parameters of a CRC model with pairs of inputs and outputs | called RevEng [0] (github copy: [1]). Once you have the | parameters you can use the CRC crate [2] with a predefined CRC | or a custom algorithm (highly unlikely). | | [0] https://reveng.sourceforge.io/ | | [1] https://github.com/berney/reveng | | [2] https://crates.io/crates/crc ___________________________________________________________________ (page generated 2023-02-09 23:00 UTC)