hngopher.com

       [HN Gopher] How much does Rust's bounds checking cost?
       ___________________________________________________________________
        
       How much does Rust's bounds checking cost?
        
       Author : glittershark
       Score  : 118 points
       Date   : 2022-11-30 18:38 UTC (4 hours ago)
        
 (HTM) web link (blog.readyset.io)
 (TXT) w3m dump (blog.readyset.io)
        
       | killingtime74 wrote:
       | Can someone smarter than me enlighten me when you would consider
       | disabling bounds checking for performance? In ways the compiler
       | is not already doing so? The article starts with a bug that would
       | have been prevented by bounds checking. It's like talking about
       | how much faster a car would go if it didn't have to carry the
       | extra weight of brakes.
        
         | pornel wrote:
         | In tight numeric code that benefits from autovectorization.
         | 
         | Bound checks prevent some optimizations, since they're a branch
         | with a significant side effect that the compiler must preserve.
        
         | masklinn wrote:
         | > It's like talking about how much faster a car would go if it
         | didn't have to carry the extra weight of brakes.
         | 
         | And there's folks who do exactly that.
        
         | returningfory2 wrote:
         | I think the point of the article is the other way around: when
         | starting from a language like C that doesn't have bound
         | checking, moving to Rust will involve adding bounds checks and
         | then an argument will be made that this will regress
         | performance. So to test that hypothesis you start with the safe
         | Rust code, and then remove the bounds check to emulate what the
         | C code might be like. If, as in the article, you find that
         | performance is not really affected, then it makes a C-to-Rust
         | migration argument more compelling.
        
         | pitaj wrote:
         | Sometimes the programmer can prove that bounds checks are
         | unnecessary in a certain situation, but the compiler can't
         | prove that itself, and the programmer can't communicate that
         | proof to the compiler. Bounds checks _can_ result in lost
         | performance in some cases (very tight loops), so unsafe
         | facilities exist as a workaround (like `get_unchecked`).
        
         | heleninboodler wrote:
         | I don't really see this as a person who _wants_ to turn off the
         | bounds checking for any real reason, but as someone who just
         | wants to have an idea of what the cost of that bounds checking
         | is.
        
         | constantcrying wrote:
         | >Can someone smarter than me enlighten me when you would
         | consider disabling bounds checking for performance?
         | 
         | Because it is faster. Worst case you are triggering a branch
         | miss, which is quite expensive.
         | 
         | >It's like talking about how much faster a car would go if it
         | didn't have to carry the extra weight of brakes.
         | 
         | So? Every wasted CPU cycle costs money and energy. Especially
         | for very high performance applications these costs can be very
         | high. Not every car needs brakes, if it doesn't need to stop by
         | itself and crashing hurts nobody they are just waste.
        
         | jackmott42 wrote:
         | In Rust you can usually use iterators to avoid bounds checks.
         | This is idiomatic and the fast way to do things, so usually
         | when using Rust you don't worry about this at all.
         | 
         | But, occasionally, you have some loop that can't be done with
         | an iterator, AND its part of a tight loop where removing a
         | single conditional jump matters to you. When that matters it is
         | a real easy thing to use an unsafe block to index into the
         | array without the check. The good news is then at least in your
         | 1 million line program, the unsafe parts are only a few lines
         | that you are responsible for being sure are correct.
        
         | emn13 wrote:
         | Sometimes the extra speed can be relevant. Knowing what the
         | upside _can_ be can help inform the choice whether it's worth
         | it.
         | 
         | Secondly, even assuming you want runtime bounds checking
         | everywhere, then this is still a useful analysis because if you
         | learn that bounds-checking has no relevant overhead - great! No
         | need to look at that if you need to optimize. But if you learn
         | that it _does_ have an overhead, then you have the knowledge to
         | guide your next choices - is it enough to be worth spending any
         | attention on? If you want the safety, perhaps there's specific
         | code paths you can restructure to make it easier for the
         | compiler to elide the checks, or the branch predictor to make
         | em smaller? Perhaps you can do fewer indexing operations
         | altogether? Or perhaps there's some very specific small hot-
         | path you feel you can make an exception for; use bounds-
         | checking 99% of the time, but not in _that_ spot? All of these
         | avenues are only worth even exploring if there 's anything to
         | gain here in the first place.
         | 
         | And then there's the simple fact that having a good intuition
         | for where machines spend their time makes it easier to write
         | performant code right off the bat, and it makes it easier to
         | guess where to look first when you're trying to eek out better
         | perf.
         | 
         | Even if you like or even need a technique like bounds checking,
         | knowing the typical overheads can be useful.
        
         | d265f278 wrote:
         | I've seen bounds checks being compiled to a single integer
         | comparison followed by a jump (on x86 at least). This should
         | have a negligible performance impact for most programs running
         | on a modern, parallel CPU. However, for highly optimized
         | programs that constantly saturate all processor instruction
         | ports, bounds checks might of course become a bottleneck.
         | 
         | I think the most preferable solution (although not always
         | possible) would be to use iterators as much as possible. This
         | would allow rustc to "know" the entire range of possible
         | indexes used at runtime, which makes runtime bounds checking
         | redundant.
         | 
         | Some old benchmarks here: https://parallel-rust-
         | cpp.github.io/v0.html#rustc
        
         | dathinab wrote:
         | In some very hot code (most times loops with math stuff were it
         | prevents some optimizations and/or the compiler fails to
         | eliminate it) it can lead to relevant performance improvements.
         | 
         | Because of this you might find some rust code which opts out of
         | bounds check for such code using unsafe code.
         | 
         | But this code tends to be fairly limited and often encapsulated
         | into libraries.
         | 
         | So I agree that for most code doing so is just plain stupid. In
         | turn I believe doing it on a program or compilation unit level
         | (instead of a case by case basis) is (nearly) always a bad
         | idea.
        
       | moloch-hai wrote:
       | The instructions generated make a big difference. Modern
       | processor specifications commonly quote how many instructions of
       | a type can be "retired" in a cycle. They can retire lots of
       | conditional branches at once, or branches and other ops, when the
       | branches are _not taken_.
       | 
       | So it matters whether the code generator produces dead branches
       | that can be retired cheaply. Probably, optimizers take this into
       | account for built-in operations, but they know less about the
       | happy path in libraries.
       | 
       | This is a motivation for the "likely" annotations compilers
       | support. The likely path can then be made the one where the
       | branch is not taken. Code on the unhappy path can be stuck off in
       | some other cache line, or even another MMU page, never fetched in
       | normal operation.
       | 
       | The cost seen here is likely from something else, though. Keeping
       | array size in a register costs register pressure, or comparing to
       | a stack word uses up cache bandwidth. Doing the comparison burns
       | an ALU unit, and propagating the result to a branch instruction
       | via the status register constrains instruction order.
       | 
       | Even those might not be at fault, because they might not add any
       | extra cycles. Modern processors spend most of their time waiting
       | for words from memory: just a few cycles for L1 cache, many more
       | for L2 or L3, an eternity for actual RAM. They can get a fair bit
       | done when everything fits in registers and L1 cache, and loops
       | fit in the micro-op cache. Blow any of those, and performance
       | goes to hell. So depending how close your code is to such an
       | edge, extra operations might have zero effect, or might tank you.
       | 
       | Results of measurements don't generalize. Change something that
       | looks like it ought to make no difference, and your performance
       | goes up or down by 25%. In that sense, the 10% seen here is noise
       | just because it is hard to know what might earn or cost you 10%.
        
         | dathinab wrote:
         | In rust there is `#[cold]` for functions as well as (nightly
         | only) `likely(cond)`/`unlikely(cond)` and some tricks you can
         | have something similar in stable rust.
         | 
         | Also branch paths which lead guaranteed to an panic tend to be
         | treated as "unlikely" but not sure how far this is guaranteed.
        
       | pitaj wrote:
       | Very interesting. One thing that I'm curious about is adding the
       | bounds-check assertion to `get_unchecked` and seeing if that has
       | a significant effect.
        
         | masklinn wrote:
         | Happens from time to time, I've seen folks going around
         | libraries looking for "perf unsafe" and benching if removing
         | the unsafe actually lowered performances.
         | 
         | One issue on that front is a question of reliability /
         | consistency: on a small benchmark chances are the compiler will
         | always trigger to its full potential because there's relatively
         | little code, codegen could be dodgier in a context where code
         | is more complicated or larger.
         | 
         | Then again the impact of the bounds check would also likely be
         | lower on the non-trivial code (on the other hand there are also
         | threshold effects, like branch predictor slots, icache sizes,
         | ...).
        
       | jackmott42 wrote:
       | Occasionally small changes like this will result in bigger than
       | expected performance improvements. An example of this happened
       | once with C#, when two very tiny changes, each of which were
       | borderline measurable, combined they made a big difference.
       | 
       | IIRC it was in the List.Add method, a very commonly used function
       | in the C# core libs. First one programmer refactored it to very
       | slightly reduce how many instructions were output when compiled.
       | Then a second programmer working on the jit compiler
       | optimizations which also affected this Add method making it a
       | little smaller as well.
       | 
       | Alone, each change was hard to even measure, but seemed like they
       | should be a net win at least in theory. Combined, the two changes
       | made the Add method small enough to be an in-lining candidate!
       | Which meant in real programs sometimes very measurable
       | performance improvements result.
       | 
       | As others in this post have noted, a removed bounds check might
       | also unblock vectorization optimizations in a few cases. One
       | might be able to construct a test case where removing the check
       | speeds thing up by a factor of 16!
        
       | downvotetruth wrote:
       | Too much - write vague question titles & get comments in kind
        
       | tayistay wrote:
       | What if a compiler were to only allow an array access when it can
       | prove that it's in bounds? Wherever it can't you'd have to wrap
       | the array access in an if, or otherwise refactor your code to
       | help the compiler. Then you'd have no panicking at least and more
       | predictable performance.
        
         | constantcrying wrote:
         | >What if a compiler were to only allow an array access when it
         | can prove that it's in bounds?
         | 
         | Even very good static analysis tools have a hard time doing
         | this. In a language like C++ this would effectively mean that
         | very few index operations can be done naively and compile times
         | are significantly increased. Performance is likely reduced as
         | well over the trivial alternative of using a safe array.
        
         | glittershark wrote:
         | there's a cheeky link to idris's vector type in the second
         | paragraph: https://www.idris-
         | lang.org/docs/idris2/current/base_docs/doc... which
         | accomplishes just that
        
         | jackmott wrote:
        
         | tylerhou wrote:
         | With bounds checking by default, even if a compiler can't
         | statically prove that an index is in bounds, if the index in
         | practice is always in bounds, the compiler inlines the
         | check/branch into the calling function, and you're not starved
         | of branch prediction resources, the check will be "free"
         | because the branch will always be predicted as taken.
        
         | est31 wrote:
         | Rust has a tool for that, it's iterators. It is only limited
         | however.
        
           | dathinab wrote:
           | also `slice.get()` will return on option
           | 
           | and using a range check manually before an index will
           | normally optimize the internal bounds check away
        
         | nigeltao wrote:
         | That's how WUFFS (Wrangling Untrusted File Formats Safely)
         | works:
         | 
         | https://github.com/google/wuffs#what-does-compile-time-check...
        
       | constantcrying wrote:
       | I imagine the reason bounds check are cheap is because of the
       | branch predictor. If you always predict the in bounds path, the
       | check is almost free.
       | 
       | You also do not really care about flushing the pipe on an out of
       | bounds index, since very likely normal operations can not go on
       | and you move over to handling/reporting the error, which likely
       | has no need for significant throughput.
       | 
       | Also I would just like to note that safe arrays aren't a unique
       | rust feature. Even writing your own in C++ is not hard.
        
         | jackmott42 wrote:
         | Yep, unless your code is wrong, the bounds check will always be
         | predictable. Which makes it free in a sense. But sometimes it
         | will block other optimizations, and it takes up space in the
         | caches.
        
         | pantalaimon wrote:
         | That would be bad on Embedded where MCUs usually don't do any
         | branch prediction.
        
         | dathinab wrote:
         | > If you always predict the in bounds path, the check is almost
         | free.
         | 
         | Note that you often only branch in the "bad" case, which means
         | even on systems without branch prediction it tends to be not
         | very expensive, and compilers can also eliminate a lot of
         | bounds checks.
        
       | Jach wrote:
       | Always amuses me that it's current year and people think about
       | turning off checks, even when they're pretty much free in modern*
       | (since 1993 Pentium, which got like 80% accuracy with its
       | primitive branch prediction?) CPUs...
       | 
       | "Around Easter 1961, a course on ALGOL 60 was offered ... After
       | the ALGOL course in Brighton, Roger Cook was driving me and my
       | colleagues back to London when he suddenly asked, "Instead of
       | designing a new language, why don't we just implement ALGOL60?"
       | We all instantly agreed--in retrospect, a very lucky decision for
       | me. But we knew we did not have the skill or experience at that
       | time to implement the whole language, so I was commissioned to
       | design a modest subset. In that design I adopted certain basic
       | principles which I believe to be as valid today as they were
       | then.
       | 
       | "(1) The first principle was _security_ : The principle that
       | every syntactically incorrect program should be rejected by the
       | compiler and that every syntactically correct program should give
       | a result or an error message that was predictable and
       | comprehensible in terms of the source language program itself.
       | Thus no core dumps should ever be necessary. It was logically
       | impossible for any source language program to cause the computer
       | to run wild, either at compile time or at run time. A consequence
       | of this principle is that every occurrence of every subscript of
       | every subscripted variable was on every occasion checked at run
       | time against both the upper and the lower declared bounds of the
       | array. Many years later we asked our customers whether they
       | wished us to provide an option to switch off these checks in the
       | interests of efficiency on production runs. Unanimously, they
       | urged us not to -- they already knew how frequently subscript
       | errors occur on production runs where failure to detect them
       | could be disastrous. I note with fear and horror that even in
       | 1980, language designers and users have not learned this lesson.
       | In any respectable branch of engineering, failure to observe such
       | elementary precautions would have long been against the law."
       | 
       | -Tony Hoare, 1980 Turing Award Lecture
       | (https://www.cs.fsu.edu/~engelen/courses/COP4610/hoare.pdf)
        
         | dataangel wrote:
         | They're nowhere near free. Branch prediction table has finite
         | entries, instruction cache has finite size, autovectorizing is
         | broken by bounds checks, inlining (the most important
         | optimization) doesn't trigger if functions are too big because
         | of the added bounds checking code, etc. This is just not great
         | benchmarking -- no effort to control for noise.
        
         | titzer wrote:
         | > I note with fear and horror that even in 1980, language
         | designers and users have not learned this lesson. In any
         | respectable branch of engineering, failure to observe such
         | elementary precautions would have long been against the law.
         | 
         | Here we are, 42 years later, and bounds checks are still not
         | the default in some languages. Because performance, or
         | something. And our computers are literally 1000x as fast as
         | they were in 1980. So instead of paying 2% in bounds checks and
         | getting a _merge_ 980x faster, we get 2-3x more CVEs, costing
         | the economy billions upon billions of dollars a year.
        
           | nine_k wrote:
           | Removing bounds checks is a stark example of a premature
           | optimization.
           | 
           | You can remove bounds checks when you can _prove_ that the
           | index won 't ever get out of bounds; this is possible in many
           | cases, such as iteration with known bounds.
        
       | bugfix-66 wrote:
       | Similarly, you can turn off bounds-checking in Go like this:
       | go build -gcflags=-B
       | 
       | and see if it helps. Generally the assembly looks better, but it
       | doesn't really run faster on a modern chip.
       | 
       | Do your own test, and keep the results in mind next time somebody
       | on Hacker News dismisses Go because of the "overwhelming cost of
       | bounds checking".
        
         | masklinn wrote:
         | > next time somebody on Hacker News dismisses Go because of the
         | "overwhelming cost of bounds checking".
         | 
         | That's certainly one criticism I don't remember ever seeing.
        
           | viraptor wrote:
           | There's a few examples like this
           | https://news.ycombinator.com/item?id=32256038 if you search
           | comments for "go bounds checking"
        
       | lowbloodsugar wrote:
       | I am a Rust fan, but 10% degradation in performance (29ms to
       | 33ms) is not "a pretty small change" nor "within noise
       | threshold". If the accuracy of the tests are +/- 10% then that
       | needs to be proven and then fixed. I didn't see any evidence in
       | the article that there is, in fact, a 10% error, and it looks
       | like there is a genuine 10% drop in performance.
        
         | glittershark wrote:
         | a 10% drop in performance with bounds checks _removed_ , mind
         | you - so if anything the bounds checks are improving
         | performance.
        
           | pantalaimon wrote:
           | The more likely explanation is that the test is bunk.
           | 
           | Or maybe the unsafe access acts like volatile in C and
           | disables any optimization/reordering because the compiler
           | thinks it's accessing a register.
        
         | hra5th wrote:
         | To be clear, _removing_ the bounds checks led to the observed
         | performance degradation. Your statement beginning with  "I am a
         | Rust fan, but..." suggests that you might have interpreted it
         | as the other way around
        
           | lowbloodsugar wrote:
           | I certainly did. Thank you for pointing that out. That
           | suggests then that the problem is the tests are bogus.
        
       | tragomaskhalos wrote:
       | I would expect an iterator into a slice to not incur any bounds
       | checking, as the compiler can deduce the end pointer one time as
       | start + size. So idiomatic looping should be maximally efficient
       | you'd hope.
        
         | lmkg wrote:
         | The compiler shouldn't have to deduce anything, an Iterator
         | shouldn't have a bounds check to begin with. It ought to be
         | using unsafe operations under the hood, because it can
         | guarantee they will only be called with valid arguments.
        
           | meindnoch wrote:
           | Calling 'next' on an iterator involves a bounds check.
        
           | tylerhou wrote:
           | Safe iterators have to have bounds checks for dynamically
           | sized arrays; otherwise, how you prevent iterators from
           | walking past the end?
        
             | apendleton wrote:
             | In Rust at least, once you instantiate the iterator, the
             | array it's iterating over can't be mutated until the
             | iterator is dropped, and that can be statically guaranteed
             | at compile time. So you don't need to bounds-check at every
             | access; you can decide at the outset how many iterations
             | there are going to be, and doing that number of iterations
             | will be known not to walk past the end.
        
               | vore wrote:
               | I don't think that's always possible in practice:
               | consider Vec<T>, whose size is only known at runtime. A
               | Vec<T>'s iterator can only do runtime bounds checking to
               | avoid walking past the end.
               | 
               | That said, this is unavoidable in C/C++ too.
        
               | apendleton wrote:
               | I think we're suffering from some fuzziness about what
               | bounds checks we're referring to. Even in your example,
               | you only need to check the size of the Vec<T> when you
               | instantiate the iterator, not each time the iterator
               | accesses an element, because at the time the iterator
               | over the Vec<T>'s contents is instantiated, the Vec<T>'s
               | size is known, and it can't change over the life of the
               | iterator (because mutation is disallowed when there's an
               | outstanding borrow). With a regular for-loop:
               | for i in 0..v.len() {             println!("{:?}", v[i]);
               | }
               | 
               | you check the length at the top (the `v.len()`) and
               | _also_ for each `v[i]`. The first is unavoidable, but the
               | second can be skipped when using an iterator instead,
               | because it can be statically guaranteed that, even if you
               | don 't know at compile time what concretely the length
               | is, whatever it ends up being, the index will never
               | exceed it. Rust specifically differs from C++ in this
               | respect, because nothing in that language prevents the
               | underlying vector's length from changing while the
               | iterator exists, so without per-access bounds checks it's
               | still possible for an iterator to walk past the end.
        
               | tylerhou wrote:
               | When I read "iterator" I think of an object that points
               | into the vector and can be advanced. For Rust's vector,
               | that is std::slice::Iter (https://doc.rust-
               | lang.org/std/slice/struct.Iter.html). When you advance an
               | iterator, you must do a bounds check if the vector is
               | dynamically sized; otherwise, you don't know when to
               | stop. I.e., if I have                 let mut it =
               | vec.iter();       println!(it.next());
               | println!(it.next());       println!(it.next());
               | 
               | This needs to do bounds checking on each call to next()
               | to either return Some(a) or None (assuming the length of
               | vec is unknown at compile time). (hhttps://doc.rust-
               | lang.org/beta/src/core/slice/iter/macros.rs....)
               | 
               | You are right that theoretically a range-based for loop
               | _that uses iterators_ does _not_ need to do bounds
               | checking because a compiler can infer the invariant that
               | the iterator is always valid. In practice I don 't know
               | enough about LLVM or rustc to know whether this
               | optimization is actually happening.
        
               | oever wrote:
               | The Rust compiler guarantees that the memory location and
               | size of the iterated array do not change during the
               | operation. So the iterator can be a pointer that iterates
               | until it points to the end of the array. There is no need
               | to do bounds checks: the pointer only goes over the valid
               | range.
               | 
               | In C/C++, the array _can_ change. It might be moved, de-
               | allocated or resized in the current or a synchronous
               | thread. So the pointer that iterates until it is equal to
               | the end pointer, might point to invalid data if the size,
               | location or existence of the vector changes.
        
               | kaoD wrote:
               | What parent means is that you won't have any bound checks
               | on array access, just a len(arr) loop.
        
       | gigatexal wrote:
       | TL;DR - in the test bounds checking vs no checks showed no
       | noticeable difference. Very good article though. Worth reading.
        
         | carl_dr wrote:
         | Not too long, did read :
         | 
         | The benchmark went from 28.5ms to 32.9ms.
         | 
         | That as a percentage is 15% and is huge, it's not noise.
         | 
         | The test is flawed in some way, the article is disappointing in
         | that the author didn't investigate further.
        
           | dale_glass wrote:
           | MySQL is a huge amount of code doing a variety of things in
           | each query -- networking, parsing, IO, locking, etc. Each of
           | those can easily have significant and hard to predict
           | latencies.
           | 
           | Benchmarking that needs special care, and planning for
           | whatever it is you want to measure. A million trivial queries
           | and a dozen very heavy queries are going to do significantly
           | different things, and have different tradeoffs and
           | performance characteristics.
        
             | carl_dr wrote:
             | The benchmark was specifically testing the hot path of a
             | cached query in their MySQL caching proxy. MySQL wasn't
             | involved at all.
             | 
             | I agree completely that benchmarks need care, hence my
             | point that the article is disappointing.
             | 
             | The author missed the chance to investigate why removing
             | bounds checks seemed to regress performance by 15%, and
             | instead wrote it off as "close enough."
             | 
             | It would have been really interesting to find out why, even
             | if it did end up being measurement noise.
        
       | bjourne wrote:
       | The reason performance decreased when he removed bounds checking
       | is because asserting bounds is very useful to a compiler.
       | Essentially, the compiler emits code like this:
       | 1. if (x >= 0) && (x < arr_len(arr))         2.    get element
       | from array index x         3. else          4.     throw
       | exception         5. do more stuff
       | 
       | The compiler deduces that at line 5 0 <= x < arr_len(arr). From
       | that it can deduce that abs(x) is a no op, that 2*x won't
       | overflow (cause arrays can only have 2^32 elements), etc. Without
       | bounds checking the compiler emits:                   1. get
       | element from array index x         2. do more stuff
       | 
       | So the compiler doesn't know anything about x, which is bad. The
       | solution which apparently is not implemented in Rust (or LLVM,
       | idk) is to emit code like the following:                   1.
       | assert that 0 <= x < arr_len(arr)         2. get element from
       | array index x         3. do more stuff
        
         | vore wrote:
         | I'm not sure I follow: where is abs(x)?
        
           | layer8 wrote:
           | It's an example of what could occur within "do more stuff".
           | The mentioned 2*x is another example.
        
         | est31 wrote:
         | Interesting observation. So one should instead do the
         | comparison with something like:                   1. if (x >=
         | 0) && (x < arr_len(arr))         2.    get element from array
         | index x         3. else          4.
         | core::hint::unreachable_unchecked         5. do more stuff
         | 
         | Where unreachable_unchecked transmits precisely such
         | information to the optimizer: https://doc.rust-
         | lang.org/stable/std/hint/fn.unreachable_unc...
        
       | [deleted]
        
       | titzer wrote:
       | For Virgil, there is a switch to turn off bounds checking, for
       | the only reason to measure their cost. (It's not expected that
       | anyone ever do this for production code). Bounds checks do not
       | appear to slow down any program that matters (TM) by more than
       | 2%. That's partly because so many loops automatically have bounds
       | checks removed by analysis. But still. It's negligible.
        
       | dathinab wrote:
       | Anything between nothing and one most likely correct branch
       | predicted to _not_ jump "branch iff int/pointer > int/pointer".
       | 
       | This kind of bounds check are normally not ever violated (in well
       | formed code) so branch prediction predicts them correctly nearly
       | always.
       | 
       | It also is (normally) just jumping in the bad case, which means
       | with a correct branch predictions thy can be really cheap.
       | 
       | And then cpu "magic" tends to be optimized for that kind of
       | checks at they appear in a lot of languages (e.g. Java).
       | 
       | Then in many cases the compiler can eliminate the checks
       | partially.
       | 
       | For example any many kinds of for-each element iterations the
       | compiler can infer that the result of the conditionally loop
       | continuation check implies the bounds check. Combine that with
       | loop unrolling which can reduce the number of continuation checks
       | and you might end up with even less.
       | 
       | Also bounds checks tend to be an emergency guard, so you tend to
       | sometimes do checks yourself before indexing and the compiler can
       | often use that to eliminate the bounds check.
       | 
       | And even if you ignore all optimizations it's (assuming in
       | bounds) "just" at most one int/pointer cmp (cheap) followed by a
       | conditional branch which doesn't branch (cheap).
        
       | api wrote:
       | It's a lot cheaper than having an RCE and being completely pwned.
        
       | mastax wrote:
       | One technique is to add asserts before a block of code to hoist
       | the checks out. The compiler is usually smart enough to know
       | which conditions have already been checked. Here's a simple
       | example: https://rust.godbolt.org/z/GPMcYd371
       | 
       | This can make a big difference if you can hoist bounds checks out
       | of an inner loop. You get the performance without adding any
       | unsafe {}.
        
         | est31 wrote:
         | Yeah this is because the error message printed contains the
         | location of the error as well as the attempted index. Thus,
         | there are differences between the bounds failures and the
         | optimizer can't hoist the check out (plus probably some
         | concerns due to side effects of opaque functions).
        
         | [deleted]
        
       | tester756 wrote:
       | I've been shocked when I've heard C programmers being actually
       | concerned about performance penalty of checks
       | 
       | like, why bother? CPUs in next 2 years will win that performance
       | anyway
       | 
       | and your software will be safer
        
         | Gigachad wrote:
         | Most online debates are filled with illogical opinions on
         | theoretical issues. You get people on this site complaining
         | that they have to spend money on a catalytic converter because
         | it's not required for the car to run and only prevents other
         | people from getting cancer.
        
         | humanrebar wrote:
         | _All_ of the CPUs? C runs a lot of places.
        
           | tester756 wrote:
           | If you need perf, then consider
           | 
           | better algorithms, better data structures, multi-threading,
           | branchless programming (except safety), data-oriented design
           | 
           | and then elimination of checks, not first.
        
       | rfoo wrote:
       | A consistent 5 ms difference in micro-benchmarks is definitely
       | not "measurement noise". Noise averages out way before
       | accumulating to 5ms. There must be a reason and it mostly likely
       | relates to the change. So you can confidently say that removing
       | bounds checking (at least with how you did it) is a regression.
       | 
       | ... that being said, I'd argue that the most beneficial memory-
       | safety feature of Rust is about temporal things (i.e. prevents
       | UAF etc) instead of spatial ones.
        
         | spullara wrote:
         | A benchmarking harness without error bars?
        
         | whatshisface wrote:
         | Well, there is both random and systemic error in any
         | experiment, and if 5ms is small relative to anything you'd
         | expect (or there is some other reason to discount it) then it
         | might be related to a problem in the benchmarking setup that's
         | too small to be worth resolving. Any test is good to within
         | some level of accuracy and they don't always average out to
         | infinitely good if you rerun them enough times.
        
           | joosters wrote:
           | The 5ms isn't the key number. It's 5ms extra over a 28ms
           | baseline, that's about 18% difference. If your noise
           | threshold is 18%, then I think you have to accept that the
           | benchmark probably isn't any good for this stated task.
        
             | viraptor wrote:
             | https://github.com/bheisler/criterion.rs is good for tests
             | like that. It will give you much more than a single number
             | and handle things like outliers. This makes identifying
             | noisy tests simpler.
        
               | glittershark wrote:
               | The benchmarking harness that the post uses is based on
               | criterion
        
       | piwi wrote:
       | The article mentions measurement noise several times without
       | addressing the uncertainty. It would help to add statistical
       | tests, otherwise the spread could let us conclude the opposite of
       | what is really happened, just because we are out of luck.
        
       ___________________________________________________________________
       (page generated 2022-11-30 23:00 UTC)