hngopher.com

       [HN Gopher] Using unwrap() in Rust is okay
       ___________________________________________________________________
        
       Using unwrap() in Rust is okay
        
       Author : cp9
       Score  : 156 points
       Date   : 2022-08-08 13:27 UTC (9 hours ago)
        
 (HTM) web link (blog.burntsushi.net)
 (TXT) w3m dump (blog.burntsushi.net)
        
       | nnoitra wrote:
       | This blog post will age out in 6 months.
        
         | burntsushi wrote:
         | Why? It's basically an elaboration of the same advice I gave
         | eight years ago. And I was not the first.
        
       | ComputerGuru wrote:
       | One runtime panic I wish was a compile-time error in the rust
       | standard library is the use of an incorrect memory order, eg
       | Ordering::Release with AtomicBool::load().
       | 
       | It would have been fairly trivial to set up generic constraints
       | specifying if a read, write, or read-write ordering semantic is
       | expected and to fail to compile if it wasn't met.
        
         | pitaj wrote:
         | This could be a good application of a Clippy lint.
        
         | nynx wrote:
         | This is going to happen at some point now that const generics
         | arrived.
        
           | ComputerGuru wrote:
           | Likely only once we finally get const parameters specified as
           | regular arguments.
           | 
           | It doesn't need const at all, though. You just need three
           | traits and either first class enum variants as types or a
           | pseudo enum (mod/struct Ordering with ZST structs Acquire,
           | Release, etc)
        
       | jeffrallen wrote:
       | When you've gotten to the point of saying, "well, it's ok to
       | panic in example code" you've already lost the game. Novice
       | programmers learn from example code, and novice programmers are
       | an order of magnitude more common than experienced ones. A
       | programming ecosystem that depends on 9 out of every 10 people
       | being able to intuit that which the other 1 understands is not an
       | ecosystem that's going to produce good code.
       | 
       | Rust is a minefield of bear traps laid by experts, and I fear for
       | the future of our industry if Go and Java programmers are
       | required by some quirk of network effects or first mover
       | advantage or whatever to starting programming (badly) in Rust.
        
         | burntsushi wrote:
         | Did you read the blog? I addressed all of that.
        
       | schubart wrote:
       | > When checking preconditions, make sure the panic message
       | relates to the documented precondition, perhaps by adding a
       | custom message. For example,
       | 
       | > `assert!(!xs.is_empty(), "expected parameter 'xs' to be non-
       | empty")`.
       | 
       | This panics with
       | 
       | > thread 'main' panicked at 'expected parameter 'xs' to be non-
       | empty', src/main.rs:79:5
       | 
       | Without the custom message it's
       | 
       | > thread 'main' panicked at 'assertion failed: !xs.is_empty()',
       | src/main.rs:79:5
       | 
       | Given that panics should be for bugs, i.e. interpreted by
       | developers, I'd say the second message is clear enough and a
       | custom message just adds noise in the source code.
        
       | sitkack wrote:
       | One can set the env var themselves in main(), many errors are
       | transient, best to capture it the first time.
       | // use std::env;         env::set_var("RUST_BACKTRACE", "full");
        
         | burntsushi wrote:
         | Hah. Ironically, set_var might be deprecated at some point and
         | replaced with an unsafe alternative. (Long story. Short story
         | is that it's currently unsound. If you have C code trying to
         | read from the environment at the same time you end up with UB.
         | If everything is Rust code though, then I believe you're fine.)
        
           | sitkack wrote:
           | I was trying to find a "turn on backtrace" as an official api
           | but couldn't find one in 90s of looking. I see the backtrace
           | crate for catching them in user space in process, which is
           | nice.
           | 
           | What would you recommend? Forking and execing yourself to set
           | the env var? backtrace::enable_full() or something to that
           | effect would be nice.
        
       | ww520 wrote:
       | I generally just use expect(), assert!, or unreachable! rather
       | than simply unwrap() to document the unexpected invalid state.
        
       | c7DJTLrn wrote:
       | The problem is that Rust developers don't just use unwrap() when
       | it _should_ panic. I 've seen plenty of "production grade" crates
       | basically just unwrap because the author didn't know how to
       | handle it gracefully or just wanted to get the code compiling,
       | then forgot about it.
        
         | bern4444 wrote:
         | When writing a library, wouldn't the prudent thing to do be to
         | return a Try/Either or have the method accept a function to
         | invoke in case of an error (like an IOC solution)?
        
         | ironmagma wrote:
         | But luckily when that time does come to refactor the code to
         | handle the error case, the Rust compiler does a lot to help you
         | make sure it's handled properly.
        
         | librexpr wrote:
         | Maybe it would have been a good idea to have two names for
         | unwrap, one which would mean "I'm certain that this value will
         | always be okay", and another which would mean "I'm taking a
         | shortcut because I'm writing a script or just want it to
         | compile for now". Maybe a longer name like "assert_valid()" for
         | the one where you're sure it's okay. That might make it easier
         | to find the places where shortcuts were taken and forgotten.
        
           | [deleted]
        
           | jeffrallen wrote:
           | Maybe if Rust code wasn't so hard to get working at all, then
           | more of it would work correctly (including checking errors
           | instead of giving up and unwrapping).
        
             | dureuill wrote:
             | not sure what you're referring to, but yes having a first
             | iteration of code with unwrap here and there, and then a
             | second one with proper error handling is a viable strategy.
             | the "loudness" of unwrap makes it super easy to notice any
             | leftover in code review
        
           | burntsushi wrote:
           | Consider using anyhow. At that point, unwrap isn't a shortcut
           | anymore except for one place: you are in a deeply nested
           | function call and you suddenly realize you need to bubble up
           | an error a few layers. Now you have to change a bunch of
           | function signatures.
           | 
           | I don't find myself in that position too frequently.
           | Certainly not enough to warrant two identical but differently
           | named functions.
           | 
           | In that case, I would suggest coming up with your own
           | pattern. Perhaps an unwrap() with a FIXME comment. Or a
           | expect("FIXME").
        
           | r3un1 wrote:
           | There is! `expect()`
        
             | sanderjd wrote:
             | I think this is a good answer. People should use `expect`
             | for this "I'm sure this is fine" case, with descriptive
             | text about why it's fine.
        
         | dymk wrote:
         | I'm confused, you seem to be saying opposite things. You're
         | saying authors don't use unwrap when it should panic, and in
         | the next sentence, you say they use unwrap (which causes a
         | panic) when the failure could be gracefully recovered.
        
           | c7DJTLrn wrote:
           | "don't just"
        
             | dymk wrote:
             | Got it. OJFord has it right, my parens were in the wrong
             | place.
        
             | OJFord wrote:
             | Heh, GP (I assume) and I initially parsed it as "don't
             | just-use" (don't simply use) rather than "don't-just use"
             | (don't only use).
        
         | dureuill wrote:
         | not sure why you're getting downvoted. maybe the scope of what
         | you're saying ('plenty of "production grade" crates unwrap
         | [when they shouldn't]') calls for a reference? Or maybe because
         | you're singling out Rust developers while the issue is
         | certainly observed in all languages with similar mechanisms
         | (see unchecked exceptions abuse in java, or aborting asserts
         | abuse in C)
         | 
         | Anyway i wouldn't say "plenty", but i did came across crates
         | (parsers :/) that would unwrap on malformed input. the
         | workaround is to encapsulate their use in a catch_unwind.
         | 
         | for the record, i had a similar issue in a c++ lib where the
         | author elected to abort on the unsupported input, so i'm
         | somewhat thankful that the idiomatic mechanism is panic (which
         | is recoverable if needs be) in Rust
        
         | burntsushi wrote:
         | That is _a_ problem. Another similar problem is that
         | programmers write buggy code. I don 't see anything
         | particularly special about 'unwrap()'. 'unwrap()' is a common
         | manifestation of it. But as others have pointed out to you,
         | this same phenomenon manifests everywhere and in other
         | programming languages, but through different means.
        
       | dcsommer wrote:
       | I totally agree with de-emphasizing the old "recoverable" vs.
       | "unrecoverable" dichotomy
       | (https://blog.burntsushi.net/unwrap/#what-about-
       | recoverable-v...). Every time I've heard programmers (especially
       | in the context of exceptions) try to define it, I've found it
       | imprecise and open to debate.
       | 
       | When invariant violations or mistakes by programmers (aka bugs)
       | are detected, the program should halt as it is in an inconsistent
       | state and continuing could be very dangerous (think
       | privacy/security/data corruption). Otherwise, don't halt (handle
       | it or have the caller handle it).
        
         | arcticbull wrote:
         | Yep, there's not really any such thing (IME) as a 'recoverable'
         | error - except with respect to I/O.
         | 
         | There's either I/O errors - or there's logic errors. A failure
         | with logic should nuke due to the app being in an inconsistent
         | state; trust is lost. An I/O error should fail softly.
        
           | ithkuil wrote:
           | A parser library can return an error when the input is wrong.
           | It doesn't necessarily mean the app is in an inconsistent
           | state when that happens; it all depends on what the
           | application does. It follows that aborting is often a
           | sensible decision in application code and rarely in library
           | code.
        
             | arcticbull wrote:
             | I would personally consider parsers as part of I/O, but
             | point taken.
        
           | fatherzine wrote:
           | Partially true. In practice people implement multiplexed
           | servers, for many reasons, including performance /
           | throughput. A logic failure should nuke _the offending
           | request_, not the entire server with all the unrelated
           | concurrent requests.
        
             | AshamedCaptain wrote:
             | This is an attitude that I often see -- library authors who
             | believe they own the process. Aborting may be the only
             | sensible thing to do in runtimes where you can end up
             | corrupting the entire process memory or the like, making
             | recovery "dubiously possible", but absolutely not for
             | anything higher level, where recovery may be safe and
             | possible.
        
               | burntsushi wrote:
               | I gave examples covering this area in the post. How would
               | you rewrite the code in this section[1], for example, to
               | conform to your views? (Assume this code is in a
               | library.)
               | 
               | [1]: https://blog.burntsushi.net/unwrap/#what-about-when-
               | invarian...
        
               | fatherzine wrote:
               | * Short term: add explicit runtime checks for every step
               | that may panic. With the way jump prediction works in
               | modern processors, the runtime cost may be smaller than
               | one may naively assume it is. Some of us would take, say,
               | a 10% perf. degradation instead of chasing prod panics in
               | the middle of the night.
               | 
               | * Long term: plug-in a richer type (logic) system, so one
               | can safely prove the most costly runtime invariants at
               | compile time.
        
               | winstonewert wrote:
               | Your short term suggestion appears confused. Panics are
               | already the result of runtime checks. The issue isn't
               | whether or not we have runtime checks, but what to do
               | when those checks fail.
               | 
               | But to your original point, its true that a multiplexing
               | server should probably try to stay up even if there is a
               | bug uncovered in handling a request. But that's precisely
               | why rust allows you to catch panics.
        
               | fatherzine wrote:
               | "It is _not_ recommended to use this function for a
               | general try /catch mechanism. The Result type is more
               | appropriate to use for functions that can fail on a
               | regular basis. Additionally, this function is not
               | guaranteed to catch all panics, see the "Notes" section
               | below."
               | 
               | https://doc.rust-lang.org/std/panic/fn.catch_unwind.html
               | 
               | Edit. Re. already existing runtime checks, I'd expect a
               | decent compiler to optimize away redundant checks, i.e.
               | explicitly coded checks vs. checks injected automatically
               | by the compiler, as long as their semantics is identical.
        
               | burntsushi wrote:
               | Can you point to any compiler in the world that can see
               | through the construction of a DFA from user input to the
               | point that it can automatically optimize away the
               | transition table lookup?
               | 
               | > The Result type is more appropriate to use for
               | *functions that can fail on a regular basis.*
               | 
               | Emphasis mine. The DFA::is_match routine can never fail
               | on any input. Yet, you want to use a 'Result' with it.
               | 
               | Seriously. Please please pretty please read the blog
               | post. You are literally caught in precisely the tangled
               | mess that I tried to untangle in the post. If my post
               | didn't do it for you, then please consider responding to
               | specific points you disagree with in the post.
        
               | fatherzine wrote:
               | Sorry, I only meant to address with some nuance the
               | statement "A failure with logic should nuke due to the
               | app being in an inconsistent state". If you are confident
               | that the specific DFA code in the blog post will not
               | fail, more power to you! And to us, which transitively
               | trust the code via your judgment ;)
               | 
               | Edit. For technical geeking around:
               | 
               | * The point is about _redundant_ check elimination, not
               | full check elimination. A decent compiler should be able
               | to eliminate redundant checks, though perhaps the
               | language could offer some intrinsics (was: pragmas) to
               | make this 100% reliable.
               | 
               | * For DFA proofs, perhaps something like https://courses.
               | grainger.illinois.edu/cs373/fa2013/Lectures/... could be
               | a start. A great intern may tackle it over a summer ;)
        
               | burntsushi wrote:
               | > add explicit runtime checks for every step that may
               | panic
               | 
               | And do... what when the check fails? Can you please write
               | the code for it? Because I don't understand what the heck
               | you mean here. If you don't know Rust, pseudo code is
               | fine.
               | 
               | > Long term: plug-in a richer type (logic) system, so one
               | can safely prove the most costly runtime invariants at
               | compile time.
               | 
               | This is just copying what I already said in the blog
               | post. Until someone can show me how to prove the
               | correctness of arbitrary DFA construction _and_ search
               | from a user provided regular expression pattern _and_
               | demonstrate its use in a practical programming language
               | like Rust, I consider this a total non-answer, not a
               | "long term" answer.
               | 
               | Basically, your answer here confirms for me that your
               | views on how code should be structured are incoherent.
        
               | fatherzine wrote:
               | fn is_match(&self, haystack: &[u8]) -> Result<bool,
               | Error> { ... }
        
               | burntsushi wrote:
               | That's exactly what the next section of my post explores
               | and explains why it's complete bonkers:
               | https://blog.burntsushi.net/unwrap/#why-not-return-an-
               | error-...
               | 
               | So it looks like I already provided a rewrite of the code
               | for you in your preferred style:                   //
               | Returns true if the DFA matches the entire 'haystack'.
               | // This routine always returns either Ok(true) or
               | Ok(false) for all inputs.         // It never returns an
               | error unless there is a bug in its implementation.
               | fn is_match(&self, haystack: &[u8]) -> Result<bool,
               | &'static str> {           let mut state_id =
               | self.start_id;           for &byte in haystack {
               | let row = match state_id.checked_mul(256) {
               | None => return Err("state id too big"),
               | Some(row) => row,             };             let
               | row_offset = match row.checked_add(usize::from(byte)) {
               | None => return Err("row index too big"),
               | Some(row_offset) => row_offset,             };
               | state_id = match self.transitions.get(row_offset) {
               | None => return Err("invalid transition"),
               | Some(&state_id) => state_id,             };
               | match self.is_match_id.get(state_id) {               None
               | => return Err("invalid state id"),
               | Some(&true) => return Ok(true),
               | Some(&false) => {}             }           }
               | Ok(false)         }
               | 
               | My favorite part is the docs that say "this never returns
               | an error." Because if it did, the code would be buggy.
               | And now all sorts of internal implementation details are
               | leaked.
               | 
               | You can't even document the error conditions in a way
               | that is related to the input of the routine, because if
               | you could, you would have discovered a bug!
               | 
               | Another nail in the coffin of your preferred style is
               | that this will absolutely trash the performance of a DFA
               | search loop.
        
               | fatherzine wrote:
               | Man, I only gave some friendly feedback ;)
               | 
               | "this will absolutely trash the performance of a DFA
               | search loop."
               | 
               | Do you happen to have some hard data to quantify? I'm
               | curious.
        
               | ninkendo wrote:
               | One additional data point: The Rust compiler got ~2%
               | faster when they made some of the core traits in the
               | serializer infallible (ie. panic instead of using
               | Result): https://github.com/rust-lang/rust/pull/93066
               | 
               | Writeup from the contributor:
               | https://nnethercote.github.io/2022/02/25/how-to-speed-up-
               | the...
               | 
               | > The Decoder trait used for metadata decoding was
               | fallible, using Result throughout. But decoding failures
               | should only happen if something highly unexpected happens
               | (e.g. metadata is corrupted) and on failure the calling
               | code would just abort. This PR changed Decoder to be
               | infallible throughout--panicking immediately instead of
               | panicking slightly later--thus avoiding lots of pointless
               | Result propagation, for wins across many benchmarks of up
               | to 2%.
        
               | burntsushi wrote:
               | Yes. I work on regex engines. Here's the code with all
               | the crazy checks that return errors that can never
               | happen:                   fn find_fwd_imp<A: Automaton +
               | ?Sized>(             dfa: &A,             input:
               | &Input<'_, '_>,             pre: Option<&'_ dyn
               | Prefilter>,             earliest: bool,         ) ->
               | Result<Option<HalfMatch>, MatchError> {             let
               | mut mat = None;             let mut sid = init_fwd(dfa,
               | input)?;             let mut at = input.start();
               | while at < input.end() {                 let byte = match
               | input.haystack().get(at) {                     None =>
               | return Err(MatchError::GaveUp { offset: at }),
               | Some(&byte) => byte,                 };
               | sid = dfa.try_next_state(sid, byte)?;                 if
               | dfa.is_special_state(sid) {                     if
               | dfa.is_match_state(sid) {                         let
               | pattern = dfa.match_pattern(sid, 0);
               | mat = Some(HalfMatch::new(pattern, at));
               | } else if dfa.is_dead_state(sid) {
               | return Ok(mat);                     }                 }
               | at = match at.checked_add(1) {                     None
               | => return Err(MatchError::GaveUp { offset: at }),
               | Some(at) => at,                 };             }
               | eoi_fwd(dfa, input, &mut sid, &mut mat)?;
               | Ok(mat)         }
               | 
               | And here's a search (I ran it a few times):
               | $ regex-cli count matches dfa dense @$medium '\w{42}'
               | parse time:  45.331us                 translate time:
               | 28.767us               compile nfa time:  9.577558ms
               | nfa memory:  698148         compile dense dfa time:
               | 516.497476ms               dense dfa memory:  6562848
               | dense alphabet length:  113                   dense
               | stride:  128                    search time:
               | 531.808124ms                         counts:  [2]
               | 
               | Here's code that's safe that will panic when there's a
               | bug:                   fn find_fwd_imp<A: Automaton +
               | ?Sized>(             dfa: &A,             input:
               | &Input<'_, '_>,             pre: Option<&'_ dyn
               | Prefilter>,             earliest: bool,         ) ->
               | Result<Option<HalfMatch>, MatchError> {             let
               | mut mat = None;             let mut sid = init_fwd(dfa,
               | input)?;             let mut at = input.start();
               | while at < input.end() {                 sid =
               | dfa.next_state(sid, input.haystack()[at]);
               | if dfa.is_special_state(sid) {                     if
               | dfa.is_match_state(sid) {                         let
               | pattern = dfa.match_pattern(sid, 0);
               | mat = Some(HalfMatch::new(pattern, at));
               | } else if dfa.is_dead_state(sid) {
               | return Ok(mat);                     }                 }
               | at += 1;             }             eoi_fwd(dfa, input,
               | &mut sid, &mut mat)?;             Ok(mat)         }
               | 
               | And here's a search with it:                   $ regex-
               | cli count matches dfa dense @$medium '\w{42}'
               | parse time:  33.701us                 translate time:
               | 24.264us               compile nfa time:  9.55935ms
               | nfa memory:  698148         compile dense dfa time:
               | 550.959698ms               dense dfa memory:  6562848
               | dense alphabet length:  113                   dense
               | stride:  128                    search time:
               | 468.932568ms                         counts:  [2]
               | 
               | So that's 10% right there. We can keep going though.
               | Let's get rid of the bounds checks. We have to use unsafe
               | though. The stakes have risen. Now if there's a bug, we
               | probably won't get a panic. Instead we get UB. Which
               | might lead to all sorts of bad stuff.
               | fn find_fwd_imp<A: Automaton + ?Sized>(             dfa:
               | &A,             input: &Input<'_, '_>,             pre:
               | Option<&'_ dyn Prefilter>,             earliest: bool,
               | ) -> Result<Option<HalfMatch>, MatchError> {
               | let mut mat = None;             let mut sid =
               | init_fwd(dfa, input)?;             let mut at =
               | input.start();             while at < input.end() {
               | sid = unsafe {                     let byte =
               | *input.haystack().get_unchecked(at);
               | dfa.next_state_unchecked(sid, byte)                 };
               | if dfa.is_special_state(sid) {                     if
               | dfa.is_match_state(sid) {                         let
               | pattern = dfa.match_pattern(sid, 0);
               | mat = Some(HalfMatch::new(pattern, at));
               | } else if dfa.is_dead_state(sid) {
               | return Ok(mat);                     }                 }
               | at += 1;             }             eoi_fwd(dfa, input,
               | &mut sid, &mut mat)?;             Ok(mat)         }
               | 
               | And now the same search:                   $ regex-cli
               | count matches dfa dense @$medium '\w{42}'
               | parse time:  22.511us                 translate time:
               | 26.843us               compile nfa time:  8.709352ms
               | nfa memory:  698148         compile dense dfa time:
               | 565.365005ms               dense dfa memory:  6562848
               | dense alphabet length:  113                   dense
               | stride:  128                    search time:
               | 456.402963ms                         counts:  [2]
               | 
               | For a smaller gain.
        
               | fatherzine wrote:
               | This is so awesome, thanks for sharing! I am happy that
               | the "~10% perf hit" intuition is reasonably close to what
               | the data shows. What is the right tradeoff? I am not an
               | author of widely popular regex engines, so I'll leave
               | that evaluation to more qualified people.
        
               | burntsushi wrote:
               | 10% is an excellent win. And given that I consider the
               | first code snippet to be effectively nonsense, the right
               | trade off is obvious to me. :)
               | 
               | I bet if me and you sat down with a big codebase side-by-
               | side, I could convince you that the style you're
               | proposing is totally unworkable.
               | 
               | Now, whether it's worth it to use unsafe and elide bounds
               | checks is harder. The gain I showed here is small, but it
               | can be bigger in other examples. It depends. But it is
               | certainly less clear. It helps that this is probanly the
               | easiest kind of unsafe to justify, because you just have
               | to make sure the index is in bounds. Other types of
               | unsafe can be quite tricky to justify.
        
               | AshamedCaptain wrote:
               | Exceptions.
        
           | jcelerier wrote:
           | > There's either I/O errors - or there's logic errors. A
           | failure with logic should nuke due to the app being in an
           | inconsistent state; trust is lost.
           | 
           | nah. in GUI apps for instance you want the failure in the
           | logic of a sub-sub-function to just tell the error "wops"
           | when the button that triggered the action was clicked, not
           | nuke the app (unless you hate your users). e.g. imagine a 3D
           | software which allows to do mesh operations - user clicks on
           | the "Smooth the mesh" button somewhere. Programmer forgot to
           | handle a division by zero in some degenerate case of the
           | smoothing computation which ends up leading to an exception:
           | a value becomes zero, someone used unsigned integers for n in
           | an "n - 1" computation which ends up in a call to
           | array_of_floats.resize(0xffffffffffffffff) (and a likely
           | std::bad_alloc being thrown if you're in c++).
           | 
           | The original mesh is unchanged as the operation waits until
           | the computation is complete to replace the old mesh with the
           | new.
           | 
           | If you ever decide to crash in this situation I am sure you
           | will have _great_ reviews on 3D modeling software
           | comparisons.
        
             | pcwalton wrote:
             | > Programmer forgot to handle a division by zero in some
             | degenerate case of the smoothing computation which ends up
             | leading to an exception: a value becomes zero, someone used
             | unsigned integers for n in an "n - 1" computation which
             | ends up in a call to
             | array_of_floats.resize(0xffffffffffffffff) (and a likely
             | std::bad_alloc being thrown if you're in c++).
             | 
             | To me this is a textbook case of why panic::catch_unwind
             | exists in Rust. Conceptually, the smoothing algorithm
             | failed in an unexpected way, so a panic is appropriate:
             | there is nothing else for the procedure to do but crash.
             | But crashing would be a very poor user experience, because
             | the smoothing operation is conceptually _isolated_ from the
             | application as a whole, so the program shouldn 't be
             | affected by a single operation crashing. This is why Rust
             | goes to the trouble to implement unwinding on panic: in
             | certain domains, software fault isolation is important.
             | 
             | Another possibility, of course, would be to spawn a
             | separate process for the smoothing operation, which would
             | effectively replace software fault isolation with hardware
             | fault isolation. But this might be awkward (and slow),
             | which is why few modeling software packages do this.
             | 
             | ("A complex smoothing operation in a 3D modeling package
             | fails" is maybe the best example of the need for
             | panic::catch_unwind I've ever heard of--thanks for offering
             | it. I wish I had been able to deploy this example back when
             | we had the debate as to whether catch_unwind should exist
             | at all. Thankfully, we kept it in.) :)
        
             | Snarwin wrote:
             | The problem with this approach is that, when a panic
             | actually happens at runtime, there is no way to tell
             | whether it was caused by an isolated mistake like
             | forgetting to check for zero, or by memory corruption
             | writing garbage all over your data structures. If it's the
             | first, recovering is fine; if it's the second, you risk
             | exposing the user to data loss and security
             | vulnerabilities. So the only safe thing to do is crash.
             | 
             | Now, it may be true that you get better reviews selling
             | software that corrupts data or gets its users hacked than
             | selling software that crashes. But unless you hate your
             | users, those are probably things you want to avoid.
        
             | arcticbull wrote:
             | I would actually want that to crash, yes. That would ensure
             | it gets caught and resolved by developers during
             | development. Or at least increase the odds thereof.
             | Further, if it allocates a few gb, or if it's allocating a
             | large amount of memory because the size parameter got
             | smashed?
             | 
             | If it crashes enough that you'd get terrible reviews it
             | would _definitely_ be caught during development.
             | 
             | If it messes up your mesh silently then you'll definitely
             | still get bad reviews.
             | 
             | And actually having an std::bad_alloc thrown is even worse
             | since you have no idea what state it left things after
             | running some destructors when you weren't planning for it.
        
               | jcelerier wrote:
               | > That would ensure it gets caught and resolved by
               | developers during development
               | 
               | i don't know in which reality you live but in mine
               | there's not much relationship between the existence of
               | crashes, and them being resolved during development
               | 
               | > And actually having an std::bad_alloc thrown is even
               | worse since you have no idea what state it left things
               | after running some destructors when you weren't planning
               | for it.
               | 
               | i don't even know what to say. that's the whole point of
               | destructors - you _know_ that things will be unwound in
               | the reverse order from which they were created on
               | automatic storage. that 's, like, why c++ exists
        
               | outworlder wrote:
               | Only if you have implemented your destructors correctly.
               | Forgot to add a 'virtual' on one of them? Too bad. Have
               | some pointers to objects? Better remember to call delete
               | on them all. You never really 'know' things are
               | destroyed, that's why tools like Valgrind exist.
               | 
               | The real fun is really the combination with exceptions
               | and destructors. So much so, some projects essentially
               | ban exceptions in C++ code.
        
         | stormbrew wrote:
         | To me the real issue is this is an extremely forced binary and
         | there's really at least three meaningful categories (especially
         | in software with a UI of any sort):
         | 
         | - unactionable invariant violation (poisoned mutex, hard memory
         | errors): crash immediately, something that should ever happen
         | happened and there's no way to either handle or present the
         | error to the user in a meaningful way.
         | 
         | - unactionable (at the call site) but normal errors (couldn't
         | open a file, disconnected from the remote end of a connection,
         | etc): these need to be propagated up to where they can be
         | turned into actionable information for a user, ideally. This is
         | _rarely_ a thing the call site where it happened can usefully
         | do.
         | 
         | - immediately actionable and normal errors (user input didn't
         | validate, file user wanted to open doesn't exist, connection
         | failed but can be retried with a backoff, etc). These need to
         | be handled at the call site or maybe one or two levels up.
         | 
         | You _need_ an exception-like mechanism (or at least a process
         | for emulating one, a la go MRV or C errno) to handle the second
         | case, you often _want_ it for the third case, but it never
         | really makes sense to use it for the first.
         | 
         | That said, I think in non-test rust code you should use expect
         | instead of unwrap, because sometimes invariants do trip and
         | that little tiny extra bit of info can make a huge difference
         | to resolving it.
        
         | jcranmer wrote:
         | The criteria I tend to prefer is "expected" versus "unexpected"
         | errors. I/O errors, especially network errors, are things that
         | are going to be expected under reasonable operation, therefore
         | it make sense that code should handle them. Similarly, user
         | input resulting in incorrectly formatted code should be
         | reasonably expected and therefore handled.
         | 
         | But the same kinds of failures might not be reasonably expected
         | in other circumstances--I wouldn't expect that the internal
         | configuration files of an application should occur in
         | reasonable operation, and therefore it makes sense to panic if
         | they're corrupted... even if the cause is an I/O operation on a
         | local disk, or parsing some JSON or TOML or INI or whatnot
         | file.
         | 
         | One implication of this is that it needs to be easy for any
         | error system to promote an "expected error" into an "unexpected
         | error"--which is what unwrap/expect does. The
         | recoverable/unrecoverable error suggests that there ought to be
         | no reason to do this, but there is absolutely a reason to do
         | so: what category an error falls into is ultimately decided by
         | the context of the error, not the generation of the error
         | itself.
        
           | taeric wrote:
           | I'm a fan of, "is this going to a user, or to another part of
           | the calling program?" If expecting it to go to a user,
           | exception make sense. Need to bubble it all the way up to the
           | part of the system that is closest to the user.
        
           | vbezhenar wrote:
           | I write Java and for me it's not exactly "expected" versus
           | "unexpected".
           | 
           | There are unexpected errors for sure. For example
           | "StackOverflowError" which could be thrown from any method
           | call.
           | 
           | There should be unexpected errors handler which does some
           | sane thing in given circumstances. Usually it involves
           | logging error and its details (stacktrace, may be something
           | else) and returning some kind of generic error to the caller
           | (e.g. HTTP 500).
           | 
           | But the thing is, this handler very often suitable for
           | handling errors that I expect but I don't want to bother
           | handling them. For example those errors might be rare enough
           | that writing code would be a net negative (every line of code
           | is maintenance burden and error handling code is maintenance
           | burden power 2). And I'm totally OK with those errors being
           | handled in generic way.
           | 
           | Even if it's user input sometimes. For example user could be
           | me. And I know input format. I don't want to write error
           | handling for myself, all I need is to prevent data
           | corruption. Getting HTTP 500 InvalidNumberFormatException is
           | totally fine in some situations.
           | 
           | And language should provide means for writing that kind of
           | code. At least that's my opinion and that's what I truly miss
           | in those languages with explicit error handling of every
           | function call.
           | 
           | You might call it lazy coding. I call it reasonable coding.
           | 
           | Exceptions for the win!
        
             | spion wrote:
             | Rust does allow you to easily call functions or methods
             | that may error by using the `?` operator (formerly `try!`
             | macro)
             | 
             | https://doc.rust-lang.org/std/macro.try.html
             | 
             | You can also implement `From` (see `impl From` in the above
             | doc) to define how you want external error types to be
             | converted.
             | 
             | Its one of the rare languages where there isn't much
             | boilerplate involved in using `Result`-style errors, even
             | compared to exceptions (Swift is like this too)
             | 
             | There is even backtrace support for Result types:
             | https://docs.rs/trace-error/latest/trace_error/
        
           | throwawaymaths wrote:
           | It's kind of the opposite. If you have a network error, often
           | you shouldn't bother to try handling it, the best strategy is
           | often to zero out your state, give up and try again later.
           | 
           | By contrast, you should give up parsing a JSON if it's a
           | config file read on startup but probably not if it's user
           | input.
        
           | merb wrote:
           | Network errors might be retryable/routable differently, but
           | often (especially when starting out) should probably returned
           | to the User. I mean if s3 is down you can retry the call but
           | often it is down then
        
             | dymk wrote:
             | Sort of. At my company, if we removed retries from our
             | services, our reliability would drop precipitously.
             | Something like 99.99% of retries succeed on the second try,
             | if there's not a hard service outage. If there is a hard
             | outage, well, not much to do about that.
        
           | saghm wrote:
           | > The criteria I tend to prefer is "expected" versus
           | "unexpected" errors
           | 
           | The way I've often heard this phrased is "exceptions are for
           | exceptional behavior", and it's always rubbed me the wrong
           | way a bit (although maybe this is partially just because I
           | don't think wordplay is a sufficient argument to do
           | something; I've made similar arguments in the past to friends
           | who sung the praises of "no-shave November" and "thirsty
           | Thursdays"). From digging a bit deeper when I've heard this
           | opinion espoused, it seems like it mostly boils down to the
           | fact that exceptions tend not to be as efficient as happy-
           | path code, so using them for circumstances that are too
           | common is not going to lead to good performance. I guess I
           | don't really find this a subtle enough concept to warrant
           | needing to introduce another abstraction layer into the
           | discussion, especially one that's much vaguer like "is this
           | behavior expected?" If there's a performance concern, I think
           | it's much better addressed directly rather than shifting the
           | discussion to a proxy.
        
             | spion wrote:
             | In languages with exceptions there typically isn't much of
             | a distinction between expected and unexpected errors.
        
         | hinkley wrote:
         | Recoverable vs unrecoverable comes down to requirements.
         | Certain companies known for having software that 'just works'
         | tend to have both very few unrecoverable errors and very
         | conservative feature sets to help facilitate that short list.
         | 
         | It is very clearly a choice, even if many people are deciding
         | by default. By not tackling an issue, you've chosen to have
         | that issue.
        
           | dcsommer wrote:
           | I think we have compatible views. Each layer of the software
           | must decide it's requirements and handle errors appropriately
           | per requirements. You're right I didn't articulate when to
           | handle an issue locally vs. pass it up. I think that's where
           | requirements (and also explicit API guarantees) come into
           | play.
           | 
           | I do think that APIs that "overpromise" by not returning the
           | errors they do not handle to the caller, and instead halt or
           | throw an exception, do their users a disservice in the long-
           | run. These just become undocumented cases that bite you later
           | on. Better libraries have all these conditions baked into the
           | API itself.
        
             | tsimionescu wrote:
             | But an exception is a part of the API, why are you putting
             | it at the same level as halting?
        
               | dcsommer wrote:
               | In my experience, APIs that throw rarely define _all_ the
               | exceptions that can come from it, especially
               | transitively. I see exceptions as a failed (because
               | undocumented, but still important for correctness)
               | attempt at compromising between halting and returning an
               | error.
        
               | hinkley wrote:
               | I wonder if there's a moral equivalent of borrow
               | semantics where we more formally define error
               | propagation.
        
             | hinkley wrote:
             | > You're right I didn't articulate when to handle an issue
             | locally vs. pass it up.
             | 
             | I'm saying this is an everybody problem. Most of us just
             | pass the buck by default. It can take quite a bit of
             | cajoling to get error passing pushed back down to the layer
             | that can best cope with it, once it has escaped up the
             | stack. And when we split teams horizontally, instead of
             | vertically, this happens practically all the time.
             | 
             | This is one of the primary reasons I push for feature teams
             | instead of client/server teams. The distance between, "I
             | think this is done" and "this is ready to go to production"
             | is mercifully small, instead of unknowable.
        
         | alerighi wrote:
         | > When invariant violations or mistakes by programmers (aka
         | bugs) are detected, the program should halt as it is in an
         | inconsistent state and continuing could be very dangerous
         | (think privacy/security/data corruption). Otherwise, don't halt
         | (handle it or have the caller handle it).
         | 
         | Well it's not always the case. There are situations in which if
         | you detect errors you want the program to continue running, and
         | have only that particular functionality to fail.
         | 
         | I tend to write resilient code, since I work in embedded
         | systems and what you never want is the system to crash. Halting
         | a CPU on an invariant violation (i.e. and assert failing) is
         | something useful for debugging (you trigger the debugger and
         | you then analyze why it happened), but something you generally
         | don't want in production.
         | 
         | Bette to have a ton of checks more and in case of an invariant
         | violation (that maybe is resulting from a programmer mistake,
         | but there is always the possibility of hardware memory
         | corruption errors) to return an error and handle it in some
         | ways (for example restart the task that returned the error,
         | trying to go back to the last working state).
        
           | mook wrote:
           | > I tend to write resilient code, since I work in embedded
           | systems and what you never want is the system to crash.
           | 
           | That might be okay too depending on what your system is. I've
           | had a cell phone (the monochrome dumb kind) display an
           | assertion error at me, that was kind of cool. A screw was
           | coming loose, so there were hardware issues. Worst case, I'd
           | have to bring it to the store and get it repaired or
           | replaced; there was no threat of injury or large monetary
           | damage.
        
           | burntsushi wrote:
           | > There are situations in which if you detect errors you want
           | the program to continue running, and have only that
           | particular functionality to fail.
           | 
           | Yes, like a web server. If a request handler fails by
           | panicking, in a Rust program, you catch the panic, respond
           | with a 500 error and log the panic somewhere. But you
           | continue serving other requests.
           | 
           | I talked about this in the blog post.
           | 
           | The problem with your strategy is that it requires you to be
           | aware of your own mistakes. That doesn't sound like a robust
           | strategy, unless you're investing huge resources into
           | sophisticated tooling and have drastically restricted the
           | expressivity of your programming environment. That exists and
           | is fine, and I even addressed that in the blog post too.
        
             | marshray wrote:
             | Great article BTW, loved it! Will certainly become a
             | classic.
             | 
             | The web server example scares me.
             | 
             | Something happened during execution that the programmer
             | didn't expect. There's a bug in the program. What if the
             | panic is due to memory corruption (less likely in Rust) or
             | internal data structure corruption?
             | 
             | Without knowledge to the contrary, swallowing a panic and
             | YOLO'ing execution is driving full speed down the road of
             | very poorly defined behavior.
             | 
             | If the programmer had sufficient knowledge to conclude it
             | was safe, they could have just used a Result<> to report
             | the error.
             | 
             | So ... don't make panic part of your API, and don't recover
             | from panics?
        
               | ninkendo wrote:
               | > Without knowledge to the contrary, swallowing a panic
               | and YOLO'ing execution is driving full speed down the
               | road of very poorly defined behavior.
               | 
               | I mean, if you're worried about code "corrupting" things
               | such that it may not be safe to even serve the next
               | request, you probably need to worry about successful
               | requests as well. I mean, if all bets are off, all bets
               | are truly off.
               | 
               | At some point, if you decide to colocate code from
               | multiple "apps" in one server, you gotta have some level
               | of trust in said apps. You'd obviously never do this for
               | untrusted code. And you'd never do this for code that
               | (hand-waving as you have) "corrupts" things in such a way
               | that it's not safe to handle the next request, panic or
               | no.
        
               | marshray wrote:
               | > You'd obviously never do this for untrusted code.
               | 
               | Doesn't panic imply that the program entered a state from
               | which it was never explicitly designed to recover?
               | 
               | Since that state may persist across invocations of a
               | function, doesn't that make the code effectively
               | 'untrusted' for subsequent execution?
        
               | burntsushi wrote:
               | Doesn't scare me in a memory safe language. Request
               | handlers tend to be very well isolated. It's standard
               | practice for web servers written in Go too.
               | 
               | > So ... don't make panic part of your API, and don't
               | recover from panics?
               | 
               | That's the wrong lesson. As my blog says, don't use
               | panicking for error handling. Making panicking part of
               | the API is standard practice. Otherwise `slice[i]`
               | wouldn't exist. (Its API is to panic when `i` is out of
               | bounds.)
               | 
               | Lest you think you should just never panic and convert
               | everything into errors, that's addressed in the blog too.
               | :)
        
               | marshray wrote:
               | As the sibling post said:
               | 
               | > I mean, if all bets are off, all bets are truly off.
               | 
               | We have to expect that panic (e.g. from slice index
               | operator) exercised an execution path that the library
               | developer had never even considered, much less validated.
               | 
               | So then, post-panic, are we left with _any_ guarantees
               | about the validity of postconditions?
        
         | simion314 wrote:
         | Don't exception also halt your program if you ignore them?
         | 
         | Also if using a library I don't want a bug in it to bring my
         | program down, then I am forced to use workarounds like create a
         | child process to use the library, start the child process from
         | the main process and check on it to see if it fails or
         | succeeds, that would be bad for performance and ugly.
        
       | renewiltord wrote:
       | Very neat. Just discovered anyhow and the context stuff from this
       | post.
        
       | ArrayBoundCheck wrote:
       | Does this mean using C inside of rust is ok? I'm pretty sure the
       | original team kept hitting memory problems and it was in fact not
       | ok. Browsers crash often enough that I don't want unwrap making
       | it worse
        
         | TheDong wrote:
         | > Does this mean using C inside of rust is ok? I'm pretty sure
         | the original team kept hitting memory problems and it was in
         | fact not ok
         | 
         | This reads a little like flamebait.
         | 
         | Memory unsafety is bad. panic is _not_ a memory unsafe
         | operation, it does not result in memory problems.
         | 
         | This post is not related to memory safety in any way.
        
           | ArrayBoundCheck wrote:
           | The third sentence was the point. I don't want browsers being
           | less reliable and terminating randomly due to unwraps
        
             | burntsushi wrote:
             | And how do you propose they do that? What you're saying is
             | tautological. It's like saying, "I don't want browsers to
             | use runtime invariants because they can be broken and thus
             | causing my browser to terminate."
             | 
             | Like... yeah, cool. How do you do that exactly?
             | 
             | Also, please do consider reading the blog post I wrote. I
             | wrote it to try to clarify a lot of confusing around this
             | topic. It is nuanced, and it just can't be untangled in a
             | few sentences in a HN comment.
        
             | 0x457 wrote:
             | I think you missed the point of the article. In fact, the
             | article specifically covers catching panics, so it doesn't
             | bring down the entire application. As well as covering when
             | panic is okay. It doesn't say you should just unwrap
             | everything and treat panic as error handling.
        
       | richardwhiuk wrote:
       | I think the `expect()` bad examples are something of a strawman.
       | 
       | The `.expect()` for regex for example would say what the regex is
       | matching for.
       | 
       | I think it'd be desirable to have a
       | `.unwrap_with_context("Context: {}")`, and the you'd get
       | `Context: Inner Panic Info`.
        
         | burntsushi wrote:
         | So you're saying that the 'expect()' message when a regex
         | compilation error occurs should be a translation from a terse
         | domain specific language to bloviating prose? :-)
         | 
         | What 'expect()' message would you write for this regex?
         | https://github.com/BurntSushi/ucd-generate/blob/6d3aae3b8005...
         | 
         | I think 'unwrap()' there is perfectly appropriate.
         | 
         | > I think it'd be desirable to have a
         | `.unwrap_with_context("Context: {}")`, and the you'd get
         | `Context: Inner Panic Info`.
         | 
         | Why?
        
           | richardwhiuk wrote:
           | .expect("UnicodeData::from_str regex failed to compile")
           | 
           | With this, even with out backtrace, you can work out what
           | happened.
           | 
           | Without it, you just know that some regex somewhere is
           | invalid.
        
             | burntsushi wrote:
             | Have you ever seen a Regex::new(..).unwrap() fail? It
             | sounds like maybe not. It also sounds like you haven't seen
             | an 'unwrap()' fail either.
             | 
             | > Without it, you just know that some regex somewhere is
             | invalid.
             | 
             | That's bologna. As I discuss in the blog post, 'unwrap()'
             | tells you the line number at which it panicked. There's
             | even an example showing exactly this. There's even
             | _another_ example showing what happens when you call
             | 'Regex::new(..).unwrap()' and it fails[1]:
             | fn main() {
             | regex::Regex::new(r"foo\p{glyph}bar").unwrap();         }
             | 
             | And running it:                   $ cargo run
             | Finished dev [unoptimized + debuginfo] target(s) in 0.00s
             | Running `target/debug/rust-panic`         thread 'main'
             | panicked at 'called `Result::unwrap()` on an `Err` value:
             | Syntax(         ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
             | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~         regex parse
             | error:             foo\p{glyph}bar                ^^^^^^^^^
             | error: Unicode property not found         ~~~~~~~~~~~~~~~~~
             | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
             | ~~~         )', main.rs:4:36         note: run with
             | `RUST_BACKTRACE=1` environment variable to display a
             | backtrac
             | 
             | So you get the line number (as is standard with any
             | 'unwrap()') and you get the actual error message from the
             | regex. No need to even enable the backtrace. You just don't
             | need 'expect()' here.
             | 
             | [1]: https://blog.burntsushi.net/unwrap/#why-not-use-
             | expect-inste...
        
         | xvedejas wrote:
         | When you unwrap/expect on an error, the inner error value is
         | already printed. Are you suggesting something more? Reference
         | the implementation details from the article:
         | impl<T, E: std::fmt::Debug> Result<T, E> {           pub fn
         | unwrap(self) -> T {             match self {
         | Ok(t) => t,               Err(e) => panic!("called
         | `Result::unwrap()` on an `Err` value: {:?}", e),             }
         | }         }
        
         | vincenv wrote:
         | You can get that with anyhow [1] by calling unwrap after
         | `.with_context`.
         | 
         | [1]: https://docs.rs/anyhow/latest/anyhow/
        
       | berryton wrote:
       | The Rust book has a small discussion about his in chapter 9.3
       | (https://doc.rust-lang.org/book/ch09-03-to-panic-or-not-to-
       | pa...).
       | 
       | I think it is good to have some references (the blog post and the
       | book) for when the horde comes after you.
        
       | koala_man wrote:
       | tl;dr: Author convincingly argues that Rust `unwrap()` (Java
       | `Optional.get`, Haskell `fromJust`) is fine when you either have
       | checked that the call will not fail, or when you're in a unit
       | test or similar where a panic is a helpful result.
       | 
       | (And, conversely, that it's not fine to use it to avoid doing
       | real error handling)
        
         | OJFord wrote:
         | Buried in your 'or similar' is the more controversial and
         | helpful (IMO) use - the infamous 'this should never happen'
         | exceptions.
        
           | marcosdumay wrote:
           | Well, if you want a definitive answer, it's "it depends".
           | Those errors are not all alike.
        
           | burntsushi wrote:
           | No, that isn't buried in the "or similar" of the GP. They
           | mention that explicitly with "checked that the call will not
           | fail." The "or similar" refers to documentation examples and
           | prototyping/one-off scripts.
        
         | drexlspivey wrote:
         | Is this supposed to be controversial? Even rust docs say so
         | https://doc.rust-lang.org/book/ch09-03-to-panic-or-not-to-pa...
        
           | burntsushi wrote:
           | Nope. But people are thoroughly confused by it. Comes up all
           | of the time. And a lot of people have more extremist
           | positions. Banning unwrap. Or banning any panicking branches
           | at all. That's why my blog post covers an example where we
           | convert a function that never panics (sensible) to a function
           | whose signature says it might return an error, but it
           | actually will never return an error (bonkers).
           | 
           | People advocate for this. After publishing this article, it
           | almost seems like people are more confused than I thought.
           | Idk.
           | 
           | Anyway, no, this blog is not meant to be controversial. It is
           | meant to untangle knots.
        
       ___________________________________________________________________
       (page generated 2022-08-08 23:00 UTC)