[HN Gopher] Using unwrap() in Rust is okay ___________________________________________________________________ Using unwrap() in Rust is okay Author : cp9 Score : 156 points Date : 2022-08-08 13:27 UTC (9 hours ago) (HTM) web link (blog.burntsushi.net) (TXT) w3m dump (blog.burntsushi.net) | nnoitra wrote: | This blog post will age out in 6 months. | burntsushi wrote: | Why? It's basically an elaboration of the same advice I gave | eight years ago. And I was not the first. | ComputerGuru wrote: | One runtime panic I wish was a compile-time error in the rust | standard library is the use of an incorrect memory order, eg | Ordering::Release with AtomicBool::load(). | | It would have been fairly trivial to set up generic constraints | specifying if a read, write, or read-write ordering semantic is | expected and to fail to compile if it wasn't met. | pitaj wrote: | This could be a good application of a Clippy lint. | nynx wrote: | This is going to happen at some point now that const generics | arrived. | ComputerGuru wrote: | Likely only once we finally get const parameters specified as | regular arguments. | | It doesn't need const at all, though. You just need three | traits and either first class enum variants as types or a | pseudo enum (mod/struct Ordering with ZST structs Acquire, | Release, etc) | jeffrallen wrote: | When you've gotten to the point of saying, "well, it's ok to | panic in example code" you've already lost the game. Novice | programmers learn from example code, and novice programmers are | an order of magnitude more common than experienced ones. A | programming ecosystem that depends on 9 out of every 10 people | being able to intuit that which the other 1 understands is not an | ecosystem that's going to produce good code. | | Rust is a minefield of bear traps laid by experts, and I fear for | the future of our industry if Go and Java programmers are | required by some quirk of network effects or first mover | advantage or whatever to starting programming (badly) in Rust. | burntsushi wrote: | Did you read the blog? I addressed all of that. | schubart wrote: | > When checking preconditions, make sure the panic message | relates to the documented precondition, perhaps by adding a | custom message. For example, | | > `assert!(!xs.is_empty(), "expected parameter 'xs' to be non- | empty")`. | | This panics with | | > thread 'main' panicked at 'expected parameter 'xs' to be non- | empty', src/main.rs:79:5 | | Without the custom message it's | | > thread 'main' panicked at 'assertion failed: !xs.is_empty()', | src/main.rs:79:5 | | Given that panics should be for bugs, i.e. interpreted by | developers, I'd say the second message is clear enough and a | custom message just adds noise in the source code. | sitkack wrote: | One can set the env var themselves in main(), many errors are | transient, best to capture it the first time. | // use std::env; env::set_var("RUST_BACKTRACE", "full"); | burntsushi wrote: | Hah. Ironically, set_var might be deprecated at some point and | replaced with an unsafe alternative. (Long story. Short story | is that it's currently unsound. If you have C code trying to | read from the environment at the same time you end up with UB. | If everything is Rust code though, then I believe you're fine.) | sitkack wrote: | I was trying to find a "turn on backtrace" as an official api | but couldn't find one in 90s of looking. I see the backtrace | crate for catching them in user space in process, which is | nice. | | What would you recommend? Forking and execing yourself to set | the env var? backtrace::enable_full() or something to that | effect would be nice. | ww520 wrote: | I generally just use expect(), assert!, or unreachable! rather | than simply unwrap() to document the unexpected invalid state. | c7DJTLrn wrote: | The problem is that Rust developers don't just use unwrap() when | it _should_ panic. I 've seen plenty of "production grade" crates | basically just unwrap because the author didn't know how to | handle it gracefully or just wanted to get the code compiling, | then forgot about it. | bern4444 wrote: | When writing a library, wouldn't the prudent thing to do be to | return a Try/Either or have the method accept a function to | invoke in case of an error (like an IOC solution)? | ironmagma wrote: | But luckily when that time does come to refactor the code to | handle the error case, the Rust compiler does a lot to help you | make sure it's handled properly. | librexpr wrote: | Maybe it would have been a good idea to have two names for | unwrap, one which would mean "I'm certain that this value will | always be okay", and another which would mean "I'm taking a | shortcut because I'm writing a script or just want it to | compile for now". Maybe a longer name like "assert_valid()" for | the one where you're sure it's okay. That might make it easier | to find the places where shortcuts were taken and forgotten. | [deleted] | jeffrallen wrote: | Maybe if Rust code wasn't so hard to get working at all, then | more of it would work correctly (including checking errors | instead of giving up and unwrapping). | dureuill wrote: | not sure what you're referring to, but yes having a first | iteration of code with unwrap here and there, and then a | second one with proper error handling is a viable strategy. | the "loudness" of unwrap makes it super easy to notice any | leftover in code review | burntsushi wrote: | Consider using anyhow. At that point, unwrap isn't a shortcut | anymore except for one place: you are in a deeply nested | function call and you suddenly realize you need to bubble up | an error a few layers. Now you have to change a bunch of | function signatures. | | I don't find myself in that position too frequently. | Certainly not enough to warrant two identical but differently | named functions. | | In that case, I would suggest coming up with your own | pattern. Perhaps an unwrap() with a FIXME comment. Or a | expect("FIXME"). | r3un1 wrote: | There is! `expect()` | sanderjd wrote: | I think this is a good answer. People should use `expect` | for this "I'm sure this is fine" case, with descriptive | text about why it's fine. | dymk wrote: | I'm confused, you seem to be saying opposite things. You're | saying authors don't use unwrap when it should panic, and in | the next sentence, you say they use unwrap (which causes a | panic) when the failure could be gracefully recovered. | c7DJTLrn wrote: | "don't just" | dymk wrote: | Got it. OJFord has it right, my parens were in the wrong | place. | OJFord wrote: | Heh, GP (I assume) and I initially parsed it as "don't | just-use" (don't simply use) rather than "don't-just use" | (don't only use). | dureuill wrote: | not sure why you're getting downvoted. maybe the scope of what | you're saying ('plenty of "production grade" crates unwrap | [when they shouldn't]') calls for a reference? Or maybe because | you're singling out Rust developers while the issue is | certainly observed in all languages with similar mechanisms | (see unchecked exceptions abuse in java, or aborting asserts | abuse in C) | | Anyway i wouldn't say "plenty", but i did came across crates | (parsers :/) that would unwrap on malformed input. the | workaround is to encapsulate their use in a catch_unwind. | | for the record, i had a similar issue in a c++ lib where the | author elected to abort on the unsupported input, so i'm | somewhat thankful that the idiomatic mechanism is panic (which | is recoverable if needs be) in Rust | burntsushi wrote: | That is _a_ problem. Another similar problem is that | programmers write buggy code. I don 't see anything | particularly special about 'unwrap()'. 'unwrap()' is a common | manifestation of it. But as others have pointed out to you, | this same phenomenon manifests everywhere and in other | programming languages, but through different means. | dcsommer wrote: | I totally agree with de-emphasizing the old "recoverable" vs. | "unrecoverable" dichotomy | (https://blog.burntsushi.net/unwrap/#what-about- | recoverable-v...). Every time I've heard programmers (especially | in the context of exceptions) try to define it, I've found it | imprecise and open to debate. | | When invariant violations or mistakes by programmers (aka bugs) | are detected, the program should halt as it is in an inconsistent | state and continuing could be very dangerous (think | privacy/security/data corruption). Otherwise, don't halt (handle | it or have the caller handle it). | arcticbull wrote: | Yep, there's not really any such thing (IME) as a 'recoverable' | error - except with respect to I/O. | | There's either I/O errors - or there's logic errors. A failure | with logic should nuke due to the app being in an inconsistent | state; trust is lost. An I/O error should fail softly. | ithkuil wrote: | A parser library can return an error when the input is wrong. | It doesn't necessarily mean the app is in an inconsistent | state when that happens; it all depends on what the | application does. It follows that aborting is often a | sensible decision in application code and rarely in library | code. | arcticbull wrote: | I would personally consider parsers as part of I/O, but | point taken. | fatherzine wrote: | Partially true. In practice people implement multiplexed | servers, for many reasons, including performance / | throughput. A logic failure should nuke _the offending | request_, not the entire server with all the unrelated | concurrent requests. | AshamedCaptain wrote: | This is an attitude that I often see -- library authors who | believe they own the process. Aborting may be the only | sensible thing to do in runtimes where you can end up | corrupting the entire process memory or the like, making | recovery "dubiously possible", but absolutely not for | anything higher level, where recovery may be safe and | possible. | burntsushi wrote: | I gave examples covering this area in the post. How would | you rewrite the code in this section[1], for example, to | conform to your views? (Assume this code is in a | library.) | | [1]: https://blog.burntsushi.net/unwrap/#what-about-when- | invarian... | fatherzine wrote: | * Short term: add explicit runtime checks for every step | that may panic. With the way jump prediction works in | modern processors, the runtime cost may be smaller than | one may naively assume it is. Some of us would take, say, | a 10% perf. degradation instead of chasing prod panics in | the middle of the night. | | * Long term: plug-in a richer type (logic) system, so one | can safely prove the most costly runtime invariants at | compile time. | winstonewert wrote: | Your short term suggestion appears confused. Panics are | already the result of runtime checks. The issue isn't | whether or not we have runtime checks, but what to do | when those checks fail. | | But to your original point, its true that a multiplexing | server should probably try to stay up even if there is a | bug uncovered in handling a request. But that's precisely | why rust allows you to catch panics. | fatherzine wrote: | "It is _not_ recommended to use this function for a | general try /catch mechanism. The Result type is more | appropriate to use for functions that can fail on a | regular basis. Additionally, this function is not | guaranteed to catch all panics, see the "Notes" section | below." | | https://doc.rust-lang.org/std/panic/fn.catch_unwind.html | | Edit. Re. already existing runtime checks, I'd expect a | decent compiler to optimize away redundant checks, i.e. | explicitly coded checks vs. checks injected automatically | by the compiler, as long as their semantics is identical. | burntsushi wrote: | Can you point to any compiler in the world that can see | through the construction of a DFA from user input to the | point that it can automatically optimize away the | transition table lookup? | | > The Result type is more appropriate to use for | *functions that can fail on a regular basis.* | | Emphasis mine. The DFA::is_match routine can never fail | on any input. Yet, you want to use a 'Result' with it. | | Seriously. Please please pretty please read the blog | post. You are literally caught in precisely the tangled | mess that I tried to untangle in the post. If my post | didn't do it for you, then please consider responding to | specific points you disagree with in the post. | fatherzine wrote: | Sorry, I only meant to address with some nuance the | statement "A failure with logic should nuke due to the | app being in an inconsistent state". If you are confident | that the specific DFA code in the blog post will not | fail, more power to you! And to us, which transitively | trust the code via your judgment ;) | | Edit. For technical geeking around: | | * The point is about _redundant_ check elimination, not | full check elimination. A decent compiler should be able | to eliminate redundant checks, though perhaps the | language could offer some intrinsics (was: pragmas) to | make this 100% reliable. | | * For DFA proofs, perhaps something like https://courses. | grainger.illinois.edu/cs373/fa2013/Lectures/... could be | a start. A great intern may tackle it over a summer ;) | burntsushi wrote: | > add explicit runtime checks for every step that may | panic | | And do... what when the check fails? Can you please write | the code for it? Because I don't understand what the heck | you mean here. If you don't know Rust, pseudo code is | fine. | | > Long term: plug-in a richer type (logic) system, so one | can safely prove the most costly runtime invariants at | compile time. | | This is just copying what I already said in the blog | post. Until someone can show me how to prove the | correctness of arbitrary DFA construction _and_ search | from a user provided regular expression pattern _and_ | demonstrate its use in a practical programming language | like Rust, I consider this a total non-answer, not a | "long term" answer. | | Basically, your answer here confirms for me that your | views on how code should be structured are incoherent. | fatherzine wrote: | fn is_match(&self, haystack: &[u8]) -> Result<bool, | Error> { ... } | burntsushi wrote: | That's exactly what the next section of my post explores | and explains why it's complete bonkers: | https://blog.burntsushi.net/unwrap/#why-not-return-an- | error-... | | So it looks like I already provided a rewrite of the code | for you in your preferred style: // | Returns true if the DFA matches the entire 'haystack'. | // This routine always returns either Ok(true) or | Ok(false) for all inputs. // It never returns an | error unless there is a bug in its implementation. | fn is_match(&self, haystack: &[u8]) -> Result<bool, | &'static str> { let mut state_id = | self.start_id; for &byte in haystack { | let row = match state_id.checked_mul(256) { | None => return Err("state id too big"), | Some(row) => row, }; let | row_offset = match row.checked_add(usize::from(byte)) { | None => return Err("row index too big"), | Some(row_offset) => row_offset, }; | state_id = match self.transitions.get(row_offset) { | None => return Err("invalid transition"), | Some(&state_id) => state_id, }; | match self.is_match_id.get(state_id) { None | => return Err("invalid state id"), | Some(&true) => return Ok(true), | Some(&false) => {} } } | Ok(false) } | | My favorite part is the docs that say "this never returns | an error." Because if it did, the code would be buggy. | And now all sorts of internal implementation details are | leaked. | | You can't even document the error conditions in a way | that is related to the input of the routine, because if | you could, you would have discovered a bug! | | Another nail in the coffin of your preferred style is | that this will absolutely trash the performance of a DFA | search loop. | fatherzine wrote: | Man, I only gave some friendly feedback ;) | | "this will absolutely trash the performance of a DFA | search loop." | | Do you happen to have some hard data to quantify? I'm | curious. | ninkendo wrote: | One additional data point: The Rust compiler got ~2% | faster when they made some of the core traits in the | serializer infallible (ie. panic instead of using | Result): https://github.com/rust-lang/rust/pull/93066 | | Writeup from the contributor: | https://nnethercote.github.io/2022/02/25/how-to-speed-up- | the... | | > The Decoder trait used for metadata decoding was | fallible, using Result throughout. But decoding failures | should only happen if something highly unexpected happens | (e.g. metadata is corrupted) and on failure the calling | code would just abort. This PR changed Decoder to be | infallible throughout--panicking immediately instead of | panicking slightly later--thus avoiding lots of pointless | Result propagation, for wins across many benchmarks of up | to 2%. | burntsushi wrote: | Yes. I work on regex engines. Here's the code with all | the crazy checks that return errors that can never | happen: fn find_fwd_imp<A: Automaton + | ?Sized>( dfa: &A, input: | &Input<'_, '_>, pre: Option<&'_ dyn | Prefilter>, earliest: bool, ) -> | Result<Option<HalfMatch>, MatchError> { let | mut mat = None; let mut sid = init_fwd(dfa, | input)?; let mut at = input.start(); | while at < input.end() { let byte = match | input.haystack().get(at) { None => | return Err(MatchError::GaveUp { offset: at }), | Some(&byte) => byte, }; | sid = dfa.try_next_state(sid, byte)?; if | dfa.is_special_state(sid) { if | dfa.is_match_state(sid) { let | pattern = dfa.match_pattern(sid, 0); | mat = Some(HalfMatch::new(pattern, at)); | } else if dfa.is_dead_state(sid) { | return Ok(mat); } } | at = match at.checked_add(1) { None | => return Err(MatchError::GaveUp { offset: at }), | Some(at) => at, }; } | eoi_fwd(dfa, input, &mut sid, &mut mat)?; | Ok(mat) } | | And here's a search (I ran it a few times): | $ regex-cli count matches dfa dense @$medium '\w{42}' | parse time: 45.331us translate time: | 28.767us compile nfa time: 9.577558ms | nfa memory: 698148 compile dense dfa time: | 516.497476ms dense dfa memory: 6562848 | dense alphabet length: 113 dense | stride: 128 search time: | 531.808124ms counts: [2] | | Here's code that's safe that will panic when there's a | bug: fn find_fwd_imp<A: Automaton + | ?Sized>( dfa: &A, input: | &Input<'_, '_>, pre: Option<&'_ dyn | Prefilter>, earliest: bool, ) -> | Result<Option<HalfMatch>, MatchError> { let | mut mat = None; let mut sid = init_fwd(dfa, | input)?; let mut at = input.start(); | while at < input.end() { sid = | dfa.next_state(sid, input.haystack()[at]); | if dfa.is_special_state(sid) { if | dfa.is_match_state(sid) { let | pattern = dfa.match_pattern(sid, 0); | mat = Some(HalfMatch::new(pattern, at)); | } else if dfa.is_dead_state(sid) { | return Ok(mat); } } | at += 1; } eoi_fwd(dfa, input, | &mut sid, &mut mat)?; Ok(mat) } | | And here's a search with it: $ regex- | cli count matches dfa dense @$medium '\w{42}' | parse time: 33.701us translate time: | 24.264us compile nfa time: 9.55935ms | nfa memory: 698148 compile dense dfa time: | 550.959698ms dense dfa memory: 6562848 | dense alphabet length: 113 dense | stride: 128 search time: | 468.932568ms counts: [2] | | So that's 10% right there. We can keep going though. | Let's get rid of the bounds checks. We have to use unsafe | though. The stakes have risen. Now if there's a bug, we | probably won't get a panic. Instead we get UB. Which | might lead to all sorts of bad stuff. | fn find_fwd_imp<A: Automaton + ?Sized>( dfa: | &A, input: &Input<'_, '_>, pre: | Option<&'_ dyn Prefilter>, earliest: bool, | ) -> Result<Option<HalfMatch>, MatchError> { | let mut mat = None; let mut sid = | init_fwd(dfa, input)?; let mut at = | input.start(); while at < input.end() { | sid = unsafe { let byte = | *input.haystack().get_unchecked(at); | dfa.next_state_unchecked(sid, byte) }; | if dfa.is_special_state(sid) { if | dfa.is_match_state(sid) { let | pattern = dfa.match_pattern(sid, 0); | mat = Some(HalfMatch::new(pattern, at)); | } else if dfa.is_dead_state(sid) { | return Ok(mat); } } | at += 1; } eoi_fwd(dfa, input, | &mut sid, &mut mat)?; Ok(mat) } | | And now the same search: $ regex-cli | count matches dfa dense @$medium '\w{42}' | parse time: 22.511us translate time: | 26.843us compile nfa time: 8.709352ms | nfa memory: 698148 compile dense dfa time: | 565.365005ms dense dfa memory: 6562848 | dense alphabet length: 113 dense | stride: 128 search time: | 456.402963ms counts: [2] | | For a smaller gain. | fatherzine wrote: | This is so awesome, thanks for sharing! I am happy that | the "~10% perf hit" intuition is reasonably close to what | the data shows. What is the right tradeoff? I am not an | author of widely popular regex engines, so I'll leave | that evaluation to more qualified people. | burntsushi wrote: | 10% is an excellent win. And given that I consider the | first code snippet to be effectively nonsense, the right | trade off is obvious to me. :) | | I bet if me and you sat down with a big codebase side-by- | side, I could convince you that the style you're | proposing is totally unworkable. | | Now, whether it's worth it to use unsafe and elide bounds | checks is harder. The gain I showed here is small, but it | can be bigger in other examples. It depends. But it is | certainly less clear. It helps that this is probanly the | easiest kind of unsafe to justify, because you just have | to make sure the index is in bounds. Other types of | unsafe can be quite tricky to justify. | AshamedCaptain wrote: | Exceptions. | jcelerier wrote: | > There's either I/O errors - or there's logic errors. A | failure with logic should nuke due to the app being in an | inconsistent state; trust is lost. | | nah. in GUI apps for instance you want the failure in the | logic of a sub-sub-function to just tell the error "wops" | when the button that triggered the action was clicked, not | nuke the app (unless you hate your users). e.g. imagine a 3D | software which allows to do mesh operations - user clicks on | the "Smooth the mesh" button somewhere. Programmer forgot to | handle a division by zero in some degenerate case of the | smoothing computation which ends up leading to an exception: | a value becomes zero, someone used unsigned integers for n in | an "n - 1" computation which ends up in a call to | array_of_floats.resize(0xffffffffffffffff) (and a likely | std::bad_alloc being thrown if you're in c++). | | The original mesh is unchanged as the operation waits until | the computation is complete to replace the old mesh with the | new. | | If you ever decide to crash in this situation I am sure you | will have _great_ reviews on 3D modeling software | comparisons. | pcwalton wrote: | > Programmer forgot to handle a division by zero in some | degenerate case of the smoothing computation which ends up | leading to an exception: a value becomes zero, someone used | unsigned integers for n in an "n - 1" computation which | ends up in a call to | array_of_floats.resize(0xffffffffffffffff) (and a likely | std::bad_alloc being thrown if you're in c++). | | To me this is a textbook case of why panic::catch_unwind | exists in Rust. Conceptually, the smoothing algorithm | failed in an unexpected way, so a panic is appropriate: | there is nothing else for the procedure to do but crash. | But crashing would be a very poor user experience, because | the smoothing operation is conceptually _isolated_ from the | application as a whole, so the program shouldn 't be | affected by a single operation crashing. This is why Rust | goes to the trouble to implement unwinding on panic: in | certain domains, software fault isolation is important. | | Another possibility, of course, would be to spawn a | separate process for the smoothing operation, which would | effectively replace software fault isolation with hardware | fault isolation. But this might be awkward (and slow), | which is why few modeling software packages do this. | | ("A complex smoothing operation in a 3D modeling package | fails" is maybe the best example of the need for | panic::catch_unwind I've ever heard of--thanks for offering | it. I wish I had been able to deploy this example back when | we had the debate as to whether catch_unwind should exist | at all. Thankfully, we kept it in.) :) | Snarwin wrote: | The problem with this approach is that, when a panic | actually happens at runtime, there is no way to tell | whether it was caused by an isolated mistake like | forgetting to check for zero, or by memory corruption | writing garbage all over your data structures. If it's the | first, recovering is fine; if it's the second, you risk | exposing the user to data loss and security | vulnerabilities. So the only safe thing to do is crash. | | Now, it may be true that you get better reviews selling | software that corrupts data or gets its users hacked than | selling software that crashes. But unless you hate your | users, those are probably things you want to avoid. | arcticbull wrote: | I would actually want that to crash, yes. That would ensure | it gets caught and resolved by developers during | development. Or at least increase the odds thereof. | Further, if it allocates a few gb, or if it's allocating a | large amount of memory because the size parameter got | smashed? | | If it crashes enough that you'd get terrible reviews it | would _definitely_ be caught during development. | | If it messes up your mesh silently then you'll definitely | still get bad reviews. | | And actually having an std::bad_alloc thrown is even worse | since you have no idea what state it left things after | running some destructors when you weren't planning for it. | jcelerier wrote: | > That would ensure it gets caught and resolved by | developers during development | | i don't know in which reality you live but in mine | there's not much relationship between the existence of | crashes, and them being resolved during development | | > And actually having an std::bad_alloc thrown is even | worse since you have no idea what state it left things | after running some destructors when you weren't planning | for it. | | i don't even know what to say. that's the whole point of | destructors - you _know_ that things will be unwound in | the reverse order from which they were created on | automatic storage. that 's, like, why c++ exists | outworlder wrote: | Only if you have implemented your destructors correctly. | Forgot to add a 'virtual' on one of them? Too bad. Have | some pointers to objects? Better remember to call delete | on them all. You never really 'know' things are | destroyed, that's why tools like Valgrind exist. | | The real fun is really the combination with exceptions | and destructors. So much so, some projects essentially | ban exceptions in C++ code. | stormbrew wrote: | To me the real issue is this is an extremely forced binary and | there's really at least three meaningful categories (especially | in software with a UI of any sort): | | - unactionable invariant violation (poisoned mutex, hard memory | errors): crash immediately, something that should ever happen | happened and there's no way to either handle or present the | error to the user in a meaningful way. | | - unactionable (at the call site) but normal errors (couldn't | open a file, disconnected from the remote end of a connection, | etc): these need to be propagated up to where they can be | turned into actionable information for a user, ideally. This is | _rarely_ a thing the call site where it happened can usefully | do. | | - immediately actionable and normal errors (user input didn't | validate, file user wanted to open doesn't exist, connection | failed but can be retried with a backoff, etc). These need to | be handled at the call site or maybe one or two levels up. | | You _need_ an exception-like mechanism (or at least a process | for emulating one, a la go MRV or C errno) to handle the second | case, you often _want_ it for the third case, but it never | really makes sense to use it for the first. | | That said, I think in non-test rust code you should use expect | instead of unwrap, because sometimes invariants do trip and | that little tiny extra bit of info can make a huge difference | to resolving it. | jcranmer wrote: | The criteria I tend to prefer is "expected" versus "unexpected" | errors. I/O errors, especially network errors, are things that | are going to be expected under reasonable operation, therefore | it make sense that code should handle them. Similarly, user | input resulting in incorrectly formatted code should be | reasonably expected and therefore handled. | | But the same kinds of failures might not be reasonably expected | in other circumstances--I wouldn't expect that the internal | configuration files of an application should occur in | reasonable operation, and therefore it makes sense to panic if | they're corrupted... even if the cause is an I/O operation on a | local disk, or parsing some JSON or TOML or INI or whatnot | file. | | One implication of this is that it needs to be easy for any | error system to promote an "expected error" into an "unexpected | error"--which is what unwrap/expect does. The | recoverable/unrecoverable error suggests that there ought to be | no reason to do this, but there is absolutely a reason to do | so: what category an error falls into is ultimately decided by | the context of the error, not the generation of the error | itself. | taeric wrote: | I'm a fan of, "is this going to a user, or to another part of | the calling program?" If expecting it to go to a user, | exception make sense. Need to bubble it all the way up to the | part of the system that is closest to the user. | vbezhenar wrote: | I write Java and for me it's not exactly "expected" versus | "unexpected". | | There are unexpected errors for sure. For example | "StackOverflowError" which could be thrown from any method | call. | | There should be unexpected errors handler which does some | sane thing in given circumstances. Usually it involves | logging error and its details (stacktrace, may be something | else) and returning some kind of generic error to the caller | (e.g. HTTP 500). | | But the thing is, this handler very often suitable for | handling errors that I expect but I don't want to bother | handling them. For example those errors might be rare enough | that writing code would be a net negative (every line of code | is maintenance burden and error handling code is maintenance | burden power 2). And I'm totally OK with those errors being | handled in generic way. | | Even if it's user input sometimes. For example user could be | me. And I know input format. I don't want to write error | handling for myself, all I need is to prevent data | corruption. Getting HTTP 500 InvalidNumberFormatException is | totally fine in some situations. | | And language should provide means for writing that kind of | code. At least that's my opinion and that's what I truly miss | in those languages with explicit error handling of every | function call. | | You might call it lazy coding. I call it reasonable coding. | | Exceptions for the win! | spion wrote: | Rust does allow you to easily call functions or methods | that may error by using the `?` operator (formerly `try!` | macro) | | https://doc.rust-lang.org/std/macro.try.html | | You can also implement `From` (see `impl From` in the above | doc) to define how you want external error types to be | converted. | | Its one of the rare languages where there isn't much | boilerplate involved in using `Result`-style errors, even | compared to exceptions (Swift is like this too) | | There is even backtrace support for Result types: | https://docs.rs/trace-error/latest/trace_error/ | throwawaymaths wrote: | It's kind of the opposite. If you have a network error, often | you shouldn't bother to try handling it, the best strategy is | often to zero out your state, give up and try again later. | | By contrast, you should give up parsing a JSON if it's a | config file read on startup but probably not if it's user | input. | merb wrote: | Network errors might be retryable/routable differently, but | often (especially when starting out) should probably returned | to the User. I mean if s3 is down you can retry the call but | often it is down then | dymk wrote: | Sort of. At my company, if we removed retries from our | services, our reliability would drop precipitously. | Something like 99.99% of retries succeed on the second try, | if there's not a hard service outage. If there is a hard | outage, well, not much to do about that. | saghm wrote: | > The criteria I tend to prefer is "expected" versus | "unexpected" errors | | The way I've often heard this phrased is "exceptions are for | exceptional behavior", and it's always rubbed me the wrong | way a bit (although maybe this is partially just because I | don't think wordplay is a sufficient argument to do | something; I've made similar arguments in the past to friends | who sung the praises of "no-shave November" and "thirsty | Thursdays"). From digging a bit deeper when I've heard this | opinion espoused, it seems like it mostly boils down to the | fact that exceptions tend not to be as efficient as happy- | path code, so using them for circumstances that are too | common is not going to lead to good performance. I guess I | don't really find this a subtle enough concept to warrant | needing to introduce another abstraction layer into the | discussion, especially one that's much vaguer like "is this | behavior expected?" If there's a performance concern, I think | it's much better addressed directly rather than shifting the | discussion to a proxy. | spion wrote: | In languages with exceptions there typically isn't much of | a distinction between expected and unexpected errors. | hinkley wrote: | Recoverable vs unrecoverable comes down to requirements. | Certain companies known for having software that 'just works' | tend to have both very few unrecoverable errors and very | conservative feature sets to help facilitate that short list. | | It is very clearly a choice, even if many people are deciding | by default. By not tackling an issue, you've chosen to have | that issue. | dcsommer wrote: | I think we have compatible views. Each layer of the software | must decide it's requirements and handle errors appropriately | per requirements. You're right I didn't articulate when to | handle an issue locally vs. pass it up. I think that's where | requirements (and also explicit API guarantees) come into | play. | | I do think that APIs that "overpromise" by not returning the | errors they do not handle to the caller, and instead halt or | throw an exception, do their users a disservice in the long- | run. These just become undocumented cases that bite you later | on. Better libraries have all these conditions baked into the | API itself. | tsimionescu wrote: | But an exception is a part of the API, why are you putting | it at the same level as halting? | dcsommer wrote: | In my experience, APIs that throw rarely define _all_ the | exceptions that can come from it, especially | transitively. I see exceptions as a failed (because | undocumented, but still important for correctness) | attempt at compromising between halting and returning an | error. | hinkley wrote: | I wonder if there's a moral equivalent of borrow | semantics where we more formally define error | propagation. | hinkley wrote: | > You're right I didn't articulate when to handle an issue | locally vs. pass it up. | | I'm saying this is an everybody problem. Most of us just | pass the buck by default. It can take quite a bit of | cajoling to get error passing pushed back down to the layer | that can best cope with it, once it has escaped up the | stack. And when we split teams horizontally, instead of | vertically, this happens practically all the time. | | This is one of the primary reasons I push for feature teams | instead of client/server teams. The distance between, "I | think this is done" and "this is ready to go to production" | is mercifully small, instead of unknowable. | alerighi wrote: | > When invariant violations or mistakes by programmers (aka | bugs) are detected, the program should halt as it is in an | inconsistent state and continuing could be very dangerous | (think privacy/security/data corruption). Otherwise, don't halt | (handle it or have the caller handle it). | | Well it's not always the case. There are situations in which if | you detect errors you want the program to continue running, and | have only that particular functionality to fail. | | I tend to write resilient code, since I work in embedded | systems and what you never want is the system to crash. Halting | a CPU on an invariant violation (i.e. and assert failing) is | something useful for debugging (you trigger the debugger and | you then analyze why it happened), but something you generally | don't want in production. | | Bette to have a ton of checks more and in case of an invariant | violation (that maybe is resulting from a programmer mistake, | but there is always the possibility of hardware memory | corruption errors) to return an error and handle it in some | ways (for example restart the task that returned the error, | trying to go back to the last working state). | mook wrote: | > I tend to write resilient code, since I work in embedded | systems and what you never want is the system to crash. | | That might be okay too depending on what your system is. I've | had a cell phone (the monochrome dumb kind) display an | assertion error at me, that was kind of cool. A screw was | coming loose, so there were hardware issues. Worst case, I'd | have to bring it to the store and get it repaired or | replaced; there was no threat of injury or large monetary | damage. | burntsushi wrote: | > There are situations in which if you detect errors you want | the program to continue running, and have only that | particular functionality to fail. | | Yes, like a web server. If a request handler fails by | panicking, in a Rust program, you catch the panic, respond | with a 500 error and log the panic somewhere. But you | continue serving other requests. | | I talked about this in the blog post. | | The problem with your strategy is that it requires you to be | aware of your own mistakes. That doesn't sound like a robust | strategy, unless you're investing huge resources into | sophisticated tooling and have drastically restricted the | expressivity of your programming environment. That exists and | is fine, and I even addressed that in the blog post too. | marshray wrote: | Great article BTW, loved it! Will certainly become a | classic. | | The web server example scares me. | | Something happened during execution that the programmer | didn't expect. There's a bug in the program. What if the | panic is due to memory corruption (less likely in Rust) or | internal data structure corruption? | | Without knowledge to the contrary, swallowing a panic and | YOLO'ing execution is driving full speed down the road of | very poorly defined behavior. | | If the programmer had sufficient knowledge to conclude it | was safe, they could have just used a Result<> to report | the error. | | So ... don't make panic part of your API, and don't recover | from panics? | ninkendo wrote: | > Without knowledge to the contrary, swallowing a panic | and YOLO'ing execution is driving full speed down the | road of very poorly defined behavior. | | I mean, if you're worried about code "corrupting" things | such that it may not be safe to even serve the next | request, you probably need to worry about successful | requests as well. I mean, if all bets are off, all bets | are truly off. | | At some point, if you decide to colocate code from | multiple "apps" in one server, you gotta have some level | of trust in said apps. You'd obviously never do this for | untrusted code. And you'd never do this for code that | (hand-waving as you have) "corrupts" things in such a way | that it's not safe to handle the next request, panic or | no. | marshray wrote: | > You'd obviously never do this for untrusted code. | | Doesn't panic imply that the program entered a state from | which it was never explicitly designed to recover? | | Since that state may persist across invocations of a | function, doesn't that make the code effectively | 'untrusted' for subsequent execution? | burntsushi wrote: | Doesn't scare me in a memory safe language. Request | handlers tend to be very well isolated. It's standard | practice for web servers written in Go too. | | > So ... don't make panic part of your API, and don't | recover from panics? | | That's the wrong lesson. As my blog says, don't use | panicking for error handling. Making panicking part of | the API is standard practice. Otherwise `slice[i]` | wouldn't exist. (Its API is to panic when `i` is out of | bounds.) | | Lest you think you should just never panic and convert | everything into errors, that's addressed in the blog too. | :) | marshray wrote: | As the sibling post said: | | > I mean, if all bets are off, all bets are truly off. | | We have to expect that panic (e.g. from slice index | operator) exercised an execution path that the library | developer had never even considered, much less validated. | | So then, post-panic, are we left with _any_ guarantees | about the validity of postconditions? | simion314 wrote: | Don't exception also halt your program if you ignore them? | | Also if using a library I don't want a bug in it to bring my | program down, then I am forced to use workarounds like create a | child process to use the library, start the child process from | the main process and check on it to see if it fails or | succeeds, that would be bad for performance and ugly. | renewiltord wrote: | Very neat. Just discovered anyhow and the context stuff from this | post. | ArrayBoundCheck wrote: | Does this mean using C inside of rust is ok? I'm pretty sure the | original team kept hitting memory problems and it was in fact not | ok. Browsers crash often enough that I don't want unwrap making | it worse | TheDong wrote: | > Does this mean using C inside of rust is ok? I'm pretty sure | the original team kept hitting memory problems and it was in | fact not ok | | This reads a little like flamebait. | | Memory unsafety is bad. panic is _not_ a memory unsafe | operation, it does not result in memory problems. | | This post is not related to memory safety in any way. | ArrayBoundCheck wrote: | The third sentence was the point. I don't want browsers being | less reliable and terminating randomly due to unwraps | burntsushi wrote: | And how do you propose they do that? What you're saying is | tautological. It's like saying, "I don't want browsers to | use runtime invariants because they can be broken and thus | causing my browser to terminate." | | Like... yeah, cool. How do you do that exactly? | | Also, please do consider reading the blog post I wrote. I | wrote it to try to clarify a lot of confusing around this | topic. It is nuanced, and it just can't be untangled in a | few sentences in a HN comment. | 0x457 wrote: | I think you missed the point of the article. In fact, the | article specifically covers catching panics, so it doesn't | bring down the entire application. As well as covering when | panic is okay. It doesn't say you should just unwrap | everything and treat panic as error handling. | richardwhiuk wrote: | I think the `expect()` bad examples are something of a strawman. | | The `.expect()` for regex for example would say what the regex is | matching for. | | I think it'd be desirable to have a | `.unwrap_with_context("Context: {}")`, and the you'd get | `Context: Inner Panic Info`. | burntsushi wrote: | So you're saying that the 'expect()' message when a regex | compilation error occurs should be a translation from a terse | domain specific language to bloviating prose? :-) | | What 'expect()' message would you write for this regex? | https://github.com/BurntSushi/ucd-generate/blob/6d3aae3b8005... | | I think 'unwrap()' there is perfectly appropriate. | | > I think it'd be desirable to have a | `.unwrap_with_context("Context: {}")`, and the you'd get | `Context: Inner Panic Info`. | | Why? | richardwhiuk wrote: | .expect("UnicodeData::from_str regex failed to compile") | | With this, even with out backtrace, you can work out what | happened. | | Without it, you just know that some regex somewhere is | invalid. | burntsushi wrote: | Have you ever seen a Regex::new(..).unwrap() fail? It | sounds like maybe not. It also sounds like you haven't seen | an 'unwrap()' fail either. | | > Without it, you just know that some regex somewhere is | invalid. | | That's bologna. As I discuss in the blog post, 'unwrap()' | tells you the line number at which it panicked. There's | even an example showing exactly this. There's even | _another_ example showing what happens when you call | 'Regex::new(..).unwrap()' and it fails[1]: | fn main() { | regex::Regex::new(r"foo\p{glyph}bar").unwrap(); } | | And running it: $ cargo run | Finished dev [unoptimized + debuginfo] target(s) in 0.00s | Running `target/debug/rust-panic` thread 'main' | panicked at 'called `Result::unwrap()` on an `Err` value: | Syntax( ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ regex parse | error: foo\p{glyph}bar ^^^^^^^^^ | error: Unicode property not found ~~~~~~~~~~~~~~~~~ | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | ~~~ )', main.rs:4:36 note: run with | `RUST_BACKTRACE=1` environment variable to display a | backtrac | | So you get the line number (as is standard with any | 'unwrap()') and you get the actual error message from the | regex. No need to even enable the backtrace. You just don't | need 'expect()' here. | | [1]: https://blog.burntsushi.net/unwrap/#why-not-use- | expect-inste... | xvedejas wrote: | When you unwrap/expect on an error, the inner error value is | already printed. Are you suggesting something more? Reference | the implementation details from the article: | impl<T, E: std::fmt::Debug> Result<T, E> { pub fn | unwrap(self) -> T { match self { | Ok(t) => t, Err(e) => panic!("called | `Result::unwrap()` on an `Err` value: {:?}", e), } | } } | vincenv wrote: | You can get that with anyhow [1] by calling unwrap after | `.with_context`. | | [1]: https://docs.rs/anyhow/latest/anyhow/ | berryton wrote: | The Rust book has a small discussion about his in chapter 9.3 | (https://doc.rust-lang.org/book/ch09-03-to-panic-or-not-to- | pa...). | | I think it is good to have some references (the blog post and the | book) for when the horde comes after you. | koala_man wrote: | tl;dr: Author convincingly argues that Rust `unwrap()` (Java | `Optional.get`, Haskell `fromJust`) is fine when you either have | checked that the call will not fail, or when you're in a unit | test or similar where a panic is a helpful result. | | (And, conversely, that it's not fine to use it to avoid doing | real error handling) | OJFord wrote: | Buried in your 'or similar' is the more controversial and | helpful (IMO) use - the infamous 'this should never happen' | exceptions. | marcosdumay wrote: | Well, if you want a definitive answer, it's "it depends". | Those errors are not all alike. | burntsushi wrote: | No, that isn't buried in the "or similar" of the GP. They | mention that explicitly with "checked that the call will not | fail." The "or similar" refers to documentation examples and | prototyping/one-off scripts. | drexlspivey wrote: | Is this supposed to be controversial? Even rust docs say so | https://doc.rust-lang.org/book/ch09-03-to-panic-or-not-to-pa... | burntsushi wrote: | Nope. But people are thoroughly confused by it. Comes up all | of the time. And a lot of people have more extremist | positions. Banning unwrap. Or banning any panicking branches | at all. That's why my blog post covers an example where we | convert a function that never panics (sensible) to a function | whose signature says it might return an error, but it | actually will never return an error (bonkers). | | People advocate for this. After publishing this article, it | almost seems like people are more confused than I thought. | Idk. | | Anyway, no, this blog is not meant to be controversial. It is | meant to untangle knots. ___________________________________________________________________ (page generated 2022-08-08 23:00 UTC)