[HN Gopher] Bugs that the Rust compiler catches for you ___________________________________________________________________ Bugs that the Rust compiler catches for you Author : ossusermivami Score : 52 points Date : 2022-05-05 21:02 UTC (1 hours ago) (HTM) web link (kerkour.com) (TXT) w3m dump (kerkour.com) | [deleted] | TheDong wrote: | > Resources leaks | | > // defer resp.Body.Close() // DON'T forget this line | | This doesn't actually leak the file / connection / etc in go for | most situations. | | When the object gets GC'd, the finalizer runs (https://cs.opensou | rce.google/go/go/+/master:src/os/file_unix...). That closes it. | ankrgyl wrote: | At the very least, that does not work for rolling back | transactions with the database/sql | (https://pkg.go.dev/database/sql) package, although it may work | for other cases. We've had numerous production bugs result from | this. | eptcyka wrote: | Then what's the point of the Close method in the File interface | (https://pkg.go.dev/io/fs@go1.18.1#File) ? | Andys wrote: | ...You might want to close a handle at any time? | woodruffw wrote: | I'm not a Go programmer, but I assume it's there for the same | reason every other GC'd language has the ability to close | resources manually: sometimes you just want to do it earlier | (or more explicitly) than the GC would. | TheDong wrote: | It lets you check for errors, and have a deterministic time | at which the file is closed. | | Both of these are desirable properties. | | The finalizer is there to prevent a subset of resource leaks, | not to be relied upon. | tylerhou wrote: | Does Go's GC have any SLA about when an unreachable object must | be garbage collected? If not, this is risky: | | 1. Will finalizers run on program crash? | | 2. Will I run out of resources if the garbage collection | doesn't run for sufficiently long time? (E.g. # of open file | descriptors.) | | 3. Does the order of finalization matter? | | Finalizers in Java were deprecated for the above reasons (and | more). https://stackoverflow.com/a/56454348 | TheDong wrote: | I agree it's not something that should be relied upon, nor is | it elegant, I'm just pointing out that in the average case, | failing to close a file is not a resource leak in the usual | sense. It's not like forgetting to close a file descriptor in | C. | | To answer each of your questions: | | 1. Generally no, but if your process exits then you | definitely aren't leaking a file descriptor. The great | finalizer in the linux kernel gets them then. | | 2. Yes absolutely, but that also isn't a leak. If you're | opening a lot of files, you probably have to handle open | failing as well anyway. | | 3. No order is guaranteed | | Most of this is documented on | https://pkg.go.dev/runtime#SetFinalizer | tylerhou wrote: | 1. This can lead to programming errors if, e.g., a write is | buffered in Golang and Close() flushes the buffer. Then you | might not correctly write the file. (I know that if you | really cared, you should use fsync, but lost writes could | happen in e.g. logging where you don't want the overhead of | fsync but you would also like to see all log output, | especially on program crash.) | | 2. I think this is a bigger deal than you are making it out | to be. If open fails, how would you handle it without just | exiting? I can't see a way of forcing finalizers to run. If | you're distributing your Go binary to users, you may not | have permissions to increase the allowed number of file | descriptors. So your program no longer functions correctly. | | Example: A program that processes files in parallel. At any | given time it might have 2 * num_cores files open, well | below the default descriptor limit on most systems. If I | rely on finalizers running, then I might have to exit if | the time to process each file is sufficiently short. There | is no way to fix this without instructing the user to | increase their fd limit. This is bad. Alternatively, if I | explicitly closed files, I would never exit. | lordnacho wrote: | All really good things to mention, but the data races thing is a | big deal and gets just a short reference and where is the bit | about the borrow checker? | | The way I see it the things mentioned are nice appetizers and the | data race and borrow checker are the main meal. | | IME the most frustrating problems are not that you forgot to | exhaust the switch statement or didn't initialise a new field, | it's when you get a segfault that's hard to reproduce and you | have barely a hint about what's caused it. | tylerhou wrote: | In addition, most of these checks are provided in other | languages or linters for those languages. E.g. C++ has had RAII | before Rust has existed. C++ (with -Werror), TypeScript and | most FP languages have exhaustive switch checking. Clang can | catch many (but not all) initialization errors. The places | where Rust adds value (data races, memory safety, explicit | unsafe) are not discussed in the article. | eptcyka wrote: | I've seen people who state their love for Rust and then fail to | explain the difference between passing an argument by value, | reference or mutable reference. | carlmr wrote: | I mean those are two separate things. Depending on how you | code, Rust can be similarly high level as Python, make it | much easier to design with types than C++ and has great | package management with cargo. | | You can find plenty of reasons to love Rust, without even | getting to the technical details. | woodruffw wrote: | This is pedantic of me, but I think it matters in terms of what | Rust _actually_ provides: | | Rust does not provide any resource leak guarantees. The fact that | resources tend to be freed when their owning scope closes is a | "fallout" consequence of ownership semantics, but Rust itself | does not guarantee that dropping an object necessarily closes or | disposes any underlying system resources. You can prove this to | yourself by writing a thin wrapper over the nix crate's POSIX | APIs: you can leak a file descriptor by forgetting to add a Drop | implementation for it. | | Similarly, Rust won't guarantee that all allocated memory is | freed. `Box::leak` has well-defined lifetime semantics: it turns | a `T` into a `&'static T` by removing its drop handler and | leaking the underlying pointer. And this isn't a problem, because | it doesn't compromise either spatial or temporal memory safety! | nicoburns wrote: | It's true that it's not a guarantee. But I feel like this is | one of those cases where it could be an issue in theory, but | pretty much never is in practice. | woodruffw wrote: | Certainly. Rust has a well-thought-out standard library, and | sticking to it will (generally) guarantee that the connection | between resource acquisition and memory safety is maintained. | | That being said: it can be a problem in practice, | particularly in sandboxed or otherwise constrained | environments. Leaking a file descriptor isn't a problem when | you have tens of thousands, but it can be one when you've | constrained the process to just a dozen. | amelius wrote: | The difference between theory and practice is exactly where | security exploits shine. | toolz wrote: | Are you suggesting if rust removed "safe in practice" | features (only keeping theoretically safe features) it | would lead to less exploitable software? If so I strongly | disagree with you. Every language is rife with features | that can be used in unsafe way but in practice increases | security. | woodruffw wrote: | I read this more as a "let's be precise about what's | actually guaranteed" and not an exhortation to avoid | Rust. | | Rust is my favorite compiled language, and that's why I'd | like conversations about Rust to be grounded in _formal_ | guarantees and not in incidental properties. | xedrac wrote: | Sure, but just because Rust isn't perfect doesn't mean it | isn't a huge improvement over the status quo. | Andys wrote: | If "in practice" is OK, then Go looks pretty good again. | LAC-Tech wrote: | Needs a companion blog, "perfectly safe code the rust compiler | will nag you about". And the contortions rust programmers go | through to avoid that. | | Rust is really impressive in a lot of ways. Type classes and | pattern are a great fit for systems programming. | | But they're fixated on the idea that everything possible should | be a static analysis error, language ergonomics or usability be | damned. I'd much rather these be warnings, because no static | analysis on earth is going to stop you from actually needing | tests to see if your code works. | rhn_mk1 wrote: | > no static analysis on earth is going to stop you from | actually needing tests to see if your code works. | | I might have been convinced if mathematical proofs were not | expressed in code. If a proof can exhaustively cover the | problem space, then there's no need for further testing. | | https://en.wikipedia.org/wiki/Curry%E2%80%93Howard_correspon... | jonpalmisc wrote: | Do you have any examples of the "perfectly safe code the Rust | compiler will nag you about"? Not trying to start language | wars, just genuinely curious as someone who writes Rust on | occasion. | verdagon wrote: | Something I find interesting about Rust is that we _can_ do | those safe patterns, as long as we 're willing to lose some | performance. | | The way I think of it: Rust forces us to choose between | flexibility and zero-cost memory safety. | | If we choose zero-cost memory safety (in other words, we don't | use Rc or unsafe or large Cells) we can't do things like | dependency injection, basic observers, backreferences, or many | kinds of custom RAII. But we do get speed. | | On the other hand, if we allow e.g. Rc into our codebases, we | can do these patterns just fine, though there is a performance | hit. | | The final challenge in learning Rust (IMO) is to figure out | when Rc is better, and when we can afford the complexity cost | of zero-cost memory safety. I've seen a lot of Rust projects | move mountains to avoid Rc, and ironically end up adding more | run-time overhead and complexity. | hu3 wrote: | let wordlist_file = File::open("wordlist.txt")?; // do | something... // we don't need to close wordlist_file | // it will be closed when the variable goes out of scope | | What happens if there is an error when closing the file that I | have written to? | LegionMammal978 wrote: | The error will be ignored, per the docs for File [0]: | | > Files are automatically closed when they go out of scope. | Errors detected on closing are ignored by the implementation of | Drop. Use the method sync_all if these errors must be manually | handled. | | [0] https://doc.rust-lang.org/stable/std/fs/struct.File.html | josephg wrote: | That sounds like a weird choice. When do you ever write to a | file but not care if the write has failed? | | Sounds like a source of bugs any time your file system isn't | 100% reliable. | eptcyka wrote: | If you close your files without waiting for fsync to return | first, do you really care if the data has hit the disk? If | fsync didn't fail, but close fails, what can you do then? | Calling close() doesn't imply anything about flushing | buffers or syncing data to disk or anything like that. It's | just a signal to the OS that your process is done with this | particular resource. | CraigJPerry wrote: | You should probably notify the user though | deathanatos wrote: | I agree it's a bit unfortunate. The rub here would be that | `Drop` would become fallible, and if it is fallible, then | ... _how_ does it fail, exactly? (What happens to the | error?) | | There's exceptions, but the downsides to such systems are | pretty extensively covered. | | Nonetheless, the point here is that RAII offers a | deterministic close compared to other approaches, at least, | even if the write's success isn't covered. You can get | that, too, with, wordlist_file.flush()?; | | or wordlist_file.sync_all()?; | | depending on desires. | | (And again, I agree that requiring the programmer to | remember code in order to obtain safe behavior is not | desirable. But this problem manifests in pretty much any | other language, and typically in worse manners.) | [deleted] | verdagon wrote: | As far as I know, Vale is the only language that can statically | ensure we handle that error with its Higher RAII [0], a form of | linear typing. | | Basically, File's drop() returns a Result, and the compiler | enforces that we use it. | | I hear linear types might also be coming to Haskell soon, which | is pretty exciting. Such a thing is unfortunately impossible in | Rust (though many languages can detect it at run-time). | | [0] https://verdagon.dev/blog/higher-raii-7drl | kitkat_new wrote: | why is it impossible? | SCHiM wrote: | Off topic: what type of error can occur when closing a file? Is | it somehow possible that the kernel denies your request, and | forces your handle to stay open? | colonwqbang wrote: | One situation is that you close something twice or otherwise | try to close an invalid fd. Still, it's very common to ignore | the return value of close. | | If you use NFS or other specific drivers you can probably get | more interesting errors. | TheDong wrote: | Quoting from "man 2 close": https://man7.org/linux/man- | pages/man2/close.2.html | | > it is quite possible that errors on a previous write(2) | operation are reported only on the final close() ... Failing | to check the return value when closing a file may lead to | silent loss of data. | | > the behavior that occurs on Linux ... the file descriptor | is guaranteed to be closed. | | So yeah, it's always closed on linux, but POSIX doesn't | guarantee that for EINTR specifically, and there are | sometimes meaningful errors. | [deleted] | hu3 wrote: | Great question! | | Here is a better explanation than I could write: | https://www.joeshaw.org/dont-defer-close-on-writable-files | | In resume, man close(2), gives us the potential errors. This | is the output for Ubuntu 20.04 LTS: EBADF | fd isn't a valid open file descriptor. EINTR The | close() call was interrupted by a signal; see signal(7). | EIO An I/O error occurred. ENOSPC, EDQUOT | On NFS, these errors are not normally reported against | the first write which exceeds the available storage space, | but instead against a subsequent write(2), fsync(2), or | close(). | [deleted] ___________________________________________________________________ (page generated 2022-05-05 23:00 UTC)