[HN Gopher] What Is Rust's Unsafe? (2019) ___________________________________________________________________ What Is Rust's Unsafe? (2019) Author : luu Score : 100 points Date : 2022-04-10 17:36 UTC (5 hours ago) (HTM) web link (nora.codes) (TXT) w3m dump (nora.codes) | verdagon wrote: | Interestingly enough, unsafe is the root reason Rust couldn't add | Vale's Seamless Concurrency feature [0] which is basically a way | to add a "parallel" loop that can access any existing data, | without refactoring it or any existing code. | | If Rust didn't have unsafe, it could have that feature. Instead, | the Rust compiler assumes a lot of data is !Sync, such as | anything that might indirectly contain a trait object (unless | explicitly given + Sync) which might contain a Cell or RefCell, | both of which are only possible because of unsafe. | | Without those, without unsafe, shared borrow references would be | true immutable borrow references, and Sync would go away, and we | could have Seamless Concurrency in Rust. | | I often wonder what else would emerge in an unsafe-less Rust! | | Still, Given Rust's priorities (low level development) and the | borrow checker's occasional need for workarounds, and the sheer | usefulness of shared mutability, it was a wise decision for Rust | to include unsafe. | | [0] https://verdagon.dev/blog/seamless-fearless-structured- | concu... | ______-_-______ wrote: | I'm not sure I follow this. If Rust wanted the feature in that | blog post, they could restrict it to only accessing data that | is Sync. They wouldn't have to throw out the concept of Sync | entirely, in fact cases like this are the reason it exists. | Rust just chooses to leave this kind of feature up to libraries | instead of building it into the language. | | And even without unsafe, you still couldn't assume all data is | Sync. Counter-examples include references to data in thread- | local storage, and most data used with ffi. | zozbot234 wrote: | > Without those, without unsafe, shared borrow references would | be true immutable borrow | | Rust devs have thought about implementing "true immutable" | before and found it to be problematic. It would come in quite | handy for mostly anything related to FP or referential | transparency/purity, but these things turn out to be very hard | to reconcile with the "systems" orientation of Rust. Perhaps | the answer will reside in some expanded notion of "equality" of | values and objects, which might allow for trivial variation | while verifying that the code you write respects the same | notion of "equality". | celeritascelery wrote: | Rust without unsafe would just be another obscure academic | language like Haskell. The ability to bypass the type system | when the programmer needs to is what makes Rust work. | whateveracct wrote: | Haskell also allows you to bypass the type system plenty | tialaramex wrote: | Crucially, unsafe is also about the _social contract_. Rust 's | compiler can't tell whether you wrote a safety rationale | adequately explaining why this use of unsafe was appropriate, | only other members of the community can decide that. Rust's | compiler doesn't prefer an equally fast _safe_ way to do a thing | over the unsafe way, but the community does. You could imagine a | language with exactly the same technical features but a different | community, where unsafe use is rife and fewer of the benefits | accrue. | | One use of "unsafe" that was not mentioned by Nora but is | important for the embedded community in particular is the use of | "unsafe" to flag things which from the point of view of Rust | itself are fine, but are dangerous enough to be worth having a | human programmer directed away from them unless they know what | they're doing. From Rust's point of view, | "HellPortal::unchecked_open()" is safe, it's thread-safe, it's | memory safe... but it will summon demons that could destroy | mankind if the warding field isn't up, so, that might need an | "unsafe" marker and we can write a _checked_ version which | verifies that the warding is up and the stand-by wizard is | available to close the portal before we actually open it, the | checked one will be safe. | OtomotO wrote: | When I read posts like yours, I wish unsafe had a different | name like "human invariant" or whatever. | | Something that would make it harder to water down the meaning | of the clearly defined unsafe keyword to suddenly mean | something else. | | Using the unsafe keyword to mark a function as "potentially | dangerous" is just wrong. | | Just prefix your functions with something like | "dangerous_call", but don't misuse unsafe! | pjmlp wrote: | This is a complete misuse of the unsafe as language concept in | high integrity computing. | vitno wrote: | This person isn't wrong. A lot of serious Rust users don't | agree with what the GP is suggesting. `unsafe` has an | explicit meaning: the user must uphold some invariant or | check something about the environment, otherwise it is memory | unsafe. | | I have several times in code review prevented people from | marking safe interfaces as "unsafe" because they are "special | and concerning", overloading the usage of unsafe is itself | dangerous. | zozbot234 wrote: | True, but I think allowing potentially UB-invoking code to | not use "unsafe" (e.g. because the use is in the context of | FFI, so the unsafety is thought to be "obvious" and not | worth marking as such) might be even less advisable. This | makes it harder to ensure the "social rule" mentioned by | GP, that every potential UB should be endowed with a | "Safety" annotation describing the conditions for it to be | safe. | ______-_-______ wrote: | Your comment gave me an idea for a lint that might help | prevent those mistakes. Right now rustc flags `unsafe {}` | with an "unused_unsafe" warning. However it doesn't warn | for `unsafe fn foo() {}`. Maybe it should. | gpm wrote: | I think as described you would get false positives, | because `unsafe fn foo() { body that performs no unsafe | operations }` can be unsafe to call if it interacts with | private fields on datastructures used by safe (to call) | functions that perform unsafe operations... I expect you | would end up with a reasonably high number of false | positives. | | For an example, consider Vec::set_len in the standard | library. Which only contains safe code, but lets you | access uninitialized memory and beyond the length of your | allocation by modifying the length field of vector: | https://doc.rust-lang.org/src/alloc/vec/mod.rs.html#1264 | | You might be able to fix this with a lint that looked at | a bit more context though, `unsafe fn foo()` in a module | (or even crate) with no actually unsafe operations is | very likely wrong. Likewise `unsafe fn foo()` which | performs no unsafe operations and only accesses fields, | statics, functions, and methods that are public. | infogulch wrote: | Real formal verification is clearly a step up from rust's | meaning of "safe", but I don't think it's wrong to try to add | another rung to the verification ladder at a different | height. Verification technologies have a lot of space to | improve in the UX department, here Rust trades off a some | verification guarantees for a practical system that is still | meaningful. | wheelerof4te wrote: | "From Rust's point of view, "HellPortal::unchecked_open()" is | safe, it's thread-safe, it's memory safe... but it will summon | demons that could destroy mankind if the warding field isn't | up, so, that might need an "unsafe" marker and we can write a | checked version which verifies that the warding is up and the | stand-by wizard is available to close the portal before we | actually open it, the checked one will be safe." | | _" The only thing they fear is you"_ | | playing in the background. | furyofantares wrote: | Would you be able to provide a real example of that HellPortal | thing? I'm not really following | seba_dos1 wrote: | I'm not convinced that it's a great use of Rust's `unsafe`, | but since you want an example... dealing with voltage | regulators maybe? Where an invalid value put into some | register could fry your hardware? There's a ton of such cases | in embedded. | oconnor663 wrote: | You could probably make the case that any function that | might physically destroy your memory is memory unsafe :) | mikepurvis wrote: | Similar to the sibling, stuff where you're dealing with | parallel state in hardware, like talking to a device over i2c | or something, where you know certain things are _supposed_ to | happen but you don 't, like, know know. | wheelerof4te wrote: | Imagine the code screaming | | _" Rip and tear, until it's done!"_ | tialaramex wrote: | I'm assuming "real example" means of such unsafe-means- | actually-unsafe behaviour in embedded Rust, as opposed to a | real example of summoning demons? | | For example volatile_register is a crate for representing | some sort of MMIO hardware registers. It will do the actual | MMIO for you, just tell it where your registers are in | "memory" and say whether they're read-write, read-only, or | write-only just once, and it provides the nice Rust interface | to the registers. | | https://docs.rs/volatile- | register/0.2.1/volatile_register/st... | | The low-level stuff it's doing is inherently unsafe, but it | is wrapping that. So when you call register.read() that's | safe, and it will... read the register. However even though | it's a wrapper it chooses to label the register.write() call | as unsafe, reminding you that this is a hardware register and | that's on you. | | In many cases you'd add a further wrapper, e.g. maybe there's | a register for controlling clock frequency of another part, | you know the part malfunctions below 5kHz and is not | warrantied above 60kHz, so, your wrapper can take a value, | check it's between 5 and 60 inclusive and then do the | arithmetic and set the frequency register using that unsafe | register.write() function. You would probably decide that | your wrapper is now actually safe. | furyofantares wrote: | > I'm assuming "real example" means of such unsafe-means- | actually-unsafe behaviour in embedded Rust, as opposed to a | real example of summoning demons? | | That was what I meant, thanks for the answer! Though if you | have an example of the other thing I'd be open to that too | zozbot234 wrote: | The social contract is manageable precisely because the unsafe | subset of typical Rust codebases is a tiny fraction of the | code. This is also why I'm wary about expanding the use of | `unsafe` beyond things that are actually UB. Something that's | "merely" security sensitive or tricky should use Rust's | existing features (modules, custom types etc.) to ensure that | any use of the raw feature is flagged by the compiler, while | sticking to actual Safe Rust and avoiding the `unsafe` keyword | altogether. | saghm wrote: | I agree with this mindset a lot. One example I like about how | to deal with exposing a "memory and type safe but still | potentially dangerous" API is how rustls supports allowing | custom verification for certificates. By default, no such API | exists, and server certificates will be verified by the | client when connecting, with an error being returned if the | validation fails. However, they expose an optional feature | for the crate called "dangerous_configuration" which allows | writing custom code that inspects a certificate and | determines for itself whether or not the certificate is | valid. This is useful because often you might want to test | something locally or in a trusted environment with a self- | signed certificate bit not want to actually deploy code that | would potentially allow an untrusted certificate to be | accepted. | EE84M3i wrote: | As mentioned in the article, it's entirely possible to create a | raw pointer to an object in safe rust, you just can't do much | with it. One thing you can do with it though is convert it to an | integer and reveal a randomized base address, which isn't | possible in some other languages without their "unsafe" features. | Of course, this follows naturally from Rust's definition of what | is safe, but I remember being kind of surprised by it when I | first learned Rust and didn't understand those definitions yet. | | It would be pretty interesting to me if someone wrote a survey of | what different languages consider to be "unsafe", including | specific operations like this. For example, it looks like | "sizeof" is relegated to the "unsafe" package in Go, which | strikes me as strange.[1] I'd love to read a big comparison. | | [1]: https://pkg.go.dev/unsafe#Sizeof | ridiculous_fish wrote: | Sometimes unsafe is used differently, for example when setting up | a signal handler or a post-fork handler. Process::pre_exec is | unsafe to indicate that some safe code may crash or produce UB | within this function. Only async-signal safe functions should be | used and Rust does not model that in its type system. | oxff wrote: | It is a very cool abstraction that makes lots of things possible, | which are (to my understanding) impossible to do in other | languages without the explicit safety contract. | | Like this: https://www.infoq.com/news/2021/11/rudra-rust-safety/, | and I quote: "In C/C++, getting a confirmation from the | maintainers whether certain behavior is a bug or an intended | behavior is necessary in bug reporting, because there are no | clear distinctions between an API misuse and a bug in the API | itself. In contrast, Rust's safety rule provides an objective | standard to determine whose fault a bug is." | | People bring up `unsafe` Rust as an argument against the | language, but to me it appears to be an argument `for` it. | likeabbas wrote: | My only complaint is I wish they didn't name it `unsafe` and | instead named it something like `compiler_unverifiable` so that | people could more properly understand that we can make safe | abstractions around what the compiler can't verify. | dwohnitmok wrote: | I like the word `unsafe`. It's a nice red flag to newbies | that "you almost certainly shouldn't use this" and once you | have enough experience to use `unsafe` well you'll know its | subtleties well enough. | likeabbas wrote: | I don't think newbies using it would be a problem. There is | an issue with people/companies evaluating Rust and stopping | when they see `unsafe` without looking much further into | it. | mynameisash wrote: | I just saw a tweet from Esteban Kuber[0] the other day that | made me rethink a silly thing I did: | | "I started learning Rust in 2015. I've been using it since | 2016. I've been a compiler team member since 2017. I've | been paid to write Rust code since 2018. I have needed to | use `unsafe` <5 times. _That 's_ what why Rust's safety | guarantees matter despite the existence of `unsafe`." | | For me, I was being a bit lazy with loading some config on | program startup that would never change, so I used a | `static mut` which requires `unsafe` to access. Turns out I | was able to figure out a way to pass my data around with an | `Arc<T>`. I think either way would have worked, but I | figured I should avoid the unsafe approach anyway. | | [0] https://twitter.com/ekuber/status/1511762429226590215 | oconnor663 wrote: | Yeah if you don't need mutation after initialization is | done, Arc<T> is a good option for sharing. Lazy<T> (https | ://docs.rs/once_cell/latest/once_cell/sync/struct.Lazy... | .) is also nice, especially if you want to keep treating | it like a global. I believe something like Lazy<T> is | eventually going to be included in the standard library. | shepmaster wrote: | I'll be even lazier (heh) and just straight up leak | memory for true once-set read-only config values. | | https://github.com/shepmaster/stack-overflow- | relay/blob/273a... | kibwen wrote: | If I were to go back and do it again I'd propose to rename | unsafe blocks to something like `promise`, to indicate that | the programmer is promising to uphold some invariants (the | keyword on function declarations would still be `unsafe`). | (Of course the idea of a "promise" means something different | in JavaScript land, but I'm not worried about confusion | there.) | gpm wrote: | I still like `trust_me`, maybe a bit unprofessional in some | contexts though. | ChadNauseam wrote: | While we're at it, I also would have preferred `&uniq` over | `&mut`. | kevincox wrote: | Or just `unchecked`. It is similarly short but holds a | similar meaning. Of course `unchecked_by_the_compiler` would | be more accurate. | kibwen wrote: | That's inaccurate, because Rust still performs all the | usual checks inside of unsafe blocks. The idea that unsafe | blocks are "turning off" some aspect of the language is a | persistent misconception, so it's best to pick a keyword | that doesn't reinforce that. | nyanpasu64 wrote: | Shrug. I really don't think unchecked is "wrong" in | spirit, since adding an unsafe block makes it trivial to | write code which ignores the usual boundary and pointer | lifetime checks, which is not very different in practice | from turning off checks. Also unsafe code is usually | wrong, and deeply difficult to write correctly, much more | so than C. | | Also, I dislike the word "unsafe" since "unsafe code" is | easily (mis)interpreted to mean "invalid/UB code", but | "invalid/UB code" is officially called "unsound" rather | than "unsafe". Unsafe blocks are used to call unsafe | functions etc. (they pass information into unchecked | operations via arguments), and unsafe trait impls are | used by generic/etc. unsafe code like threading | primitives (unsafe trait impls pass information into | unchecked operations via return values or side effects). | | unchecked would make a good keyword name for blocks | calling unchecked operations, but I'm not so sure about | using it for functions or traits. | retrac wrote: | I believe the term "unsafe" in this sense predates Rust. | Haskell's GHC has unsafePerformIO [1], which is very similar | conceptually. Generally used to implement an algorithm in a | way where it can be pure/safe in practice, but where this | cannot be proved by the type system. | | [1] https://stackoverflow.com/questions/10529284/is-there- | ever-a... | pjmlp wrote: | The unsafe concept exists since early 60's in systems | programming languages. | | The real issue here is Rust fans trying to make Rust into a | type dependent language while abusing unsafe's original | purpose. | pornel wrote: | hold_my_beer { } | verdagon wrote: | Jon Goodwin's Cone [0] language had a nice idea of calling it | "trust", as in, "Compiler, trust me, I know what I'm doing" | | [0] https://cone.jondgoodwin.com/coneref/reftrust.html | bitbckt wrote: | I like the sound of "trusted". | MereInterest wrote: | I think the benefit of the name "unsafe" is that it | immediately tells newer users that the code inside has | deeper magic than they may be comfortable using. Where | "trusted" is what the writer attests to the compiler, | "unsafe" is what writer warnings to a future reader. | [deleted] | seba_dos1 wrote: | > People bring up `unsafe` Rust as an argument against the | language | | Those people usually don't understand what `unsafe` is. | zozbot234 wrote: | This article fails to mention that the true semantics of Rust's | `unsafe` subset has been very much up-in-the-air for a long time. | Nowadays the `miri` interpreter is supposed to be giving us a | workable model for what `unsafe` code is or is not going to | trigger UB but many things are still uncertain, sometimes | intentionally so as some properties may depend on idiosyncratic | workings of the underlying OS, platform and/or hardware. These | factors all make it harder to turn what's currently `unsafe` into | future features of safe Rust, perhaps guarded by some sort of | formalized proof-carrying code. | kibwen wrote: | Progress towards this is always being made, though it's a long | road with much still to determine. I encourage anyone | interested in this to join the Zulip channel for Rust's unsafe | code guidelines working group: https://rust- | lang.zulipchat.com/#narrow/stream/t-lang.2Fwg-u... | | Some recent developments: https://gankra.github.io/blah/tower- | of-weakenings/ | [deleted] ___________________________________________________________________ (page generated 2022-04-10 23:00 UTC)