[HN Gopher] What Is Rust's Unsafe? (2019)
       ___________________________________________________________________
        
       What Is Rust's Unsafe? (2019)
        
       Author : luu
       Score  : 100 points
       Date   : 2022-04-10 17:36 UTC (5 hours ago)
        
 (HTM) web link (nora.codes)
 (TXT) w3m dump (nora.codes)
        
       | verdagon wrote:
       | Interestingly enough, unsafe is the root reason Rust couldn't add
       | Vale's Seamless Concurrency feature [0] which is basically a way
       | to add a "parallel" loop that can access any existing data,
       | without refactoring it or any existing code.
       | 
       | If Rust didn't have unsafe, it could have that feature. Instead,
       | the Rust compiler assumes a lot of data is !Sync, such as
       | anything that might indirectly contain a trait object (unless
       | explicitly given + Sync) which might contain a Cell or RefCell,
       | both of which are only possible because of unsafe.
       | 
       | Without those, without unsafe, shared borrow references would be
       | true immutable borrow references, and Sync would go away, and we
       | could have Seamless Concurrency in Rust.
       | 
       | I often wonder what else would emerge in an unsafe-less Rust!
       | 
       | Still, Given Rust's priorities (low level development) and the
       | borrow checker's occasional need for workarounds, and the sheer
       | usefulness of shared mutability, it was a wise decision for Rust
       | to include unsafe.
       | 
       | [0] https://verdagon.dev/blog/seamless-fearless-structured-
       | concu...
        
         | ______-_-______ wrote:
         | I'm not sure I follow this. If Rust wanted the feature in that
         | blog post, they could restrict it to only accessing data that
         | is Sync. They wouldn't have to throw out the concept of Sync
         | entirely, in fact cases like this are the reason it exists.
         | Rust just chooses to leave this kind of feature up to libraries
         | instead of building it into the language.
         | 
         | And even without unsafe, you still couldn't assume all data is
         | Sync. Counter-examples include references to data in thread-
         | local storage, and most data used with ffi.
        
         | zozbot234 wrote:
         | > Without those, without unsafe, shared borrow references would
         | be true immutable borrow
         | 
         | Rust devs have thought about implementing "true immutable"
         | before and found it to be problematic. It would come in quite
         | handy for mostly anything related to FP or referential
         | transparency/purity, but these things turn out to be very hard
         | to reconcile with the "systems" orientation of Rust. Perhaps
         | the answer will reside in some expanded notion of "equality" of
         | values and objects, which might allow for trivial variation
         | while verifying that the code you write respects the same
         | notion of "equality".
        
         | celeritascelery wrote:
         | Rust without unsafe would just be another obscure academic
         | language like Haskell. The ability to bypass the type system
         | when the programmer needs to is what makes Rust work.
        
           | whateveracct wrote:
           | Haskell also allows you to bypass the type system plenty
        
       | tialaramex wrote:
       | Crucially, unsafe is also about the _social contract_. Rust 's
       | compiler can't tell whether you wrote a safety rationale
       | adequately explaining why this use of unsafe was appropriate,
       | only other members of the community can decide that. Rust's
       | compiler doesn't prefer an equally fast _safe_ way to do a thing
       | over the unsafe way, but the community does. You could imagine a
       | language with exactly the same technical features but a different
       | community, where unsafe use is rife and fewer of the benefits
       | accrue.
       | 
       | One use of "unsafe" that was not mentioned by Nora but is
       | important for the embedded community in particular is the use of
       | "unsafe" to flag things which from the point of view of Rust
       | itself are fine, but are dangerous enough to be worth having a
       | human programmer directed away from them unless they know what
       | they're doing. From Rust's point of view,
       | "HellPortal::unchecked_open()" is safe, it's thread-safe, it's
       | memory safe... but it will summon demons that could destroy
       | mankind if the warding field isn't up, so, that might need an
       | "unsafe" marker and we can write a _checked_ version which
       | verifies that the warding is up and the stand-by wizard is
       | available to close the portal before we actually open it, the
       | checked one will be safe.
        
         | OtomotO wrote:
         | When I read posts like yours, I wish unsafe had a different
         | name like "human invariant" or whatever.
         | 
         | Something that would make it harder to water down the meaning
         | of the clearly defined unsafe keyword to suddenly mean
         | something else.
         | 
         | Using the unsafe keyword to mark a function as "potentially
         | dangerous" is just wrong.
         | 
         | Just prefix your functions with something like
         | "dangerous_call", but don't misuse unsafe!
        
         | pjmlp wrote:
         | This is a complete misuse of the unsafe as language concept in
         | high integrity computing.
        
           | vitno wrote:
           | This person isn't wrong. A lot of serious Rust users don't
           | agree with what the GP is suggesting. `unsafe` has an
           | explicit meaning: the user must uphold some invariant or
           | check something about the environment, otherwise it is memory
           | unsafe.
           | 
           | I have several times in code review prevented people from
           | marking safe interfaces as "unsafe" because they are "special
           | and concerning", overloading the usage of unsafe is itself
           | dangerous.
        
             | zozbot234 wrote:
             | True, but I think allowing potentially UB-invoking code to
             | not use "unsafe" (e.g. because the use is in the context of
             | FFI, so the unsafety is thought to be "obvious" and not
             | worth marking as such) might be even less advisable. This
             | makes it harder to ensure the "social rule" mentioned by
             | GP, that every potential UB should be endowed with a
             | "Safety" annotation describing the conditions for it to be
             | safe.
        
             | ______-_-______ wrote:
             | Your comment gave me an idea for a lint that might help
             | prevent those mistakes. Right now rustc flags `unsafe {}`
             | with an "unused_unsafe" warning. However it doesn't warn
             | for `unsafe fn foo() {}`. Maybe it should.
        
               | gpm wrote:
               | I think as described you would get false positives,
               | because `unsafe fn foo() { body that performs no unsafe
               | operations }` can be unsafe to call if it interacts with
               | private fields on datastructures used by safe (to call)
               | functions that perform unsafe operations... I expect you
               | would end up with a reasonably high number of false
               | positives.
               | 
               | For an example, consider Vec::set_len in the standard
               | library. Which only contains safe code, but lets you
               | access uninitialized memory and beyond the length of your
               | allocation by modifying the length field of vector:
               | https://doc.rust-lang.org/src/alloc/vec/mod.rs.html#1264
               | 
               | You might be able to fix this with a lint that looked at
               | a bit more context though, `unsafe fn foo()` in a module
               | (or even crate) with no actually unsafe operations is
               | very likely wrong. Likewise `unsafe fn foo()` which
               | performs no unsafe operations and only accesses fields,
               | statics, functions, and methods that are public.
        
           | infogulch wrote:
           | Real formal verification is clearly a step up from rust's
           | meaning of "safe", but I don't think it's wrong to try to add
           | another rung to the verification ladder at a different
           | height. Verification technologies have a lot of space to
           | improve in the UX department, here Rust trades off a some
           | verification guarantees for a practical system that is still
           | meaningful.
        
         | wheelerof4te wrote:
         | "From Rust's point of view, "HellPortal::unchecked_open()" is
         | safe, it's thread-safe, it's memory safe... but it will summon
         | demons that could destroy mankind if the warding field isn't
         | up, so, that might need an "unsafe" marker and we can write a
         | checked version which verifies that the warding is up and the
         | stand-by wizard is available to close the portal before we
         | actually open it, the checked one will be safe."
         | 
         |  _" The only thing they fear is you"_
         | 
         | playing in the background.
        
         | furyofantares wrote:
         | Would you be able to provide a real example of that HellPortal
         | thing? I'm not really following
        
           | seba_dos1 wrote:
           | I'm not convinced that it's a great use of Rust's `unsafe`,
           | but since you want an example... dealing with voltage
           | regulators maybe? Where an invalid value put into some
           | register could fry your hardware? There's a ton of such cases
           | in embedded.
        
             | oconnor663 wrote:
             | You could probably make the case that any function that
             | might physically destroy your memory is memory unsafe :)
        
           | mikepurvis wrote:
           | Similar to the sibling, stuff where you're dealing with
           | parallel state in hardware, like talking to a device over i2c
           | or something, where you know certain things are _supposed_ to
           | happen but you don 't, like, know know.
        
           | wheelerof4te wrote:
           | Imagine the code screaming
           | 
           |  _" Rip and tear, until it's done!"_
        
           | tialaramex wrote:
           | I'm assuming "real example" means of such unsafe-means-
           | actually-unsafe behaviour in embedded Rust, as opposed to a
           | real example of summoning demons?
           | 
           | For example volatile_register is a crate for representing
           | some sort of MMIO hardware registers. It will do the actual
           | MMIO for you, just tell it where your registers are in
           | "memory" and say whether they're read-write, read-only, or
           | write-only just once, and it provides the nice Rust interface
           | to the registers.
           | 
           | https://docs.rs/volatile-
           | register/0.2.1/volatile_register/st...
           | 
           | The low-level stuff it's doing is inherently unsafe, but it
           | is wrapping that. So when you call register.read() that's
           | safe, and it will... read the register. However even though
           | it's a wrapper it chooses to label the register.write() call
           | as unsafe, reminding you that this is a hardware register and
           | that's on you.
           | 
           | In many cases you'd add a further wrapper, e.g. maybe there's
           | a register for controlling clock frequency of another part,
           | you know the part malfunctions below 5kHz and is not
           | warrantied above 60kHz, so, your wrapper can take a value,
           | check it's between 5 and 60 inclusive and then do the
           | arithmetic and set the frequency register using that unsafe
           | register.write() function. You would probably decide that
           | your wrapper is now actually safe.
        
             | furyofantares wrote:
             | > I'm assuming "real example" means of such unsafe-means-
             | actually-unsafe behaviour in embedded Rust, as opposed to a
             | real example of summoning demons?
             | 
             | That was what I meant, thanks for the answer! Though if you
             | have an example of the other thing I'd be open to that too
        
         | zozbot234 wrote:
         | The social contract is manageable precisely because the unsafe
         | subset of typical Rust codebases is a tiny fraction of the
         | code. This is also why I'm wary about expanding the use of
         | `unsafe` beyond things that are actually UB. Something that's
         | "merely" security sensitive or tricky should use Rust's
         | existing features (modules, custom types etc.) to ensure that
         | any use of the raw feature is flagged by the compiler, while
         | sticking to actual Safe Rust and avoiding the `unsafe` keyword
         | altogether.
        
           | saghm wrote:
           | I agree with this mindset a lot. One example I like about how
           | to deal with exposing a "memory and type safe but still
           | potentially dangerous" API is how rustls supports allowing
           | custom verification for certificates. By default, no such API
           | exists, and server certificates will be verified by the
           | client when connecting, with an error being returned if the
           | validation fails. However, they expose an optional feature
           | for the crate called "dangerous_configuration" which allows
           | writing custom code that inspects a certificate and
           | determines for itself whether or not the certificate is
           | valid. This is useful because often you might want to test
           | something locally or in a trusted environment with a self-
           | signed certificate bit not want to actually deploy code that
           | would potentially allow an untrusted certificate to be
           | accepted.
        
       | EE84M3i wrote:
       | As mentioned in the article, it's entirely possible to create a
       | raw pointer to an object in safe rust, you just can't do much
       | with it. One thing you can do with it though is convert it to an
       | integer and reveal a randomized base address, which isn't
       | possible in some other languages without their "unsafe" features.
       | Of course, this follows naturally from Rust's definition of what
       | is safe, but I remember being kind of surprised by it when I
       | first learned Rust and didn't understand those definitions yet.
       | 
       | It would be pretty interesting to me if someone wrote a survey of
       | what different languages consider to be "unsafe", including
       | specific operations like this. For example, it looks like
       | "sizeof" is relegated to the "unsafe" package in Go, which
       | strikes me as strange.[1] I'd love to read a big comparison.
       | 
       | [1]: https://pkg.go.dev/unsafe#Sizeof
        
       | ridiculous_fish wrote:
       | Sometimes unsafe is used differently, for example when setting up
       | a signal handler or a post-fork handler. Process::pre_exec is
       | unsafe to indicate that some safe code may crash or produce UB
       | within this function. Only async-signal safe functions should be
       | used and Rust does not model that in its type system.
        
       | oxff wrote:
       | It is a very cool abstraction that makes lots of things possible,
       | which are (to my understanding) impossible to do in other
       | languages without the explicit safety contract.
       | 
       | Like this: https://www.infoq.com/news/2021/11/rudra-rust-safety/,
       | and I quote: "In C/C++, getting a confirmation from the
       | maintainers whether certain behavior is a bug or an intended
       | behavior is necessary in bug reporting, because there are no
       | clear distinctions between an API misuse and a bug in the API
       | itself. In contrast, Rust's safety rule provides an objective
       | standard to determine whose fault a bug is."
       | 
       | People bring up `unsafe` Rust as an argument against the
       | language, but to me it appears to be an argument `for` it.
        
         | likeabbas wrote:
         | My only complaint is I wish they didn't name it `unsafe` and
         | instead named it something like `compiler_unverifiable` so that
         | people could more properly understand that we can make safe
         | abstractions around what the compiler can't verify.
        
           | dwohnitmok wrote:
           | I like the word `unsafe`. It's a nice red flag to newbies
           | that "you almost certainly shouldn't use this" and once you
           | have enough experience to use `unsafe` well you'll know its
           | subtleties well enough.
        
             | likeabbas wrote:
             | I don't think newbies using it would be a problem. There is
             | an issue with people/companies evaluating Rust and stopping
             | when they see `unsafe` without looking much further into
             | it.
        
             | mynameisash wrote:
             | I just saw a tweet from Esteban Kuber[0] the other day that
             | made me rethink a silly thing I did:
             | 
             | "I started learning Rust in 2015. I've been using it since
             | 2016. I've been a compiler team member since 2017. I've
             | been paid to write Rust code since 2018. I have needed to
             | use `unsafe` <5 times. _That 's_ what why Rust's safety
             | guarantees matter despite the existence of `unsafe`."
             | 
             | For me, I was being a bit lazy with loading some config on
             | program startup that would never change, so I used a
             | `static mut` which requires `unsafe` to access. Turns out I
             | was able to figure out a way to pass my data around with an
             | `Arc<T>`. I think either way would have worked, but I
             | figured I should avoid the unsafe approach anyway.
             | 
             | [0] https://twitter.com/ekuber/status/1511762429226590215
        
               | oconnor663 wrote:
               | Yeah if you don't need mutation after initialization is
               | done, Arc<T> is a good option for sharing. Lazy<T> (https
               | ://docs.rs/once_cell/latest/once_cell/sync/struct.Lazy...
               | .) is also nice, especially if you want to keep treating
               | it like a global. I believe something like Lazy<T> is
               | eventually going to be included in the standard library.
        
               | shepmaster wrote:
               | I'll be even lazier (heh) and just straight up leak
               | memory for true once-set read-only config values.
               | 
               | https://github.com/shepmaster/stack-overflow-
               | relay/blob/273a...
        
           | kibwen wrote:
           | If I were to go back and do it again I'd propose to rename
           | unsafe blocks to something like `promise`, to indicate that
           | the programmer is promising to uphold some invariants (the
           | keyword on function declarations would still be `unsafe`).
           | (Of course the idea of a "promise" means something different
           | in JavaScript land, but I'm not worried about confusion
           | there.)
        
             | gpm wrote:
             | I still like `trust_me`, maybe a bit unprofessional in some
             | contexts though.
        
             | ChadNauseam wrote:
             | While we're at it, I also would have preferred `&uniq` over
             | `&mut`.
        
           | kevincox wrote:
           | Or just `unchecked`. It is similarly short but holds a
           | similar meaning. Of course `unchecked_by_the_compiler` would
           | be more accurate.
        
             | kibwen wrote:
             | That's inaccurate, because Rust still performs all the
             | usual checks inside of unsafe blocks. The idea that unsafe
             | blocks are "turning off" some aspect of the language is a
             | persistent misconception, so it's best to pick a keyword
             | that doesn't reinforce that.
        
               | nyanpasu64 wrote:
               | Shrug. I really don't think unchecked is "wrong" in
               | spirit, since adding an unsafe block makes it trivial to
               | write code which ignores the usual boundary and pointer
               | lifetime checks, which is not very different in practice
               | from turning off checks. Also unsafe code is usually
               | wrong, and deeply difficult to write correctly, much more
               | so than C.
               | 
               | Also, I dislike the word "unsafe" since "unsafe code" is
               | easily (mis)interpreted to mean "invalid/UB code", but
               | "invalid/UB code" is officially called "unsound" rather
               | than "unsafe". Unsafe blocks are used to call unsafe
               | functions etc. (they pass information into unchecked
               | operations via arguments), and unsafe trait impls are
               | used by generic/etc. unsafe code like threading
               | primitives (unsafe trait impls pass information into
               | unchecked operations via return values or side effects).
               | 
               | unchecked would make a good keyword name for blocks
               | calling unchecked operations, but I'm not so sure about
               | using it for functions or traits.
        
           | retrac wrote:
           | I believe the term "unsafe" in this sense predates Rust.
           | Haskell's GHC has unsafePerformIO [1], which is very similar
           | conceptually. Generally used to implement an algorithm in a
           | way where it can be pure/safe in practice, but where this
           | cannot be proved by the type system.
           | 
           | [1] https://stackoverflow.com/questions/10529284/is-there-
           | ever-a...
        
           | pjmlp wrote:
           | The unsafe concept exists since early 60's in systems
           | programming languages.
           | 
           | The real issue here is Rust fans trying to make Rust into a
           | type dependent language while abusing unsafe's original
           | purpose.
        
           | pornel wrote:
           | hold_my_beer {         }
        
           | verdagon wrote:
           | Jon Goodwin's Cone [0] language had a nice idea of calling it
           | "trust", as in, "Compiler, trust me, I know what I'm doing"
           | 
           | [0] https://cone.jondgoodwin.com/coneref/reftrust.html
        
           | bitbckt wrote:
           | I like the sound of "trusted".
        
             | MereInterest wrote:
             | I think the benefit of the name "unsafe" is that it
             | immediately tells newer users that the code inside has
             | deeper magic than they may be comfortable using. Where
             | "trusted" is what the writer attests to the compiler,
             | "unsafe" is what writer warnings to a future reader.
        
         | [deleted]
        
         | seba_dos1 wrote:
         | > People bring up `unsafe` Rust as an argument against the
         | language
         | 
         | Those people usually don't understand what `unsafe` is.
        
       | zozbot234 wrote:
       | This article fails to mention that the true semantics of Rust's
       | `unsafe` subset has been very much up-in-the-air for a long time.
       | Nowadays the `miri` interpreter is supposed to be giving us a
       | workable model for what `unsafe` code is or is not going to
       | trigger UB but many things are still uncertain, sometimes
       | intentionally so as some properties may depend on idiosyncratic
       | workings of the underlying OS, platform and/or hardware. These
       | factors all make it harder to turn what's currently `unsafe` into
       | future features of safe Rust, perhaps guarded by some sort of
       | formalized proof-carrying code.
        
         | kibwen wrote:
         | Progress towards this is always being made, though it's a long
         | road with much still to determine. I encourage anyone
         | interested in this to join the Zulip channel for Rust's unsafe
         | code guidelines working group: https://rust-
         | lang.zulipchat.com/#narrow/stream/t-lang.2Fwg-u...
         | 
         | Some recent developments: https://gankra.github.io/blah/tower-
         | of-weakenings/
        
         | [deleted]
        
       ___________________________________________________________________
       (page generated 2022-04-10 23:00 UTC)