[HN Gopher] Exploiting null-dereferences in the Linux kernel
       ___________________________________________________________________
        
       Exploiting null-dereferences in the Linux kernel
        
       Author : kuter
       Score  : 80 points
       Date   : 2023-01-19 18:37 UTC (4 hours ago)
        
 (HTM) web link (googleprojectzero.blogspot.com)
 (TXT) w3m dump (googleprojectzero.blogspot.com)
        
       | high_byte wrote:
       | 8 days to exploit :) pretty neat.
       | 
       | and 2 years on servers? still worth a shot. I bet it can be much
       | faster in certain scenarios.
        
       | azakai wrote:
       | IIUC steps 5-7 in the exploit cause around 2^32 oopses. I don't
       | know much about the Linux kernel - could it perhaps have a limit
       | on the number of oopses before it halts the entire system?
       | 
       | The article explains why it is important to not do that in
       | general, as an oops allows debugging and recovery etc. But 2^32
       | of them seems suspicious.
        
         | roguebantha wrote:
         | Yes, there is now an oops limit, specifically because of this
         | technique - see the conclusion paragraph.
        
           | azakai wrote:
           | Ah, thanks! I should finish reading the entire article before
           | commenting, sorry...
        
       | jeffbee wrote:
       | The article is about the exploitability of the flaw but really
       | the flaw should not exist. Printing /proc/$pid/smaps is not on
       | any conceivable performance-critical hot path. It can stand to
       | have bounds checks and safety. The call to print out smaps should
       | be well-encapsulated in some non-C language.
        
         | [deleted]
        
         | nix0n wrote:
         | > Printing /proc/$pid/smaps is not on any conceivable
         | performance-critical hot path.
         | 
         | I disagree, for profiling memory usage it's useful to get
         | memory map data multiple times per second.
        
           | jeffbee wrote:
           | If you think smaps is performance-critical that raises of the
           | question of its ridiculous textual format. Clearly, it would
           | be vastly more efficient to pass the information as a
           | protobuf or whatever. Believe me, as the person who had to
           | refactor the smaps-reading library on cost/efficiency grounds
           | at Google, this issue it nearer and dearer to me than to
           | probably anyone else.
        
         | chc4 wrote:
         | Yes, in an ideal world your kernel shouldn't have any bugs. We
         | don't live in an ideal world.
         | 
         | Security engineering is the field of _practical_ mitigations -
         | given that there are, in fact, null pointer dereferences in the
         | kernel, mmap_min_addr and adding count limit to kernel oops
         | provides defense in depth to help prevent them from being
         | exploitable.
        
         | roguebantha wrote:
         | Thankfully this isolated flaw was quite easy to fix. And _yes_
         | this code isn 't likely to be on any hot paths, and code can
         | always stand to have bounds/sanity checks (and it always
         | should). But unfortunately encapsulating all non-hot-paths in
         | Linux kernel that might have these sorts of bugs in a memory-
         | safe language is at best a very long term goal and at worst a
         | pipe-dream. The real goal of the blog post was not to push for
         | any sort of rewrite, but rather to note how even the simplest
         | and most innocuous of bugs can lead to security-relevant
         | primitives. And also to make sure kernel developers and bug
         | fixers have strategies like this in mind when they evaluate
         | other bugs in the future.
         | 
         | TLDR: However honorable the end-goal is, this blog post is not
         | the ammo you need to push for a big rewrite of various
         | kernel<->userland interfaces into memory safe languages.
        
         | tedunangst wrote:
         | What does your safe language do when it accesses a null object?
         | Does it oops?
        
           | deathanatos wrote:
           | My safe language doesn't have "null", more or less.
           | 
           | What it has is Option<T>, and I cannot turn that into a T
           | without handling the failure case: there is literally no way
           | to construct the code otherwise1. One _must_ handle the
           | failure path. (That might be way of explicit panic
           | /abort/oops, but it's then right there in the code: that
           | branch _will_ panic ... and safely.)
           | 
           | 1(this example is using safe Rust. There's unsafe Rust too
           | and there I can chase the null pointer all I want with that,
           | but the parent's point is that we should be sticking to safe
           | interfaces for stuff like this. And I'm using Rust as an
           | example, but Option is hardly unique to Rust, heck, Rust
           | stole the idea from its predecessors.)
        
             | tedunangst wrote:
             | What happens when I call unwrap?
        
               | SpaghettiCthulu wrote:
               | You shouldn't be allowed to in the kernel
        
               | monocasa wrote:
               | A panic there absolutely could bail out to the read call
               | at the root of this kernel stack with an EIO or some
               | such.
        
               | deathanatos wrote:
               | That's the explicit handling of the None case I mentioned
               | in the comment: it causes an explicit, and safe, abort.
               | By "explicit", I mean the .unwrap() call will be right
               | there, in the method that needs to turn an Option<T> into
               | a T, and visible to a code reviewer. In the larger
               | context here of kernel code, it should raise the eyebrow
               | on the reviewer: "wait, this function _shouldn 't_ abort,
               | it needs to handle the edge cases!".
               | 
               | (But for some userland app, aborting might be acceptable.
               | The kernel is in a bit of a bind, since an abort -- a
               | kernel panic -- means the user loses computer until they
               | reboot, and the work along with it.)
               | 
               | Vs. a C pointer ... all uses are more or less equally
               | suspect; any given use, you hope the code has done it's
               | homework for ensuring they're not NULL, and if they are,
               | the consequence is UB. (And in Rust, and in the languages
               | Rust steals the idea of Option from, you're only
               | using/passing Options where "None"/null/nil is a
               | possibility. If it's not, or you've verified or handled
               | that at some outer stack frame, then you just pass a
               | reference to a T, which is statically guaranteed to point
               | to a valid object1.)
               | 
               | 1again, barring buggy code using unsafe Rust, in the
               | example of Rust, or calling into C code that fails to
               | maintain its invariants, etc.
               | 
               | Take the example in the article, where the code does,
               | priv->mm->mmap->vm_start
               | 
               | while trying to generate the output for smaps_rollup.
               | That's compilable, but buggy, C, because mmap can be
               | null, but we failed to check for it.
               | 
               | Vs., if mmap were an Option<T>, where T is whatever type
               | that pointer points to. Let's say our coder attempts to
               | write,                 priv->mm->mmap->vm_start
               | 
               | (In some imaginary language, because C doesn't have
               | Option, AFAIK.) The compiler would say, no, you can't
               | "->vm_start", because "mmap" _could_ be None (whatever
               | you call the  "nothing here" value/variant; I'm going to
               | call it None, to distinguish it from the null pointer).
               | 
               | In the case of unwrap, the coder could do something like
               | (this is psuedo-code)
               | (priv->mm->mmap).unwrap().vm_start
               | 
               | It would then be obvious there is an abort there. Their
               | reviewer would not be pleased with that, I suspect: we
               | don't want kernel panics or oops or aborts while
               | generating a file in /proc. And likely our imaginary
               | coder would know this too, and when the compiler errored
               | the first time, saying, "hey, mmap is an Option", they'd
               | raise an eyebrow, say something like, "wait, it is? When
               | would mmap be None?" and then proceed to properly handle
               | that case. (E.g., by treating it as if it where the empty
               | list.)
        
               | tedunangst wrote:
               | Ah, I see now, abort is better than oops. Why hasn't
               | anyone told the Linux developers they should use abort
               | instead of oops?
        
               | loeg wrote:
               | These juvenile and facile retorts do not elevate the
               | discourse, nor are they a good look. As a long-time
               | OpenBSD developer you have a lot of smart things to say
               | about technical subjects, including kernels specifically,
               | and I wish you would leave the dumb snarky comments
               | unwritten.
        
               | Dylan16807 wrote:
               | If an abort function still triggers cleanup, then yes it
               | is better. C doesn't have such a thing, so your sarcasm
               | about 'telling them' is unwarranted.
               | 
               | If an abort function doesn't trigger cleanup, then you
               | can block it at compile time to prevent this kind of bug.
               | But before you can even think about doing that, you need
               | to split pointers into nullable and non-nullable. And the
               | kernel devs already know about that idea, and how hard it
               | is to implement in C.
               | 
               | Nobody is naively suggesting "hey kernel devs do this
               | thing!" as if there isn't decades of momentum behind the
               | current codebase. It's just a look at how C is bad at
               | this particular kind of bug.
        
               | chc4 wrote:
               | The root cause of the bug isn't the kernel dereferencing
               | a NULL and causing UB. It's the kernel _doing error
               | handling_ and attempting to kill the oopsing task and
               | continue. If the semantics of Rust panics also did a
               | kernel oops, unwrap() would trigger the exact same bug
               | described in the blog post with regards to reference
               | count rollover (if Rust in the kernel doesn 't do stack
               | unwinding)
        
               | jeffbee wrote:
               | Yes, but fundamentally the kernel's inability to handle
               | exceptional cases is all in deference to performance.
               | Non-performance-critical sections, which in a fair
               | analysis would be 99.9% of the kernel at least, should be
               | written in a language and style that provides structured
               | unwinding, not just jumping to the error case and whoops
               | I accidentally jumped beyond all the unlocks and
               | reference count decrements and deallocations. That's the
               | issue.
        
               | Dylan16807 wrote:
               | You wouldn't allow code that aborts without cleanup in
               | these areas of the kernel, or in the kernel at all.
               | 
               | You can't make a similar rule against null dereferences,
               | because those happen by accident. (Unless you wrap every
               | single pointer dereference, which is not happening.)
               | 
               | If you don't allow aborting, then the compiler makes you
               | write an error-handling path that returns, and the
               | cleanup code will not be skipped.
        
               | deathanatos wrote:
               | There are two bugs discussed in the article. One is the
               | one this thread started with, which _was_ the kernel
               | deref 'ing a NULL.
               | 
               | You're right that this is separate from the handling of
               | the oops, which is the main exploitability that TFA is
               | getting at, and certainly fixing one deref leading to an
               | oops (the proc file chasing NULL) doesn't fix the other
               | bug of "any oops can be further exploited".
               | 
               | But the context of this subthread is the implication that
               | you must have some null, and some thing must happen when
               | it is chased. That assumption is wrong, that's what the
               | core of the comment I'm making is getting at: you can't
               | follow a null if you don't have the possibility of them
               | in the first place. (Or where you must have an Option<T>,
               | you can build safe interfaces for handling that fact.)
               | 
               | > _unwrap() would trigger the exact same bug described in
               | the blog post with regards to reference count rollover_
               | 
               | If we consider this instead as "an unwrap occurring
               | during the oops handling", _maybe_ , but it's not
               | guaranteed that that is the case. Other aspects of Rust
               | could similarly prevent that bug. I haven't fully grokked
               | the latter half of the article, but I didn't think it
               | would be necessary for the comment, as, a. the chain was
               | about "Printing /proc/$pid/smaps is not on any
               | conceivable performance-critical hot path." and b.
               | followed by the question about null.
               | 
               | Ref-counting in Rust is often dealt with via RAII, and is
               | safe through that, both in that RAII means the refcount
               | is managed correctly and without input from the coder,
               | but also Rc (and I presume Arc) will abort on overflow. I
               | don't _know_ if that would fully translate to kernel
               | code, given that we might be taking refs due to the
               | actions of userland, and that might be happening near the
               | userland /kernel boundary and be reasonably subject to
               | unsafe code that could very well fall prey to the same
               | problems.
        
               | aseipp wrote:
               | The thread panics because the value is None, and it stops
               | execution before the attacker can take control of EIP.
               | 
               | Was that supposed to be a hard question?
        
               | avgcorrection wrote:
               | It shouldn't be a question that tedunangst should even
               | have to ask, given that he has had a chip on his shoulder
               | about Rust and its safety guarantees for years and thus
               | should have learned _that_ much about it by now.
        
               | btown wrote:
               | One of the missing pieces IMO is that your language needs
               | the right kind of shorthand to make it easy to say "I'm
               | calling things that return Options or Results, I myself
               | return an Option or Result, any time I unwrap let's
               | pipeline any null values or failures into returning a
               | failure immediately."
               | 
               | In Rust this is the ? operator: https://doc.rust-
               | lang.org/reference/expressions/operator-exp...
               | 
               | The whole idea is that there should never be a way to
               | call unwrap() if you as the caller cannot handle it
               | gracefully. And if you do this at every step up until the
               | UI layer, which can handle any failure as an error to be
               | displayed in the UI, then the job is complete!
        
               | Y_Y wrote:
               | The Maybe Monad
        
           | wyldfire wrote:
           | In order to have the safe language I believe you would need
           | to decompose the code shown here in show_smaps_rollup(). If
           | the null deref occurred in the unsafe portion it would likely
           | still do an oops. If the null deref occurred in the safe
           | portion it would likely exit safely and cause the syscall to
           | return some errno that describes a kernel fault.
        
           | monocasa wrote:
           | Ideally it has a iterator construct built in so it views an
           | empty linked list chain truly as an empty list without
           | derefencing the first (null) item preemptively.
        
         | david2ndaccount wrote:
         | C could support the concept of nullable vs non-null pointers.
         | Clang even already has this as an extension:
         | https://clang.llvm.org/docs/AttributeReference.html#nullabil...
         | 
         | There is also an associated nullability sanitizer.
         | 
         | I use this in my own C code all the time and null pointer
         | errors vanish if you faithfully annotate every pointer. There's
         | also a pragma to make non null pointers the default in a file.
         | 
         | GCC devs would have to be convinced to add this to GCC and then
         | nullability annotations would need to be added to the kernel.
         | You can then do static analysis/compile error if you do an
         | unguarded check of a nullable pointer.
        
       ___________________________________________________________________
       (page generated 2023-01-19 23:00 UTC)