[HN Gopher] Debugging a Linux network stack crash via a single r...
       ___________________________________________________________________
        
       Debugging a Linux network stack crash via a single register value
        
       Author : jgrahamc
       Score  : 201 points
       Date   : 2021-11-17 23:54 UTC (1 days ago)
        
 (HTM) web link (blog.cloudflare.com)
 (TXT) w3m dump (blog.cloudflare.com)
        
       | ndesaulniers wrote:
       | I had a similar experience recently when we were trying to get
       | AMDGPU Linux kernel drivers to run without panic'ing.
       | 
       | ./scripts/decodecode is what produces the disassembly of the code
       | trace from the panic. (Seeing its output converted in Intel
       | syntax in this post is...heresy)
       | 
       | For AMDGPU, the issue was that the x86 Linux kernel doesn't use
       | 16B stack alignment (it uses 8B stack alignment), yet just the
       | AMDGPU driver was forcing the stack alignment for itself back to
       | 16B. The AMDGPU driver uses SSE2 instructions that require 16B
       | stack alignment. From the trace, seeing RSP as a multiple of 8
       | and not 16 was the smoking gun (ie. a single register).
       | 
       | The fix was to use the same stack alignment (8B) consistently in
       | the driver when using sse2 (except for old GCC versions), see
       | this series:
       | https://lore.kernel.org/lkml/20191016230209.39663-2-ndesauln....
       | Stack alignment has obvious ABI implications.
        
       | uvdn7 wrote:
       | Brilliant!! What's really cool is that Jakub approached the crash
       | systemically. There are hard bugs in life e.g. cache
       | inconsistencies, kernel bugs. What' more important than fixing
       | them with one-off solution is to come up with a systemic approach
       | to them. Great job!
       | 
       | Cloudflare is indeed a really cool place.
        
       | stuff4ben wrote:
       | Wow, that was a journey and very well written technical article!
       | I didn't understand a good half of it since it's been 20 years
       | since my single class in assembly language. But it makes me feel
       | good that people like this exist!
        
         | kingcharles wrote:
         | This was my thought too. I can follow the x86 and C code, but
         | the networking and kernel debugging was way above my pay grade.
         | 
         | Thank FSM for cleverer people than myself. Another case of
         | feeling like an imposter...
        
       | [deleted]
        
       | Hikikomori wrote:
       | Reminds me of
       | https://quickview.cloudapps.cisco.com/quickview/bug/CSCso053...
        
       | abainbridge wrote:
       | Impressive bug hunting skills.
       | 
       | When was this bug introduced? Does anyone maintain a list of when
       | all the known Linux kernel bugs were introduced? I'd love to know
       | how many bugs are added to the kernel each year, and if the rate
       | is changing.
       | 
       | I'm not trying to troll. I think the Linux kernel is an amazing
       | piece of software engineering. I just think this would be an
       | interesting metric.
        
         | LukeShu wrote:
         | The kernel folks do a pretty good job of keeping track of which
         | past commits a new commit "fixes", which they put in the commit
         | message. For example, the patch linked in the article says:
         | Fixes: bf296b125b21 ("tcp: Add GRO support")         Fixes:
         | f993bc25e519 ("net: core: handle encapsulation offloads when
         | computing segment lengths")         Fixes: e20cf8d3f1f7 ("udp:
         | implement GRO for plain UDP sockets.")
         | 
         | That said, using this to track how many bugs are introduced
         | each year is problematic. It's often the case that commit A
         | introduces a bug, commit B aims to fix it and says "Fixes: A"
         | but turns out to only be a partial fix, and then commit C
         | completes the fix and says "Fixes: B". Naively, based on the se
         | annotations it would make sense to say "B introduced a bug",
         | but as in my example, this isn't always the case.
         | 
         | Greg KH discusses this in his talk "CVEs are dead" (video:
         | https://www.youtube.com/watch?v=HeeoTE9jLjM slides:
         | https://github.com/gregkh/presentation-cve-is-dead/blob/mast...
         | ).
        
       | abridgett wrote:
       | Superb article: - shows how to debug a kernel oops - show use of
       | extra tools (bpf, scapy, kasan) - dives deep into more esoteric
       | bits of networking explaining from basic to how the kernel
       | implementation works - demonstrates proof of the theory in
       | multiple ways
       | 
       | Yes, it is a big "come work here on interesting stuff with
       | fantastic people" (and show off your own skills and learn
       | something neat) - and it's done without bragging and I'm sure
       | will help others to debug their next OOPS :)
       | 
       | Kudos to Jakub (and presumably reviewers)
        
       | megous wrote:
       | "One register value", and the article starts with a full stack
       | dump even with source code references available. :)
        
         | jerjerjer wrote:
         | Yes, excellent article but clickbaity title. Honestly, all
         | memory access errors eventually boil down to a "one register
         | value" and many other error types probably too. I mean, what
         | else is there if we go down far enough, really?
        
         | AlphENsign_Tech wrote:
         | exactly kind of articles we need FTW
        
       | Agingcoder wrote:
       | Brilliant article, thanks for posting.
       | 
       | This is typically the kind of post which makes me want to
       | actually apply, mostly because I can relate to it : it's not
       | magical, doesn't require gazillions of tpus, or petabytes of
       | storage : it's plain old, excellent engineering.
       | 
       | I wonder how often these issues creep up in practice, and how
       | long it took the author to sort it out though! I've had my share
       | of compiler bugs/kernel bugs, and they're usually quite expensive
       | to understand, and it takes a long time to convince yourself that
       | the bug is not in your code (granted, there's an oops here!)
        
         | babelfish wrote:
         | https://cloudflare.com/careers :)
        
       | diskzero wrote:
       | What a fun read! Was the bug fixed as as a result of this
       | investigation or was the fix already in a patch that these
       | machines didn't have. I wasn't able to figure that out by reading
       | the article.
        
         | azinman2 wrote:
         | If you click the link to the patch you see it was submitted by
         | cloudflare.
        
         | xbar wrote:
         | It was fixed as a result of the article. Jakub opens with
         | "About a year ago...." and closes with a link to the fix thread
         | where he gets it accepted into the kernel back in July 2021.
         | 
         | https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/lin...
        
       | quotemstr wrote:
       | Nobody should have to decode register values, look up source
       | files versions, or reconstruct a stack trace by hand. Crashes,
       | both in user- and kernel-space, should produce neat, tidy, self-
       | contained dumps that include not only the entire machine state,
       | but globally unique build IDs for all binaries involved in the
       | crash. And the debugger ought to be able to load one of these
       | crash dumps and find the debug symbols and source files
       | _automatically_.
       | 
       | Windows has been able to do this for decades. Why, in Unix-land,
       | are we still reading text reports about crashes and puzzling over
       | specific register values?
        
         | bitcharmer wrote:
         | How on earth are you going to map a particular spot in the code
         | without access to the sources?
         | 
         | I'd love you to explain how hunting kernel bugs is easier on
         | windows.
        
       | holonomically wrote:
       | Microsoft is working on a project to formalize various parts of
       | the web stack and it would be interesting if their work also
       | carried over to the lower parts of the networking stack like in
       | this article. [1] I suspect this bug would have been caught if
       | the segment handling logic was implemented in a language with a
       | formal specification for the segment headers.
       | 
       | Is cloudflare working on any formalization efforts like
       | Microsoft's Project Everest?
       | 
       | 1: https://www.microsoft.com/en-us/research/project/project-
       | eve...
        
       ___________________________________________________________________
       (page generated 2021-11-19 23:00 UTC)