[HN Gopher] Uncovering a 24-year-old bug in the Linux Kernel (2021)
       ___________________________________________________________________
        
       Uncovering a 24-year-old bug in the Linux Kernel (2021)
        
       Author : endorphine
       Score  : 399 points
       Date   : 2022-10-15 13:08 UTC (9 hours ago)
        
 (HTM) web link (engineering.skroutz.gr)
 (TXT) w3m dump (engineering.skroutz.gr)
        
       | sponaugle wrote:
       | This was a cool example of a class of bugs that are both hard to
       | find with no active example, and hard to prevent in complex
       | systems. The optimization that was added many years ago for
       | performance didn't update something that had a use case that was
       | incompatible with not being updated in a very small number of
       | circumstances.
       | 
       | It is an interesting thought experiment to consider what kind of
       | tool or automated detection could have found this. Some type of
       | dependency linking between variables might have shed some light,
       | but I'm not sure that would have really highlighted this kind of
       | issue.
       | 
       | Great description of both the bug and the path to the solution!
        
         | gizmo686 wrote:
         | Probably the only way to prevent this type of issue in an
         | automated fashion is to change your perspective from proving
         | that a bug exists, to proving that it doesn't exist. That is,
         | you define some properties that your program must satisfy to be
         | considered correct. Then, when you make optimizations such as
         | bulk receiver fast-path, you must prove (to the static analysis
         | tool) that your optimizations to not break any of the required
         | properties. You also need to properly specify the required
         | properties in a way that they are actually useful for what
         | people want the code to do.
         | 
         | All of this is incredibly difficult, and an open area of
         | research. Probably the biggest example of this approach is the
         | Sel4 microkernel. To put the difficulty in perspective, I
         | checkout out some of the sel4 repositories did a quick line
         | count.
         | 
         | The repository for the microkernel itself [0] has 276,541
         | 
         | The testsuite [1] has 26,397
         | 
         | The formal verification repo [2] has 1,583,410, over 5 times as
         | much as the source code.
         | 
         | That is not to say that formal verification takes 5x the work.
         | You also have to write your source-code in such a way that it
         | is ammenable to being formally verified, which makes it more
         | difficult to write, and limits what you can reasonably do.
         | 
         | Having said that, this approach can be done in a less severe
         | way. For instance, type systems are essentially a simple form
         | of formal verification. There are entire classes of bugs that
         | are simply impossible in a properly typed programs; and more
         | advanced type systems can eliminate a larger class of bugs.
         | Although, to get the full benefit, you still need to go out of
         | your way to encode some invariant into the type system. You
         | also find that mainstream languages that try to go in this
         | direction always contain some sort of escape hatch to let the
         | programmer assert a portion of code is correct without needing
         | to convince the verifier.
         | 
         | [0] https://github.com/seL4/seL4
         | 
         | [1] https://github.com/seL4/sel4test
         | 
         | [2] https://github.com/seL4/l4v
        
           | xani_ wrote:
           | > That is not to say that formal verification takes 5x the
           | work. You also have to write your source-code in such a way
           | that it is ammenable to being formally verified, which makes
           | it more difficult to write, and limits what you can
           | reasonably do.
           | 
           | Also hire significantly more skilled people. Write formal
           | verification on job requirement and the pool of candidates
           | will shrink massively.
           | 
           | Explains why it is so rare really. "Spend 5-10x on developers
           | to have some bugs not happen" is not a great sell.
        
         | simtel20 wrote:
         | It's a great question! Thinking back...
         | 
         | At the time this bug was introduced it would probably have been
         | cost prohibitive to create a test case. We were proud of
         | 100mbit networks, had flaky nics the vendors didn't help
         | maintain much of the time (and which were often broken in
         | hardware) and the filesystem max file size was something like
         | 2tb, and most drives wee're in the handful of gbs. Conceiving
         | of testing for something like this would have been expensive.
         | And none of the big system vendors took Linux seriously then.
         | 
         | Though perhaps flooding zeros across a TCP socket could work, I
         | really think that a kernel hacker would have found a lot of
         | other hardware and driver issues before ever being able to
         | trigger this.
        
       | aposm wrote:
       | Awesome breakdown - as someone who is fairly familiar with TCP
       | theoretically but not with the details of the TCP implementation
       | in the Linux kernel, this was just the right balance of detail.
       | Great technical writing IMO!
        
       | myself248 wrote:
       | Okay so I got to the wrap-up at the end, about "why did nobody
       | else find this", the author sets up some logical dominoes but
       | doesn't knock them down. Allow me to try:
       | 
       | Earlier in the article, the author mentions that they recently
       | upgraded some network hardware, and the problem seemed to become
       | more frequent after that.
       | 
       | Packet loss or other network issues would force the stack to fall
       | out of fast-path and update the counter, avoiding the bug.
       | 
       | Running over ssh would avoid the bug. The only time you'd run
       | rsync not over ssh would be within your own network.
       | 
       | So it sounds like (this is my conjecture here) this would only
       | appear to someone running rsync internally, over a high-
       | performance network with no packet loss, and upgrading the
       | switches might've finally gotten the network good enough to
       | expose the bug?
        
         | cryptonector wrote:
         | One might expect this to have been hit by HPN (high performance
         | networking) users, but perhaps if they are storage I/O bound
         | rather than CPU or network I/O bound, then probably not.
        
         | ardel95 wrote:
         | That sounds plausible. But also, most software (browsers, web
         | service SDKs, RPC frameworks) treat TCP connections as fallible
         | by setting read/write timeouts and aggressively reopening
         | broken connections. So, I'm totally not surprised this issue
         | went unnoticed for this many years.
        
       | verisimilitudes wrote:
       | _How is it possible for a TCP bug that leads to stuck connections
       | to go unnoticed for 24 years?_
       | 
       | It's because the fools responsible never rewrite their code, use
       | a broken language, and don't even try to prove half of the broken
       | garbage they write. Then, when it turns out to have been broken
       | for decades, they chuckle and shove another finger into another
       | crack, never understanding how they misuse computers.
        
       | layer8 wrote:
       | This is a good case for formal verification.
        
         | mdaniel wrote:
         | I struggle because I want to upvote these comments, because
         | that's the world I want to live in. But the opposite side of
         | that coin is who is going to author the incredibly arcane
         | _specification_ of TCP against which any such implementation is
         | formally verified?
         | 
         | Maybe TCP stacks are one of the few cases where that make
         | sense, but I'd suspect if it was "worth the cost" it would have
         | already been done
        
           | layer8 wrote:
           | There are certain guarantees you want such a formal
           | specification to give, like for example not getting
           | permanently stuck in some state as with the present bug. You
           | can formalize the proofs for those guarantees and have their
           | correctness machine-checked. Something like TLA+/PlusCal is
           | likely suitable for that.
           | 
           | A formal specification is less ambiguous than a prose
           | specification. Formalizing the TCP specification will, if
           | anything, expose aspects where the specification is unclear,
           | or corner cases where the specification actually leads to
           | unwanted behavior and doesn't provide the desired guarantees.
           | 
           | So, while you can't prove that the formal specification
           | matches the prose specification a 100%, you _can_ prove that
           | it provides all the guarantees the original prose
           | specification was aiming for (once you've formalized those
           | desired guarantees), which is something you can't do for the
           | prose specification.
        
       | sneak wrote:
       | > _These snapshots are updated daily through a pipeline that
       | involves taking an LVM snapshot of production data, anonymizing
       | the dataset by stripping all personal data, and transferring it
       | via rsync to the development database servers._
       | 
       | I don't know what sort of data these people process, but most
       | datasets about people are not anonymized by simply removing the
       | PII.
        
         | abraae wrote:
         | Yes they are. Any information that can be used to identify a
         | person by definition is PII.
         | 
         | Once all the PII is removed, by definition the dataset is
         | anonymized.
        
           | capitol_ wrote:
           | This is obviously true, as you are stating an axiom. But what
           | I think the grand parent is trying to say is that databases
           | with PII can often be deanonymized by looking at the other
           | data that isn't obviously PII.
           | 
           | Take for example a database over all mobile phone positions
           | over time, this can be 'anonymized' by removing all
           | connections from the phones to information on who owns the
           | phones.
           | 
           | But it can still be trivially deanonymized by analyzing where
           | the phones are at night and during office hours, not very
           | many persons work in the same building and sleep in the same
           | house.
        
       | omginternets wrote:
       | Which kernel version has this patch?
        
       | MatthiasPortzel wrote:
       | I remember when this was originally posted, but I voted it up
       | again because I think it's such an excellent story, and excellent
       | programming. We need more people and companies like this, who are
       | willing to go beyond "oh it fails randomly sometimes" and track
       | down the underlying issues.
       | 
       | => https://news.ycombinator.com/item?id=26102241 Previous
       | Discussion (497 points - 41 comments)
        
         | [deleted]
        
         | c0mptonFP wrote:
         | > We need more people and companies like this, who are willing
         | to go beyond "oh it fails randomly sometimes" and track down
         | the underlying issues.
         | 
         | I absolutely disagree. Most capable engineers I know have this
         | urge to go down rabbit holes and fix any issue, this is nothing
         | special.
         | 
         | Everyone wants to be the hero that found a bug deep in the
         | stack, make a glorious pull request, and be celebrated in the
         | community.
         | 
         | I much more value people who have enough self-control to pick
         | meaningful battles, and follow the right priorities.
        
           | jackmott wrote:
        
           | black_puppydog wrote:
           | Eh, right, many bugs we have don't really matter.
           | 
           | Oh what is that you say, security vulnerabilities are also
           | just bugs that get exploited? Oh well...
        
             | [deleted]
        
           | [deleted]
        
           | rrss wrote:
           | In my experience, the "oh it fails randomly sometimes" bugs
           | are often in some random dull legacy infrastructure component
           | where there is zero attention or celebration for fixing them,
           | and so engineers tend to tolerate losing a bit of time once a
           | week due to them for years rather than someone spending half
           | a day to fix it for everyone.
        
           | robertlagrant wrote:
           | Exactly. I could fix any complex bug. I just choose not to.
        
           | KolmogorovComp wrote:
           | At the company level, it is indeed more expensive to fix
           | upstream rather thank work around it, but on a macro scale it
           | is much more beneficial.
           | 
           | In my opinion fixing upstream whenever possible even if not
           | the best short-term solution should be considered the price
           | to pay for using OSS.
        
           | CSSer wrote:
           | GP's comment is also odd because the article notes they took
           | your approach. They documented the problem when they first
           | noticed it happening infrequently and moved on to higher
           | priorities. When it started happening every single day it
           | became mission critical to investigate.
        
           | HenrikB wrote:
           | I think this was well prioritized; they struggled with the
           | issue at times, found a temporary workaround, but when that
           | workaround stod being efficient and the bug hit them
           | everyday, they decided to track down the source. Then they
           | reported upstream, it was reproduced, and someone patched it,
           | and rolled out new, fixed kernels.
           | 
           | That is a perfect example of how things works and should
           | work. They contributed to the community. I think it was a
           | great prioritization.
           | 
           | I'm certain there were lots of other people hitting this bug
           | and killing processes or rebooting to get around it. The
           | troubleshooting and reporting done here, silently saved a lot
           | of of other people a lot of efforts - now and in the future.
           | I don't think they were after it to be heroes; they just
           | shared their story, which I'm sure will encourage others to
           | maybe do the same one day.
        
           | freedomben wrote:
           | This opinion is a popular one these days (particularly since
           | it complements the demands of business nicely by maximizing
           | personal/company profit), but it is a big part of the reason
           | why the majority of software these days is so unreliable and
           | buggy. It results in hacks on top of hacks to paper over
           | problems in the lower levels of the abstraction tower that is
           | modern software, and it results in tons of "WTF" bugs that
           | are just accepted and never fixed.
        
           | trasz wrote:
           | This _is_ the meaningful stuff. Engineers might have the
           | urge, but most don't have the opportunity, because they need
           | to focus on the currently fashionable framework.
           | 
           | A good rule of thumb regarding meaningful battles is to
           | ignore everything promoted by companies like Google or
           | Facebook - everything they do is either going to be abandoned
           | in five years, or makes sense only in the context of solving
           | problems nobody else have.
        
             | stjohnswarts wrote:
             | seems like something an engineer might fix on their own
             | time if they were feeling feisty about the matter.
             | Something tells me if it went on for 20 years it was an
             | edge case that only very rarely came up and was mostly a
             | non-issue.
        
               | trasz wrote:
               | I suspect it was definitely an issue, it's just that most
               | companies like Google don't care about reliability, only
               | availability, and it might just not show up in their
               | stats.
        
         | digiou wrote:
         | For the record, this is one of the top Greek employers. This is
         | Greece's Amazon essentially. The C-team are intact since day-1
         | and AFAIK still writing (some) code.
         | 
         | It is not unheard of to have 4-day weeks and developer-first
         | mindset at that place.
        
       | charcoalhobo wrote:
       | Love deep dive troubleshooting like this. I haven't heard of
       | systemtap before; looks nice. When I had to troubleshoot a kernel
       | bug [1] I used perf [2] probes which are also really nice for
       | this kind of debugging.
       | 
       | [1] https://www.spinics.net/lists/xdp-newbies/msg01231.html
       | 
       | [2] https://www.brendangregg.com/perf.html
        
       | thow232329 wrote:
       | "This setup has worked rather well for the better part of a
       | decade and has managed to scale from 15 developers to 150"
       | 
       | LOL
        
         | dang wrote:
         | Could you please stop creating accounts for every few comments
         | you post? We ban accounts that do that. This is in the site
         | guidelines: https://news.ycombinator.com/newsguidelines.html.
         | 
         | You needn't use your real name, of course, but for HN to be a
         | community, users need some identity for other users to relate
         | to. Otherwise we may as well have no usernames and no
         | community, and that would be a different kind of forum.
         | https://hn.algolia.com/?sort=byDate&dateRange=all&type=comme...
         | 
         | Also, could you please stop posting unsubstantive and/or snarky
         | and/or flamebait comments? It's not what this site is for, and
         | it destroys what it is for. If you wouldn't mind reviewing
         | https://news.ycombinator.com/newsguidelines.html and taking the
         | intended spirit of the site more to heart, we'd be grateful.
        
       | halukakin wrote:
       | Could someone provide link(s) on how regular snapshots of
       | databases can be taken like this? (Googling didn't help much,
       | maybe I'm googling for the wrong keywords.) For me, backing up
       | the database is a few-hour-long process. Restoring it for a
       | developer again is a few hours process. I read about snapshots
       | before but haven't realized they could be this effective.
        
         | rrdharan wrote:
         | It's the lack of clarity on how they manage access control for
         | what should be regulated data that surprises me, more than the
         | technology achievement.
        
         | nick__m wrote:
         | for mariadb :
         | 
         | 0) make sure the the database data volume is on lvm or zfs
         | 
         | in a sql prompt:                 1) BACKUP STAGE START; BACKUP
         | STAGE BLOCK_COMMIT;       2) \! the shell command to take the
         | snapshot       3) BACKUP STAGE END;
         | 
         | you can now mount your snapshot, copy it offsite and delete it.
         | The restore procedure is left as an exercise!
        
           | halukakin wrote:
           | Very helpful. Thank you!
        
         | ClumsyPilot wrote:
         | can't most COW dilesystems like BTRFS or ZFS take a snapshot at
         | a point in time instantly?
        
           | abdulocracy wrote:
           | LVM does the same but at the block level.
           | 
           | https://wiki.archlinux.org/title/Create_root_filesystem_snap.
           | ..
        
         | mauvehaus wrote:
         | Because it isn't a backup. They put the database into a
         | quiescent state on disk, take a file system snapshot, let the
         | dbms resume working, and send the snapshot data via rsync.
         | 
         | This requires the cooperation of the dbms software to get the
         | on-disk data quiesced. Then your snapshot has to go fast enough
         | that the dbms doesn't end up with too many spinning plates
         | before you let it start writing normally.
        
           | halukakin wrote:
           | Got it. Thank you!
        
       | justin_oaks wrote:
       | I love when you're using open source software and can find the
       | bug yourself, even if it's deep down the stack.
       | 
       | Imagine if this bug were somewhere in closed source software.
       | You'd have to reach out to the software's customer support team.
       | Every time I reach out to customer support I expect to have an
       | unpleasant experience. It is rarely otherwise.
        
         | [deleted]
        
         | xani_ wrote:
         | Kinda why I'm not a fan of cloud, same black box problem.
        
         | perth wrote:
         | And even if you did reach out to customer support, it would
         | rarely ever get dev attention unless most people have the
         | issue. Even in that case, it sometimes still gets a fat
         | wontfix, like the famous OneDrive file corruption bug.
        
           | themoonisachees wrote:
           | Raising this bug in windows (how? Microsoft sells support,
           | barely, but you can't talk to the ipv4 stack dev anyway) woul
           | get you laughed out of the chat room because it can't posibly
           | be the ip stack's fault.
        
       | didgetmaster wrote:
       | As someone who thrives on tracking down rare but annoying bugs in
       | a debugger, I love stories like this. It is not just bugs that
       | cause real failures which can be headaches; but also bugs that
       | just slow things down unexpectantly. They can sometimes go
       | undetected for decades like this one.
       | 
       | I wrote an article this past year that talks about silent bugs
       | that slowly eat resources and collectively can be very expensive
       | in terms of wasted time and energy:
       | https://didgets.substack.com/p/finding-and-fixing-a-billion-...
        
         | xani_ wrote:
         | > As someone who thrives on tracking down rare but annoying
         | bugs in a debugger,
         | 
         | As someone that is cursed to inevitably find some obscure bug
         | the second I start using some piece of software I'm happy I'm
         | not the only one
         | 
         | > I wrote an article this past year that talks about silent
         | bugs that slowly eat resources and collectively can be very
         | expensive in terms of wasted time and energy
         | 
         | "Using JS for backend is ecoterrorism" lmao
        
         | myself248 wrote:
         | Okay but where's the bug story? Did I miss the story?
        
           | didgetmaster wrote:
           | I wrote the article right after I fixed a huge inefficiency
           | problem in a function within my own project. I neglected to
           | give the specifics in the article, but here they are since
           | you asked.
           | 
           | My Didgets tool lets you create pivot tables against
           | relational database tables, even very large ones. For the
           | pivot values, you can choose to just count the occurrence of
           | each value or if it is a number type you can add them up. You
           | can also add up the values in a separate number column. Here
           | is a quick demo video:
           | https://www.youtube.com/watch?v=2ScBd-71OLQ
           | 
           | When adding up numbers in a separate column, I had just a few
           | lines of unnecessary code that ended up being called
           | exponentially. For smaller tables it was barely noticeable,
           | but for tables with 30 million+ rows it really bogged down.
           | 
           | A simple fix to the affected lines caused a certain test
           | against a large table to go from over 10 minutes down to
           | under 20 seconds. The effects of just a few lines of code
           | when applied to a big enough data set can really impact
           | performance. It is the old Einstein equation E=mc2 in effect
           | which is discussed here:
           | https://didgets.substack.com/p/musings-from-an-old-
           | programme...
        
             | shurane wrote:
             | I guess there is a lost art of writing for optimal
             | code/memory/execution time, especially as our resources
             | increase.
             | 
             | I think the idea here is to write code quickly that's
             | inefficient, and re-write it to be efficient if the
             | performance is required down the line. For companies where
             | there's bigger fish to fry, i.e. customer acquisition, it's
             | more useful to pump out more features (even at the expense
             | of bugs) because that draws customers.
             | 
             | But in places where performance is important, you do see
             | developers squeeze out more cycles/memory. I.e. kernel/OS
             | development, database servers, video games. It's just that
             | most developers aren't in those areas of specialty anymore.
             | 
             | Btw, have you heard of https://handmade.network/ and
             | https://en.wikipedia.org/wiki/Demoscene ? Wondering what
             | your thoughts are in those areas. There are probably more
             | communities like the ones I mentioned, where developers are
             | interested in writing the kind of code that you are talking
             | about.
        
         | pvillano wrote:
         | > but also bugs that just slow things down unexpectantly. They
         | can sometimes go undetected for decades like this one.
         | 
         | Reminds me of the GTA Online quadratic time JSON parsing bug
        
       | itismetheidiot wrote:
       | how odd to see a write up from skroutz.gr blog being at the first
       | page of HN...
        
         | dang wrote:
         | Also these!
         | 
         |  _Speeding Up Our Build Pipelines_ -
         | https://news.ycombinator.com/item?id=20775297 - Aug 2019 (24
         | comments)
         | 
         |  _The infrastructure behind one of the most popular sites in
         | Greece_ - https://news.ycombinator.com/item?id=9982361 - July
         | 2015 (5 comments)
         | 
         |  _Working with the ELK stack_ -
         | https://news.ycombinator.com/item?id=9008119 - Feb 2015 (35
         | comments)
        
         | NKosmatos wrote:
         | Yeap, it's a bit strange, but the post was very well written,
         | with a nice breakdown and easily understandable steps that can
         | be followed by most software engineers.
         | 
         | There have been some sporadic posts from Skroutz in the past,
         | but nothing that gained so much attention.
         | 
         | For those that don't know it, Skroutz is the biggest Greek
         | online price aggregator/e-commerce market/price comparison
         | site.
        
       | [deleted]
        
       ___________________________________________________________________
       (page generated 2022-10-15 23:00 UTC)