[HN Gopher] Software bug made Bombardier planes turn the wrong way
       ___________________________________________________________________
        
       Software bug made Bombardier planes turn the wrong way
        
       Author : sohkamyung
       Score  : 54 points
       Date   : 2020-05-29 09:30 UTC (13 hours ago)
        
 (HTM) web link (www.theregister.co.uk)
 (TXT) w3m dump (www.theregister.co.uk)
        
       | [deleted]
        
       | trhway wrote:
       | >Most bugs in airliners tend to be unforeseen memory overflows
       | 
       | the 21st century, planet Earth.
        
       | parkovski wrote:
       | This reminds me of a meetup I attended last fall, they were
       | talking about the Spectre/Meltdown issues. I asked the presenters
       | if anything in chip manufacturing/verification processes had
       | changed as a result of that and they seemed surprised.
       | 
       | To me, when a software bug shows up in a critical system, that
       | means you actually have a logistics bug. Airplane control
       | software should not be allowed to have bugs. CPUs should not be
       | allowed to have bugs. And OS's should not be allowed to crash
       | (looking at you Microsoft).
       | 
       | When one of these things happens, in my opinion the correct
       | response is _not_ to just release fixes and workarounds and then
       | say "we'll try really hard to not let it happen again." You do
       | that, sure. But the first time you see airplane software
       | malfunction, that means you need to change the way the software
       | is written and released so that the whole class of issues will
       | not ever happen again. You don't stop at a public apology, you
       | don't fire the person that unintentionally wrote the bug. If you
       | have to hire mathematicians to formally prove the critical paths
       | of the software, you do that. If it costs 10x more to release
       | bug-free software, oh well, you do that.
       | 
       | All of these corporate people thinking they can save money by
       | spending less on quality are extremely naive. You can do a
       | financial analysis of this, but they're doing it wrong. Did you
       | ever consider what the cost of a whole generation just not
       | trusting air travel at all would be?
        
         | ashtonkem wrote:
         | On one hand, I understand your sentiment, on the other hand
         | even with these bugs air travel is as safe as it's ever been.
         | We've reached a point where fewer people die in air travel per
         | year than at any other point in the history of air travel, and
         | that's _before_ you account for the number of miles travelled.
         | It's almost ridiculous how safe air travel is on average.
        
           | martinald wrote:
           | That was true until 737 MAX, which statistically must have
           | been one of the most dangerous planes (or jets at least) in
           | history. Very few miles and 2 complete hull loss incidents
           | very close together. These bugs really do matter. You can
           | have quite a lot of minor issues and get away with it, but
           | when you hit a serious failure like the MAX had, even if only
           | triggered 1 in 10,000 flights ends up with an awful lot of
           | casualties.
        
             | CamperBob2 wrote:
             | The MAX problems weren't so much software bugs as
             | specification bugs. The software did exactly what it was
             | told to do by criminally-negligent engineering and
             | management personnel.
        
             | phire wrote:
             | MCAS was not a bug. The software behaved excatly as
             | specified.
             | 
             | The issue was the specification itself, which assumed
             | pilots would reliably catch the uncommanded trim down,
             | diagnose it and disable the whole electric trim subsystem
             | within seconds of the problem behavior arising.
             | 
             | That assumption turned out to be massively flawed.
        
               | jacquesm wrote:
               | Your comment implicitly - and probably unintentionally -
               | appears to assign part of the blame to the pilots, which
               | I think is a very bad thing to do in this particular
               | case.
        
               | shreyansj wrote:
               | I think specification here refers to the type
               | specification of the aircraft. It's not putting the
               | burden on the pilots but rather on the lack of pilot
               | training due to Boeing and airlines not wanting to bear
               | the cost of training pilots to a new aircraft type.
        
               | [deleted]
        
               | heavenlyblue wrote:
               | Then it means that they had to formally verify the
               | specification itself.
               | 
               | It's not that hard by the way. And they did that, but
               | handwaved the critique - the typical approach of "my guts
               | are probably more correct than maths".
        
           | londons_explore wrote:
           | _commercial_ air travel.
           | 
           | Private planes and industrial planes still have an awful
           | safety record.
           | 
           | Most stats also exclude 'unrelated' deaths which happen
           | during a flight (even though there is a good chance the
           | changes in air pressure, stress, lack of medical care, and
           | cramped conditions at least contributed to the death).
           | 
           | Stats also often exclude terrorist or war shootdowns of
           | commercial planes, which are starting to become significant.
        
         | cryptonector wrote:
         | This is not about saving money! You can't simply shutdown
         | manufacturing of Intel or other chips that have
         | Spectre/Meltdown issues because that would leave us with
         | essentially no usable CPUs for new computers!
         | 
         | The Spectre/Meltdown issues are deep and architectural, not
         | simple to fix. It's not just a batch of CPUs that's the
         | problem, but _all_ of them.
         | 
         | Besides, if a CPU ships with a bug that can be fixed via a
         | microcode patch, then it would be a tremendous economic waste
         | for all humanity to throw those CPUs out.
         | 
         | Even when new CPUs come out that can be shown not to have
         | Spectre/Meltdown issues, it will take a long time to replace
         | the installed base of those that do because it's not a matter
         | of a little bit of money, but a matter of a great deal of money
         | and opportunity costs.
         | 
         | So microcode patches and software mitigations is all there is.
         | Absolutist attitudes don't help.
        
         | nickff wrote:
         | Are you willing to pay 10x more for the product with that
         | supposed extra reliability (100% vs 99.99966%)? Before you
         | answer, you must remember that perfection cannot be proven ex-
         | ante, it can only be assured.
         | 
         | You should also keep in mind that real systems have fault modes
         | aside from software bugs and hardware glitches, such as
         | unanticipated edge cases and user error, which may dominate
         | your actual failure statistics.
        
         | na85 wrote:
         | >But the first time you see airplane software malfunction, that
         | means you need to change the way the software is written and
         | released so that the whole class of issues will not ever happen
         | again.
         | 
         | This is pretty good intuition but often a systemic change is
         | not economically feasible. For avionics software at least, a
         | rewrite of the software would likely have to be recertified
         | from scratch before it would be allowed to fly.
         | 
         | We do, however, have several different quality assurance
         | programs in Aerospace that are supposed to address this sort of
         | thing.
         | 
         | Once you identify the root cause, the process found to be
         | deficient is supposed to have a Process Owner who is required
         | to create a preventive and corrective action plan to prevent a
         | recurrence, with more severe problems requiring more robust
         | action plans. Done right, the process owner is supposed to be
         | empowered to make the changes that need to be made.
         | 
         | These systems tend to be evolutions of ISO 9000 as pioneered by
         | Toyota (IIRC). They are highly bureaucratic and soul-sucking,
         | but they are also the least-shitty solution that's been tried.
        
         | LifeLiverTransp wrote:
         | Naive does not even begin to describe it. You do not save money
         | by writting software cheap. You are borrowing it from the
         | future as tech debt & hidden bugs. All debts are owned and paid
         | for - one way or another.
         | 
         | Managers who claim to have overcome this - are paying one
         | credit card with two new.
        
         | jacquesm wrote:
         | You are about 100% right on the mark here. There is only one
         | slight problem: people don't want to pay for very high quality
         | software except in a very limited number of fields.
         | 
         | In a way every real software improvement (not fancy language
         | flavor 'x' of the year but entirely new ways of developing
         | software) have always been with the main goals of writing
         | software with fewer bugs faster.
         | 
         | That's the whole reason we have abstractions, compilers, syntax
         | checkers, statical analyzers and so on. In spite of all those,
         | software still has bugs and budgets are still not sufficient to
         | write bug free software.
         | 
         | On another note: this problem is getting worse over time. As
         | tools improved codebases got larger and the number of users
         | multiplied at an astounding rate resulting in many more live
         | instances of bugs popping up. After all, software that contains
         | bugs but that is never run is harmless, only when you run buggy
         | software many times does the price of those bugs really add up.
         | 
         | Somewhere we took a wrong turn and we decided that more of the
         | same is a better way to compete than to have one of each that
         | is perfected and honed until the bugs have been (mostly...)
         | ironed out.
        
       | 908B64B197 wrote:
       | Turning off the feature doesn't sound so bad considering the
       | CRJ-200 first flew in 1991, it took 26 years to identify the bug
       | so I assume it's not used frequently at all.
        
       | thePunisher wrote:
       | I keep noticing that more and more aviation and space missions
       | fail because of software problems. It seems to me that the new
       | generation of engineers are generally less competent or companies
       | see software as an afterthought which can be outsourced to lower-
       | wage countries.
       | 
       | The Boeing MAX and Starliner come to mind, but the failed Moon
       | missions by Israel and India are also examples of this trend.
       | 
       | Cost cutting in software development is costing companies dearly.
       | Boeing may even go bankrupt because of this.
        
         | londons_explore wrote:
         | Most modern systems aim to move all 'hard' bits to software.
         | 
         | Theres no surprise that's where most of the failures occur.
        
         | aidenn0 wrote:
         | Don't underestimate the fact that there is a higher volume of
         | software though.
         | 
         | Many things that used to be done by analog computers or
         | manually done by the pilot are now done in software.
         | 
         | In addition, the Boeing MAX was a largely system design issue;
         | the software was operating as-designed, and had it been
         | implemented in hardware, it would have likely failed in the
         | same manner.
        
         | Jtsummers wrote:
         | It's a hiring problem. They let go of the good and/or
         | experienced engineers in the 00s, then replaced them primarily
         | with EEs (as a computer scientist I was told I couldn't write
         | code, only test it, at my first job, I did not stay long) with
         | minimal programming experience. These were very compliant
         | people, happy to do 60 hours or more per week (work harder, not
         | smarter). They lacked the historical context of the systems
         | they were maintaining/developing, and the experience to
         | properly model the systems under development [0].
         | 
         | This hiring problem is compounded by the oversight problem. The
         | program managers are similarly inexperienced. Or they came from
         | strictly a testing side with no concept of what software
         | development itself entails (I've seen this a lot). So they
         | aren't _bad_ at managing requirements, they may actually be
         | really good at it, but they absolutely fail to understand that
         | software is a hard problem (especially when dozens of
         | subcontractor are involved) that extends beyond just the
         | technical problem, and to the communication and coordination
         | problem. That 's assuming they're experienced, USAF program
         | managers for software (IME) are straight out of college history
         | majors. DoD programs are scary.
         | 
         | [0] Most avionics systems, in my experience, boil down to
         | rather straightforward state machines. Understood this way they
         | become much simpler to write and test. The hard part is hitting
         | your timing constraints, but that's easier to achieve with
         | correct-but-slow-and-maintainable code than with incorrect-but-
         | fast-and-unmaintainable code. Inexperienced developers won't
         | see this possibility, either by failing to spend time studying
         | the requirements or failing to understand how to implement
         | state machines at all.
        
         | cryptonector wrote:
         | (The MAX was not so much a software issue as an architecture
         | issue (starting with insufficient redundancy). So that's not a
         | good example of software causing problems for airliners.)
         | 
         | There are two reasons why you can expect software to be more
         | and more the cause of airliner safety issues:
         | 
         | - software is eating the world
         | 
         | - software is getting more complicated
         | 
         | The first is a long-term trend now. Look under the hood of any
         | automobile from before the 80s: no computer to be found. Look
         | under the hood of any automobile from the past 30 years:
         | computers abound. The reason for this is that many problems are
         | easier to address in software than in hardware. Of course, you
         | go from N hardware problems to some possibly smaller set of
         | possibly simpler hardware problems at the cost of gaining a set
         | of software problems -- but this trade-off usually pays off. In
         | some cases this trade-off enables functionality that would be
         | infeasible to create otherwise.
         | 
         | The second problem is also a long-term trend: CPUs, systems,
         | operating systems, and applications have all tended to get more
         | complex. In embedded systems the trend has been less strongly
         | towards ever-increasing complexity, but even in embedded
         | systems things have gotten more complex.
         | 
         | Whether the problem is less competence among today's
         | programmers is hard to establish here. First, we need much more
         | software, which means we need many more programmers, which
         | means the quality of programmers you get probably does
         | decrease, though then again, we do have more programmers
         | overall as more people (competent and otherwise) are attracted
         | to the industry. But more importantly, the increase in
         | complexity of today's systems could very well be enough to make
         | yesteryear's competent programmers incompetent today -- you
         | can't really compare software development 40 years ago to
         | software development today.
         | 
         | (I object to this idea that lower-wage programmers necessarily
         | can't be competent, though that isn't quite what you wrote.
         | It's true that a lax process for outsourcing can mean you get
         | less competent programmers, and it's probably true that higher
         | GDP/capita correlates with availability of competent
         | programmers. But it doesn't follow that there are no competent
         | lower-wage programmers in India, say.)
        
       | redis_mlc wrote:
       | I understand the bug, and it's one of the worst imaginable.
       | 
       | Doing an unauthorized departure turn can impact terrain. At
       | night, it may not even be noticed by the pilots.
       | 
       | Source: commercially-rated pilot.
        
         | speeder wrote:
         | It wasn't because software bug, but this is how "Mamonas
         | Assassinas" died, basically during a missed approach the pilot
         | turned the wrong direction (but with the correct radius and
         | all) and crashed into a very tall hill near the airport.
        
           | pravda wrote:
           | Ah yes. That was a dark day. Well, night actually.
           | 
           | https://www.youtube.com/watch?v=PwXplj3ssNs
        
         | zomglings wrote:
         | Could you explain the bug here?
         | 
         | It wasn't clear to me what the exact problem was from reading
         | the article, just that it occurred under a very specific and
         | uncommon set of circumstances.
        
           | NikolaeVarius wrote:
           | > "This issue will occur in departures and missed approaches
           | where the shortest turn direction is different than the
           | required turn direction onto the next leg if the crew edits
           | the 'Climb to' altitude field."
           | 
           | > The FMS may change the planned database turn direction to
           | an incorrect turn direction when the altitude climb field is
           | edited."
        
             | zomglings wrote:
             | Thanks. The summary is actually very useful.
             | 
             | What I mean, though, was I didn't understand what the bug
             | was - not how it manifested.
        
               | NikolaeVarius wrote:
               | The bug is that the flight computer could set an
               | incorrect turn direction for a go around abort
               | maneuver/take off procedure.
               | 
               | https://portal.rockwellcollins.com/documents/796122/0/OPS
               | B+R...
        
               | mopsi wrote:
               | Airports have landing procedures, essentially a list of
               | waypoints and altitudes. A landing aircraft has to fly
               | that route down to the runway.
               | 
               | Each list has Missed Approach Point, at which the list
               | branches into two: landing or abort.
               | 
               | If you are not ready to land at that point (too fast,
               | can't see runway, previous aircraft still on the runway,
               | etc), then you fly the abort part. Usually it tells to
               | climb to a certain safe altitude and turn towards a
               | waiting area for another landing attempt.
               | 
               | These procedures can be flown manually, or activated for
               | autopilot to fly. A bug made the autopilot turn in
               | opposite direction to what's in the abort section.
               | 
               | Here's a landing procedure for Helena regional airport: h
               | ttps://flightaware.com/resources/airport/HLN/IAP/ILS+OR+L
               | OC...
               | 
               | The narrowing beam is instrument landing system that
               | guides you towards the runway. If you reach 4850 feet
               | minimum altitude during approach, but can't see the
               | runway, you must fly the abort procedure, which is drawn
               | with dashed lines: climb to 4700 feet on current heading,
               | turn to heading 021 while climbing to 9000 feet, then
               | proceed north via 336 radial.
               | 
               | This bug could cause the aircraft turn southwest (towards
               | mountains) instead of northeast (valley).
               | 
               | At Helena, the bug would not reveal itself because the
               | right turn is 114 degrees. If the procedure required to
               | turn more than 180 degrees right, for example, 200 or so
               | degrees right towards SWEDD, the aircraft would make a
               | left (shortest) turn instead. Green _should_ be flown,
               | red _would_ be flown: https://i.imgur.com/ojShQa2.png
        
               | mjg59 wrote:
               | There's two components in a turn - the desired heading at
               | the end of the turn, and the direction you should turn to
               | get there. Setting the "Climb to" altitude appears to
               | have cleared the turn direction information. In the
               | absence of that, the computer will turn in whichever
               | direction results in a shorter turn to the desired
               | heading. This is usually what you want, but not always,
               | so I can understand it taking a while before anyone
               | noticed it.
        
       | kohtatsu wrote:
       | I just flew out of this airport yesterday, only 5 passengers
       | onboard a Bombardier Q400.
       | 
       | Thankfully there are no nearby hills for this bug to kill anyone
       | there.
       | 
       | Unrelated, but how many carbon offsets do I buy?
        
         | ashtonkem wrote:
         | For a mostly empty flight? A lot.
        
         | markvdb wrote:
         | You'd have consumed about 0.45 l/km of kerosene per flight km
         | per passenger. Calculation based upon [0]. That means you'd
         | need to offset CO2 emissions of about 1.11 kg CO2 per flight km
         | [1] for yourself.
         | 
         | [0] https://www.flyradius.com/bombardier-q400/fuel-burn-
         | consumpt...
         | 
         | [1] https://www.engineeringtoolbox.com/co2-emission-fuels-
         | d_1085...
        
           | kohtatsu wrote:
           | Wow that's great, thank you for finding these and doing those
           | calculations. I'm glad to read it's one of the more fuel
           | efficient aircrafts.
           | 
           | Flightaware puts the actual distance at 760km, so it'd be
           | ~844kg of CO2.
           | 
           | I found this great guide by the David Suzuki Foundation which
           | assesses different carbon offset vendors with a few different
           | measures:
           | 
           | https://davidsuzuki.org/wp-
           | content/uploads/2019/10/purchasin...
           | 
           | Pages 42-49 go into the different criteria, Page 50 is the
           | table scoring them all.
           | 
           | I decided to go with https://carbonzero.ca, it was $19.89CAD
           | for 0.88t. Thank you for your help :)
        
       ___________________________________________________________________
       (page generated 2020-05-29 23:01 UTC)