[HN Gopher] Software bug made Bombardier planes turn the wrong way ___________________________________________________________________ Software bug made Bombardier planes turn the wrong way Author : sohkamyung Score : 54 points Date : 2020-05-29 09:30 UTC (13 hours ago) (HTM) web link (www.theregister.co.uk) (TXT) w3m dump (www.theregister.co.uk) | [deleted] | trhway wrote: | >Most bugs in airliners tend to be unforeseen memory overflows | | the 21st century, planet Earth. | parkovski wrote: | This reminds me of a meetup I attended last fall, they were | talking about the Spectre/Meltdown issues. I asked the presenters | if anything in chip manufacturing/verification processes had | changed as a result of that and they seemed surprised. | | To me, when a software bug shows up in a critical system, that | means you actually have a logistics bug. Airplane control | software should not be allowed to have bugs. CPUs should not be | allowed to have bugs. And OS's should not be allowed to crash | (looking at you Microsoft). | | When one of these things happens, in my opinion the correct | response is _not_ to just release fixes and workarounds and then | say "we'll try really hard to not let it happen again." You do | that, sure. But the first time you see airplane software | malfunction, that means you need to change the way the software | is written and released so that the whole class of issues will | not ever happen again. You don't stop at a public apology, you | don't fire the person that unintentionally wrote the bug. If you | have to hire mathematicians to formally prove the critical paths | of the software, you do that. If it costs 10x more to release | bug-free software, oh well, you do that. | | All of these corporate people thinking they can save money by | spending less on quality are extremely naive. You can do a | financial analysis of this, but they're doing it wrong. Did you | ever consider what the cost of a whole generation just not | trusting air travel at all would be? | ashtonkem wrote: | On one hand, I understand your sentiment, on the other hand | even with these bugs air travel is as safe as it's ever been. | We've reached a point where fewer people die in air travel per | year than at any other point in the history of air travel, and | that's _before_ you account for the number of miles travelled. | It's almost ridiculous how safe air travel is on average. | martinald wrote: | That was true until 737 MAX, which statistically must have | been one of the most dangerous planes (or jets at least) in | history. Very few miles and 2 complete hull loss incidents | very close together. These bugs really do matter. You can | have quite a lot of minor issues and get away with it, but | when you hit a serious failure like the MAX had, even if only | triggered 1 in 10,000 flights ends up with an awful lot of | casualties. | CamperBob2 wrote: | The MAX problems weren't so much software bugs as | specification bugs. The software did exactly what it was | told to do by criminally-negligent engineering and | management personnel. | phire wrote: | MCAS was not a bug. The software behaved excatly as | specified. | | The issue was the specification itself, which assumed | pilots would reliably catch the uncommanded trim down, | diagnose it and disable the whole electric trim subsystem | within seconds of the problem behavior arising. | | That assumption turned out to be massively flawed. | jacquesm wrote: | Your comment implicitly - and probably unintentionally - | appears to assign part of the blame to the pilots, which | I think is a very bad thing to do in this particular | case. | shreyansj wrote: | I think specification here refers to the type | specification of the aircraft. It's not putting the | burden on the pilots but rather on the lack of pilot | training due to Boeing and airlines not wanting to bear | the cost of training pilots to a new aircraft type. | [deleted] | heavenlyblue wrote: | Then it means that they had to formally verify the | specification itself. | | It's not that hard by the way. And they did that, but | handwaved the critique - the typical approach of "my guts | are probably more correct than maths". | londons_explore wrote: | _commercial_ air travel. | | Private planes and industrial planes still have an awful | safety record. | | Most stats also exclude 'unrelated' deaths which happen | during a flight (even though there is a good chance the | changes in air pressure, stress, lack of medical care, and | cramped conditions at least contributed to the death). | | Stats also often exclude terrorist or war shootdowns of | commercial planes, which are starting to become significant. | cryptonector wrote: | This is not about saving money! You can't simply shutdown | manufacturing of Intel or other chips that have | Spectre/Meltdown issues because that would leave us with | essentially no usable CPUs for new computers! | | The Spectre/Meltdown issues are deep and architectural, not | simple to fix. It's not just a batch of CPUs that's the | problem, but _all_ of them. | | Besides, if a CPU ships with a bug that can be fixed via a | microcode patch, then it would be a tremendous economic waste | for all humanity to throw those CPUs out. | | Even when new CPUs come out that can be shown not to have | Spectre/Meltdown issues, it will take a long time to replace | the installed base of those that do because it's not a matter | of a little bit of money, but a matter of a great deal of money | and opportunity costs. | | So microcode patches and software mitigations is all there is. | Absolutist attitudes don't help. | nickff wrote: | Are you willing to pay 10x more for the product with that | supposed extra reliability (100% vs 99.99966%)? Before you | answer, you must remember that perfection cannot be proven ex- | ante, it can only be assured. | | You should also keep in mind that real systems have fault modes | aside from software bugs and hardware glitches, such as | unanticipated edge cases and user error, which may dominate | your actual failure statistics. | na85 wrote: | >But the first time you see airplane software malfunction, that | means you need to change the way the software is written and | released so that the whole class of issues will not ever happen | again. | | This is pretty good intuition but often a systemic change is | not economically feasible. For avionics software at least, a | rewrite of the software would likely have to be recertified | from scratch before it would be allowed to fly. | | We do, however, have several different quality assurance | programs in Aerospace that are supposed to address this sort of | thing. | | Once you identify the root cause, the process found to be | deficient is supposed to have a Process Owner who is required | to create a preventive and corrective action plan to prevent a | recurrence, with more severe problems requiring more robust | action plans. Done right, the process owner is supposed to be | empowered to make the changes that need to be made. | | These systems tend to be evolutions of ISO 9000 as pioneered by | Toyota (IIRC). They are highly bureaucratic and soul-sucking, | but they are also the least-shitty solution that's been tried. | LifeLiverTransp wrote: | Naive does not even begin to describe it. You do not save money | by writting software cheap. You are borrowing it from the | future as tech debt & hidden bugs. All debts are owned and paid | for - one way or another. | | Managers who claim to have overcome this - are paying one | credit card with two new. | jacquesm wrote: | You are about 100% right on the mark here. There is only one | slight problem: people don't want to pay for very high quality | software except in a very limited number of fields. | | In a way every real software improvement (not fancy language | flavor 'x' of the year but entirely new ways of developing | software) have always been with the main goals of writing | software with fewer bugs faster. | | That's the whole reason we have abstractions, compilers, syntax | checkers, statical analyzers and so on. In spite of all those, | software still has bugs and budgets are still not sufficient to | write bug free software. | | On another note: this problem is getting worse over time. As | tools improved codebases got larger and the number of users | multiplied at an astounding rate resulting in many more live | instances of bugs popping up. After all, software that contains | bugs but that is never run is harmless, only when you run buggy | software many times does the price of those bugs really add up. | | Somewhere we took a wrong turn and we decided that more of the | same is a better way to compete than to have one of each that | is perfected and honed until the bugs have been (mostly...) | ironed out. | 908B64B197 wrote: | Turning off the feature doesn't sound so bad considering the | CRJ-200 first flew in 1991, it took 26 years to identify the bug | so I assume it's not used frequently at all. | thePunisher wrote: | I keep noticing that more and more aviation and space missions | fail because of software problems. It seems to me that the new | generation of engineers are generally less competent or companies | see software as an afterthought which can be outsourced to lower- | wage countries. | | The Boeing MAX and Starliner come to mind, but the failed Moon | missions by Israel and India are also examples of this trend. | | Cost cutting in software development is costing companies dearly. | Boeing may even go bankrupt because of this. | londons_explore wrote: | Most modern systems aim to move all 'hard' bits to software. | | Theres no surprise that's where most of the failures occur. | aidenn0 wrote: | Don't underestimate the fact that there is a higher volume of | software though. | | Many things that used to be done by analog computers or | manually done by the pilot are now done in software. | | In addition, the Boeing MAX was a largely system design issue; | the software was operating as-designed, and had it been | implemented in hardware, it would have likely failed in the | same manner. | Jtsummers wrote: | It's a hiring problem. They let go of the good and/or | experienced engineers in the 00s, then replaced them primarily | with EEs (as a computer scientist I was told I couldn't write | code, only test it, at my first job, I did not stay long) with | minimal programming experience. These were very compliant | people, happy to do 60 hours or more per week (work harder, not | smarter). They lacked the historical context of the systems | they were maintaining/developing, and the experience to | properly model the systems under development [0]. | | This hiring problem is compounded by the oversight problem. The | program managers are similarly inexperienced. Or they came from | strictly a testing side with no concept of what software | development itself entails (I've seen this a lot). So they | aren't _bad_ at managing requirements, they may actually be | really good at it, but they absolutely fail to understand that | software is a hard problem (especially when dozens of | subcontractor are involved) that extends beyond just the | technical problem, and to the communication and coordination | problem. That 's assuming they're experienced, USAF program | managers for software (IME) are straight out of college history | majors. DoD programs are scary. | | [0] Most avionics systems, in my experience, boil down to | rather straightforward state machines. Understood this way they | become much simpler to write and test. The hard part is hitting | your timing constraints, but that's easier to achieve with | correct-but-slow-and-maintainable code than with incorrect-but- | fast-and-unmaintainable code. Inexperienced developers won't | see this possibility, either by failing to spend time studying | the requirements or failing to understand how to implement | state machines at all. | cryptonector wrote: | (The MAX was not so much a software issue as an architecture | issue (starting with insufficient redundancy). So that's not a | good example of software causing problems for airliners.) | | There are two reasons why you can expect software to be more | and more the cause of airliner safety issues: | | - software is eating the world | | - software is getting more complicated | | The first is a long-term trend now. Look under the hood of any | automobile from before the 80s: no computer to be found. Look | under the hood of any automobile from the past 30 years: | computers abound. The reason for this is that many problems are | easier to address in software than in hardware. Of course, you | go from N hardware problems to some possibly smaller set of | possibly simpler hardware problems at the cost of gaining a set | of software problems -- but this trade-off usually pays off. In | some cases this trade-off enables functionality that would be | infeasible to create otherwise. | | The second problem is also a long-term trend: CPUs, systems, | operating systems, and applications have all tended to get more | complex. In embedded systems the trend has been less strongly | towards ever-increasing complexity, but even in embedded | systems things have gotten more complex. | | Whether the problem is less competence among today's | programmers is hard to establish here. First, we need much more | software, which means we need many more programmers, which | means the quality of programmers you get probably does | decrease, though then again, we do have more programmers | overall as more people (competent and otherwise) are attracted | to the industry. But more importantly, the increase in | complexity of today's systems could very well be enough to make | yesteryear's competent programmers incompetent today -- you | can't really compare software development 40 years ago to | software development today. | | (I object to this idea that lower-wage programmers necessarily | can't be competent, though that isn't quite what you wrote. | It's true that a lax process for outsourcing can mean you get | less competent programmers, and it's probably true that higher | GDP/capita correlates with availability of competent | programmers. But it doesn't follow that there are no competent | lower-wage programmers in India, say.) | redis_mlc wrote: | I understand the bug, and it's one of the worst imaginable. | | Doing an unauthorized departure turn can impact terrain. At | night, it may not even be noticed by the pilots. | | Source: commercially-rated pilot. | speeder wrote: | It wasn't because software bug, but this is how "Mamonas | Assassinas" died, basically during a missed approach the pilot | turned the wrong direction (but with the correct radius and | all) and crashed into a very tall hill near the airport. | pravda wrote: | Ah yes. That was a dark day. Well, night actually. | | https://www.youtube.com/watch?v=PwXplj3ssNs | zomglings wrote: | Could you explain the bug here? | | It wasn't clear to me what the exact problem was from reading | the article, just that it occurred under a very specific and | uncommon set of circumstances. | NikolaeVarius wrote: | > "This issue will occur in departures and missed approaches | where the shortest turn direction is different than the | required turn direction onto the next leg if the crew edits | the 'Climb to' altitude field." | | > The FMS may change the planned database turn direction to | an incorrect turn direction when the altitude climb field is | edited." | zomglings wrote: | Thanks. The summary is actually very useful. | | What I mean, though, was I didn't understand what the bug | was - not how it manifested. | NikolaeVarius wrote: | The bug is that the flight computer could set an | incorrect turn direction for a go around abort | maneuver/take off procedure. | | https://portal.rockwellcollins.com/documents/796122/0/OPS | B+R... | mopsi wrote: | Airports have landing procedures, essentially a list of | waypoints and altitudes. A landing aircraft has to fly | that route down to the runway. | | Each list has Missed Approach Point, at which the list | branches into two: landing or abort. | | If you are not ready to land at that point (too fast, | can't see runway, previous aircraft still on the runway, | etc), then you fly the abort part. Usually it tells to | climb to a certain safe altitude and turn towards a | waiting area for another landing attempt. | | These procedures can be flown manually, or activated for | autopilot to fly. A bug made the autopilot turn in | opposite direction to what's in the abort section. | | Here's a landing procedure for Helena regional airport: h | ttps://flightaware.com/resources/airport/HLN/IAP/ILS+OR+L | OC... | | The narrowing beam is instrument landing system that | guides you towards the runway. If you reach 4850 feet | minimum altitude during approach, but can't see the | runway, you must fly the abort procedure, which is drawn | with dashed lines: climb to 4700 feet on current heading, | turn to heading 021 while climbing to 9000 feet, then | proceed north via 336 radial. | | This bug could cause the aircraft turn southwest (towards | mountains) instead of northeast (valley). | | At Helena, the bug would not reveal itself because the | right turn is 114 degrees. If the procedure required to | turn more than 180 degrees right, for example, 200 or so | degrees right towards SWEDD, the aircraft would make a | left (shortest) turn instead. Green _should_ be flown, | red _would_ be flown: https://i.imgur.com/ojShQa2.png | mjg59 wrote: | There's two components in a turn - the desired heading at | the end of the turn, and the direction you should turn to | get there. Setting the "Climb to" altitude appears to | have cleared the turn direction information. In the | absence of that, the computer will turn in whichever | direction results in a shorter turn to the desired | heading. This is usually what you want, but not always, | so I can understand it taking a while before anyone | noticed it. | kohtatsu wrote: | I just flew out of this airport yesterday, only 5 passengers | onboard a Bombardier Q400. | | Thankfully there are no nearby hills for this bug to kill anyone | there. | | Unrelated, but how many carbon offsets do I buy? | ashtonkem wrote: | For a mostly empty flight? A lot. | markvdb wrote: | You'd have consumed about 0.45 l/km of kerosene per flight km | per passenger. Calculation based upon [0]. That means you'd | need to offset CO2 emissions of about 1.11 kg CO2 per flight km | [1] for yourself. | | [0] https://www.flyradius.com/bombardier-q400/fuel-burn- | consumpt... | | [1] https://www.engineeringtoolbox.com/co2-emission-fuels- | d_1085... | kohtatsu wrote: | Wow that's great, thank you for finding these and doing those | calculations. I'm glad to read it's one of the more fuel | efficient aircrafts. | | Flightaware puts the actual distance at 760km, so it'd be | ~844kg of CO2. | | I found this great guide by the David Suzuki Foundation which | assesses different carbon offset vendors with a few different | measures: | | https://davidsuzuki.org/wp- | content/uploads/2019/10/purchasin... | | Pages 42-49 go into the different criteria, Page 50 is the | table scoring them all. | | I decided to go with https://carbonzero.ca, it was $19.89CAD | for 0.88t. Thank you for your help :) ___________________________________________________________________ (page generated 2020-05-29 23:01 UTC)