[HN Gopher] Why CVE-2022-3602 was not detected by fuzz testing ___________________________________________________________________ Why CVE-2022-3602 was not detected by fuzz testing Author : pjmlp Score : 171 points Date : 2022-11-21 15:48 UTC (7 hours ago) (HTM) web link (allsoftwaresucks.blogspot.com) (TXT) w3m dump (allsoftwaresucks.blogspot.com) | alkonaut wrote: | Adding more and better fuzzing instead of trying to fix the issue | (potentially malicious user input inside a C library) seems like | the wrong way to address the problem. Buffer overruns just | shouldn't be a concern of the developer or test suite but of the | compiler or language runtime. | artariel wrote: | Trusting the fuzzer and not examining its coverage seem to be | main problem here. | | I fail to see what is problematic about giving the control over | the entire flow of the program to the developer. Quite the | contrary, I am more concerned about the paradigm shift towards | higher level system programming languages that hide more and | more control from the developer while putting more burden on | the perfectness of optimizer. | alkonaut wrote: | Absent a high performing systems language that still offers | some safety guarantees, the right call should be to use | whatever the second best is. It could be a higher level | language with runtime overhead, sandboxing, formal | verification etc. In some cases constraints won't allow this, | and obviously replacing even parts of infrastructure code is | never easy. Nor should the perfect be the enemy of the good - | adding better testing doesn't sound like a bad idea even for | a piece of code being sunset. What I'm objecting against is | the (apparent, or my perceived!) idea that "if only the | fuzzing was good enough, this code would be acceptable for | use forever. | DistractionRect wrote: | There are two problems. The CVE, and the fact that the current | fuzzing harness does did not find it. The CVE is getting fixed, | but obviously the fuzzer needs work too because it exists to | find these kinds of issues before they get used in the wild. | | It's being handled how it should be. This happened, let's | handle it, and how can we work to better address future | problems. | sitkack wrote: | Open-loop fuzz testing catches only the most shallow of bugs. It | is like genetic optimization with no fitness function. | | Why are people still using parsers for untrusted input in C? That | is the real flaw here, not how the fuzzing was done. | not2b wrote: | But modern fuzzers aren't open-loop, they are coverage | directed, adjusting their inputs to increase coverage. As the | article points out, this works best if leaf functions are | fuzzed; difficult to reach corners still might not be found. | [deleted] | halpmeh wrote: | Because there isn't a good way of distributing pre-compiled | cross-platform C libraries. So if you want to use a parsing | library written in Rust, for example, you'd need to add Rust to | your toolchain, which is a pain. | | One solution to this problem would be to write an LLVM backend | that outputs C. Maybe such a thing already exists. | ralphb wrote: | I'm confused and very far from an expert here. What is wrong | with parsers, and what is the alternative? | jcims wrote: | A specific class of parsers | | >parsers for untrusted input in C | thaumasiotes wrote: | That didn't answer anything. If you want to do anything | with your input, you have to run it through a parser. | Doesn't matter if it's untrusted or not. Your only options | are ignoring the input, echoing it somewhere, or parsing | it. | Diggsey wrote: | They're saying don't write such a parser in C. Use | something else (memory safe language, parser generator, | whatever). | lazide wrote: | And then do what with it? Throw it away? | | If it hands it to a C program, that C program needs to | parse (in some form!) those values! | | How is a C program expected to ever do anything if it | can't safely handle input? | nicoburns wrote: | Right, but you do have the option of writing that parser | in a language other than C. And given how often severe | security issues are caused by such parsers written in C, | one probably ought to choose a different language, or at | least use C functions and string types that store a | length rather than relying on null termination. | thaumasiotes wrote: | >>>>> Why are people still using parsers for untrusted | input in C? | | No matter what the parser itself is written in, if you're | writing in C you'll be using the parser in C. | wongarsu wrote: | If you have the input in a buffer of known length in C, | hand it off to a (dynamic or static) library written in a | safe language, and get back trusted parsed output, then | there's much less attack surface in your C code. | lazide wrote: | The issue in many of these cases is there appears to be | no canonical safe way to know the length of the input in | C, and people apparently screw up keeping track of the | lengths of the buffers all the time. | harshreality wrote: | Even if application constraints mean you can't write a | parser in another language that's linkable to C, why | couldn't you use a parser generator that outputs C? | dllthomas wrote: | I agree that the original statement encourages that | interpretation, but I think it admits the interpretation | that the parser itself is in C and I think that is what | was intended. | nicoburns wrote: | 1. Well don't write in C then if your program is security | critical or going to be exposed over a network. Sure, | there are some targets that require C, but that's not the | case for the vast majority of platforms running OpenSSL. | | 2. That's still less of a problem as the C will then be | handling trusted data validated by the safe langauge. | jstimpfle wrote: | If you make argument 2) could you explain how writing a | parser is more security critical than any other code that | has a (direct or indirect) interaction with the network? | At least recursive descent parsers are close to trivial. | I usually start by writing a "next_byte" function and | then "next_token". You'll have to look very hard to find | any pointer code there. It's close to impossible to get | this wrong and I don't see how the fact that it's a | parser would make it any more dangerous. | thaumasiotes wrote: | > It's close to impossible to get this wrong and I don't | see how the fact that it's a parser would make it any | more dangerous. | | I can answer that one. The parser is more dangerous | because a parser, essentially by definition, takes | untrusted input. | | Nothing the parser _does_ is any more dangerous than the | rest of the code; it 's all about the parser's position | in the data flow. | nicoburns wrote: | Well if you're dealing with a struct then the compiler | will provide type safety if say you try to access a field | that doesn't exist. You don't get the same safeguards | when dealing with raw bytes. Admittedly in C you can also | run into these hazards with arrays and strings, which I | why I suggest using non-standard array and string types | which actually store the length if you insist on using C. | lazide wrote: | Pretty much all input is untrusted unless it originated | (exclusively!) from something with more permissions that is | trustworthy. | | The kernel is written in C. | | So that pretty much means _all_ parsers written in C and | every other language should consider all input | untrustworthy, no? | Gigachad wrote: | Linux is probably the most carefully constructed C | codebase in existence and still falls in to C pitfalls | semi regularly. Every other project has no hope of safely | using C. It's looking more and more like Linux should be | carefully rewritten in Rust. It's a monstrous task but I | can see it happening over the next decade. | er4hn wrote: | The problem here was that the coverage of the fuzz testing was | not being examined. | | Using parsers for untrusted input in C is a legacy of when this | was written. Requiring the parsing portion (or any version of | OpenSSL) to be rewritten in Rust or whatever new language is a | massive change given the length of time the OpenSSL project has | been around. | sitkack wrote: | "Massive" 338 line file. | | https://github.com/openssl/openssl/blob/openssl-3.0.6/crypto. | .. | duped wrote: | It's not that big of a change, in the grand scheme of things. | But it's also not the only thing you can do. The memory safe | subset of C++ is also an option. | | Shipping a CVE in critical infrastructure because of a | trivial memory safety bug is borderline negligence in 2022. | This is why people get upset over new code being written in | C. The cost of writing new portions of the software with | memory safety in mind dwarfs the cost of writing in C because | it's more convenient for the build tooling. | | The bigger question is why hasn't OpenSSL bitten the bullet | and adopted some memory safety guarantees in their tooling, | given the knowledge of the sources of these bugs and | prevalent literature and tools in avoiding them! | coder543 wrote: | > The memory safe subset of C++ is also an option. | | This does not actually exist, as far as I'm aware. There | are certain things people propose doing in C++ that | eliminate a small number of issues, but I haven't seen | anyone clearly define and propose a subset of C++ that is | reasonably described as memory safe. Even if such a subset | existed, you would still need some way to statically | enforce that people _only_ use that. | | Even just writing the parsers in Lua should be a safer | choice than writing it in C, but I think now is as good of | a time as any to start writing critical code paths in Rust. | If the Linux kernel is beginning to allow Rust for kernel | modules, then it is high time that OpenSSL looked more | seriously at Rust too | | As others have pointed out, parser generators could be a | useful intermediate option for some of this. | duped wrote: | You can write memory safe C++ easier than memory safe C. | The problem of statically verifying it the C++ is | actually memory safe because of the numerous ways to | write unsafe code in C++ is a different goalpost. | | My point is that there isn't a compelling reason to write | new code in C for something where safety is critical. | pjmlp wrote: | It is called C++ Core Guidelines. | zasdffaa wrote: | Rewriting the parser portion of anything is not 'massive'. | Boring as anything, but mot difficult and not much time | consuming. | gizmo686 wrote: | Parser generators are one of the oldest ideas in computer | science. YACC was written in the 70s; and had to be ported to | C because it's original implementation was written in B. | | The idea of not writing parsers directly was well established | by the time OpenSSL started in the late 90s. | stcredzero wrote: | Given that so many of these bugs are parsing bugs, maybe we | should have a new emphasis on compiler compilers which | generate fast, provably correct code? (Fast, because most | of the bugs and exploits are accompanied by some form of | optimization.) | nyrikki wrote: | I think you may run into the halting problem here. | stcredzero wrote: | You don't have to solve the halting problem every time it | presents itself. Otherwise, things like Valgrind and | fuzzing wouldn't be valid at all. You just have to | improve your odds. | | EDIT: An important note to newbs: The Halting Problem is | correct. However, a problem which maps to the halting | problem can still be solved often enough in practice to | make it worthwhile. In fact, entire industries have been | born of heuristic solutions to such problems. | gizmo686 wrote: | Fast and provably correct are more or less solved | problems (at least for a large class of languages). | | The main drawback is that it is difficult to get good | error messages when a parse fails. | hardware2win wrote: | Industry prefers hand written parsers for a reasons. | | Parser generators feel like an academic dream | spockz wrote: | Why is this? I come from academia and I have yet to | encounter a good argument for not using parser | combinators, in new applications. Can you please point to | some reason? | mkeedlinger wrote: | What alternatives are there to parsers? Genuine question from | the ignorant. | chowells wrote: | There are no alternatives to parsers. There are alternatives | to "in C". | kazinator wrote: | You can easily write a robust parser in C. Just don't write a | clump of code that interleaves pointer manipulation for | scanning the input, writing the output and doing the parsing | _per se_. | | * Have a stream-like abstraction for getting or peeking at the | next symbol (and pushing back, if necessary). Make it | impervious to abuse; under no circumstances will it access | memory beyond the end of a string or whatever. | | * Have some safe primitives for producing whatever output the | parser produces. | | * Work only with the primitives, and check all the cases of | their return values. | draw_down wrote: | docandrew wrote: | As powerful as fuzzing is, this is a good reminder why it's not a | substitute for formal verification in high-integrity or critical | systems. | er4hn wrote: | I would argue the issue was not checking the coverage of new | code vs what was being tested. | naasking wrote: | The OP is pointing out that what "the issue" is depends on | whether you want high confidence that your code has few bugs, | or you want certainty that your code contains no bugs. | fulafel wrote: | .. or more pragmatically, safer languages where errors aren't | exploitable to get remote code execution. | | (I guess that semantics can also be seen as a formally verified | property) | ludovicianul wrote: | Safer languages cannot protect from bad design. Many | libraries have implicit behaviour which is not always | visible. It's a hard tradeoff to make. You want safety, but | in the same time enough customisation and features. I worked | recently with an http client library which was forbidding to | send special characters in headers. I understand that this is | a safety feature, but I really wanted to send weird | characters (building a fuzzing tool). | manbash wrote: | OK but we can mitigate these types of exploits (buffer | overflow etc.) using memory-safe languages. | | Bad design is a universal orthogonal problem. | Gigachad wrote: | Seatbelts can not save you from bad driving, but they | certainly help mitigate the effects. | marginalia_nu wrote: | Given log4shell happened in one of the more aggressively | sandboxed languages with mainstream adoption, the outlook | isn't great. | yakubin wrote: | I'd classify runtime reflection as an unsafe language | feature, to be honest. | chowells wrote: | Sandboxing is mostly irrelevant to the log4j error. You'd | have to tell the sandbox to turn off reflection, which | isn't really feasible in Java. And that's because Java is | so poorly designed that big libraries are all designed to | use reflection to present an API they consider usable. | | Compare that to a language designed well enough that | reflection isn't necessary for good APIs, for instance. | marginalia_nu wrote: | Dunno if I agree that libraries need reflection. Some do, | but primarily in the dependency injection and testing | space. | | That's not really where you'd expect RCE-problems. | chowells wrote: | Yeah, I should say where developers don't _think_ they | need to use reflection. | | Like, the log4j thing came from (among other design | errors) choosing to use reflection to look up filters for | processing data during logging. Why would log4j's | developers possibly think reflection is an appropriate | tool for making filters available? Because it's the easy | option in Java. Because it's the easy option, people are | already comfortable with it in other libraries. Because | it's easy and comfortable, it's what gets done. | | Some languages make reflection much more difficult (or | nearly impossible) and other APIs much easier. It's _far_ | more difficult to make that class of error in languages | like that. | strbean wrote: | > big libraries are all designed to use reflection to | present an API they consider usable. | | _whistles in python_ | fulafel wrote: | Code executing in the JVM isn't sandboxed. Sandboxing could | have indeed mitigated log4shell. Log4shell was a design | where a too powerful embedded DSL was exposed to untrusted | data in a daft way - the log("format here...", arg1, arg2) | call would interpret DSL expressions in the args carrying | logged data. One can even imagine it passing formal | verification depending on the specification. | | But more broadly the thing is that eliminating these low | level language footguns would allow people to focus on the | logic and design errors. | pjmlp wrote: | That belongs to the 30% of exploits that we are left with, | after removing the 70% others from C. | marginalia_nu wrote: | I think you are correct, but I do not think the average | severity of these exploits is necessarily the same. | pjmlp wrote: | US agency for cyber security thinks otherwise. | cryptonector wrote: | Upvote for log4shell. That's pretty funny. | | Yes, a safer language is not enough, but it is a huge leap | forward, so I'll take it. | jeffbee wrote: | tl;dr: because ossl_a2ulabel had no unit tests until a few days | ago, the fuzzer could not have reached it through any combination | of other tests. | | That fuzzing is tricky was not the problem here. The problem is | the culture that allowed ossl_a2ulabel to exist without unit | tests. And before some weird nerd jumps in to say that openssl is | so old we can't apply modern standards of project health, please | note that the vulnerable function was committed from scratch in | August 2020. Without unit tests. | MuffinFlavored wrote: | > because ossl_a2ulabel had no unit tests until a few days ago | | it's not realistic to enforce unit test coverage % with a | project at the scale of OpenSSL, right? | pca006132 wrote: | It should be realistic, line coverage isn't really that hard. | The hard thing is that high line coverage alone is usually | not enough for numerical stuff... | jeffbee wrote: | It is trivial to enforce that new functions have new unit | tests and fuzz tests. You are the reviewer of | https://github.com/openssl/openssl/pull/9654 and you just say | "Please add unit tests and fuzz tests for foo and bar" and | you don't approve it. | | I don't know what the deal is with their testing culture but | in year 27 of the project they demonstrably haven't learned | this lesson. It's nice that they added integration tests | (testing given encoded certs) but as the article points out | that was insufficient. | bluGill wrote: | Last week I code a "please add unit tests for this code". | The person who wrote that comment wasn't aware that this | was a refactoring where the functionality was well tested. | | There is no substitute for reviewers who really understand | the code in question. The problem is they are the ones | writing the code and so are biased and not able to give a | good review. | dralley wrote: | IMO, one of the biggest benefits of "modern" systems | languages like Rust, D, Zig is how much easier they make | writing and running tests compared to C and C++. Yes, you | can write tests for those languages, but it's nowhere near | as trivial. And that makes a difference. | pjmlp wrote: | I was writing unit tests for C in 1996, naturally we | still haven't coined the term, so we just called them | automatic tests. | | It was part of our data structures and algorithms | project, failure to execute the automatic tests meant no | admission to the final exam. | | We had three sets of tests, those provided initially at | the begin of the term, those that we were expected to | write ourselves, and a surprise set on the integration | week at the end of the semester. | fisf wrote: | I am not buying it. | | Writing unit tests for c/c++ is trivial. There are | perfectly fine test frameworks, used by developers every | day, integrated in any major IDE or runnable as one-liner | from the command line. | | This is absolutely a cultural problem. | masklinn wrote: | > it's not realistic to enforce unit test coverage % with a | project at the scale of OpenSSL, right? | | Why not? | | You can enforce that all new files should be covered (at the | very least line-covered). It requires some setup effort | (collecting code coverage and either sending it to a tool | which perform the correlation or correlating yourself), but | once that's done... it does its thing. | | Then you can work on increasing coverage for existing files, | and ratcheting requirements. | stefan_ wrote: | No one is asking for that. But this is code that did one | thing: punycode decoding, with millions of well documented | test vectors. The code had zero dependencies on anything | OpenSSL related. It is a very simple "text in, text out" | problem, the most trivial thing to write unit tests for. At | the same time, it's code that _parses_ externally provided | buffers and has to deal with things like unicode in C - | _there should be a massive red flashing warning light in | every developers head here_. | duped wrote: | It's realistic to reject a CR for a new parsing function | without proof that it works, which usually comes in the form | of a unit test | planede wrote: | Nit: a unit test never proves that it works. At best it can | prove that it doesn't work. | | Otherwise, I agree. | hardware2win wrote: | Nit: "never"? Even if you unit test all cases? | bluGill wrote: | How do you know you covered all cases. You can verify you | cover all cases the code handles easily enough. However | does the code actually handle all the cases that could | exist? A formal proof can bring to your attention a case | that you didn't handle at all. | | Formal proofs also have limits. Donald Knuth once | famously wrote "beware of bugs in the above code, I | proved it correct but never ran it". Which is why I think | we should write tests for code as well as formally prove | it. (On the later I've never figured out how to prove my | code - writing C++ I'm not sure if it is possible but I'd | like to) | planede wrote: | Touche | tz18 wrote: | Yes, never. You are assuming that the implementation of | all unit test cases are themselves correct (that they | would fail if there was any error in the case they | cover). In fact unit tests are often wrong. In that | context a unit test can't even prove code incorrect, | unless we know that the unit test is correct. | | IMO to prove that code is correct requires a proof; a | unit test can only provide evidence suggestive of | correctness. | planede wrote: | Mistakes in proofs are just as probable as mistakes in | exhaustive tests. | | An exhaustive test is just one type of a machine verified | proof. | nicoburns wrote: | > An exhaustive test is just one type of a machine | verified proof. | | Not entirely sure I agree with this. A proof by | construction is a very different beast to empirical unit | tests that only cover a subset of inputs. The equivalent | would be units tests that cover every single possible | input. | asveikau wrote: | It's worth noting that coverage can be a deceptive metric | sometimes. | | You can have coverage on code that divides - it won't tell | you if you ever divide by zero. | | You can have coverage on code that follows a pointer - it | won't tell you if you ever pass a bad pointer. | adql wrote: | yeah but there isn't even a try here | asveikau wrote: | Not saying it isn't worth an attempt, just that the real | meaning shouldn't be lost. | adql wrote: | Seems OpenSSL as an organization is just irreparably broken if | they haven't _still_ learned the lesson | Someone1234 wrote: | OpenSSL still receives minimum funding. Until Heartbleed they | had nearly no funding, and now it is two full time people. | | https://en.wikipedia.org/wiki/Core_Infrastructure_Initiative. | .. | pizza234 wrote: | I'm of the opinion that in cases like this, it'd be better | for the organization to close, and allow the gap to be | filled naturaly. | | If the current OpenSSL maintainers closed the project, | given its importance, there would be a rush to follow up | maintenance. Chances are, it'd be better funded; even in | worst case, it'll hardly be assigned less than two devs. | | This a case of the general dynamic where a barely- | sufficient-but-arguably-insufficient solution prevents | actors from finding and executing a proper one. | SoftTalker wrote: | For a project like OpenSSL, it's not just having "enough" | developers (whatever that is) it's having _qualified_ | developers. Writing good crypto code requires deep | expertise. There aren 't a lot of people with such | expertise whose time is not already fully committed. | AdamJacobMuller wrote: | You don't even have to close. You can just refuse to | merge code which does not include 100% test coverage. If | someone wants the feature badly enough, they will figure | out a way to fill the gap. Alternatively, someone can | always fork the code and release "OpenSSL-but-with-lots- | of-untested-code" variant. | coder543 wrote: | I think at this point we've established that it's C which is | just irreparably broken. | | Blaming the OpenSSL developers for writing bad C is just a | "no true scotsman" at this point, since there is no large, | popular C codebase in existence that I'm aware of that avoids | running into vulnerabilities like this; vulnerabilities that | just about every other language (mainly excluding C++) would | have prevented from becoming an RCE, and likely prevented | from even being a DoS. Memory safe languages obviously can't | prevent _all_ vulnerabilities, since the developer can still | intentionally or unintentionally write code that simply does | the wrong thing, but memory safe languages can prevent a lot | of dumb vulnerabilities, including this one. | | No feasible amount of funding would have prevented this, | since it continues to happen to much better funded projects | also written in C. | | On the other hand, I guess we _could_ blame the OpenSSL | developers for writing C at all, being unwilling to start | writing new code in a memory safe language of some kind, and | ideally rewriting particularly risks code paths like parsers | as well. We 've learned this lesson the hard way a thousand | times. C isn't going away any time soon (unfortunately), but | that doesn't mean we have to _continue_ writing new | vulnerabilities like this one, which was written in the last | two years. | tredre3 wrote: | > Blaming the OpenSSL developers for writing bad C is just | a "no true scotsman" at this point, since there is no | large, popular C codebase in existence that I'm aware of | that avoids running into vulnerabilities like this; | vulnerabilities that just about every other language | (mainly excluding C++) would have prevented from becoming | an RCE | | No, this whole thing is about the lack of testing. Adding a | parser without matching tests is just absurd regardless of | the language it's implemented with. If only for basic | correctness check, you want a test. | | Not all vulnerabilities or bugs are memory-related, | vulnerabilities are bound to surface in any language with | that kind of organizational culture. | jabart wrote: | Keep in mind that Ubuntu compiled OpenSSL using a gcc flag | that turns this one byte overflow into a crash instead of a | memory leak/corrpution because it has a way to do that | already. It's very risky, and a very long term project to | rewrite something with this level of history into a | completely new language. | coder543 wrote: | > It's very risky, and a very long term project to | rewrite something with this level of history into a | completely new language. | | I _didn 't suggest_ a complete rewrite of the project. | However, they _could_ choose to only write new code in | something else, and they could rewrite certain critical | paths too. The bulk of the code would continue to be a | liability written in C. | | I agree that it would be nearly impossible to rewrite | OpenSSL as-is. It would take huge amounts of funding and | time. In general, people with that much funding are | probably better off starting from scratch and focusing on | only the most commonly used functionality, as well as | designing the public interface to be more ergonomic / | harder to misuse. | cryptonector wrote: | The OpenSSL team is actually very good and they do very good | work, and they even have funding. The problem is that legacy | never goes away, and OpenSSL is a huge pile of legacy code, | and it will take a long long time to a) fix all the issues | (e.g., code coverage), b) migrate OpenSSL to not-C or the | industry to not-OpenSSL. | KingLancelot wrote: | OpenSSL is known to be broken. | | All this bickering over language misses the real problem. | | The actual solution is that open source, widely used code is a | target for hackers. | | By using one library used everywhere for everything, you're | painting a target on your own back. | | The real solution is we need the software ecosystem to have more | competition and decentralization. | | Use alternative crypto libraries. | | If you want a drop in replacement, use LibreSSL which was forked | and cleaned up by the OpenBSD guys due to HeartBleed. | | But the long term solution, is more competition by using smaller, | more specialized libraries, or even writing your own. | nicoburns wrote: | > The actual solution is that open source, widely used code is | a target for hackers. | | The long term solution is likely using languages which are A. | Memory safe, and B. make formal verification viable. Being | widely used and open source isn't an issue if there are no | exploitable bugs in the code. | KingLancelot wrote: | I see a lot of people pushing memory safe languages, far more | people than actually write systems code. | | What is your primary language? | pjmlp wrote: | Many of those people have experience writing systems | programming code before UNIX and C got widespread outside | Bell Labs. | | Mac OS was originally written in Object Pascal + Assembly, | just to cite one example from several. | KingLancelot wrote: | Nice deflection, but you still didn't answer the | question. | | Have you ever written in a memory safe language like | you're telling others to do? | pjmlp wrote: | Yes, my dear, plenty of times since 1986. | | Turbo Pascal, Turbo Basic, Modula-2, Ada, Oberon, ... | hn_acc_2 wrote: | There are a limited number of people willing to spend a limited | amount of time fuzzing, reviewing, and scrutinizing crypto | libraries. The more libraries exist, the more their efforts are | divided, and the total scrutiny each library receives | decreases. How would this help the problem? | sramsay wrote: | I don't get it. Why doesn't everyone just use the battle- | hardened, fully-compliant Rust implementation of OpenSSL? | yumjum wrote: | Loads of bugs aren't detected by fuzz testing, as this technique | exhibits stochastic behaviour, where you'll most likely find bugs | overall, but have varying chances (including none at all) of | uncovering specific bugs. | | Which is great news for those of us who approach such research by | gaining a deep understanding of the code and the systems it | exists in, and figuring out vulnerabilities from that | perspective. An overreliance on fuzzing keeps us employed. | Diggsey wrote: | Fuzz testing has a very high chance of detecting bugs, | especially these kind, but you do need to at least check that | the fuzzer is reaching the relevant code! | fulafel wrote: | This is reasoning backwards in a misleading way. The point is | not changing the fuzzing setup to find this specific bug that | we now know with hindsight was there. There are a zillion | paths and you would need to be ensuring that fuzzing reaches | all vulnerable code with values that trigger all vulnerable | dynamic behaviours. | m463 wrote: | isn't that what code coverage does? | Diggsey wrote: | It's not backwards: you run the fuzzer, you look at the | code coverage, and you compare that against what you expect | to be tested. Then you update the fuzzing harness to allow | it to find missing code paths. | | It's far more doable than you are suggesting: fuzzing | automatically covers most branches anyway, so you just need | to manually deal with the exceptions (which are easy to | locate from the code coverage). | | I used fuzzing to test an implementation of Raft, and with | only a little help, the fuzzer was able to execute every | major code path, including dynamic cluster membership | changes, network failure and delays. The Raft safety | invariants are checked after each step. Does this guarantee | that there are no bugs? Of course not. It did however find | some very difficult to reproduce issues that would never | have been caught during manual testing. And this is with a | project not even particularly well suited to fuzzing! A | parser is the dream scenario for a fuzzer, you just have to | actually run it... | fulafel wrote: | Yep, code coverage can tell you code is definitely | entirely untested, but doesn't tell you that you are | covering the input space to have high assurance that | there aren't vulnerabilities. | | Coverage might have helped here (or not), but it doesn't | fix the general problem of fuzzing being stochastic and | only testing some behaviours of the covered code. | PhearTheCeal wrote: | I wonder if one possible solution is making things more | "the Unix way" or like microservices. Then instead of | depending on some super specific inputs to reach deep into | some code branch, you can just send input directly to that | piece and fuzz it. Even if fuzzers only catch shallow bugs, | if everything is spread out enough then each part will be | simple and shallow. | fulafel wrote: | This is the flip size of the fuzzing approach that is | called property testing. It's legit but involves unit | test style manual creation of lots of tests for various | components of the system, and a lot of specs of what are | the contracts between components & aligning the property | testing to those. | eklitzke wrote: | Fuzzers can already do this. When you set up a fuzzer you | set up what functions it's going to call and how it | should generate inputs to the function. So you can fuzz | the X.509 parsing code and hope it hits punycode parsing | paths, but you can also fuzz the punycode parsing | routines directly. | skybrian wrote: | Fuzz tests can take a seed corpus of test vectors. If the test | framework tries them first, it can guarantee that it will find | _those_ bugs in any test run. For anything beyond that, it | depends on chance. | spockz wrote: | Or seen the other way around. By applying fuzzing to find the | "silly" type of bugs, you can spend your artistic efforts on | finding the other bugs. | ludovicianul wrote: | I think this is the main reason fuzzing exist. Let the boring | part to the tool and focus on the most creative work | w_for_wumbo wrote: | I'm not familiar with C enough to know the answer, but I'm trying | to think how anything goes from untrusted input -> trusted input | safely. To sanitize the data, you're putting the input into | memory to perform logic on it, isn't that itself then an attack | vector? I would think that any language would need to do this. | | Is anyone able to explain this to me? | oconnor663 wrote: | There are a lot of different issues that can come up, but in | practice ~80% of those (my made up number) are out-of-bounds | issues. So for example, say you're parsing a JSON string | literal. What happens if the close-quote is missing from the | end of the string? You might have a loop that iterates forward | looking for the close-quote until it reaches the end of the | input. What that code _should_ do is then return an error like | "unclosed string". If you write that check, your code will be | fine in any language. What if you forget that check? In most | languages you'll get an exception like "tried to read element | X+1 in an array of length X". That's not a great error message, | but it's invalid JSON anyway, so maybe we don't care super | much. However in C, array accesses aren't bounds-checked, so | your loop plows forward into random memory, and you get a CVE | roughly like this one. | | In short, the issue is that you forgot a check, and your code | effectively "trusted" that the input would close all its | strings. If you never make mistakes like that, you can validate | input in C just like in any other language. But the | consequences of making that mistake in C are really nasty. | zwkrt wrote: | Just because something is in memory doesn't mean that it is | realistically executable. That's why you can download a virus | to look at the code without it installing itself. | | You aren't wrong that even downloading untrusted data is less | secure than not downloading it. But to actually exploit a | machine that is actively sanitizing unsafe data, you need | either (A) an attack vector for executing code at an arbitrary | location in memory, or (B) a known OOB bug in the code that you | can exploit to read your malicious data, by ensuring your data | is right after the data affected by the OOB bug. | bluGill wrote: | >To sanitize the data, you're putting the input into memory to | perform logic on it | | Sure, but memory isn't normally executed. | | One of the more common problems was not checking length. Many C | functions assume sanitized data and so they don't check. You | have functions to get that data that don't check length - thus | if someone supplies more data than you have more room for (gets | is most famous, but there are others) the rest of the data will | just keep going off the end - and it turns out in many cases | you and predict where that off the end is, and then craft that | data to be something the computer will run. | | One common variation: C assumes that many strings end with a | null character. There are a number of ways to get a string to | not end with that null, and if the user can force that those | functions will read/write past the end of data which is | sometimes something you can exploit. | | So long as your C code carefully checks the length of | everything you are fine. One common variation of this is | checking length but miss counting by one character. It is very | hard to get this right every single time, and mess it up just | once and you are open to something unknown in the future. | | (Note, there are also memory issues with malloc that I didn't | cover, but that is something else C makes hard to get right). | planede wrote: | Would it be reasonable to have fuzz testing around reasonably | sized units in addition to e2e? | dllthomas wrote: | Yes, but see also "property testing" like QuickCheck and | Hypothesis, the line is blurry. | bawolff wrote: | > I think we should give the developers the benefit of doubt and | assume they were acting in good faith and try to see what could | be improved. | | I feel like there is this trend of assuming any harsh criticism | is bad faith. Asking why industry standard $SECURITY_CONTROL | didn't work immediately after an issue happened that should have | been caught by $SECURITY_CONTROL is hardly a bad faith question. | aidenn0 wrote: | Questions themselves are not good-faith or bad-faith. People | asking questions are doing so in either good-faith or bad- | faith. | | Someone pushing hard on legitimate criticisms with the intent | of attacking a project or members thereof is acting in bad- | faith, while someone ignorant with a totally bogus criticism | could be acting in good-faith. Many bad-faith actors hide | behind a veneer of legitimacy by disguising or shifting the | gaze away from their motivations. | bawolff wrote: | Umm, i disagree. | | Bad/good faith is about whether you are being misleading or | dishonest in asking the question. | | You can intend to go attack a project in good faith as long | as you are not being misleading in your intentions. | | For example, a movie critic who pans a film is not acting in | bad faith since they aren't being misleading in their | intentions. | aidenn0 wrote: | A movie critic who pans a film they think sucked is acting | in good faith; a movie critic who pans a film _specifically | with the intent of attacking the film_ (whether as | clickbait or because they don 't like someone involved with | the film or whatever) is acting in bad faith. | | We might actually be in agreement with each other because a | critic who leads with "The director slept with my wife, so | I'm only going to say all the bad things about the film and | you should probably ignore this review" would have | significantly blunted their attack by leading with it, and | are arguably not acting in bad-faith. | germandiago wrote: | I really find OpenSSL function call interfaces infuriating more | so if you think that this is a security library. | | I think interfaces in Botan, to give an example, are way easier | to use. | | It looks to me like a minefield the OpenSSL API. ___________________________________________________________________ (page generated 2022-11-21 23:00 UTC)