[HN Gopher] Go Fuzzing ___________________________________________________________________ Go Fuzzing Author : 0xedb Score : 149 points Date : 2022-01-01 18:27 UTC (4 hours ago) (HTM) web link (tip.golang.org) (TXT) w3m dump (tip.golang.org) | staticassertion wrote: | Great to see fuzzing becoming more mainstream. Ultimately we have | absurd program states, with even a trivial program's state vastly | exceeding the number of particles in the universe. We need to | start finding order-of-magnitude-better approaches for testing. | | I almost always write generated tests at this point with unit | tests being a fallback for slow code or niche cases. What I | _dont_ generally write though is fuzz tests, which would really | be a 'next step'. In Rust it's not very hard to do so, but it | hasn't quite hit the "trivial" mark yet for me, whereas | quickcheck is virtually the same amount of work to use as to not | use. | | Languages like Go adopting and mainstreaming these practices will | be a benefit to everyone. | | I'm curious if there's documentation on: | | a) The coverage approach taken | | b) The mutation approach taken | | Can you configure these? Plugin different fuzzing backends? | fpopa wrote: | What kind of generated tests are you writing? | | Is it more similar to 'golden files'? Generate expected output | and assert versus current implementation output? | adamgordonbell wrote: | Go stdlib has property testing built it. It's not as powerful | as some quick check frameworks, but it's built right in. I | wrote an article on it. | | https://earthly.dev/blog/property-based-testing/ | staticassertion wrote: | Just taking what would normally be a unit test and having the | input values be generated. Some examples: | | 1. I have a test for encryption/decryption functions. The | data that's provided for the plaintext, additional data, and | key, is generated. The assertions are: | | assert_ne!(plaintext, encrypted_data); | | assert_eq!(plaintext, decrypted_data); | | assert_eq!(aad, decrypted_aad); | | etc | | 2. I have some generated integration tests. For example, in | our product, there are certain properties that should always | hold for a given database entry. I generate a new entry on | every test and have the fields for that entry provided by | quickcheck, then I perform the operation, query the database, | and assert that properties on those values hold. | | So to answer your question, yes. Sometimes you want to check | a concrete output (ie: "this base64 encoded string should | always equal this other value) for sanity, but in general | property tests give me more confidence. | | I find it works particularly well with a 'given, when, then' | approach, personally. | | edit: I'll also note that for the base64 case I'd suggest: | | a) A hardcoded suite of values. | | b) Generate property tests. | | assert_eq!(value, base64decode(base64encode(value)); | | As well as things like "contains only these characters" and | "ends with [=a-zA-Z]" etc. | | c) Oracle tests against a "known good" implementation. | krobelus wrote: | Sounds like a sensible mix. There is really no single | silver bullet. | | We at https://symflower.com/ are working on a product to | generate unit tests. Unlike quickcheck/proptest we promise | to find errors, even if they are unlikely (for example | [this input](https://github.com/AltSysrq/proptest/blob/mast | er/proptest/RE...) would be trivial for Symflower). Also, | unlike fuzzing our technology is deterministic. | | Here's one of our blog posts that explains the approach: | https://symflower.com/en/company/blog/2021/symflower- | finds-m... | 2OEH8eoCRo0 wrote: | > Great to see fuzzing becoming more mainstream. | | Agreed. I wrote a fuzzer at my last job and it found a bunch of | bugs right before a release. Nobody knew what fuzzing was so I | was attacked by the program owner for trying to break the | software and given an insulting performance review for it. Then | I had all the fuzzing results and coredumps deleted out of | their directories by the program owner so the release looked | immaculate. Defense software ftw | staticassertion wrote: | Yikes, sounds pretty toxic on their part, but good on you for | taking a strong approach to software stability. | | Also, writing fuzzers is super fun. | masklinn wrote: | > I almost always write generated tests at this point with unit | tests being a fallback for slow code or niche cases. What I | dont generally write though is fuzz tests, which would really | be a 'next step'. In Rust it's not very hard to do so, but it | hasn't quite hit the "trivial" mark yet for me, whereas | quickcheck is virtually the same amount of work to use as to | not use. | | Did you mean genera _tive_ tests? You 're talking about | quickcheck and that's what it does. | | "Generated tests" would usually be interpreted as codegen'd | test which you commit. | staticassertion wrote: | Tests with generated input. Call it what you like. | cinntaile wrote: | What would you say are the main differences between a fuzzer and | QuickCheck? The authors of quickcheck don't call it a fuzzer so I | assume there is some difference but both seem to randomize | inputs? | ryanschneider wrote: | Anyone seen good articles on converting go-fuzz tests to native | fuzzing? Specifics on the new corpus format and a converter from | go-fuzz would be really useful. | | It's great to hear that the fuzzer is built on go-fuzz so | hopefully the conversion process won't be too bad: | https://github.com/dvyukov/go-fuzz/issues/329 | aleksi wrote: | https://pkg.go.dev/golang.org/x/tools@v0.1.8/cmd/file2fuzz | morelisp wrote: | I've pre-emptively migrated a couple projects and found that | loading the old corpus files wherever you already had them and | then `Add`ing them as whatever new appropriate type was the | easiest way. The inclusion of types necessitates at least a | minor migration. I did not find any official documentation on | the format, though it's trivial to read, e.g.: | go test fuzz v1 string("\xff0") | | Overall while the API (and of course tooling) is a huge step | forward, corpus management feels like a small step backwards | compared to go-fuzz - I didn't find a way to pull non-crashers | into an in-repo corpus other than manually copying them out of | my cache directory. And one-file-per-case still blows up a lot | of repo management tools. | dang wrote: | Past related thread: | | _Go: Fuzzing Is Beta Ready_ - | https://news.ycombinator.com/item?id=27391048 - June 2021 (53 | comments) | xiaq wrote: | Fuzzing is awesome. I just discovered an accidental O(2^n) code | path in my project with fuzzing and fixed it: | https://github.com/elves/elvish/commit/9cda3f643efafce2df567... | | Edit: shortly after I wrote this comment, fuzzing discovered | another pathological input - and that was fixed in | https://github.com/elves/elvish/commit/04173ee8ab3c7fc4a9e79... | | (In case people are curious, the project is a Unix shell, Elvish: | https://elv.sh) | damagednoob wrote: | I will never understand why this has been included in the | standard library instead of as a standalone library available for | download. Now it's locked to the Go release cycle and have the | potential to languish because of backward compatibility concerns. | | The decision to include it is perplexing when other language | ecosystems have chosen to keep this kind of functionality out of | the standard lib, e.g. requests in python[1]. To quote Kenneth | Reitz: "...the standard library is where a library goes to die." | | [1] https://github.com/psf/requests/issues/2424 | cube2222 wrote: | Seems fairly standard for Go. | | You mentioned requests - the Go net/http library is widely | used, even though it's in the standard library. It didn't | languish, it didn't die. The interfaces are also used in most | 3rd party libraries and work well. | | Moreover, Go's quality standard library is often cited as one | of its main strengths. | | Thus, the inclusion of fuzzing in the stdlib isn't surprising | to me. Not saying the other way around would be bad. It's just | not surprising, and I don't think it's a bad choice, looking at | Go historically. | dainiusse wrote: | I think it is great in general. OTOH - nobody prohibits to use | any third party library whoever wants to. Third party libraries | also die like - https://github.com/go-check/check | Scaevolus wrote: | Python had an 18 month release cycle for most of its life (now | 12 month), while Go has a 6 month release cycle. | | Many Python devs use the OS packaged Python versions, while Go | devs tend to use the latest release. | | Integrating this in the stdlib means that more people will use | basic fuzzing functionality. There's nothing preventing third | party fuzzers from continuing to develop. | smasher164 wrote: | The Go fuzzing tool takes advantage of compiler | instrumentation. It can also work with built-in Go types, in | comparison to traditional fuzzing tools that just work with | bytes. Additionally, integrating it into the testing tool | allows it to be as easy to write as a unit test. This can help | provide a batteries-included fuzzing experience. | staticassertion wrote: | This is just a work around for: | | a) A lack of strong typing | | b) A custom compiler toolchain | | llvm already has instrumentation/ coverage support and | generics make it easy to work with higher level constructs | than bytes, although you generally do want to just work with | bytes when fuzzing imo. | | The language is weak, therefor the language has to add more | and more batteries-included because extending it is | purposefully difficult. | masklinn wrote: | > This is just a work around for: | | > [...] | | > llvm already has instrumentation/ coverage support | | I mean, that supports the idea of having fuzzing support in | whatever core you have. | staticassertion wrote: | My point is that between Go's custom compiler backend and | inexpressive typing there's much more need to build | things like this in directly vs what other languages can | do by just using llvm/gcc. Like if Go developers want | sanitizers equivalent to what llvm packages they'll have | to build that themselves, although that won't have the | same issue with inexpressive types. | aleksi wrote: | > Like if Go developers want sanitizers equivalent to | what llvm packages they'll have to build that themselves | | Go uses LLVM's ThreadSanitizer since 1.1. | staticassertion wrote: | I don't think that really addresses my point, it just | shows that they did the work for one sanitizer already. | smasher164 wrote: | > llvm already has instrumentation | | The Go toolchain actually supports emitting instrumentation | for LLVM's libFuzzer with -gcflags=all=-d=libfuzzer. | | > therefor the language has to add more | | Okay, and so? If this ends up making fuzzing more popular | and easy-to-use, I frankly don't care if it was added as a | library or deeply integrated into the toolchain. | staticassertion wrote: | > The Go toolchain actually supports emitting | instrumentation for LLVM's libFuzzer with | -gcflags=all=-d=libfuzzer. | | Sweet, that's a smart approach. | | > Okay, and so? If this ends up making fuzzing more | popular and easy-to-use, I frankly don't care if it was | added as a library or deeply integrated into the | toolchain. | | I don't care either because I don't write Go, so just the | fact that it's supported is nice for me since it | encourages this in languages I do care about. | | But if I were a go developer I might care a lot about how | my language evolves, what's built in, what's a library, | what the capabilities are, what tools I can integrate | with, etc. | | It sounds like they've done a pretty good job with | regards to this implementation though, happy to see it. | morelisp wrote: | > The Go fuzzing tool takes advantage of compiler | instrumentation. | | This is the main benefit. I've been using go-fuzz for years | and compiler upgrades (especially any changes related to | modules/GOROOT/GOPATH) was a pain because it always behaved | slightly differently. | | > It can also work with built-in Go types, in comparison to | traditional fuzzing tools that just work with bytes. | | This could have been done just as efficiently without | upstream integration. | throwaway894345 wrote: | Maybe, but I wouldn't support support that argument by holding | up Python and its HTTP situation as exemplary. The standard | HTTP libraries are a nightmare, requests proves that Python | packages can and will languish even outside of the standard | library, and writing even a simple HTTP script in Python means | you now need to choose between the standard HTTP libraries or | tackle dependency management and multi-file deployment issues. | | By contrast Go ships with a high quality standard HTTP library | that has lasted a decade and no "requests" equivalent has risen | up to challenge it. | | Note also that Go's testing situation in general is much nicer | than many other languages precisely because things are baked | into the standard library--no need to quibble over which test | framework to use or to memorize each framework's equivalent for | "run tests that match this pattern" and so on. | staticassertion wrote: | The fact that requests has "languished" (not sure how tbh) | doesn't really change the fact that the Python stdlib is a | bit of a hilarious disaster from afar. Tons of cases of "that | shouldn't be in std" where libraries have quirky, locked in | behaviors, or an entire major breaking release with decades | of work to migrate has to be made to clean up the mistakes. | | Python should be a case study in the many ways not to build a | language. | | > By contrast Go ships with a high quality standard HTTP | library that has lasted a decade and no "requests" equivalent | has risen up to challenge it. | | Yeah this also means that you need to update your compiler | when there's a vulnerability instead of just a single point | release in a library. This happens with some frequency. | | The parent poster is right, in my opinion. | cube2222 wrote: | > Yeah this also means that you need to update your | compiler when there's a vulnerability instead of just a | single point release in a library. | | Has updating the Go compiler actually been an issue for you | in the past? To me, with Go's stability, it's never been | more disruptive than updating a library in practice, so I | don't see much of a difference. | staticassertion wrote: | I don't know that I'd hate it if I were a go dev, it | would just be a bit annoying for a number of reasons. | | For one thing I update libraries all the time so it's a | very fast, simple, well worn operation. Updating the | compiler is a bit more of a chore and I'm going to worry | a bit more about the impact (since it's global to all | code vs local to one package). | | For another, I would want to make sure I had tooling that | could tell me "is this library in use by service X". I | don't know Go's story there, but I would hope it's | trivial to do so for a library but I suspect if it's part | of the standard library that may be trickier. If not, | nbd. | | It's a bad smell to me, but if I were a Go developer it | wouldn't break me. | | Perhaps ironically, until this native fuzzing package, | upgrading the compiler if you had fuzz tests would be one | case where things would likely break. | throwaway894345 wrote: | > Updating the compiler is a bit more of a chore and I'm | going to worry a bit more about the impact (since it's | global to all code vs local to one package). | | This is indeed a chore in other languages. In Go, the | compiler is trivially installed. Typically this just | means bumping the version in your Dockerfile and "gvm use | $newVersion --default". | | > For another, I would want to make sure I had tooling | that could tell me "is this library in use by service X". | I don't know Go's story there, but I would hope it's | trivial to do so for a library but I suspect if it's part | of the standard library that may be trickier. If not, | nbd. | | This is supported out of the box by Go's tooling. `go mod | graph` is what you're looking for. | staticassertion wrote: | > This is indeed a chore in other languages. In Go, the | compiler is trivially installed. Typically this just | means bumping the version in your Dockerfile and "gvm use | $newVersion --default". | | The issue isn't with installing the new compiler, that's | trivial in our use case as well (for Rust at least, | Python's a disaster, but I accept that). The issue is | ensuring compatibility, ensuring no new bugs are | introduced, etc. It's just a much heavier change to your | produced binary vs changing a package. | | > This is supported out of the box by Go's tooling. `go | mod graph` is what you're looking for. | | Cool, thanks. | rat9988 wrote: | It is for me, I can't just go build in the new version. | So i'm keeping the software with the old compiler. | TheDong wrote: | > Has updating the Go compiler actually been an issue for | you in the past? To me, with Go's stability, it's never | been more disruptive than updating a library in practice | | I've run into issues with several go version updates. | | Off the top of my head, all of the following caused | breakages: | | 1. go 1.4 making directories named 'internal' special and | un-importable. Cross-package imports that used to work no | longer would compile with a compiler error. | | 2. go 1.9 adding monotonic clock readings in a breaking | way, i.e. this program changed output from 1.8 to 1.9: | https://go.dev/play/p/Mi6cGCPd0rS (I know it looks | contrived, but I'm not digging up the actual code that | broke) | | 3. The change of the http.Server default to serving http2 | instead of http/1.1 broke stuff. Of course it did. How | can that possibly _not_ break stuff? | | 4. The changes in 'GO111MODULE' defaults broke many | imports which had either malformed or incorrect go.mod | files. This one was quite painful for the whole | ecosystem. | | 5. go1.17 switched to silently truncating a lot of query | strings. Of course that broke stuff, how could it not? | https://go.dev/play/p/azODBvkb-zK | | Those are all intentional breaking changes which were not | fixed upstream (i.e. are "working as intended"). The | unintentional breaking changes, from changing error | messages to cause string-based error detection to fail | (because so many stdlib errors aren't exported so you | have to do string matching), to just plain dumb bugs in | the stdlib.... those are vastly more common. Those | usually do get fixed in point releases. Take a gander at | those release notes, many of the issues highlighted in | those changelogs come from pain people hit during | upgrades. | | I think the majority of go version upgrades have had some | amount of pain, and most of them have been far more | disruptive than updating a well-built library. | | I would much rather update just my fuzz-testing library | in a commit, and be confident that it's only used in | tests so CI is good enough to validate it, than have to | update that and my http package and my tls package and my | os package all at once and have to look for bugs | _everywhere_. | cube2222 wrote: | I admit I wasn't bit by these changes and had a much | better experience overall. Thank you for the long write- | up. | | However, I think you only mentioned changes in major | releases, whereas in this scenario (vulnerability fix) a | minor release would suffice (the parent mentioned | updating to a point release of a library). Did you also | have issues with minor releases? | staticassertion wrote: | > a minor release would suffice | | Does the Go compiler have LTS releases? Like if I'm on | 1.0, but 1.5 is out, are they going to release a 1.0.1 | for a vuln that impacts 1.0+ ? | | It seems unlikely but I'd be curious to know. | | Libraries release patches more frequently, and it's also | generally easier to apply a patch yourself if you need | to. | | Otherwise a point release may still imply a major | release. | cube2222 wrote: | The most recent 2 major releases get the fix in case of | security issues[0]. This means you can be up to 6 months | behind the newest release to never be forced to do a | major version update under time pressure. | | [0]:https://github.com/golang/go/wiki/MinorReleases | throwaway894345 wrote: | > The fact that requests has "languished" (not sure how | tbh) doesn't really change the fact that the Python stdlib | is a bit of a hilarious disaster from afar. | | Agreed that the Python stdlib is a disaster, but my point | was that the OP contradicts himself by arguing that | stability guarantees hold the standard library back while | pointing to requests which itself hasn't made many/any | intrepid breaking changes or even sensible non-breaking | changes a la async support. Note that "stability is bad" is | the OP's point of view and not mine. | | > Tons of cases of "that shouldn't be in std" where | libraries have quirky, locked in behaviors, or an entire | major breaking release with decades of work to migrate has | to be made to clean up the mistakes. | | But the parent pointed to the requests library which is not | in the stdlib. Note also that Go has been around for a | decade and has needed no such major migration initiative. | | > Yeah this also means that you need to update your | compiler when there's a vulnerability instead of just a | single point release in a library. This happens with some | frequency. | | The frequency is very low and updating the compiler is | minimally risky due to Go's strong compatibility guarantees | (precisely the kind of stability the parent opposes). This | is a much lesser problem than dependency management in | Python (I have 15 years of experience in Python and 10 in | Go). | makapuf wrote: | Note that Python was already almost two decades when v3 | got out and is now three decades old. | masklinn wrote: | FWIW older design documents have (some) reasoning for | integrating fuzzing natively: | | - | https://docs.google.com/document/d/1N-12_6YBPpF9o4_Zys_E_ZQn... | | - | https://go.googlesource.com/proposal/+/master/design/draft-f... | | One of the original proposals | (https://docs.google.com/document/u/1/d/1zXR- | TFL3BfnceEAWytV8...) further explains why, by apparently go- | fuzz maintainers team (per issue 329[0] none of them seems in | any way broken-hearted about the idea of deprecating go-fuzz | eventually): | | > go-fuzz suffers from several problems: | | > - It breaks multiple times per Go release because it's tied | to the way go build works, std lib package structure and | dependencies, etc. It broke due to internal packages (multiple | times), vendoring (multiple times), changed dependencies in std | lib, etc. | | > - It tries to do compiler work regarding coverage | instrumentation without compiler help. This leads to build | breakages on corner case code; poor performance; suboptimal | quality of coverage instrumentation (missed edges). | | > - Considerable difficulty in integrating it into other build | systems and non-standard contexts as it uses source pre- | processing. | | > Goal of this proposal is to make fuzzing as easy to use as | unit testing. | | [0] https://github.com/dvyukov/go-fuzz/issues/329 | [deleted] ___________________________________________________________________ (page generated 2022-01-01 23:00 UTC)