[HN Gopher] Go Fuzzing
       ___________________________________________________________________
        
       Go Fuzzing
        
       Author : 0xedb
       Score  : 149 points
       Date   : 2022-01-01 18:27 UTC (4 hours ago)
        
 (HTM) web link (tip.golang.org)
 (TXT) w3m dump (tip.golang.org)
        
       | staticassertion wrote:
       | Great to see fuzzing becoming more mainstream. Ultimately we have
       | absurd program states, with even a trivial program's state vastly
       | exceeding the number of particles in the universe. We need to
       | start finding order-of-magnitude-better approaches for testing.
       | 
       | I almost always write generated tests at this point with unit
       | tests being a fallback for slow code or niche cases. What I
       | _dont_ generally write though is fuzz tests, which would really
       | be a  'next step'. In Rust it's not very hard to do so, but it
       | hasn't quite hit the "trivial" mark yet for me, whereas
       | quickcheck is virtually the same amount of work to use as to not
       | use.
       | 
       | Languages like Go adopting and mainstreaming these practices will
       | be a benefit to everyone.
       | 
       | I'm curious if there's documentation on:
       | 
       | a) The coverage approach taken
       | 
       | b) The mutation approach taken
       | 
       | Can you configure these? Plugin different fuzzing backends?
        
         | fpopa wrote:
         | What kind of generated tests are you writing?
         | 
         | Is it more similar to 'golden files'? Generate expected output
         | and assert versus current implementation output?
        
           | adamgordonbell wrote:
           | Go stdlib has property testing built it. It's not as powerful
           | as some quick check frameworks, but it's built right in. I
           | wrote an article on it.
           | 
           | https://earthly.dev/blog/property-based-testing/
        
           | staticassertion wrote:
           | Just taking what would normally be a unit test and having the
           | input values be generated. Some examples:
           | 
           | 1. I have a test for encryption/decryption functions. The
           | data that's provided for the plaintext, additional data, and
           | key, is generated. The assertions are:
           | 
           | assert_ne!(plaintext, encrypted_data);
           | 
           | assert_eq!(plaintext, decrypted_data);
           | 
           | assert_eq!(aad, decrypted_aad);
           | 
           | etc
           | 
           | 2. I have some generated integration tests. For example, in
           | our product, there are certain properties that should always
           | hold for a given database entry. I generate a new entry on
           | every test and have the fields for that entry provided by
           | quickcheck, then I perform the operation, query the database,
           | and assert that properties on those values hold.
           | 
           | So to answer your question, yes. Sometimes you want to check
           | a concrete output (ie: "this base64 encoded string should
           | always equal this other value) for sanity, but in general
           | property tests give me more confidence.
           | 
           | I find it works particularly well with a 'given, when, then'
           | approach, personally.
           | 
           | edit: I'll also note that for the base64 case I'd suggest:
           | 
           | a) A hardcoded suite of values.
           | 
           | b) Generate property tests.
           | 
           | assert_eq!(value, base64decode(base64encode(value));
           | 
           | As well as things like "contains only these characters" and
           | "ends with [=a-zA-Z]" etc.
           | 
           | c) Oracle tests against a "known good" implementation.
        
             | krobelus wrote:
             | Sounds like a sensible mix. There is really no single
             | silver bullet.
             | 
             | We at https://symflower.com/ are working on a product to
             | generate unit tests. Unlike quickcheck/proptest we promise
             | to find errors, even if they are unlikely (for example
             | [this input](https://github.com/AltSysrq/proptest/blob/mast
             | er/proptest/RE...) would be trivial for Symflower). Also,
             | unlike fuzzing our technology is deterministic.
             | 
             | Here's one of our blog posts that explains the approach:
             | https://symflower.com/en/company/blog/2021/symflower-
             | finds-m...
        
         | 2OEH8eoCRo0 wrote:
         | > Great to see fuzzing becoming more mainstream.
         | 
         | Agreed. I wrote a fuzzer at my last job and it found a bunch of
         | bugs right before a release. Nobody knew what fuzzing was so I
         | was attacked by the program owner for trying to break the
         | software and given an insulting performance review for it. Then
         | I had all the fuzzing results and coredumps deleted out of
         | their directories by the program owner so the release looked
         | immaculate. Defense software ftw
        
           | staticassertion wrote:
           | Yikes, sounds pretty toxic on their part, but good on you for
           | taking a strong approach to software stability.
           | 
           | Also, writing fuzzers is super fun.
        
         | masklinn wrote:
         | > I almost always write generated tests at this point with unit
         | tests being a fallback for slow code or niche cases. What I
         | dont generally write though is fuzz tests, which would really
         | be a 'next step'. In Rust it's not very hard to do so, but it
         | hasn't quite hit the "trivial" mark yet for me, whereas
         | quickcheck is virtually the same amount of work to use as to
         | not use.
         | 
         | Did you mean genera _tive_ tests? You 're talking about
         | quickcheck and that's what it does.
         | 
         | "Generated tests" would usually be interpreted as codegen'd
         | test which you commit.
        
           | staticassertion wrote:
           | Tests with generated input. Call it what you like.
        
       | cinntaile wrote:
       | What would you say are the main differences between a fuzzer and
       | QuickCheck? The authors of quickcheck don't call it a fuzzer so I
       | assume there is some difference but both seem to randomize
       | inputs?
        
       | ryanschneider wrote:
       | Anyone seen good articles on converting go-fuzz tests to native
       | fuzzing? Specifics on the new corpus format and a converter from
       | go-fuzz would be really useful.
       | 
       | It's great to hear that the fuzzer is built on go-fuzz so
       | hopefully the conversion process won't be too bad:
       | https://github.com/dvyukov/go-fuzz/issues/329
        
         | aleksi wrote:
         | https://pkg.go.dev/golang.org/x/tools@v0.1.8/cmd/file2fuzz
        
         | morelisp wrote:
         | I've pre-emptively migrated a couple projects and found that
         | loading the old corpus files wherever you already had them and
         | then `Add`ing them as whatever new appropriate type was the
         | easiest way. The inclusion of types necessitates at least a
         | minor migration. I did not find any official documentation on
         | the format, though it's trivial to read, e.g.:
         | go test fuzz v1         string("\xff0")
         | 
         | Overall while the API (and of course tooling) is a huge step
         | forward, corpus management feels like a small step backwards
         | compared to go-fuzz - I didn't find a way to pull non-crashers
         | into an in-repo corpus other than manually copying them out of
         | my cache directory. And one-file-per-case still blows up a lot
         | of repo management tools.
        
       | dang wrote:
       | Past related thread:
       | 
       |  _Go: Fuzzing Is Beta Ready_ -
       | https://news.ycombinator.com/item?id=27391048 - June 2021 (53
       | comments)
        
       | xiaq wrote:
       | Fuzzing is awesome. I just discovered an accidental O(2^n) code
       | path in my project with fuzzing and fixed it:
       | https://github.com/elves/elvish/commit/9cda3f643efafce2df567...
       | 
       | Edit: shortly after I wrote this comment, fuzzing discovered
       | another pathological input - and that was fixed in
       | https://github.com/elves/elvish/commit/04173ee8ab3c7fc4a9e79...
       | 
       | (In case people are curious, the project is a Unix shell, Elvish:
       | https://elv.sh)
        
       | damagednoob wrote:
       | I will never understand why this has been included in the
       | standard library instead of as a standalone library available for
       | download. Now it's locked to the Go release cycle and have the
       | potential to languish because of backward compatibility concerns.
       | 
       | The decision to include it is perplexing when other language
       | ecosystems have chosen to keep this kind of functionality out of
       | the standard lib, e.g. requests in python[1]. To quote Kenneth
       | Reitz: "...the standard library is where a library goes to die."
       | 
       | [1] https://github.com/psf/requests/issues/2424
        
         | cube2222 wrote:
         | Seems fairly standard for Go.
         | 
         | You mentioned requests - the Go net/http library is widely
         | used, even though it's in the standard library. It didn't
         | languish, it didn't die. The interfaces are also used in most
         | 3rd party libraries and work well.
         | 
         | Moreover, Go's quality standard library is often cited as one
         | of its main strengths.
         | 
         | Thus, the inclusion of fuzzing in the stdlib isn't surprising
         | to me. Not saying the other way around would be bad. It's just
         | not surprising, and I don't think it's a bad choice, looking at
         | Go historically.
        
         | dainiusse wrote:
         | I think it is great in general. OTOH - nobody prohibits to use
         | any third party library whoever wants to. Third party libraries
         | also die like - https://github.com/go-check/check
        
         | Scaevolus wrote:
         | Python had an 18 month release cycle for most of its life (now
         | 12 month), while Go has a 6 month release cycle.
         | 
         | Many Python devs use the OS packaged Python versions, while Go
         | devs tend to use the latest release.
         | 
         | Integrating this in the stdlib means that more people will use
         | basic fuzzing functionality. There's nothing preventing third
         | party fuzzers from continuing to develop.
        
         | smasher164 wrote:
         | The Go fuzzing tool takes advantage of compiler
         | instrumentation. It can also work with built-in Go types, in
         | comparison to traditional fuzzing tools that just work with
         | bytes. Additionally, integrating it into the testing tool
         | allows it to be as easy to write as a unit test. This can help
         | provide a batteries-included fuzzing experience.
        
           | staticassertion wrote:
           | This is just a work around for:
           | 
           | a) A lack of strong typing
           | 
           | b) A custom compiler toolchain
           | 
           | llvm already has instrumentation/ coverage support and
           | generics make it easy to work with higher level constructs
           | than bytes, although you generally do want to just work with
           | bytes when fuzzing imo.
           | 
           | The language is weak, therefor the language has to add more
           | and more batteries-included because extending it is
           | purposefully difficult.
        
             | masklinn wrote:
             | > This is just a work around for:
             | 
             | > [...]
             | 
             | > llvm already has instrumentation/ coverage support
             | 
             | I mean, that supports the idea of having fuzzing support in
             | whatever core you have.
        
               | staticassertion wrote:
               | My point is that between Go's custom compiler backend and
               | inexpressive typing there's much more need to build
               | things like this in directly vs what other languages can
               | do by just using llvm/gcc. Like if Go developers want
               | sanitizers equivalent to what llvm packages they'll have
               | to build that themselves, although that won't have the
               | same issue with inexpressive types.
        
               | aleksi wrote:
               | > Like if Go developers want sanitizers equivalent to
               | what llvm packages they'll have to build that themselves
               | 
               | Go uses LLVM's ThreadSanitizer since 1.1.
        
               | staticassertion wrote:
               | I don't think that really addresses my point, it just
               | shows that they did the work for one sanitizer already.
        
             | smasher164 wrote:
             | > llvm already has instrumentation
             | 
             | The Go toolchain actually supports emitting instrumentation
             | for LLVM's libFuzzer with -gcflags=all=-d=libfuzzer.
             | 
             | > therefor the language has to add more
             | 
             | Okay, and so? If this ends up making fuzzing more popular
             | and easy-to-use, I frankly don't care if it was added as a
             | library or deeply integrated into the toolchain.
        
               | staticassertion wrote:
               | > The Go toolchain actually supports emitting
               | instrumentation for LLVM's libFuzzer with
               | -gcflags=all=-d=libfuzzer.
               | 
               | Sweet, that's a smart approach.
               | 
               | > Okay, and so? If this ends up making fuzzing more
               | popular and easy-to-use, I frankly don't care if it was
               | added as a library or deeply integrated into the
               | toolchain.
               | 
               | I don't care either because I don't write Go, so just the
               | fact that it's supported is nice for me since it
               | encourages this in languages I do care about.
               | 
               | But if I were a go developer I might care a lot about how
               | my language evolves, what's built in, what's a library,
               | what the capabilities are, what tools I can integrate
               | with, etc.
               | 
               | It sounds like they've done a pretty good job with
               | regards to this implementation though, happy to see it.
        
           | morelisp wrote:
           | > The Go fuzzing tool takes advantage of compiler
           | instrumentation.
           | 
           | This is the main benefit. I've been using go-fuzz for years
           | and compiler upgrades (especially any changes related to
           | modules/GOROOT/GOPATH) was a pain because it always behaved
           | slightly differently.
           | 
           | > It can also work with built-in Go types, in comparison to
           | traditional fuzzing tools that just work with bytes.
           | 
           | This could have been done just as efficiently without
           | upstream integration.
        
         | throwaway894345 wrote:
         | Maybe, but I wouldn't support support that argument by holding
         | up Python and its HTTP situation as exemplary. The standard
         | HTTP libraries are a nightmare, requests proves that Python
         | packages can and will languish even outside of the standard
         | library, and writing even a simple HTTP script in Python means
         | you now need to choose between the standard HTTP libraries or
         | tackle dependency management and multi-file deployment issues.
         | 
         | By contrast Go ships with a high quality standard HTTP library
         | that has lasted a decade and no "requests" equivalent has risen
         | up to challenge it.
         | 
         | Note also that Go's testing situation in general is much nicer
         | than many other languages precisely because things are baked
         | into the standard library--no need to quibble over which test
         | framework to use or to memorize each framework's equivalent for
         | "run tests that match this pattern" and so on.
        
           | staticassertion wrote:
           | The fact that requests has "languished" (not sure how tbh)
           | doesn't really change the fact that the Python stdlib is a
           | bit of a hilarious disaster from afar. Tons of cases of "that
           | shouldn't be in std" where libraries have quirky, locked in
           | behaviors, or an entire major breaking release with decades
           | of work to migrate has to be made to clean up the mistakes.
           | 
           | Python should be a case study in the many ways not to build a
           | language.
           | 
           | > By contrast Go ships with a high quality standard HTTP
           | library that has lasted a decade and no "requests" equivalent
           | has risen up to challenge it.
           | 
           | Yeah this also means that you need to update your compiler
           | when there's a vulnerability instead of just a single point
           | release in a library. This happens with some frequency.
           | 
           | The parent poster is right, in my opinion.
        
             | cube2222 wrote:
             | > Yeah this also means that you need to update your
             | compiler when there's a vulnerability instead of just a
             | single point release in a library.
             | 
             | Has updating the Go compiler actually been an issue for you
             | in the past? To me, with Go's stability, it's never been
             | more disruptive than updating a library in practice, so I
             | don't see much of a difference.
        
               | staticassertion wrote:
               | I don't know that I'd hate it if I were a go dev, it
               | would just be a bit annoying for a number of reasons.
               | 
               | For one thing I update libraries all the time so it's a
               | very fast, simple, well worn operation. Updating the
               | compiler is a bit more of a chore and I'm going to worry
               | a bit more about the impact (since it's global to all
               | code vs local to one package).
               | 
               | For another, I would want to make sure I had tooling that
               | could tell me "is this library in use by service X". I
               | don't know Go's story there, but I would hope it's
               | trivial to do so for a library but I suspect if it's part
               | of the standard library that may be trickier. If not,
               | nbd.
               | 
               | It's a bad smell to me, but if I were a Go developer it
               | wouldn't break me.
               | 
               | Perhaps ironically, until this native fuzzing package,
               | upgrading the compiler if you had fuzz tests would be one
               | case where things would likely break.
        
               | throwaway894345 wrote:
               | > Updating the compiler is a bit more of a chore and I'm
               | going to worry a bit more about the impact (since it's
               | global to all code vs local to one package).
               | 
               | This is indeed a chore in other languages. In Go, the
               | compiler is trivially installed. Typically this just
               | means bumping the version in your Dockerfile and "gvm use
               | $newVersion --default".
               | 
               | > For another, I would want to make sure I had tooling
               | that could tell me "is this library in use by service X".
               | I don't know Go's story there, but I would hope it's
               | trivial to do so for a library but I suspect if it's part
               | of the standard library that may be trickier. If not,
               | nbd.
               | 
               | This is supported out of the box by Go's tooling. `go mod
               | graph` is what you're looking for.
        
               | staticassertion wrote:
               | > This is indeed a chore in other languages. In Go, the
               | compiler is trivially installed. Typically this just
               | means bumping the version in your Dockerfile and "gvm use
               | $newVersion --default".
               | 
               | The issue isn't with installing the new compiler, that's
               | trivial in our use case as well (for Rust at least,
               | Python's a disaster, but I accept that). The issue is
               | ensuring compatibility, ensuring no new bugs are
               | introduced, etc. It's just a much heavier change to your
               | produced binary vs changing a package.
               | 
               | > This is supported out of the box by Go's tooling. `go
               | mod graph` is what you're looking for.
               | 
               | Cool, thanks.
        
               | rat9988 wrote:
               | It is for me, I can't just go build in the new version.
               | So i'm keeping the software with the old compiler.
        
               | TheDong wrote:
               | > Has updating the Go compiler actually been an issue for
               | you in the past? To me, with Go's stability, it's never
               | been more disruptive than updating a library in practice
               | 
               | I've run into issues with several go version updates.
               | 
               | Off the top of my head, all of the following caused
               | breakages:
               | 
               | 1. go 1.4 making directories named 'internal' special and
               | un-importable. Cross-package imports that used to work no
               | longer would compile with a compiler error.
               | 
               | 2. go 1.9 adding monotonic clock readings in a breaking
               | way, i.e. this program changed output from 1.8 to 1.9:
               | https://go.dev/play/p/Mi6cGCPd0rS (I know it looks
               | contrived, but I'm not digging up the actual code that
               | broke)
               | 
               | 3. The change of the http.Server default to serving http2
               | instead of http/1.1 broke stuff. Of course it did. How
               | can that possibly _not_ break stuff?
               | 
               | 4. The changes in 'GO111MODULE' defaults broke many
               | imports which had either malformed or incorrect go.mod
               | files. This one was quite painful for the whole
               | ecosystem.
               | 
               | 5. go1.17 switched to silently truncating a lot of query
               | strings. Of course that broke stuff, how could it not?
               | https://go.dev/play/p/azODBvkb-zK
               | 
               | Those are all intentional breaking changes which were not
               | fixed upstream (i.e. are "working as intended"). The
               | unintentional breaking changes, from changing error
               | messages to cause string-based error detection to fail
               | (because so many stdlib errors aren't exported so you
               | have to do string matching), to just plain dumb bugs in
               | the stdlib.... those are vastly more common. Those
               | usually do get fixed in point releases. Take a gander at
               | those release notes, many of the issues highlighted in
               | those changelogs come from pain people hit during
               | upgrades.
               | 
               | I think the majority of go version upgrades have had some
               | amount of pain, and most of them have been far more
               | disruptive than updating a well-built library.
               | 
               | I would much rather update just my fuzz-testing library
               | in a commit, and be confident that it's only used in
               | tests so CI is good enough to validate it, than have to
               | update that and my http package and my tls package and my
               | os package all at once and have to look for bugs
               | _everywhere_.
        
               | cube2222 wrote:
               | I admit I wasn't bit by these changes and had a much
               | better experience overall. Thank you for the long write-
               | up.
               | 
               | However, I think you only mentioned changes in major
               | releases, whereas in this scenario (vulnerability fix) a
               | minor release would suffice (the parent mentioned
               | updating to a point release of a library). Did you also
               | have issues with minor releases?
        
               | staticassertion wrote:
               | > a minor release would suffice
               | 
               | Does the Go compiler have LTS releases? Like if I'm on
               | 1.0, but 1.5 is out, are they going to release a 1.0.1
               | for a vuln that impacts 1.0+ ?
               | 
               | It seems unlikely but I'd be curious to know.
               | 
               | Libraries release patches more frequently, and it's also
               | generally easier to apply a patch yourself if you need
               | to.
               | 
               | Otherwise a point release may still imply a major
               | release.
        
               | cube2222 wrote:
               | The most recent 2 major releases get the fix in case of
               | security issues[0]. This means you can be up to 6 months
               | behind the newest release to never be forced to do a
               | major version update under time pressure.
               | 
               | [0]:https://github.com/golang/go/wiki/MinorReleases
        
             | throwaway894345 wrote:
             | > The fact that requests has "languished" (not sure how
             | tbh) doesn't really change the fact that the Python stdlib
             | is a bit of a hilarious disaster from afar.
             | 
             | Agreed that the Python stdlib is a disaster, but my point
             | was that the OP contradicts himself by arguing that
             | stability guarantees hold the standard library back while
             | pointing to requests which itself hasn't made many/any
             | intrepid breaking changes or even sensible non-breaking
             | changes a la async support. Note that "stability is bad" is
             | the OP's point of view and not mine.
             | 
             | > Tons of cases of "that shouldn't be in std" where
             | libraries have quirky, locked in behaviors, or an entire
             | major breaking release with decades of work to migrate has
             | to be made to clean up the mistakes.
             | 
             | But the parent pointed to the requests library which is not
             | in the stdlib. Note also that Go has been around for a
             | decade and has needed no such major migration initiative.
             | 
             | > Yeah this also means that you need to update your
             | compiler when there's a vulnerability instead of just a
             | single point release in a library. This happens with some
             | frequency.
             | 
             | The frequency is very low and updating the compiler is
             | minimally risky due to Go's strong compatibility guarantees
             | (precisely the kind of stability the parent opposes). This
             | is a much lesser problem than dependency management in
             | Python (I have 15 years of experience in Python and 10 in
             | Go).
        
               | makapuf wrote:
               | Note that Python was already almost two decades when v3
               | got out and is now three decades old.
        
         | masklinn wrote:
         | FWIW older design documents have (some) reasoning for
         | integrating fuzzing natively:
         | 
         | -
         | https://docs.google.com/document/d/1N-12_6YBPpF9o4_Zys_E_ZQn...
         | 
         | -
         | https://go.googlesource.com/proposal/+/master/design/draft-f...
         | 
         | One of the original proposals
         | (https://docs.google.com/document/u/1/d/1zXR-
         | TFL3BfnceEAWytV8...) further explains why, by apparently go-
         | fuzz maintainers team (per issue 329[0] none of them seems in
         | any way broken-hearted about the idea of deprecating go-fuzz
         | eventually):
         | 
         | > go-fuzz suffers from several problems:
         | 
         | > - It breaks multiple times per Go release because it's tied
         | to the way go build works, std lib package structure and
         | dependencies, etc. It broke due to internal packages (multiple
         | times), vendoring (multiple times), changed dependencies in std
         | lib, etc.
         | 
         | > - It tries to do compiler work regarding coverage
         | instrumentation without compiler help. This leads to build
         | breakages on corner case code; poor performance; suboptimal
         | quality of coverage instrumentation (missed edges).
         | 
         | > - Considerable difficulty in integrating it into other build
         | systems and non-standard contexts as it uses source pre-
         | processing.
         | 
         | > Goal of this proposal is to make fuzzing as easy to use as
         | unit testing.
         | 
         | [0] https://github.com/dvyukov/go-fuzz/issues/329
        
       | [deleted]
        
       ___________________________________________________________________
       (page generated 2022-01-01 23:00 UTC)