[HN Gopher] FuzzOS - an operating system which is designed speci...
       FuzzOS - an operating system which is designed specifically for
       Author : URfejk
       Score  : 29 points
       Date   : 2020-12-07 20:00 UTC (2 hours ago)
 (HTM) web link (gamozolabs.github.io)
 (TXT) w3m dump (gamozolabs.github.io)
       | codetrotter wrote:
       | I've been watching this guy stream on Twitch and I can tell you
       | that he is legit.
       | Also his streams are often insanely long, going 7 to 13 hours. So
       | I only ever watch his streams live for a while and then I catch
       | the remainder on VOD.
       | He also has a YouTube page with archive of past streams beyond
       | the retention of Twitch.
       | https://www.youtube.com/channel/UC17ewSS9f2EnkCyMztCdoKA
       | Honestly the knowledge he shares is so interesting that I
       | selfishly did not want other people to even know about it. But
       | realistically speaking I am not going to have time to make real
       | use of the knowledge myself anytime soon.
       | He puts out quality content and he deserves all the attention he
       | can get. And also, even though competition to find security bugs
       | and earn bounties might become too hard that I myself or you ever
       | get to find one and claim some money, the products that we all
       | use will be more secure the more people in the world that work on
       | finding these bugs and reporting them.
         | albntomat0 wrote:
         | Do you have any thoughts on how to approach the videos? I have
         | a good OS and fuzzing background, but an 8 hour video seems
         | like an ordeal and harder to extract value from than something
         | written
           | codetrotter wrote:
           | I think my best advice would be that you tune in on one of
           | the streams on Twitch when it is live and ask him about it.
           | Then maybe the two of you can figure out what is most
           | relevant to you of his content compared to what you already
           | know?
       | yjftsjthsd-h wrote:
       | Huh. So my initial response was, "why on earth would you need a
       | whole OS for that", but memory snapshotting and improved virtual
       | memory performance might actually be a good justification. Linux
       | does have CRIU which might be made to work for such a purpose,
       | but I could see a reasonable person preferring to do it from a
       | clean slate. On the other hand, if you need qemu to run
       | applications (which I'm really unclear about; I can't tell if the
       | plan is to run stuff natively on this OS or just to provide
       | enough system to run qemu and then run apps on linux on qemu)
       | then I'm surprised that it's not easier to just make qemu do what
       | you want (again, I'm pretty sure qemu already has its own memory
       | snapshotting features to build on).
       | Of course, writing an OS can be its own reward, too:)
         | gamozolabs wrote:
         | Oooh, wasn't really expecting this to make it to HN cause it
         | was meant to be more of an announcement than a description.
         | But yes, I've done about 7 or 8 operating systems for fuzzing
         | in the past and it's a massive performance (and cleanliness)
         | cleanup. This one is going to be like an operating system I
         | wrote 2-3 years ago for my vectorized emulation work.
         | To answer your QEMU questions, the goal is to effectively build
         | QEMU with MUSL (just to make it static so I don't need a
         | dynamic loader), and modify MUSL to turn all syscalls to `call`
         | instructions. This means a "syscall" is just a call to another
         | area, which will by my Rust Linux emulator. I'll implement the
         | bare minimum syscalls (and enum variants to those syscalls) to
         | get QEMU to work, nothing more. The goal is not to run Linux
         | applications, but run a QEMU+MUSL combination which may be
         | modified lightly if it means a lower emulation burden (eg.
         | getting rid of threading in QEMU [if possible] so we can avoid
         | fork())
         | The main point of this isn't performance, it's determinism, but
         | that is a side effect. A normal syscall instruction involves a
         | context switch to the kernel, potentially cr3 swaps depending
         | on CPU mitigation configuration, and the same to return back.
         | This can easily be hundreds of cycles. A `call` instruction to
         | something that handles the syscall is on the order of 1-4
         | cycles.
         | While for syscalls this isn't a huge deal, it's even more
         | emphasized when it comes to KVM hypercalls. Transitions to a
         | hypervisor are very expensive, and in this case, the kernel,
         | the hypervisor, and QEMU (eg. device emulation) will all be
         | running at the same privilege level and there won't be a weird
         | QEMU -> OS -> KVM -> other guest OS device -> KVM -> OS -> QEMU
         | transition every device interaction.
         | But then again, it's mainly for determinism. By emulating Linux
         | deterministically (eg. not providing entropy through times or
         | other syscall returns), we can ensure that QEMU has no source
         | of external entropy, and thus, will always do the same thing.
         | Even if it uses a random-seeded hash table, the seed would be
         | derived from syscalls, and thus, will be the same every time.
         | This determinism means the guest always will do the same thing,
         | to the instruction. Interrupts happen on the same instructions,
         | context switches do, etc. This means any bug, regardless of how
         | complex, will reproduce every time.
         | All of this syscall emulation + determinism I have also done
         | before, in a tool called tkofuzz that I wrote for Microsoft.
         | That used Linux emulation + Bochs, and it was written in
         | userspace. This has proven incredibly successful and it's what
         | most researchers are using at Microsoft now. That being said,
         | Bochs is about 100x slower than native execution, and now that
         | people have gotten a good hold of snapshot fuzzing (there's a
         | steep learning curve), it's time to get a more performant
         | implementation. With QEMU with get this with a JIT, which at
         | least gets us a 2-5x improvement over Bochs while still
         | "emulating", but even more value could be found if we get the
         | KVM emulation working and can use a hypervisior. That being
         | said, I do plan to support a "mode" where guests which do not
         | touch devices (or more specifically, snapshots which are taken
         | after device I/O has occurred) will be able to run without QEMU
         | at all. We're really only using QEMU for device emulation +
         | interrupt control, thus, if you take a snapshot to a function
         | that just parses everything in one thread, without process IPC
         | or device access (it's rare, when you "read" from a disk,
         | you're likely just hitting OS RAM caches, and thus not
         | devices), we can cut out all the "bloat" of QEMU and run in a
         | very very thin hypervisor instead.
         | In fuzzing it's critical to have ways to quickly map and unmap
         | memory as most fuzz cases last for hundreds of microseconds.
         | This means after a few hundred microseconds, I want to restore
         | all memory back to the state "before I handled user input" and
         | continue again. This is extremely slow in every conventional
         | operating system, and there's really no way around it. It's of
         | course possible to make a driver or use CRIU, but these are
         | still not exactly the solution that is needed here. I'd rather
         | just make an OS that trivially runs in KVM/Hyper-V/Xen, and
         | thus can run in a VM to get the cross-platform support, rather
         | than writing a driver for every OS I plan to use this on.
         | Stay cute, ~gamozo
       | AgloeDreams wrote:
       | Can someone tell me what the living heck is `Fuzzing`?
       | I read this twice and I really don't have a single clue other
       | than it having something to do with or requiring fast memory?
         | lambda_obrien wrote:
         | Fuzzing: give a program structured random garbage as input and
         | see what happens, then fix the resulting bugs.
         | Forge36 wrote:
         | Originally: for each terminal program, pass every file as
         | input. If crash results: document it.
         | Effectively: random inputs to achieve unexpected results. It's
         | now come to mean "random data testing of an API"
         | mehrdadn wrote:
         | Wikipedia explains it: https://en.wikipedia.org/wiki/Fuzzing
         | SAI_Peregrinus wrote:
         | Testing code via semi-random inputs[1]. The most common
         | fuzzers, AFL-Fuzz[2] and libFuzzer[3] are coverage-guided: they
         | compile the program with special instrumentation to determine
         | code coverage, then call the program repeatedly, changing the
         | inputs via genetic algorithm to try to maximize the code paths
         | executed. When unexpected behavior is observed (typically the
         | test harness crashing) the fuzzer saves the test's input for
         | future use.
         | Basically automatic generation of test case inputs. It's non-
         | deterministic, so it won't always find problems, but it can
         | save a lot of manual effort.
         | [1] https://en.wikipedia.org/wiki/Fuzzing [2]
         | https://lcamtuf.coredump.cx/afl/ [3]
         | https://www.llvm.org/docs/LibFuzzer.html
           | davidw wrote:
           | For an interesting, similar idea, see also:
           | https://en.wikipedia.org/wiki/QuickCheck
       (page generated 2020-12-07 23:00 UTC)