[HN Gopher] QEMU Internals ___________________________________________________________________ QEMU Internals Author : Nusyne Score : 258 points Date : 2021-04-26 12:21 UTC (10 hours ago) (HTM) web link (airbus-seclab.github.io) (TXT) w3m dump (airbus-seclab.github.io) | pthreads wrote: | Thank you. | | On the same subject can someone recommend a book or any other | resource to learn about virtual machine internals? My goal is to | try to build a toy clone of VirtualBox/VMWare. | | So far I have found one -- Virtual Machines by James E. Smith and | Ravi Nair. | ahefner wrote: | "KVM host in a few lines of code" | (https://zserge.com/posts/kvm/) is a fun article to get started | with. | tkhattra wrote: | Hardware and Software Support for Virtualization Synthesis | Lectures on Computer Architecture (2017) | | https://www.morganclaypool.com/doi/abs/10.2200/S00754ED1V01Y... | | Bringing Virtualization to the x86 Architecture with the | Original VMware Workstation (2012) | | https://dl.acm.org/doi/abs/10.1145/2382553.2382554 | hag wrote: | I've always been intrigued by virtual machines and emulation as | well. I've always wanted to try and make an emulator of some | kind. I don't know much about the internals of VirtualBox, but | my suggestion would be to start "easy" with one CPU/Computer | System/Game Console and go from there. That's what I finally | did with the 6502 and Commodore 64. | pizza234 wrote: | Conventionally, one starts from the CHIP-8, which is indeed a | virtual machine rather than a system in a strict sense. | | What I've found difficult is the step beyond that. NES and | GameBoy are typical steps, however, I've been very frustrated | by the confusing documentation of the GameBoy. There are 3/4 | references, but one of them has significant mistakes, while | another is incomplete. On the other hand, the Pan Docs should | be complete and accurate. | | I'm not sure if there is an easy middle ground, that, at the | same time, is also well documented. | | The Atary 2600 is architecturally simpler but less | documented, and also requires very accurate timings. I've | read somebody suggesting systems like Channel F, Astrocade | and Odyssey2, but I'm not sure they're well documented. | | I've personally lost my interest once I've found that | building an emulator was essentially fighting specifications | rather than actually building something. | toast0 wrote: | I built about a third of a NES emulator. The nesdev wiki is | mostly decent, although there's a fair number of things | where it seems like the first people to figure things out | got stuff kind of backwards, and if you flip it, it's a lot | easier, that's the sort of fighting the specifications I | think you're talking about. | | All that said, emulating the CPU was pretty fun. There's a | CPU test rom out there you can run with tracing and compare | to the published results. I also got the background tiling | from the PPU done, but the foreground processing has a lot | of steps, so I indefinitely paused for now. Also, I had | amazingly poor performance, so I wasn't super motivated to | continue. | | The 2600 has a very similar cpu, but the very limited | Stella output chip means most games are very timing | dependent, which means you have to be super accurate, which | adds difficulty. I think you should try to be cycle | accurate anyway, but it's easy to mess that up, and having | some freedom would be nice. | bambataa wrote: | I did a GameBoy and similarly found the CPU enjoyable and | the PPU a huge pain. Perhaps if I understood graphics | better, I would have enjoyed it more, but like you say it | just felt like a lot of steps. | andrewf wrote: | A subset of CP/M calls is a pretty simple "rest of the | system" to implement on top of an 8080/Z80 CPU emulation. | (It's a bit of a cheat - like qemu's "Linux user mode | emulation" or early version of DOSBox, because you restrict | software to interacting with a high-level software | interface, there are no lower-level details to aim for | fidelity with) | teleforce wrote: | The sibling's comment book recommendation "Hardware and | Software Support for Virtualization" book is on point and it's | written by one of the co-founders of VMware. | | Another book on Libvirt will be handy since it is the de facto | API for most virtualization including VMs and containers[1]. | | [1]https://www.amazon.com/Foundations-Libvirt-Development- | Maint... | alert0 wrote: | Fuzz week shows how to make make a snapshot / resettable | jitting hypervisor. | | https://m.youtube.com/playlist?list=PLSkhUfcCXvqHsOy2VUxuoAf... | vitno wrote: | I work on virtual machines at Google. I usually suggest | "Hardware and Software Support for Virtualization" [1] to new | team members without a virtualization background. | | [1] https://www.amazon.com/Hardware-Software-Virtualization- | Synt... | [deleted] | DarmokJalad1701 wrote: | For a really simple emulator project (not quite the level of | VirtualBox), check the "IntCode" challenges from AdventOfCode | 2019. | sammorrowdrums wrote: | Those were so fun! I loved my little VM as it progressed and | played pong, and commanded robots and rendered the output | etc. | | It's a really great fun way to learn the key concepts. | junon wrote: | This is very well organized, wow. | whoisburbansky wrote: | I don't mean this to disparage Airbus in any way but after | Boeing's issues with the 737 MAX I'd assumed a fairly poor | culture of software at airplane manufacturers in general. Super | glad to see work like this coming out of Airbus, really makes me | rethink my earlier assumptions about software competence in the | field. | Glawen wrote: | Is "move fast and break things" a good culture for airplane | manufacturer? Airbus is known for making good software, they | earned their reputation by releasing the first fly by wire | airliner (a320) in 84, which forced Boeing to go this route | with the 777. | | Making safety critical software is a totally different world | than what is seen on HN. The culture needed is safety culture | and it is all about doing boring code, following strict coding | rules, doing tons of documentation and analysis prior coding | and a doing tons of review of tests. I don't think it will | arouse interest here. | Veserv wrote: | That is such a bizarre viewpoint from my perspective. The | absolute deathtrap that is the 737 MAX had two software-related | critical failures in 400,000 flights. That constitutes a whole | system per-flight software reliability of 2 in ~400,000 or a | ~99.9995%, 5 9s. Obviously that is still unacceptable as that | is far below the software standard amongst all commercial | airplanes where software has not been implicated in a crash for | at least the last 10 years except for the 737 MAX. Even if we | include the two 737 MAX crashes into the statistics, the whole | system per-flight software reliability of all commercial | airplanes over the last decade is at least 2 in ~100,000,000 or | ~99.999998% or 7 9s. The standard in airplane software is | literally 5000x more reliable than AWS SLA guarantees and 500x | the holy grail in server software of 5 9s. Even the 737 MAX is | 20x better than the AWS guarantee and 2x more reliable than 5 | 9s. Airplane software is not bad, we just rightfully expect a | lot from systems that lives depend on, so even systems that are | better than best-in-class non-safety software are completely | unacceptable which may give the impression that they are bad in | absolute terms as they fail to live up to our expectations. | zaphirplane wrote: | That's an interesting way to look at uptime no pun intended | | thou I wouldn't buy a Toyota that exploded every 400,000 | trips world wide Or bank with a bank that lost all my money | every 400,000 transactions world wide | Glawen wrote: | Well, Toyota had the sticking gas pedal issue 10 years ago: | they did not implement a brake override when the gas pedal | was stuck. This was a recommended feature by European | manufacturers when they introduced the electronic throttle, | apparently Toyota didn't get the memo. | | Although I find the GM ignition key issue way worse than | Toyota which was an oversight. | Veserv wrote: | Indeed, a Toyota with a critical fatality-inducing safety | defect every 200,000 trips would be rightfully viewed as a | deathtrap. Given that the average trip is probably | somewhere around ~30 miles that would be a fatality per 6M | miles versus the standard of ~60M miles in the US, or about | 10x more dangerous. However, when comparing a car versus | airplanes, given that they both fulfill the niche of | transportation and are to some degree substitutable, a more | reasonable analysis would be fatalities/person-hour or | fatalities/person-mile. For fatalities/person-hour the | average flight is something like ~2 hours. In the same | amount of time 200,000 cars for 2 hours at an average of 40 | mph would be ~16M miles, so the 737 MAX is ~4x more | dangerous on a person-hour basis than cars. If we go by | distance the average flight is ~500 miles, so the 737 MAX | had a fatality per 100M person-miles or is ~1.6x _safer_ | than driving. That is just how high our standards are with | planes that a plane that is viewed as an absolute death | machine that is totally unfit for use is safer than its | primary alternative for an equivalent distance. A plane | that is 100x worse than any other commercial plane is still | better than the non-plane alternative on a per-distance | basis. | | Obviously, this does not excuse their actions as they still | made a system at least 100x more dangerous than the | standard, but it should give perspective on the difficulty | of the problems actually being solved. It is not a bunch of | amateurs or below-average engineers who need to adopt basic | practices. It is a bunch of highly-skilled professionals | developing systems with a level of reliability far beyond | what most software developers even think is possible. Even | the abysmal processes of the 737 MAX that are far below the | standard in the airplane industry would, relative to most | software, be very good. It is just that the problems they | need to solve are very, very, very hard and very good does | not cut it when lives, not data, are at stake. | elteto wrote: | Apples to oranges? The scale between AWS and 737s is several | orders of magnitude different. Boeing has a critical issue | every 200k flights, or let's say 3.8M hours of flight time | (assuming all flights are 19h, which they are not). Assume | AWS has 1M CPUs total (they have way more than that), if AWS | saw a critical CPU bug every 3.8M hours of CPU time they | would be having a 737 MAX crisis level every 3.8 hours. | Veserv wrote: | One failure per 3.8M hours would be once per 433 CPU-years, | so they probably actually do have somewhere between 10-100x | that failure rate for their CPUs given that expected CPU | lifetime is probably around 20-30 years. Even using a much | more reasonable 2 hours per flight that is still ~45 CPU- | years so still within the likely range of expected CPU | errors. Also that is a comparison against a system so | dangerous that it is unfit for use instead of the actual | standard which is once per 50,000,000 flights or ~250x | better. | | Even ignoring that, I am discussing the uptime of a system | using AWS which only guarantees 99.99% uptime for AWS | service in any given AWS region and only a 10% refund | (which is less than their profit margin) as long as they | keep your system up more than 99% of the time. Downtime for | a system due to AWS downtime in a region constitutes a | critical failure of AWS to deliver expected service. That | their lack of service does not result in deaths unlike an | airplane is immaterial to a reliability analysis, it only | tells us if their critical failures matter and what level | of reliability we should require/demand when making | reliability-cost tradeoffs. In other words, the probability | and costs of failure are not actually related. It is just | that costly failures result in more effort being spent on | developing mitigations. In the case of airplanes, critical | failure in the form of a crash is very costly, so they take | great pains to minimize the whole-system risk of that | failure mode. | pjerem wrote: | Airbus is known to be excellent in airplane software | development. | | However, this is probably not about the airplane part of | Airbus. Like Boeing, Airbus also have huge defense and space | divisions. | hhh wrote: | Airbus also has the Airbus Defense and Space group as well, | it's not just all airplanes :) | [deleted] ___________________________________________________________________ (page generated 2021-04-26 23:01 UTC)