[HN Gopher] Writing an OS in Rust: Async/Await ___________________________________________________________________ Writing an OS in Rust: Async/Await Author : phil-opp Score : 317 points Date : 2020-03-30 13:52 UTC (9 hours ago) (HTM) web link (os.phil-opp.com) (TXT) w3m dump (os.phil-opp.com) | amelius wrote: | An OS with async/await sounds awfully similar to Windows 3.1 with | its cooperative multitasking model. | | What's old is new again ... | steveklabnik wrote: | Well, there's a big difference: | | > Using async/wait, we now have basic support for cooperative | multitasking in our kernel. While cooperative multitasking is | very efficient, it leads to latency problems when individual | tasks keep running for too long and thus prevent other tasks to | run. For this reason, it makes sense to also add support for | preemptive multitasking to our kernel. | | > In the next post, we will introduce threads as the most | common form of preemptive multitasking. In addition to | resolving the problem of long running tasks, threads will also | prepare us for utilizing multiple CPU cores and running | untrusted user programs in the future. | | It seems the plan here is to use this for _internal_ work, and | still give a preemptive multitasking API to userspace. | jklehm wrote: | Isn't that driving against the lesson learned from the past | though? 3rd party drivers can not be trusted to behave and | hang the kernel when doing so with cooperative multitasking. | steveklabnik wrote: | I am not sure what Phil's overall kernel design is; you're | assuming a monolithic kernel, but not every kernel has | drivers in kernel space. | jklehm wrote: | Fair point, looking forward to following along :) | eximius wrote: | Toy OS, so he has less to worry about. But also maybe just | be more careful around kernel modules, if that eventually | exists? | mtnygard wrote: | Phil's series is meant for learning rather than production | use. So sometimes he takes a "shortcut" in one post then | removes the restrictions or workarounds later. | Koshkin wrote: | GP is not wrong, then: with this implementation, async/await | still amounts to the good old cooperative multitasking and | does not (yet?) take advantage of threads. (In C#, for | example, async/await is indeed different from this, being | based on TAP - Task-based Asynchronous Pattern, where tasks | are executed "asynchronously on a thread pool thread rather | than synchronously on the main application thread.") | steveklabnik wrote: | Async/await lets you build tasks, but is completely | orthogonal to threads. You could set up tasks with a 1:1 or | N:M or even single threaded model. | | And yes, the GP is not wrong that if this were the only | mechanism provided, it would be similar. It does not seem | like the plan is to expose this externally whatsoever, | though. | fortran77 wrote: | Or MacOS up to and including OS9 | dfox wrote: | There is one significant difference: Windows did context- | switches in the blocking calls and did not rely on the program | code having "the right structure" needed for straight co- | routines to work. | | The difference between preemptive and cooperative multitasking | is not whether you do full context switches, but whether there | is a way to do context switch at a point where the process does | not expect it (ie. by handling some kind of timer interrupt as | scheduling opportunity). | amelius wrote: | Does Rust allow a computational for-loop to be interrupted | somehow? | | Computation can also be viewed as a blocking operation. | steveklabnik wrote: | The only yield points are .await points. If there's an | .await in a loop, then sure, but otherwise no. | hope-striker wrote: | Didn't most old operating systems use cooperative multitasking? | I remember, at least, that classic Mac OS (i.e. pre-OSX) didn't | use preemptive multitasking, either. | | Anyway, this SO answer[0] explains why early Linux, much like | the hobby kernel in this article, used cooperative scheduling | inside the kernel, and only preempted user-space stuff. | | [0]: https://stackoverflow.com/a/16766562 | saagarjha wrote: | Mac OS badly needed preemptive multitasking (and memory | protection!), and got it when the NeXT bits were transformed | into Mac OS X. | bestouff wrote: | MacOS may have been a little late. Windows NT, OS/2, Linux | did preemptive multitasking since beginning of 90s, even | AmigaOS had it in the 80s. | AnIdiotOnTheNet wrote: | Well, I'm not well versed in the details, but I do wonder if | cooperative multitasking isn't a better idea nowadays in the | age of ubiquitous multicore. | bluejekyll wrote: | As Steve's sibling comment points out, this is only for the | kernel. The article mentions preemptive threading for the | user space as still being desirable. | AnIdiotOnTheNet wrote: | I'm not convinced that's actually true though, is what I | mean. The big drawback of cooperative is just that a single | process can monopolize resources, which in the singlecore | days mean the system became unresponsive and you had no way | to recover from it. That isn't really true now with | multicore. As long as the OS has a core to work on it can | terminate or otherwise deal with misbehaving processes. | | Again, this is based on my admittedly limited | understanding, but I haven't yet seen good reasoning why | pre-emptive is still preferred. | steveklabnik wrote: | My understanding is that you've now moved the monopolized | case from 1 to N, where N is the number of cores. N is | usually pretty small. | | The laptop I'm writing this on has four cores, and the | Chrome instance has 17 threads running... | Bigpet wrote: | > Error: you cannot open another Chrome tab because all | your cores are already used up by Slack and VSCode | | You would probably still want to preempt, because you're | not going to rewrite all the widely used software to | actually yield. | | Because that's the thing about cooperative | multithreading, the participating parties need to | cooperate. | | And if you look at the amount of threads that some | software open it's just crazy. Looking on my machine the | top5 are: | | 1. 260 threads System (ok, that's basically the OS that | could be changed). | | 2. 145 threads Dropbox.exe (that's enough to lock all | cores on any single consumer CPU). | | 3. 87 threads SearchIndexer.exe (also OS, could be | rectified). | | 4. 74 threads EpicGamesLauncher (looks like an i9 is no | longer enough) | | 5. 70 threads MemoryCompression (also OS) | | I know not all of those threads are active at all time an | I guess all the blocking threads could be said to be | cooperating but even Dropbox alone regularly has 3-5 | threads running. | | There's still so many 2 and 4 core consumer machines out | there that cooperative multi-tasking with the current | software ecosystem would be a disaster. | | Not only would applications keep each other from making | regular steady progress but even single applications | would regularly hang themselves from all the threads they | themselves created if there was no pre-emptive yielding | beyond using any OS-call as a yield-point | [deleted] | zackmorris wrote: | I've shied away from async/await because I haven't seen a good | writeup on how to make it deterministic. Come to think of it, all | of the times I've encountered it, there was no way to analyze the | codepaths and prove that every exceptional situation and failure | mode was handled. Maybe I missed something along the way? | | So my feeling about it is that it may turn out to be an | evolutionary dead end, or anti-pattern at the least. My gut | feeling is that async/await is functionally equivalent to the the | synchronous/blocking/coroutine/channel system of other languages | like Go. Could we write a thin wrapper that converts async/await | to that or vice versa? | | This is the primary reason why I've stuck to synchronous/blocking | PHP with all of its flaws instead of Node.js. I think this is a | fundamental thing that shouldn't be glossed over and accepted so | readily into other languages. | jcranmer wrote: | There are two important things that are often conflated with | async/await: the programming model, and the executor model. | | The programming model of async/await is essentially syntactic | sugar for a state machine, where each of the states are | determined implicitly by the site of the call to await, and the | ancillary storage of the state machine is managed implicitly by | the compiler. It's basically no different from a generator | pattern (and is usually implemented on top of the same | infrastructure) [1]. | | Since the result of calling an async function is a task object | that doesn't do anything until you call methods on it | (including implicitly via await), most languages integrate | these task objects with their event model. This model is | completely orthogonal to the feature, and often exists (at | least in the library, if not the language) before async/await | is added to the language. JS has a built-in event loop that | predates async/await by several years, whether programming in | client-side browsers or using something like Node.js. Rust is | unusual in that it prescribes absolutely nothing for the | executor model; all it defines in its standard library is the | interface to drive the tasks manually. | | I doubt async/await is an evolutionary dead end. It's the third | kind of coroutine to hit it big in mainstream languages, the | first two being the zero-cost exception handling mechanism | (i.e., try/catch) and the generator (i.e. yield). The biggest | criticism of zero-cost exceptions essentially amounts to their | types not being made explicit, but the design of async/await | makes their effects on the type of coroutine quite apparent, so | it's not going to run into the same problems. | | [1] There is a difference in the pattern, which is that you're | probably going to get back different kinds of objects | implementing different interfaces, which is a bigger deal if | you manually drive the interfaces. Structurally--that is, in | terms of the actual machine code running--there is likely to be | no difference. | eximius wrote: | What do you mean by functionally equivalent? | | Channels are just queues and passing methods. Rust style | futures are closer to continuations. Both are powerful and can | accomplish neat taska, but not ABI compatible or even used the | same way. | michael_j_ward wrote: | > Channels are just queues and passing methods. | | My understanding is channels in `go` are 0 sized by default- | and thus turn into "rendezvous channels" that can be used to | synchronize progress between threads. | | This article compares Go and Rust threads and includes an | example of how go uses channel's to synchronize. [0] | | [0] - https://medium.com/@deckarep/paradigms-of-rust-for-the- | go-de... | steveklabnik wrote: | What do you mean by "deterministic" here? | | > Come to think of it, all of the times I've encountered it, | there was no way to analyze the codepaths and prove that every | exceptional situation and failure mode was handled. | | There is nothing special about async/await with regards to this | in Rust, at least if I'm understanding you correctly. Async | functions can return Results like any other function for | recoverable errors, and can panic like any other function for | un-recoverable errors. | | > My gut feeling is that async/await is functionally equivalent | to the the synchronous/blocking/coroutine/channel system of | other languages like Go. | | It depends on what level of abstraction you're talking about. | For Rust, which cares a lot about details, they're not. I gave | a talk comparing and contrasting all this stuff here: | https://www.infoq.com/presentations/rust-2019/ | animalnewbie wrote: | I am no expert and I wish there were a start from 0 guide with | examples not involving external crates but my sense is that you | can write your own executor and do whatever you want. (Maybe | I'm wrong?) | steveklabnik wrote: | You're not wrong: that is basically what this article is, and | it includes writing an executor! | nindalf wrote: | The linked article finally helped me understand how Rust | implements async. Give it a shot. | ww520 wrote: | Async or parallel systems are deterministic when the multiple | paths don't cross, and when crossed the crossing points have | well defined deterministic steps and interaction. | ccktlmazeltov wrote: | This is a huge issue when trying to fuzz async code also. | staticassertion wrote: | Dropbox does, essentially, fuzzing of Rust async code and it | is determistic. | | I couldn't find a post about it, sadly, but I know the | information is out there somewhere. | m0th87 wrote: | Is it possible to context switch between userspace processes | directly, without going through the kernel, i.e. a kind of fast, | inter-process cooperative multitasking? I know earlier operating | systems used inter-process cooperative multitasking, but I'm | guessing they still went through the scheduler? | | I was trying to figure out if QNX does this with `MsgSend`, as | QNX is renowned for being a fast microkernel, but it wasn't clear | to me. According to Animats, "There's no scheduling delay; the | control transfer happens immediately, almost like a coroutine. | There's no CPU switch, so the data that's being sent is in the | cache the service process will need." [1] According to wikipedia, | "If the receiving process is waiting for the message, control of | the CPU is transferred at the same time, without a pass through | the CPU scheduler." [2] | | It seems like `MsgSend` circumvents the scheduler, but does it | circumvent context switching to the kernel entirely too? | | 1: https://news.ycombinator.com/item?id=9872640 | | 2: https://en.wikipedia.org/wiki/QNX#Technology | bregma wrote: | MsgSend() in QNX Neutrino does not circumvent the kernel, just | the scheduler. If you're clever and lucky you can get an entire | round trip to and from the service in a single timeslice. On | the other hand, MsgSend() always blocks, so the call could be | preempted depending on service priority or if there is an I/O | wait or whatever, and that would mean the scheduler comes to | play. | cwzwarich wrote: | No, it doesn't circumvent context switching to the kernel. With | memory protection and separate address spaces, you generally | need privileged execution to change page tables. However, the | raw CPU cost of a syscall exception to the kernel followed by | an exception return to userspace isn't actually all that high | compared to all of the work that an OS usually does on top of | it. | retrac wrote: | It's not possible without specific hardware support. | | Context-switching on most current platforms with virtual memory | requires fiddling with things like page mappings, and that | simply must be done from supervisor mode. | | The ability to switch tasks in hardware has existed on some | historical machines. For example the GEC 4000 series basically | implemented a microkernel, including the scheduler, in | hardware. Switching to another process was done with a single | hardware instruction. | | I don't think anything current has such features besides the | long-obsolete and unused taskswitching feature on x86 | processors. | throwlaplace wrote: | great project for this lull. | | i've been going through http://craftinginterpreters.com/ in rust | (in order to learn rust) and it's a fantastic learning | experience. the language is straightforward after you understand | the basics (smart pointers and generics) and if you have a good | ide with completion (CLion). lifetimes/borrow checker aren't as | painful as people would have you believe if you use heap | allocation (i.e. Box, Rc<RefCell>). now obviously heap allocation | isn't optimal but for getting started it enables a smooth | progression. | steveklabnik wrote: | This is also just a fantastic introduction to async/await in | Rust, regardless of the OS bits. Another amazing article by Phil. | | > The only requirement is that we use at least nightly 2020-03-25 | of Rust because async/await was not no_std compatible before. | | There's some fun stuff here that's omitted (which makes sense, of | course). It was always a design constraint of async/await in Rust | that you could use it without the standard library. However, the | initial implementation required thread local storage. The future | trait's poll method took a &mut Context, and generators didn't | support passing context _in_ to them when resuming. This meant | that the context would be placed in TLS, and would pull it back | out when needed. Generators are unstable, partially for this | reason. However, this was fixed, and that means TLS is no longer | a requirement here! See https://github.com/rust- | lang/rust/pull/68524 for more details. | jerf wrote: | Is async/await a good idea for an OS kernel, even a toy one? | Cooperative multitasking tends to break down at scale, because | the probability that all the "threads" you're cooperating with | are playing nice goes to zero as the number of threads increases. | An OS will tend to have a concentrated number of the pathological | cases in it as it deals with hardware and all the other hardest | timing & concurrency problems. | | It's a viable option for user-space programs because you can far | more tightly characterize them and program them from top to | bottom to work with that paradigm nice. Embedding islands of | cooperative multitasking in a sea of pre-emptive multitasking | seems to make a lot more sense than the other way around. | | However, this post is a question, not a statement. If, for | example, a Linux kernel developer posts "nah, it's no biggie, the | Linux kernel is effectively structured the same", for instance, I | would not quibble. | jcranmer wrote: | There's a few different pieces to your question. | | Async/await is a programming paradigm that moves the state | management of an asynchronous program into the compiler rather | than being explicitly managed by the user. The paradigm is | almost universally a win: if you don't want to write | asynchronous code, just don't use it; it doesn't impact your | code at all. | | A second question is how the language actually drives the | execution of asynchronous tasks. Most of the languages that | have added async/await have incorporated it into their | execution model, so you are forced to use the particular | semantics of the language. There are subtle, but very | important, differences here: for example, when you actually | provide the answer that resolves the await, do you execute the | waiting code immediately, or schedule it for a future | iteration? This is what can cause the most problems with | async/await, but Rust does not define any executors [1], so | it's not an issue here. | | The final question is if asynchronous code at all makes sense | for an OS kernel. I'd argue that the answer is very much yes: | communication with devices are almost invariably asynchronous | (you set some memory lines to tell the device to do something, | and get an interrupt telling you it's arrived). Particularly | for the filesystem, where there are both synchronous and | asynchronous system calls for the same code, a good executor | for asynchronous code could well be beneficial for simplifying | code paths. (Note that filesystems also generally have some | degree of task dependency requirements--you need to ensure that | the various requests happen in a particular partial order). | | Now do note that this requires a far more complex executor | model than most async/await toolkits (including this tutorial) | provide. | | [1] Logically, it would make sense to provide a utility | function that synchronously runs an async function, but this | isn't implemented yet. | jerf wrote: | "Now do note that this requires a far more complex executor | model than most async/await toolkits (including this | tutorial) provide." | | I'm assuming the async/await language support in Rust is | intrinsically tied to a cooperative approach, which is the | part I'm questioning. Obviously a kernel needs to be able to | generically work in an asynchronous fashion... the question | is, is _this_ asynchronous fashion appropriate? | | If the Rust async/await support is indeed so flexible that | with some work you could tie it together with a pre-emptive | runtime of your own custom creation (as would make sense for | an OS kernel), or some intermediate support that may not be | "pre-emptive" but would be fully immune by-construction to | the sort of issues I'm worried about, then, yes, by all means | use it. | | But a fundamentally cooperative kernel would really worry me. | You'd be looking at freezing the whole kernel, and likely the | whole machine, if _anything_ , anywhere, goes wrong with the | cooperation. It won't take a lot of scale before that's a | problem. | | And that's the worst case. Kernels strike me as very likely | to also have to deal with the starvation situations that | cooperative multitasking can have, where one "task" never | gives up control. Even if this doesn't freeze the machine, | this can cause significant, human-visible latency issues to | emerge. | kbenson wrote: | > If the Rust async/await support is indeed so flexible | that with some work you could tie it together with a pre- | emptive runtime of your own custom creation | | I think that's exactly what it is. AIUI it's basically an | API for async/await (and maybe a default runtime?) that | different back-ends can be plugged into. At least that's | what I've come away with over the months of reading random | bits about it here. | Gladdyu wrote: | As described in the blog post, the rust compiler | generates a state machine for every async function and | generates the appropriate poll methods for the state | transition. This is fundamentally at odds with the | preemption, which would then have to indroduce new | intermediate states into the diagram which it won't be | able to do at runtime. | kbenson wrote: | But if Rust _is_ the operating system /kernel, whenever | it decides to to schedule something is preemption for | anything downstream, right? | | I mean, you don't actually use preemption in the kernel | right? Don't you have to handle all that yourself, since | there's nothing higher level than you to handle it for | you? In that respect, doesn't plugging in a Futures | runtime that looks for device flags and stuff as | appropriate and flags tasks to wake up/finish accomplish | the same thing? (those are actual questions, not leading | statements) | Gladdyu wrote: | If you would write a basic scheduler, at some point you'd | have to await the userspace code but you wouldn't have | any way to force it to stop running. If the userspace | code would enter an infinite loop it would hold the | kernel thread forever. Within a constrained environment, | eg. the kernel itself (and even that's sufficiently | complex with loadable drivers that you might end up with | bad interactions) I could see some use for async await, | but you'd still need to be able to preempt untrusted | code. | yazaddaruvala wrote: | At a high level, your concern is that OSes should not use | cooperative multitasking for OS Processes. i.e. using | Rust's async-await as the only technology within the OS | Scheduler would be a mistake. | | The article isn't discussing schedulers, but async-await | within the OS codebase in general. Forgetting the | scheduler, for example, Interrupt handling and driver code | is generally async in nature, and at a base level needs to | be cooperative. while let Some(byte) = | keyboard.interrupt().await() { ... } | | In general, certain drivers also need pre-emption, however, | given all of the code is written and curated by the kernel | devs, they can ensure the code cooperates well. | jcranmer wrote: | The Rust async/await amounts to the following: | | * Magic keywords (i.e., async and await) to manage the | state machine aspect of the function for you. | | * A standard interface for describing an asynchronous task | (core::future::Future). The magic keywords create an | implementation of this interface. | | * An interface for saying that task progress can be made | (core::task::Waker). | | That's it. There is no runtime provided, not even | optionally, in the standard library (let alone core). A | fair amount of this post is giving instructions for how to | build the runtime to execute asynchronous tasks. | | One important thing to note is that Rust uses a polling- | based approach to implement tasks. That is, the core | interface to a task is to run it and have it return the | result or indicate that is waiting on something else. In | the case of the latter, there is a callback mechanism (the | Waker object) that the task is responsible for calling when | it has more information to make progress--and the caller is | responsible for calling the task again when that happens. | Many (most?) other implementations instead use a resolution | model, where the caller provides a function that is called | with the result, and when the function gets called | (particularly in the case of tasks that can execute | quickly) differs from implementation to implementation. | Rust's model definitely maps much more easily to a kernel, | since the dominant pattern will be to drain a buffer, poke | a device, and then wait for an interrupt to happen (the | interrupt handler calling the wake function). | toast0 wrote: | > Is async/await a good idea for an OS kernel, even a toy one? | Cooperative multitasking tends to break down at scale, | | > Embedding islands of cooperative multitasking in a sea of | pre-emptive multitasking seems to make a lot more sense than | the other way around. | | In my reading of the post, the proposal is to do cooperative | multitasking within the kernel, and preemptive multitasking | between the kernel and the userspace. I think this is tenable, | within the kernel, you pretty much need to play nice with the | rest of the kernel anyway. Most kernels don't have effective | protection for one part of the kernel causing problems with the | rest; although, some do have limited protection against drivers | that crash or loop. | geogriffin wrote: | well.. there is CONFIG_PREEMPT and the even broader | CONFIG_PREEMPT_RT in linux [1]. | | [1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/ | lin... | Matthias247 wrote: | > Embedding islands of cooperative multitasking in a sea of | pre-emptive multitasking seems to make a lot more sense than | the other way around. | | Pretty much my opinion. I think a preemptive threaded | environment where some threads might run multiple async tasks | (in order to boost efficiency) can make a lot of sense. | | Pure cooperative environments seem too error-prone for bigger | systems, because they lack the necessary isolation between | tasks. Any bad task can degenerate the performance of all | others tasks by blocking the execution thread. | pjmlp wrote: | Yes, Active Oberon already had it in the mid 90's. | | Symbian also had it, both called them active objects. | bitwize wrote: | All I/O is asynchronous. Or rather, synchronous I/O is a subset | of asynchronous. Often, in kernel I/O systems, you are | communicating with, say, a disk controller that will take some | time to process your request. You will _want_ to do other stuff | while the device finishes and process the results only when the | device signals success or failure. | jfkebwjsbx wrote: | Async and sync I/O are not subsets of each other, although | you can use one to implement the other. | | The issue OP is discussing is cooperative vs. preemptive, not | async vs. sync. | [deleted] | ww520 wrote: | An OS has many aspects to it. Scheduling tasks is only one of | the aspects. Async/await is just the interface/mechanism in | dealing with the tasks, in this case cooperative tasks. | | Interacting with hardware, dealing with interrupts, memory | mapping/managing, task isolation, etc are all other aspects of | an OS that are apart from task scheduling but still needed. | | Cooperative multitasking works fine as long as the | users/developers understand the limitation. | jerf wrote: | "Cooperative multitasking works fine as long as the | users/developers understand the limitation." | | The problem is, we basically _already know_ that is not the | case. We 've got years of experience with what "cooperatively | multitasked" OS looks like, and it was not a "works fine" | situation. You can't understand the limitations at scale as | they interact with each other far, far beyond the human | mind's capacity to understand. | monocasa wrote: | Coopertively scheduling across your other isolation boundaries | is a pain point, but it's a great architecture for within a | single context (ie. within the kernel, or within a single | process). | | Linux has a lot of cruft and isn't as async as some would like, | but NT is pretty async underneath it all (although they still | allow you to do syncronous work as well). | CuriousSkeptic wrote: | Probably not at all related to the OP. But I found the thought | entertaining, so offering it as such. | | One way to address you concern would be to guarantee that task | always terminate in a bounded amount of instructions. | | One way to solve that would be offer a limited buffer in which | the program instructions for the task might be expressed, and a | language to express them in that is strongly normalizing (think | something like Morte) | briangold wrote: | True for a general-purpose OS kernel. But in an embedded system | or unikernel it could make a lot of sense. | Rusky wrote: | I'm not a Linux kernel developer, but it does have a few | similar mechanisms AFAIU. If a driver needs to perform more | work in response to an interrupt than would fit into the actual | handler, it can use cooperatively scheduled "tasklets" or | "softirqs." ___________________________________________________________________ (page generated 2020-03-30 23:00 UTC)