[HN Gopher] Virtual Threads: New Foundations for High-Scale Java... ___________________________________________________________________ Virtual Threads: New Foundations for High-Scale Java Applications Author : axelfontaine Score : 85 points Date : 2022-09-29 18:03 UTC (4 hours ago) (HTM) web link (www.infoq.com) (TXT) w3m dump (www.infoq.com) | samsquire wrote: | This is good. | | I implemented a userspace 1:M:N timeslicing thread, kernel thread | to lightweight thread multiplexer in Java, Rust and C. | | I preempt hot for and while loops by setting the looping variable | to the limit from the kernel multiplexing thread. | | It means threads cannot have resource starvation. | | https://github.com/samsquire/preemptible-thread | | The design is simple. But having native support as in Loom is | really useful. | mattgreenrocks wrote: | I like it! Do you have any sense for what the perf hit is for | making those loops less hot to enable pre-emption? | samsquire wrote: | There is no if statement in the hot loop or in the kernel | thread so there is no performance cost there. | | The multiplexing thread is separate to the kernel thread so | you could say it's 1:M:N thread scheduling. I should have | been clearer in my comment. There is 3 types of threads. | | The multiplexing thread timeslices the preemption of the | lightweight threads and kernel threads every 10 milliseconds. | That is it stops all the loops in the lightweight thread and | in the lightweight thread and causes the next lightweight | thread to execute. | | So there is no overhead except for a structure variable | retrieval in the loop body | | Rather than | | For(int I = 0; I < 1000000; I++) { | | } | | We have | | Register_loop(thread_loops, 0, 0, 1000000); | | For(; thread_loops[0].index < thread_loops[0].limit; | thread_loops[0].index++) { | | } | | Handle_virtual_interrupt() | | And in the thread multiplexer scheduler, we do this | | thread_loops[0].index = thread_loops[0].limit | lenkite wrote: | Really hope this makes it to Android. (probably need to wait for | a decade or two though) | Blackthorn wrote: | So happy this is finally coming out! After years of using the | library that inspired this (fibers), I'm so stoked this is coming | to the wide outside world of Java. There's just no comparison in | how understandable and easy to program and debug this is compared | to callback and event based programming. | jeffbee wrote: | " Operating systems typically allocate thread stacks as | monolithic blocks of memory at thread creation time that cannot | be resized later. This means that threads carry with them | megabyte-scale chunks of memory to manage the native and Java | call stacks." | | This extremely common misconception is not true of Linux or | Windows. Both Windows and Linux have demand-paged thread stacks | whose real size ("committed memory" in Windows) is minimal | initially and grows when needed. | uluyol wrote: | Do they shrink too? How many threads can be created before | address space is exhausted (even if the memory isn't backed by | pages, the address space is still reserved)? | jeffbee wrote: | You'll run out of physical memory for the first page of the | stack long before you run out of room in the virtual address | space. | tedunangst wrote: | The stack for any thread other than the first is just memory | like any other allocation. You can free it, resize it, copy | it elsewhere, whatever you want to do. Literally just a | pointer in a register. People work up weird mythologies about | it, but the stack can be anything you want if you're willing | to write code to manage it. | ccooffee wrote: | This is a great writeup, and reignites my interest in Java. (I've | long considered "Java Concurrency in Practice" to be the _best_ | Java book ever written.) | | I haven't been able to figure out how the "unmount" of a virtual | thread works. As stated in this article: | | > Nearly all blocking points in the JDK have been adapted so that | when encountering a blocking operation on a virtual thread, the | virtual thread is unmounted from its carrier instead of blocking. | | How would I implement this logic in my own libraries? The | underlying JEP 425[0] doesn't seem to list any explicit APIs for | that, but it does give other details not in the OP writeup. | | [0] https://openjdk.org/jeps/425 | ecshafer wrote: | Java Concurrency in Practice is a fantastic book. I had DL as a | professor for about a half dozen courses in undergrad, | including Concurrent and Parallel Programming. Absolutely | fantastic professor, with a lot of insight into how parallel | programming really works at the language level. One of the best | courses I've taken. | anonymousDan wrote: | Yeah Java gets a lot of grief, but I learned a lot about | concurrent programming from making sure I really understood | every line of code in this book. | pron wrote: | > How would I implement this logic in my own libraries? | | There's no need to if your code is in Java. We had to change | low-level I/O in the JDK because it drops down to native. | | That's not to say every Java library is virtual-thread- | friendly. For one, there's the issue of pinning (see the JEP) | that might require small changes (right now the problem is most | common in JDBC drivers, but they're already working on | addressing it). The bigger issue, mostly in low-level | frameworks, is implicit assumptions about a small number of | shared threads, whereas virtual threads are plentiful and are | never pooled, so they're never shared. An example of such an | issue is in Netty, where they allocate very large _native_ | buffers and cache them in ThreadLocals, which assumes that the | number of threads is low, and that they 're reused by lots of | tasks. | _benedict wrote: | Conversely, some applications would like a leaky abstraction | they have some control over. Some caching will likely remain | beneficial to link to a carrier thread. | | As a member of the Cassandra community I'm super excited to | get my hands on virtual threads come the next LTS (and | Cassandra's upgrade cycle), as it will permit us to solve | many outstanding problems much more cheaply. | | I hope by then we'll also have facilities for controlling the | scheduling of virtual threads on carrier threads. I would | rather not wait another LTS cycle to be able to make proper | use of them. | Nzen wrote: | I don't know how they did it, but you could use that jep id as | a query in the jdk issue tracker [0], and then use the issue | tracker id to find the corresponding github issue [1]. (I had | hoped for commits with that prefix, but there don't seem to be | any for that issue.) | | [0] | https://bugs.openjdk.org/browse/JDK-8277131?jql=issuetype%20... | | [1] https://github.com/openjdk/jdk/pull/8787 | chrisseaton wrote: | > I haven't been able to figure out how the "unmount" of a | virtual thread works. | | The native stack is just memory like any other, pointed to by | the stack pointer. You can unmount one stack and mount another | by changing the stack pointer. You can also do it by copying | the stack out to a backing store, and copying the new thread's | stack back in. I think the JVM does the latter, but not an | expert. | galaxyLogic wrote: | Seems like a good development. I've been doing Node.js for last | few years after letting go of Java. But there's something | uneasy about async/await. For one thing it's difficult to debug | how the async functions interact. | geodel wrote: | Well one way is to replace "synchronized" blocks with | ReentrantLocks where ever you can. | cvoss wrote: | I would guess that LockSupport.park() and friends have also | been adapted to support virtual thread unmounting. | cypressious wrote: | Does you library use any of the JDK's blocking APIs like | Thread.sleep, Socket or FileInputStream directly or | transitively? If so, it is already compatible. The only thing | you should check is if you're using monitors for | synchronization which are currently causing the carrier thread | to get pinned. The recommendation is to use locks instead. | mikece wrote: | How does this compare to Processes in Elixir/Erlang -- is Java | now as lightweight and performant? | geodel wrote: | I think it is really important development in Java space. One | reason I plan to use it soon is because it does not bring in | complex programing model of "reactive world" and hence dependency | on tons of reactive libraries. | | I tried moving plain old Tomcat based service to scalable netty | based reactive stack but it turned out to be too much work and an | alien programing model. With Loom/Virtual thread, the only thing | I will be looking for server supporting Virtual threads natively. | Helidon Nima would fit the bill here as all other frameworks/app | servers have so far just slapping virtual threads on their thread | pool based system. And unsurprisingly it is not leading to great | perf expected from Virtual thread based system. | RcouF1uZ4gsC wrote: | > How long until OS vendors introduce abstractions to make this | easier? Why aren't there OS-native green threads, or at the | very least user-space scheduling affordances for runtimes that | want to implement them without overhead in calling blocking | code? | | Windows had has Fibers[0] for decades (IIRC since 1996 with | Windows NT 4.0) | | 0. https://learn.microsoft.com/en- | us/windows/win32/procthread/f... | anonymousDan wrote: | Copying virtual stacks on a context switch sounds kind of | expensive. Any performance numbers available? Maybe for very deep | stacks there are optimizations whereby you only copy in deeper | frames lazily under the assumption they won't be used yet? Also, | what is the story with preemption - if a virtual thread spins in | an infinite loop, will it effectively hog the carrier thread or | can it be descheduled? Finally, I would be really interested to | see the impact on debugability. I did some related work where we | were trying to get the JVM to run on top of a library operating | system and a libc that contained a user level threading library. | Debugging anything concurrency related became a complete | nightmare since all the gdb tooling only really understood the | underlying carrier threads. | | Having said all that, this sounds super cool and I think is 100% | the way to go for Java. Would be interesting to revisit the | implementation of something like Akka in light of this. | thom wrote: | So right now it seems like you can replace the thread pool | Clojure uses for futures etc with virtual threads and go ham. You | could even write an alternative go macro to replace the bits of | core.async where you're not supposed to block. Feels like Clojure | could be poised to benefit the most here, and what a delight it | is to have such a language on a modern runtime that still gets | shiny new features! | gigatexal wrote: | Reading through the source code examples has me rethinking my | dislike for Java. It sure seems far less verbose and kinda nice | actually. | marginalia_nu wrote: | Modern Java is _a lot_ less boilerplaty than old enterprise | Java. | mgraczyk wrote: | The section "What about async/await?", which compares these | virtual threads to async/await is very weak. After reading this | article, I came away with the impression that this is a | dramatically worse way to solve this problem than async/await. | The only benefit I see is that this will be simpler to use for | the (increasingly rare) programmers who are not used to async | programming. | | The first objection in the article is that with async/await you | to may forget to use an async operation and could instead use a | synchronous operation. This is not a real problem. Languages like | JavaScript do not have any synchronous operations so you can't | use them by mistake. Languages like python and C# solve this with | simple lint rules that tell you if you make this mistake. | | The second objection is that you have to reimplement all library | functions to support await. This is a bad objection because you | also have to do this for virtual threads. Based on how long it | took to add virtual threada to Java vs adding async/await to | other languages, it seems like virtual threads were much more | complicated to implement. | | The programming model here sounds analogous to using gevent with | python vs python async/await. My opinion is that the gevent | approach will die out completely as async/await becomes better | supported and programmers become more familiar. | | EDIT: Looking more at the "Related Work" section at the bottom. I | think I understand the problem here. The "Structured Concurrency" | examples are unergonomical versions of async/await. I'm not sure | what I'm missing but this seems like a strictly worse way to | write structured concurrent code. | | Java example: Response handle() throws | ExecutionException, InterruptedException { try (var | scope = new StructuredTaskScope.ShutdownOnFailure()) { | Future<String> user = scope.fork(() -> findUser()); | Future<Integer> order = scope.fork(() -> fetchOrder()); | scope.join(); // Join both forks | scope.throwIfFailed(); // ... and propagate errors | // Here, both forks have succeeded, so compose their results | return new Response(user.resultNow(), order.resultNow()); | } } | | Python equivalent async def handle() -> | Response: # scope is implicit, throwing on failure is | implicit. user, order = await | asyncio.gather(findUser(), findOrder()) return | Response(user, order) | | You could probably implement a similar abstraction in Java, but | you would need to pass around and manage the the scope object, | which seems cumbersome. | wtetzner wrote: | I can see you having objections to their arguments against | async/await, but what makes you say async/await is somehow the | better solution? | mgraczyk wrote: | There are a few reasons. | | async/await allows you to do multiple things in parallel. I | don't see how you can do that in the virtual threading model, | although I haven't used it and only read this article. You | would have to spin up threads and wait for them to finish, | which IMO is much more complicated and hard to read. | | javascript async function doTwoThings() { | await Promise.all([ doThingOne(), | doThingTwo(), ]); } | | python async def do_two_things() { | await asyncio.gather( do_thing_one(), | do_thing_two(), ); } | | Another issue is building abstractions on top of this. For | example how do you implement "debounce" using virtual | threads? You end up unnaturally reimplementing async/await | anyway. | | Finally it's generally much easier implement new libraries | with a promise/future based async/await system than with a | system based on threads, but I'm not familiar enough with | Java to know whether this is actually a good objection. It's | possible they make it really easy. | Jtsummers wrote: | > async/await allows you to do multiple things in parallel. | I don't see how you can do that in the virtual threading | model, although I haven't used it and only read this | article. | | The description of this is that the virtual threads can | move between platform threads, quoting from the article: | | > The operating system only knows about platform threads, | which remain the unit of scheduling. To run code in a | virtual thread, the Java runtime arranges for it to run by | mounting it on some platform thread, called a carrier | thread. Mounting a virtual thread means temporarily copying | the needed stack frames from the heap to the stack of the | carrier thread, and borrowing the carriers stack while it | is mounted. | | > When code running in a virtual thread would otherwise | block for IO, locking, or other resource availability, it | can be unmounted from the carrier thread, and any modified | stack frames copied are back to the heap, freeing the | carrier thread for something else (such as running another | virtual thread.) Nearly all blocking points in the JDK have | been adapted so that when encountering a blocking operation | on a virtual thread, the virtual thread is unmounted from | its carrier instead of blocking. | | This allows for parallelism so long as the system is | multicore and the JVM has access to multiple parallel | threads to distribute the virtual threads across. | mgraczyk wrote: | Two separate threads run in parallel, but one thread | cannot do two subtasks in parallel without submitting | parallel jobs to an executor or a StructuredTaskScope | subtask manager. It's basically forcing the developer to | do all the hard work and boilerplate that async/await | saves you. | pron wrote: | > It's basically forcing the developer to do all the hard | work and boilerplate that async/await saves you. | | It doesn't. Both require the exact same kind of | invocation by the user. Neither automatically | parallelises operations that aren't explicitly marked for | parallelisation. | e63f67dd-065b wrote: | > async/await allows you to do multiple things in parallel. | I don't see how you can do that in the virtual threading | model, although I haven't used it and only read this | article. You would have to spin up threads and wait for | them to finish, which IMO is much more complicated and hard | to read | | I think there's a fundamental point of confusion here. In | both python and JS, you can't do _anything_ in parallel, | since node /v8 and cpython are single-threaded (yes if you | dip down into C you can spawn threads to your heart's | content). You can only do them concurrently, since only | when a virtual thread blocks can you move on and schedule | another thread in your runtime. | | In c++ (idk the java syntax, imagine these are runtime | threads): std::thread t1(doThingOne, | arg1); std::thread t2(doThingTwo, arg2); | t1.join(); t2.join(); // boost has a | join_all | | I'm sure there's some kind of `join_all` function in Java | somewhere. Imo this is even more clear than your async | await example: we have a main thread, it spawns two | children, and then waits until they're done before | proceeding. | | The traditional problem with async/await is that it forces | a "are your functions red or blue" decision up-front (see | classic essay | https://journal.stuffwithstuff.com/2015/02/01/what-color- | is-...). | | > Finally it's generally much easier implement new | libraries with a promise/future based async/await system | than with a system based on threads | | How so? Having written a bunch of libraries myself, I have | to say that not worrying about marking functions as async | or not is a great boon to development. Just let the runtime | handle it. | mgraczyk wrote: | The first part is semantics, yes I understand that python | is running one OS thread at a time with a GIL (for now). | Just pretend I used the word "concurrent" instead of | parallel in all the places necessary to remove the | semantic disagreement. | | Whether threads and joining vs async/await is clearer is | a matter of taste and familiarity. I find async/await | much more clear because that's what I am more used to. | Others will disagree, that's fine. I suspect more people | will prefer async/await as time goes on but that's my | opinion. | | > not worrying about marking functions as async or not is | a great boon to development. | | I don't really see why this is a big deal. You can change | the function and callers can change their callsite. There | are automated lint steps for this in python and | javascript that I use all the time. It's not any | different to me than adding an argument or changing a | function name. | gbear605 wrote: | Part of the difference with Java is that a lot of | libraries haven't changed in twenty years because they | already work. Adding async/await would probably mean | writing an entirely new library and scrapping the old | already working code, while green threads allow the old | libraries to silently become better. | spullara wrote: | The only difference between how virtual threads work and | how async/await work is that you don't need to use await | and don't need to declare async. Just call .get() on a | Future when you need a value - that is basically "await". | void doTwoThings() { var f1 = doThingOne(); | var f2 = doThingTwo(); var thingOne = f1.get(); | var thingTwo = f2.get(); } | mgraczyk wrote: | How do you implement doThingOne? | | You should read the "structured concurrency" link in the | article. You have to explicitly wrap the call to | doThingOne in a future under a structured concurrency | scope. The code example you wrote is not going to be | possible in Java without implementing doThingOne in a | complicated way. | merb wrote: | great, now you have futures AND virtual threads. soo much | better! | Scarbutt wrote: | _async /await allows you to do multiple things in parallel. | I don't see how you can do that in the virtual threading | model_ | | When comparing to JS, it is the other way around. Unless | you are talking about IO bound tasks only where nodejs | delegates to a thread pool (libuv). | mgraczyk wrote: | With virtual threads, you need to write fork/join code to | do two subtasks. With async await, you call two async | functions and await them. So the virtual threading model | ends up requiring something that looks like a worse | version of async await to me. | Jtsummers wrote: | t1 = async(Task1) t2 = async(Task2) await t1 | await t2 t1 = fork(Task1) t2 = | fork(Task2) t1.join() t2.join() | | What's the difference? | mgraczyk wrote: | If Java adds some nice standardized helpers like this, | they will look equivalent. The current proposal is not | this clean but that doesn't mean it won't be possible. | The key difference is that async/await implies | cooperative multitasking. Nothing else happens on the | thread until you call await. I find that an easier model | to think about, and I opt into multithreading when I need | it. | | Anyway Rust does this roughly using roughly the syntax | you described (except no need to call "fork"). Languages | that use async/await do not require you to say "async" at | the call site. | Eduard wrote: | > Languages like JavaScript do not have any synchronous | operations so you can't use them by mistake. | | Can you explain what you mean by this? Isn't it the opposite - | Javascript has a synchronous execution model? | vlovich123 wrote: | Aside from NodeJS-specific APIs, JS as a whole does not | generally have any synchronous I/O, locks, threads etc. | SharedArrayBuffer is probably the notable exception as it can | be used to build synchronous APIs that implement that | functionality if I'm not mistaken. | | Unless by synchronous you meant single threaded in which case | JS is indeed single threaded normally (unless you're using | things like Web Workers). | mgraczyk wrote: | I mean what the article calls a "synchronous blocking | method", which javascript (mostly) does not have. | pron wrote: | async/await require yet another world that's parallel to the | "thread" world but requires its own "colour" and set of APIs. | So now you have two kinds of threads, to kinds of respective | APIs, and two kinds of the same concept that has to be known by | all of your tools (debuggers, profilers, stacktraces). | | > This is a bad objection because you also have to do this for | virtual threads | | No. We had to change _a bit_ of the implementation -- at the | very bottom -- but none of the APIs, as there is no viral async | colour that requires doubling all the APIs. | | You're right that implementing user-mode threads is much more | work than async/await, which could be done in the frontend | compiler if you don't care about tool support (although we very | much do), but the result dominates async/await in languages | that already have threads (there are different considerations | in JS) as you keep all your APIs and don't need a duplicate | set, and a lot of existing code tools just work (with | relatively easy changes to accommodate for a very high number | of threads). | | > The "Structured Concurrency" examples are unergonomical | versions of async/await. | | They're very similar, actually. | | We've made the Java example very explicit, but that code would | normally be written as: Response handle() | throws ExecutionException, InterruptedException { | try (var scope = new StructuredTaskScope.ShutdownOnFailure()) { | var user = scope.fork(() -> findUser()); var | order = scope.fork(() -> fetchOrder()); | scope.join().throwIfFailed(); return new | Response(user.resultNow(), order.resultNow()); } | } | | But when the operations are homogenous, i.e. all of the same | type rather than different types as in the example above (Java | is typed), you'll do something like: try (var | scope = new StructuredTaskScope.ShutdownOnFailure()) { | var fs = myTasks.stream().map(scope::fork).toList(); | scope.join().throwIfFailed(); return | fs.map(Future:resultNow).toList(); } | | Of course, you can wrap this in a higher level `gather` | operation, but we wanted to supply the basic building blocks in | the JDK. You're comparing a high-level library to built-in JDK | primitives. | | Work is underway to simplify the simple cases further so that | you can just use the stream API without an explicit scope. | mgraczyk wrote: | This makes sense, especially the bit about tooling. I'm | unfamiliar with the state of Java tooling besides very simple | tasks. | | On the other hand using things like debuggers and reading | stack traces in python/js "just work" for me. Maybe because | the tooling and the language have evolved together over a | longer period of time. | | I also feel like the reimplementation of all functions to | support async is not a big deal because the actual pattern is | generally very simple. You can start by awaiting every async | function at the call site. New libraries can be async only. | pron wrote: | > On the other hand using things like debuggers and reading | stack traces in python/js "just work" for me. Maybe because | the tooling and the language have evolved together over a | longer period of time. | | Well, Python and JS don't have threads, so async/await are | their only concurrency construct, and it's supported by | tools. But Java has had tooling that works with threads for | a very long time. Adding async/await would have required | teaching all of them about this new construct, not to | mention the need for duplicate APIs. | | > I also feel like the reimplementation of all functions to | support async is not a big deal because the actual pattern | is generally very simple. You can start by awaiting every | async function at the call site. | | First, you'd still need to duplicate existing APIs. Second, | the async/await (cooperative) model is inherently inferior | to the thread model (non-cooperative) because scheduling | points must be statically known. This means that adding a | blocking (i.e. async) operation to an existing subroutine | requires changing all of its callers, who might be | implicitly assuming there can't be a scheduling point. The | non-cooperative model is much more composable, because any | subroutine can enforce its own assumptions on scheduling: | If it requires mutual exclusion, it can use some kind of | mutex without affecting any of the subroutines it calls or | any that call it. | | Of course, locks have their own composability issues, but | they're not as bad as async/await (which correspond to a | single global lock everywhere except around blocking, i.e. | async, calls) | | So when is async/await more useful than threads? When you | add it to an existing language that didn't have threads | before, and so already had an implicit assumption of no | scheduling points anywhere. That is the case of JavaScript. | | > New libraries can be async only. | | But why if you already have threads? New libraries get to | enjoy high-scale concurrency and old libraries too! | mgraczyk wrote: | I agree with your point that for CPU bound tasks, the | threading model is going to result in better performing | code with less work. | | As for the point about locks, I think this one is also a | question of IO-bound vs CPU bound work. For work that is | CPU bottlenecked, there is a performance advantage to | using threads vs async/await. | | As for the tooling stuff, I'm still not really convinced. | Python has almost always had threads and I've worked on | multimillion line codebases that were in the process of | migrating from thread based concurrency to async/await. | Now JS also has threads (workers). I also use coroutines | in C++ where threads have existed for a long time. I've | never had a problem debugging async/await code in these | languages, even with multiple threads. I guess I just | have had good experiences with tooling but It doesn't | seem that hard to retrofit a threaded language like | C++/Python. | pron wrote: | > I guess I just have had good experiences with tooling | but It doesn't seem that hard to retrofit a threaded | language like C++/Python. | | But why would you want to if you can make threads | lightweight (which, BTW, is not the case for C++)? By | adding async/await on top of threads you're getting | another incompatible and disjoint world that provides -- | at best -- the same abstraction as the one you already | have. | mgraczyk wrote: | I think the async/await debugging experience is easier to | understand. For example in the structured concurrency | example, it seems like it would require a lot of tooling | support to get a readable stack trace for something like | this (in python) | | Code import asyncio async | def right(directions): await | call_tree(directions) async def | left(directions): await call_tree(directions) | async def call_tree(directions): if | len(directions) == 0: raise Exception("call | stack"); if directions[0]: | await left(directions[1:]) else: | await right(directions[1:]) directions = [0, | 1, 0, 0, 1] asyncio.run(call_tree(directions)) | | Trace Traceback (most recent call | last): File "/Users/mgraczyk/tmp/test.py", line | 19, in <module> | asyncio.run(call_tree(directions)) File "/usr/l | ocal/Cellar/python@3.9/3.9.13_1/Frameworks/Python.framewo | rk/Versions/3.9/lib/python3.9/asyncio/runners.py", line | 44, in run return | loop.run_until_complete(main) File "/usr/local/ | Cellar/python@3.9/3.9.13_1/Frameworks/Python.framework/Ve | rsions/3.9/lib/python3.9/asyncio/base_events.py", line | 647, in run_until_complete return | future.result() File | "/Users/mgraczyk/tmp/test.py", line 16, in call_tree | await right(directions[1:]) File | "/Users/mgraczyk/tmp/test.py", line 4, in right | await call_tree(directions) File | "/Users/mgraczyk/tmp/test.py", line 14, in call_tree | await left(directions[1:]) File | "/Users/mgraczyk/tmp/test.py", line 7, in left | await call_tree(directions) File | "/Users/mgraczyk/tmp/test.py", line 16, in call_tree | await right(directions[1:]) File | "/Users/mgraczyk/tmp/test.py", line 4, in right | await call_tree(directions) File | "/Users/mgraczyk/tmp/test.py", line 16, in call_tree | await right(directions[1:]) File | "/Users/mgraczyk/tmp/test.py", line 4, in right | await call_tree(directions) File | "/Users/mgraczyk/tmp/test.py", line 14, in call_tree | await left(directions[1:]) File | "/Users/mgraczyk/tmp/test.py", line 7, in left | await call_tree(directions) File | "/Users/mgraczyk/tmp/test.py", line 11, in call_tree | raise Exception("call stack"); Exception: call | stack | pron wrote: | No, the existing tooling will give you such a stack trace | already (and you don't need any `async` or `await` | boilerplate, and you can even run code written and | compiled 25 years ago in a virtual thread). But you do | realise that async/await and threads are virtually the | same abstraction. What makes you think implementing | tooling for one would be harder than for the other? | mgraczyk wrote: | How does the tooling know to hide the call to "fork" in | the scoped task example? | smasher164 wrote: | After all the hoopla surrounding concurrency models, it seems | that languages are conceding that green threads are more | ergonomic to work with. Go and Java have it, and now .NET is even | experimenting with it. | | How long until OS vendors introduce abstractions to make this | easier? Why aren't there OS-native green threads, or at the very | least user-space scheduling affordances for runtimes that want to | implement them without overhead in calling blocking code? | Jtsummers wrote: | > Why aren't there OS-native green threads, or at the very | least user-space scheduling affordances for runtimes that want | to implement them without overhead in calling blocking code? | | Green threads are, definitionally, _not_ OS threads, they are | user space threads. So you will _never_ see OS-native green | threads as it 's an oxymoron. The way many green thread systems | work is to either lie to you (you really only have one OS | thread, the green threads exist to write concurrent code, which | can be much simpler, but not _parallel_ code using Pike 's | distinction), or to introduce multiple OS threads ("carrier | threads" in the terms of this article) which green threads are | distributed across (this is what Java is doing here, Go has | done for a long time, BEAM languages for a long time, and many | others). | | EDIT: | | To extend this, many people think of "green threads" as | lightweight threading mechanisms. That's kind of accurate for | many systems, but not always true. If that's the sense that's | meant, then OS-native lightweight threads are certainly | possible in the future. But there's probably not much reason to | add them when user space lightweight concurrency mechanisms | already exist, and there's no consensus on _which_ ones are | "best" (by whatever metric). | smasher164 wrote: | > If that's the sense that's meant | | Yeah that's what I meant, a lightweight threading mechanism | provided by the OS. | | > there's probably not much reason to add them when user | space lightweight concurrency mechanisms already exist | | Yeah... I don't think there's consensus on that. It seems | that many people find OS threads to be an understandable | concurrency model, but find them too heavyweight. So the | languages end up introducing other abstractions at either the | type-level (which has other benefits mind you!) or runtime to | compensate. | Sakos wrote: | > To extend this, many people think of "green threads" as | lightweight threading mechanisms. That's kind of accurate for | many systems, but not always true. If that's the sense that's | meant, then OS-native lightweight threads are certainly | possible in the future. But there's probably not much reason | to add them when user space lightweight concurrency | mechanisms already exist, and there's no consensus on which | ones are "best" (by whatever metric). | | Wouldn't it make sense to implement them kernel-side when | looking at how every programming language seems to have to | reinvent the wheel regarding green threads? | Jtsummers wrote: | Green threads (today) aren't a singular thing, the | definition is that they're in user space not kernel space. | They are implemented in a variety of ways: | | https://en.wikipedia.org/wiki/Green_thread | | Do you imitate a more traditional OS-thread style with | preemption, do you use cooperating tasks, coroutines, what? | Since there is no singular _best_ or consensus model, there | is little reason for an OS to adopt wholesale one of these | variations at this time. | | The original green threads (from that page) shared one OS | thread and used cooperative multitasking (most coroutine | approaches would be analogous to this). But today, like | with Go and BEAM languages, they're distributed across real | OS threads to get parallelism. Which approach should an OS | adopt? And if it did, would other languages/runtimes | abandon their own models if it were significantly | different? | smasher164 wrote: | Preemptive threads with growable stacks. There was some | discussion around getting segmented stacks into the | kernel, but I'm not sure that's the best approach. There | might have to be some novel work done in making | contiguous stacks work in a shared address space. | wtetzner wrote: | I think the reasons green threads can work in languages is | that the runtime understands the language semantics, and | can take advantage of them. The OS doesn't understand the | language and its concurrency semantics, and only has a blob | of machine code to work with. | smasher164 wrote: | Not really tbh. The Go runtime has a work-stealing | scheduler and does a lot of work to provide the same | abstractions that pthreads have, but for goroutines. | zozbot234 wrote: | > How long until OS vendors introduce abstractions to make this | easier? | | The OS-level abstraction is called M:N threads. It has always | been supported by Java on Solaris. But it's not really popular | elsewhere. ___________________________________________________________________ (page generated 2022-09-29 23:00 UTC)