[HN Gopher] Ruby 3.0 and the new FiberScheduler interface ___________________________________________________________________ Ruby 3.0 and the new FiberScheduler interface Author : WJW Score : 121 points Date : 2020-12-28 17:11 UTC (1 days ago) (HTM) web link (www.wjwh.eu) (TXT) w3m dump (www.wjwh.eu) | mojuba wrote: | Could someone explain if FiberScheduler can use all the available | CPU cores? | dochtman wrote: | Hey WJW, nice blog you have there! Have been following along, | good stuff. | faebi wrote: | Are there any good libraries for fiber pools, so that not to many | fibers run at once? I am interested in something like the | concurrent-ruby threadpools. | WJW wrote: | Only one fiber per thread runs at a time, so that would | naturally limit the concurrency of fibers to the number of | threads? | faebi wrote: | As I understand it will yield back once it hits IO. But if | you have something like a web server with a timeout of 60s | you won't be able to open 10'000 connections at once to it. | Some or all will be probably run into the timeout before the | ruby could process the whole request. Therefore I would try | to use some form of rate limiting via a FiberPool with a | backlog. | cogman10 wrote: | Use a semaphore instead. That's what it is built for. | | That way you don't need to make a complex fiber scheme just | for resource management. Spin up as many as you need an the | runtime will do the task pool for you. | | https://ruby-concurrency.github.io/concurrent- | ruby/1.1.4/Con... | [deleted] | elcritch wrote: | As an outsider, why doesn't/can't Ruby implement a full actor | system like Erlang/BEAM? Ruby is already based on message passing | so it seems like it should be possible. Granted it'd likely | induce a large performance hit since presumably every object | would need to be locked or have a message queue. | rubyn00bie wrote: | Well... the big reason almost no runtime/vm provides what | erlang does is because BEAM will preempt a running actor if it | takes too long (uses up its operation budget more or less). | Ruby only provides mechanisms for cooperatively scheduled | execution. That is to say, and this is the big problem with | async most everywhere except Erlang/BEAM, a task will block | everything else from executing until its finished and _yields_ | to another fiber /thread. | | _Shrug_ Having tried like every possible method of async I /O | with Ruby to eek out moar good perfz back in the day... | including various actor implementations (sup, JRuby and | Akka)... nothing compares to just using BEAM if you want the | actor paradigm. The cooperative scheduling problem is a really | enormous pain in the ass most of the time. | elcritch wrote: | That's part of my question (to me at least) is why they | couldn't implement preemptive actors? At least at the | granularity of individuals methods which in Ruby is pretty | much any action, right? Async and the whole red/blue function | problem seems like a pain. | holstvoogd wrote: | There are different fully concurrent ruby implementations: | truffleruby, rubinius, jruby.. they all suck at what ruby is | most used for, running rails. Last time i tried cruby was 10x | faster still. | | So in practice we glady throw more memory and processes at the | problem. | cortesoft wrote: | I wonder what percentage of Ruby is rails... I have been | using Ruby pretty much every day for work since 2005, and | haven't done any rails since 2009. Wonder if I am that much | of an anomaly. | freedomben wrote: | What are you using Ruby for? | | I have used ruby since 2013 and started on rails but quit | using rails in 2015. I still use ruby all the time though, | mostly for scripts/automation that violate my rule on bash | v. other lang for scripts (basically do I need arrays, | maps, or to parse json beyond simple extractions that jq is | great at). I do have a couple sinatra services now though | that I maintain. Sinatra is wonderful with simple needs | like mine. | | I don't know how much of an anomaly I am though, interested | in hearing from others as well. | | Edit: not looking to debate bash vs other langs here. Of | course you're free to do such below, but I won't be | engaging (got to focus on work and have had the debate many | times and I don't think any of us will ever convince the | other. It's become religion at this point). | | Also I use Elixir/Phoenix these days for use cases that | used to be rails | mediaman wrote: | Anything you miss from Rails now that you're primarily | using Phoenix for those use cases? Elixir/Phoenix does | look pretty intriguing to me, though the support universe | is smaller. | | What held me back so far is that a lot of simple B2B, low | volume stuff doesn't seem to benefit from the high | parallelism that BEAM brings. | | But I'm wondering if it brings enough other advantages | that I shouldn't view it that simplistically. | realusername wrote: | > But I'm wondering if it brings enough other advantages | that I shouldn't view it that simplistically. | | So to start with, Phoenix LiveView really is a game | changer, you should have a look at Elixir just for that. | This is the talk which really bluffed me when I first saw | it: https://www.youtube.com/watch?v=MZvmYaFkNJI. | | For the other upsides, I like a little bit better the way | everything is architectured in Phoenix, there's much less | magic and it's easier to follow the data flow and what is | going on. | lawik wrote: | Not the previous commenter. But enthusiastic :) | | I would recommend trying it and seeing how you like it. | I've basically dropped Python which was my daily language | in favor of Elixir. I find I get a higher top bound to | what I can do, high-level expressive but less magical | code and a bunch of capabilities that aren't typically | feasible with other runtimes (state handling, resiliency | and stuff). The parallelism I get for free with Phoenix | whether I try or not. | | I'd watch Sasa Jurics talk on the heart of elixir and | erlang to get more of the technical advantages laid out | quite well. | cortesoft wrote: | A lot of services written with Sinatra, some developer | tooling, scripts to maintain our systems. | cpuguy83 wrote: | What are you doing that cruby is faster? | | I've run both Rubinius and jruby (with rails), both gave me | significant performance gains. | chrisseaton wrote: | > As an outsider, why doesn't/can't Ruby implement a full actor | system like Erlang/BEAM? | | Do you really want an actor system? | | Actors are non-determinsitic, extremely prone to difficult-to- | debug race conditions, and inherently stateful. They're a | classic concurrency foot-gun from the dark old ways of doing | things. | | Don't we want to be moving away from these models that we know | trip people up? Can't we do better than this for Ruby? | elcritch wrote: | > They're a classic concurrency foot-gun from the dark old | ways of doing things. | | Sure you're not thinking of _threads_? Actor 's have only | really been done in Erlang, and more recenetly in Akka and | Orlean's. AFAIK, there aren't any other paradigms other than | CSP or threads for concurrency. | | Actor's can have race conditions, same as any other | concurrent system but I've personally never actually had any | given the design patterns in OTP and Elixir. It's really | helpful (to me) that each actor is always deterministic in | itself and doesn't share data, only messages. It also maps | nicely onto multicore. | chrisseaton wrote: | > Sure you're not thinking of threads? | | No I'm thinking of actors - stateful, non-determinstic, | racey actors. A minefield of classic concurrency bugs! | | > Actor's can have race conditions, same as any other | concurrent system | | Deterministic systems like fork-join don't have race- | conditions. | | > is always deterministic in itself and doesn't share data | | But the _global_ state of all the actors is shared. If an | actor you send a message to responds differently due to it | state then let 's be honest you're implicitly sharing that | state. | | > It also maps nicely onto multicore. | | It maps directly to multicore, right. Don't we want | something higher level and safer than directly mapping to | our hardware? | dnautics wrote: | You shouldn't use actor systems as a code organization | pattern, only in places where async is necessary and in those | cases you're going to have to debug race conditions anyways. | When that happens you will want an actor system because it | will make it easier to understand the data flowing through | your system, and especially, effortlessly (0 loc) clean up | dangling resources from error conditions and prevent leaks. | Moreover Elixir provides you fantastic tools to drive unit | and integration testing around your async systems. | semiquaver wrote: | Ractors, also introduced in ruby 3.0, are basically this. | Ractors work by passing messages or immutable objects. | https://github.com/ruby/ruby/blob/master/doc/ractor.md | claudiug wrote: | will this fiber scheduler + ractor will probably make ruby | faster? | nateberkopec wrote: | Both of those changes increase parallelism and concurrency, | they do not decrease latency by themselves. | cplanas wrote: | Not directly, but they will facilitate using more performant | paradigms. | freedomben wrote: | I'm excited to see where this goes, but I must admit I'm | conflicted on the idea of seeing async Ruby code. Async is great | technically but one of the things I love about Ruby is the | elegance and clarity. That tends to go out the window when async | is introduced in many languages. async/await has helped a ton | though, so hopefully Ruby will get something like that as we go. | | I would just suggest that sometimes performance isn't worth | sacrificing clarity. Obviously sometimes you have to (Big O can | be an unforgiving beast) but not always. | [deleted] | ljm wrote: | Time will tell, but this could be beneficial for existing | libraries like concurrent-rb. | | And, maybe it becomes more useful for using ruby in smaller | services outside of the Rails context, especially when the go- | to solution for a bunch of problems is to run a separate copy | of your server in 'worker mode'. Even with Rails, not relying | on Delayed Job or Sidekiq by default would be nice. Of course, | I'm thinking about Ractor there too. | Lio wrote: | It worth remembering that ruby code already handles async, and | threads. For example ActiveRecord database requests have been | non-blocking for many years. | | This new scheduler interface just gives us a nicer and lighter | weight abstraction for handling it rather than, for example, | using OS thread. | jashmatthews wrote: | > ActiveRecord database requests have been non-blocking for | many years. | | That's blocking IO with multi-threading. The executing thread | stops and waits. | nitrogen wrote: | Before I learned Rails I used EventMachine, which was very | "async". Everything was a chained callback. So Ruby has had the | option of this complexity for a while. | dorianmariefr wrote: | Could somebody tell me what's wrong with threads, e.g. | require "open-uri" require "json" results = | [] (0..100).each_slice(10) do |slice| | slice.map do |i| Thread.new do results | << JSON.parse(URI.open("https://httpbin.org/get?i=#{i}").read)["a | rgs"] end end.each(&:join) end | p results | kyledrake wrote: | FYI, this domain is flagged as "dangerous" by mcafee's malicious | site detection right now. Centurylink's nameservers are blocking | it. You might want to talk with them about that. | rdw wrote: | > all relevant standard library methods have been patched to | yield to the scheduler whenever they encounter a situation where | they will block the current fiber | | This is huge. They solved the function-coloring problem by | deciding that all functions are now potentially async. It makes | it more likely that the ecosystem as a whole actually becomes | async-compatible. I wish Python had taken this approach, though I | understand why they didn't. | Rapzid wrote: | I don't see how this solves the "coloring" problem. "Colored" | functions have a different return type; a future of some sort. | If you want to return a future to a caller, or have a method | that blocks until the work is done, you still need a way to | differentiate that. | | Golang does essentially the same thing and uses channels | instead of "futures". However, some APIs will have functions | that return a channel that you receive a message on later.. | Which is a lot like a future. And if you provide a more simple | non-channel API along side(instead of making the consumer wrap | your calls in a Go routine or always use a channel), you now | have "colored" functions. | | I believe this is just a solution to a scheduling problem(pre- | emptive vs cooperative). | rdw wrote: | It solves the coloring problem by not having a coloring | problem in the first place. Maybe I should have said "avoids | the function-coloring problem". | chrisseaton wrote: | > "Colored" functions have a different return type; a future | of some sort. | | This is an aside and possibly pedantic, but everyone seems to | have forgotten that the _whole_ idea of futures was | originally that they _weren 't_ some different type. They | were the same type, but blocked transparently on demand and | had no `get` operation. | | The construct that came before futures, called eventual- | values, had the different-type thing. The _only_ difference | of futures, the _big idea_ , was they said 'maybe we can get | rid of that part'. | | > These "futures" greatly resemble the "eventual values" of | Hibbard's Algol 68 [39]. The principal difference is that | eventual values are declared as a separate data type, | distinct from the type of the value they take on, whereas | futures cannot be distinguished from the type of the value | they take on. | | (Halstead, 1985) | | But now people write types like `Future[T]`... which | _completely_ misses the point! If you start to write | `Future[...` then think to yourself... this isn 't a future. | And that's how you get this colouring problem you describe... | it was already fixed and people have forgotten all about it. | WJW wrote: | (Author here) | | I kinda agree but yielding only on I/O is not enough IMO. | Sooner or later we'll want to do CPU intensive work in fibers | as well, which in the current implementation will block all | other fibers on the same thread. Like I mentioned in the | article, I'd like to see work stealing between different | threads as that would allow fibers to migrate away from threads | stuck in CPU intensive work. An alternative way could be to | adopt a model similar to Haskells lightweight threads, where | the runtime forces a `yield` after N milliseconds | (configurable). That would make sure that CPU intensive work | would not block other fibers "too" much. | nesarkvechnep wrote: | This is not only the way Haskell's scheduler work but also | Erlang VM's. Not completely the same because the BEAM | scheduler uses reductions instead of time but both scheduling | schemes can be classified as preemptive. Since fibers are | cooperatively scheduled I don't believe the core developers | of Ruby would just agree to switch the scheduling scheme. | anonacct38 wrote: | Go's journey here has been interesting. Early on it was | possible (though rarely seen in practice) to end up with a | cpu bound thread not yielding because it didn't hit yield | point like I/O. | | Then they added a guarantee that if your loop called a | function, the scheduler would be able to make the goroutine | yield. https://golang.org/doc/go1.2#preemption | | This mostly worked although you could still have a CPU-bound | thread not making any function calls. I also personally ran | into a pathological issue where the scheduler was being | invoked, but a heuristic kept the current goroutine running | so others were still starved. | | Finally they added true pre-emption (not yielding) in 1.14 | https://golang.org/doc/go1.14#runtime it looks like it just | sends signals and saves state. | | Once nice thing is that if I understand the go runtime | correctly, work stealing by scheduling goroutines on | different threads has been a thing for a long time. | orf wrote: | > They solved the function-coloring problem | | Not really, they've just made it super implicit. Any FFI calls | are now implicitly coloured, same with anything CPU-heavy. | | Like their approach to type annotations, I think this will be a | mistake in hindsight. | lalaithion wrote: | In a high level dynamic language with exceptions, there's | already so many implicit "colors" to a function that I think | this is still the right choice. | rdw wrote: | This is gonna sound pedantic but I think it's not a coloring | problem. Coloring is when syntactically a function has to | change its own signature just because it started calling an | inner function with the new color. | | That said, you're right in that one may have to make some | changes to code to get around some new problems. The problem | with FFI/CPU-heavy functions is that they prevent _other_ | fibers from being scheduled while they run. | | If it becomes possible to implement work-stealing, then that | would mitigate the problem. It would also be solvable by | sprinkling "yield to scheduler" calls throughout such | functions. Annoying and not always possible, but, since in | these cases the signature would not change, technically not | coloring. | gpderetta wrote: | It is not pedantic at all, you are completely right; if | this implementation were to be classified as coloroing, | then everything would be, including classic threads. | cogman10 wrote: | This is essentially the approach Java is taking with the | upcoming loom project. ___________________________________________________________________ (page generated 2020-12-29 23:00 UTC)