[HN Gopher] Ruby 3.0 and the new FiberScheduler interface
       ___________________________________________________________________
        
       Ruby 3.0 and the new FiberScheduler interface
        
       Author : WJW
       Score  : 121 points
       Date   : 2020-12-28 17:11 UTC (1 days ago)
        
 (HTM) web link (www.wjwh.eu)
 (TXT) w3m dump (www.wjwh.eu)
        
       | mojuba wrote:
       | Could someone explain if FiberScheduler can use all the available
       | CPU cores?
        
       | dochtman wrote:
       | Hey WJW, nice blog you have there! Have been following along,
       | good stuff.
        
       | faebi wrote:
       | Are there any good libraries for fiber pools, so that not to many
       | fibers run at once? I am interested in something like the
       | concurrent-ruby threadpools.
        
         | WJW wrote:
         | Only one fiber per thread runs at a time, so that would
         | naturally limit the concurrency of fibers to the number of
         | threads?
        
           | faebi wrote:
           | As I understand it will yield back once it hits IO. But if
           | you have something like a web server with a timeout of 60s
           | you won't be able to open 10'000 connections at once to it.
           | Some or all will be probably run into the timeout before the
           | ruby could process the whole request. Therefore I would try
           | to use some form of rate limiting via a FiberPool with a
           | backlog.
        
             | cogman10 wrote:
             | Use a semaphore instead. That's what it is built for.
             | 
             | That way you don't need to make a complex fiber scheme just
             | for resource management. Spin up as many as you need an the
             | runtime will do the task pool for you.
             | 
             | https://ruby-concurrency.github.io/concurrent-
             | ruby/1.1.4/Con...
        
       | [deleted]
        
       | elcritch wrote:
       | As an outsider, why doesn't/can't Ruby implement a full actor
       | system like Erlang/BEAM? Ruby is already based on message passing
       | so it seems like it should be possible. Granted it'd likely
       | induce a large performance hit since presumably every object
       | would need to be locked or have a message queue.
        
         | rubyn00bie wrote:
         | Well... the big reason almost no runtime/vm provides what
         | erlang does is because BEAM will preempt a running actor if it
         | takes too long (uses up its operation budget more or less).
         | Ruby only provides mechanisms for cooperatively scheduled
         | execution. That is to say, and this is the big problem with
         | async most everywhere except Erlang/BEAM, a task will block
         | everything else from executing until its finished and _yields_
         | to another fiber /thread.
         | 
         |  _Shrug_ Having tried like every possible method of async I /O
         | with Ruby to eek out moar good perfz back in the day...
         | including various actor implementations (sup, JRuby and
         | Akka)... nothing compares to just using BEAM if you want the
         | actor paradigm. The cooperative scheduling problem is a really
         | enormous pain in the ass most of the time.
        
           | elcritch wrote:
           | That's part of my question (to me at least) is why they
           | couldn't implement preemptive actors? At least at the
           | granularity of individuals methods which in Ruby is pretty
           | much any action, right? Async and the whole red/blue function
           | problem seems like a pain.
        
         | holstvoogd wrote:
         | There are different fully concurrent ruby implementations:
         | truffleruby, rubinius, jruby.. they all suck at what ruby is
         | most used for, running rails. Last time i tried cruby was 10x
         | faster still.
         | 
         | So in practice we glady throw more memory and processes at the
         | problem.
        
           | cortesoft wrote:
           | I wonder what percentage of Ruby is rails... I have been
           | using Ruby pretty much every day for work since 2005, and
           | haven't done any rails since 2009. Wonder if I am that much
           | of an anomaly.
        
             | freedomben wrote:
             | What are you using Ruby for?
             | 
             | I have used ruby since 2013 and started on rails but quit
             | using rails in 2015. I still use ruby all the time though,
             | mostly for scripts/automation that violate my rule on bash
             | v. other lang for scripts (basically do I need arrays,
             | maps, or to parse json beyond simple extractions that jq is
             | great at). I do have a couple sinatra services now though
             | that I maintain. Sinatra is wonderful with simple needs
             | like mine.
             | 
             | I don't know how much of an anomaly I am though, interested
             | in hearing from others as well.
             | 
             | Edit: not looking to debate bash vs other langs here. Of
             | course you're free to do such below, but I won't be
             | engaging (got to focus on work and have had the debate many
             | times and I don't think any of us will ever convince the
             | other. It's become religion at this point).
             | 
             | Also I use Elixir/Phoenix these days for use cases that
             | used to be rails
        
               | mediaman wrote:
               | Anything you miss from Rails now that you're primarily
               | using Phoenix for those use cases? Elixir/Phoenix does
               | look pretty intriguing to me, though the support universe
               | is smaller.
               | 
               | What held me back so far is that a lot of simple B2B, low
               | volume stuff doesn't seem to benefit from the high
               | parallelism that BEAM brings.
               | 
               | But I'm wondering if it brings enough other advantages
               | that I shouldn't view it that simplistically.
        
               | realusername wrote:
               | > But I'm wondering if it brings enough other advantages
               | that I shouldn't view it that simplistically.
               | 
               | So to start with, Phoenix LiveView really is a game
               | changer, you should have a look at Elixir just for that.
               | This is the talk which really bluffed me when I first saw
               | it: https://www.youtube.com/watch?v=MZvmYaFkNJI.
               | 
               | For the other upsides, I like a little bit better the way
               | everything is architectured in Phoenix, there's much less
               | magic and it's easier to follow the data flow and what is
               | going on.
        
               | lawik wrote:
               | Not the previous commenter. But enthusiastic :)
               | 
               | I would recommend trying it and seeing how you like it.
               | I've basically dropped Python which was my daily language
               | in favor of Elixir. I find I get a higher top bound to
               | what I can do, high-level expressive but less magical
               | code and a bunch of capabilities that aren't typically
               | feasible with other runtimes (state handling, resiliency
               | and stuff). The parallelism I get for free with Phoenix
               | whether I try or not.
               | 
               | I'd watch Sasa Jurics talk on the heart of elixir and
               | erlang to get more of the technical advantages laid out
               | quite well.
        
               | cortesoft wrote:
               | A lot of services written with Sinatra, some developer
               | tooling, scripts to maintain our systems.
        
           | cpuguy83 wrote:
           | What are you doing that cruby is faster?
           | 
           | I've run both Rubinius and jruby (with rails), both gave me
           | significant performance gains.
        
         | chrisseaton wrote:
         | > As an outsider, why doesn't/can't Ruby implement a full actor
         | system like Erlang/BEAM?
         | 
         | Do you really want an actor system?
         | 
         | Actors are non-determinsitic, extremely prone to difficult-to-
         | debug race conditions, and inherently stateful. They're a
         | classic concurrency foot-gun from the dark old ways of doing
         | things.
         | 
         | Don't we want to be moving away from these models that we know
         | trip people up? Can't we do better than this for Ruby?
        
           | elcritch wrote:
           | > They're a classic concurrency foot-gun from the dark old
           | ways of doing things.
           | 
           | Sure you're not thinking of _threads_? Actor 's have only
           | really been done in Erlang, and more recenetly in Akka and
           | Orlean's. AFAIK, there aren't any other paradigms other than
           | CSP or threads for concurrency.
           | 
           | Actor's can have race conditions, same as any other
           | concurrent system but I've personally never actually had any
           | given the design patterns in OTP and Elixir. It's really
           | helpful (to me) that each actor is always deterministic in
           | itself and doesn't share data, only messages. It also maps
           | nicely onto multicore.
        
             | chrisseaton wrote:
             | > Sure you're not thinking of threads?
             | 
             | No I'm thinking of actors - stateful, non-determinstic,
             | racey actors. A minefield of classic concurrency bugs!
             | 
             | > Actor's can have race conditions, same as any other
             | concurrent system
             | 
             | Deterministic systems like fork-join don't have race-
             | conditions.
             | 
             | > is always deterministic in itself and doesn't share data
             | 
             | But the _global_ state of all the actors is shared. If an
             | actor you send a message to responds differently due to it
             | state then let 's be honest you're implicitly sharing that
             | state.
             | 
             | > It also maps nicely onto multicore.
             | 
             | It maps directly to multicore, right. Don't we want
             | something higher level and safer than directly mapping to
             | our hardware?
        
           | dnautics wrote:
           | You shouldn't use actor systems as a code organization
           | pattern, only in places where async is necessary and in those
           | cases you're going to have to debug race conditions anyways.
           | When that happens you will want an actor system because it
           | will make it easier to understand the data flowing through
           | your system, and especially, effortlessly (0 loc) clean up
           | dangling resources from error conditions and prevent leaks.
           | Moreover Elixir provides you fantastic tools to drive unit
           | and integration testing around your async systems.
        
         | semiquaver wrote:
         | Ractors, also introduced in ruby 3.0, are basically this.
         | Ractors work by passing messages or immutable objects.
         | https://github.com/ruby/ruby/blob/master/doc/ractor.md
        
       | claudiug wrote:
       | will this fiber scheduler + ractor will probably make ruby
       | faster?
        
         | nateberkopec wrote:
         | Both of those changes increase parallelism and concurrency,
         | they do not decrease latency by themselves.
        
         | cplanas wrote:
         | Not directly, but they will facilitate using more performant
         | paradigms.
        
       | freedomben wrote:
       | I'm excited to see where this goes, but I must admit I'm
       | conflicted on the idea of seeing async Ruby code. Async is great
       | technically but one of the things I love about Ruby is the
       | elegance and clarity. That tends to go out the window when async
       | is introduced in many languages. async/await has helped a ton
       | though, so hopefully Ruby will get something like that as we go.
       | 
       | I would just suggest that sometimes performance isn't worth
       | sacrificing clarity. Obviously sometimes you have to (Big O can
       | be an unforgiving beast) but not always.
        
         | [deleted]
        
         | ljm wrote:
         | Time will tell, but this could be beneficial for existing
         | libraries like concurrent-rb.
         | 
         | And, maybe it becomes more useful for using ruby in smaller
         | services outside of the Rails context, especially when the go-
         | to solution for a bunch of problems is to run a separate copy
         | of your server in 'worker mode'. Even with Rails, not relying
         | on Delayed Job or Sidekiq by default would be nice. Of course,
         | I'm thinking about Ractor there too.
        
         | Lio wrote:
         | It worth remembering that ruby code already handles async, and
         | threads. For example ActiveRecord database requests have been
         | non-blocking for many years.
         | 
         | This new scheduler interface just gives us a nicer and lighter
         | weight abstraction for handling it rather than, for example,
         | using OS thread.
        
           | jashmatthews wrote:
           | > ActiveRecord database requests have been non-blocking for
           | many years.
           | 
           | That's blocking IO with multi-threading. The executing thread
           | stops and waits.
        
         | nitrogen wrote:
         | Before I learned Rails I used EventMachine, which was very
         | "async". Everything was a chained callback. So Ruby has had the
         | option of this complexity for a while.
        
       | dorianmariefr wrote:
       | Could somebody tell me what's wrong with threads, e.g.
       | require "open-uri"         require "json"              results =
       | []              (0..100).each_slice(10) do |slice|
       | slice.map do |i|             Thread.new do               results
       | << JSON.parse(URI.open("https://httpbin.org/get?i=#{i}").read)["a
       | rgs"]             end           end.each(&:join)         end
       | p results
        
       | kyledrake wrote:
       | FYI, this domain is flagged as "dangerous" by mcafee's malicious
       | site detection right now. Centurylink's nameservers are blocking
       | it. You might want to talk with them about that.
        
       | rdw wrote:
       | > all relevant standard library methods have been patched to
       | yield to the scheduler whenever they encounter a situation where
       | they will block the current fiber
       | 
       | This is huge. They solved the function-coloring problem by
       | deciding that all functions are now potentially async. It makes
       | it more likely that the ecosystem as a whole actually becomes
       | async-compatible. I wish Python had taken this approach, though I
       | understand why they didn't.
        
         | Rapzid wrote:
         | I don't see how this solves the "coloring" problem. "Colored"
         | functions have a different return type; a future of some sort.
         | If you want to return a future to a caller, or have a method
         | that blocks until the work is done, you still need a way to
         | differentiate that.
         | 
         | Golang does essentially the same thing and uses channels
         | instead of "futures". However, some APIs will have functions
         | that return a channel that you receive a message on later..
         | Which is a lot like a future. And if you provide a more simple
         | non-channel API along side(instead of making the consumer wrap
         | your calls in a Go routine or always use a channel), you now
         | have "colored" functions.
         | 
         | I believe this is just a solution to a scheduling problem(pre-
         | emptive vs cooperative).
        
           | rdw wrote:
           | It solves the coloring problem by not having a coloring
           | problem in the first place. Maybe I should have said "avoids
           | the function-coloring problem".
        
           | chrisseaton wrote:
           | > "Colored" functions have a different return type; a future
           | of some sort.
           | 
           | This is an aside and possibly pedantic, but everyone seems to
           | have forgotten that the _whole_ idea of futures was
           | originally that they _weren 't_ some different type. They
           | were the same type, but blocked transparently on demand and
           | had no `get` operation.
           | 
           | The construct that came before futures, called eventual-
           | values, had the different-type thing. The _only_ difference
           | of futures, the _big idea_ , was they said 'maybe we can get
           | rid of that part'.
           | 
           | > These "futures" greatly resemble the "eventual values" of
           | Hibbard's Algol 68 [39]. The principal difference is that
           | eventual values are declared as a separate data type,
           | distinct from the type of the value they take on, whereas
           | futures cannot be distinguished from the type of the value
           | they take on.
           | 
           | (Halstead, 1985)
           | 
           | But now people write types like `Future[T]`... which
           | _completely_ misses the point! If you start to write
           | `Future[...` then think to yourself... this isn 't a future.
           | And that's how you get this colouring problem you describe...
           | it was already fixed and people have forgotten all about it.
        
         | WJW wrote:
         | (Author here)
         | 
         | I kinda agree but yielding only on I/O is not enough IMO.
         | Sooner or later we'll want to do CPU intensive work in fibers
         | as well, which in the current implementation will block all
         | other fibers on the same thread. Like I mentioned in the
         | article, I'd like to see work stealing between different
         | threads as that would allow fibers to migrate away from threads
         | stuck in CPU intensive work. An alternative way could be to
         | adopt a model similar to Haskells lightweight threads, where
         | the runtime forces a `yield` after N milliseconds
         | (configurable). That would make sure that CPU intensive work
         | would not block other fibers "too" much.
        
           | nesarkvechnep wrote:
           | This is not only the way Haskell's scheduler work but also
           | Erlang VM's. Not completely the same because the BEAM
           | scheduler uses reductions instead of time but both scheduling
           | schemes can be classified as preemptive. Since fibers are
           | cooperatively scheduled I don't believe the core developers
           | of Ruby would just agree to switch the scheduling scheme.
        
           | anonacct38 wrote:
           | Go's journey here has been interesting. Early on it was
           | possible (though rarely seen in practice) to end up with a
           | cpu bound thread not yielding because it didn't hit yield
           | point like I/O.
           | 
           | Then they added a guarantee that if your loop called a
           | function, the scheduler would be able to make the goroutine
           | yield. https://golang.org/doc/go1.2#preemption
           | 
           | This mostly worked although you could still have a CPU-bound
           | thread not making any function calls. I also personally ran
           | into a pathological issue where the scheduler was being
           | invoked, but a heuristic kept the current goroutine running
           | so others were still starved.
           | 
           | Finally they added true pre-emption (not yielding) in 1.14
           | https://golang.org/doc/go1.14#runtime it looks like it just
           | sends signals and saves state.
           | 
           | Once nice thing is that if I understand the go runtime
           | correctly, work stealing by scheduling goroutines on
           | different threads has been a thing for a long time.
        
         | orf wrote:
         | > They solved the function-coloring problem
         | 
         | Not really, they've just made it super implicit. Any FFI calls
         | are now implicitly coloured, same with anything CPU-heavy.
         | 
         | Like their approach to type annotations, I think this will be a
         | mistake in hindsight.
        
           | lalaithion wrote:
           | In a high level dynamic language with exceptions, there's
           | already so many implicit "colors" to a function that I think
           | this is still the right choice.
        
           | rdw wrote:
           | This is gonna sound pedantic but I think it's not a coloring
           | problem. Coloring is when syntactically a function has to
           | change its own signature just because it started calling an
           | inner function with the new color.
           | 
           | That said, you're right in that one may have to make some
           | changes to code to get around some new problems. The problem
           | with FFI/CPU-heavy functions is that they prevent _other_
           | fibers from being scheduled while they run.
           | 
           | If it becomes possible to implement work-stealing, then that
           | would mitigate the problem. It would also be solvable by
           | sprinkling "yield to scheduler" calls throughout such
           | functions. Annoying and not always possible, but, since in
           | these cases the signature would not change, technically not
           | coloring.
        
             | gpderetta wrote:
             | It is not pedantic at all, you are completely right; if
             | this implementation were to be classified as coloroing,
             | then everything would be, including classic threads.
        
         | cogman10 wrote:
         | This is essentially the approach Java is taking with the
         | upcoming loom project.
        
       ___________________________________________________________________
       (page generated 2020-12-29 23:00 UTC)