[HN Gopher] Ruby 3.2 preview 1 with support for WASM compilation
       ___________________________________________________________________
        
       Ruby 3.2 preview 1 with support for WASM compilation
        
       Author : pvsukale3
       Score  : 228 points
       Date   : 2022-04-07 17:38 UTC (5 hours ago)
        
 (HTM) web link (www.ruby-lang.org)
 (TXT) w3m dump (www.ruby-lang.org)
        
       | freedomben wrote:
       | Timeouts for Regexp is quite interesting. The engineering purity
       | in me saddens at such a thought, but indeed it seems highly
       | practical.
       | 
       | The syntax feels a little rough although I have no ideas how to
       | make it better:                   Regexp.timeout = 1.0
       | ...         /^a*b?a*$/ =~ "a" * 50000 + "x"
       | 
       | I think I would favor the:                   long_time_re =
       | Regexp.new("^a*b?a*$", timeout: 1.0)
       | 
       | version instead but I use the `=~` almost entirely, so that would
       | still be a big style change. Probably end up setting a global
       | timeout per app and then overriding for individual checks as
       | needed?
        
         | speedgoose wrote:
         | A timeout for regexps makes so much sense. And it would end all
         | these denial of service security reports.
        
           | [deleted]
        
           | Thaxll wrote:
           | It does not make sense to me, best solution is to have an
           | implementation like re2 that does not have those problems.
           | 
           | Adding a timeout is a bit strange, first because you don't
           | know in advance how long it's going to take for large search.
           | The timeout is a failsafe against something that should be
           | fixed in the first place.
        
             | jatone wrote:
             | thats the problem with all these syntax sugar features in
             | languages. you literally can't change them without blowing
             | up your entire ecosystem.
        
               | djur wrote:
               | How would this problem be different if Ruby did not have
               | syntax support for regexes and instead offered a regex
               | module in the standard library?
        
             | JohnBooty wrote:
             | You _can_ use re2 in Ruby if you like - it 's just not the
             | default. https://github.com/mudge/re2
             | best solution is to have an implementation like re2 that
             | does not have those problems.
             | 
             | By design RE2 isn't fully compatible with Onigmo. As
             | another poster mentioned, a hybrid "use RE2 when possible;
             | fall back to Onigmo otherwise" approach was considered and
             | rejected for well-explained reasons https://bugs.ruby-
             | lang.org/issues/18653
             | 
             | Maybe in addition to `Regexp.timeout = 1.0` there could
             | also be a `Regexp.parser = :re2` option with `:onigmo`
             | being the default.
        
             | messe wrote:
             | I think a limit on stack/recursion/backtracking depth would
             | be tad a bit more elegant than a timeout and would keep
             | your code behaviour the same between different machines.
        
               | capableweb wrote:
               | It'd be harder to control perceived performance of user
               | facing applications though. If I can set the timeout, I
               | can guarantee that something happens within X seconds,
               | instead of within X iterations which could have different
               | performance machine to machine.
        
               | MaxLeiter wrote:
               | is this not the halting problem? You can guarantee a
               | certain depth isnt reached but you cant guarantee a
               | recursion will unwind or anything like that
        
               | __s wrote:
               | It's not the halting problem; it's bounding computational
               | depth
               | 
               | Their suggestion is essentially making stack overflow a
               | feature in regex, & then allowing that stack depth to be
               | tuned
        
         | andreynering wrote:
         | I wouldn't expect people to have to change that setting often,
         | and 1 second seems very reasonable to me.
         | 
         | So yes, `Regexp.timeout` is supposed to be a default setting
         | for the app, and when really needed you can override it with
         | the `timeout:` key.
        
           | JohnBooty wrote:
           | Yeah the implementation they've chosen seems totally perfect
           | to me. Sane global default, easily overridable globally or
           | locally.
           | 
           | No easy way to override it locally when using =~, but I can't
           | imagine too many cases where you would want to use a local
           | timeout anyway.... can just switch away from =~ syntax for
           | those.
           | 
           | This is mostly a denial-of-service mitigation tool, something
           | you'd just want to apply globally to avoid disasters spawned
           | by malformed or malicious input. In practice, it's hard to
           | imagine a use case where you'd really want to be twiddling
           | the knobs on a regexp-by-regexp basis.
        
           | freedomben wrote:
           | Yes good point, I was initially thinking that it would make
           | sense to always ask yourself "how long should this take" and
           | tune appropriately, but for the vast majority of regexes
           | that's overkill, especially if you're not doing anything
           | O(n^2). sticking a 1 second in there gives you a lot of
           | headroom and you can just get more specific for any
           | exceptions.
        
             | JohnBooty wrote:
             | I was initially thinking that it would make sense
             | to always ask yourself "how long should this take"
             | and tune appropriately, but for the vast majority
             | of regexes that's overkill
             | 
             | More than being overkill, it's actually impossible right?
             | 
             | The execution time will also vary greatly based on base CPU
             | performance, and current server load.
             | 
             | A regexp that takes 10ms to process right now might take
             | 500ms tomorrow when your server is under heavy load. So we
             | can't predict how much time each regexp "needs."
             | 
             | But, like you said, we can set a somewhat ridiculously high
             | limit to help prevent regex-based _oopsies_ or re-based DoS
             | attacks from dragging us down =)
        
         | ainar-g wrote:
         | I wonder why they didn't just include an option to use a non-
         | backtracking algorithm, like re2's[1]. As far as I know, that
         | would completely eliminate the possibility of catastrophic
         | backtracking happening.
         | 
         | [1]: https://github.com/google/re2
        
           | byroot wrote:
           | It was explored but decided against, at least for now
           | https://bugs.ruby-lang.org/issues/18653
        
             | dragonwriter wrote:
             | Wrapping RE2 with a fallback to the existing engine to try
             | to maintain compatibility was explored; that, like the
             | timeout approach, is pretty clearly a stopgap measure;
             | actually implementing an RE2-style algorithm without the
             | compatibility and toolchain warts of RE2 for Ruby's
             | existing code and functionality is a bigger but more
             | permanent solution, that I don't think has really been
             | ruled out of explored.
        
               | riffraff wrote:
               | if you break compatibility you might as well just use
               | some re2 bindings[0]
               | 
               | [0] https://github.com/mudge/re2
        
         | exfascist wrote:
         | Better arguably would be to use a generator or continuation.
        
       | marcus_cemes wrote:
       | I've been very attracted to learn Ruby a couple of times, being
       | exhausted of the JS ecosystem. Everybody who's used it seems to
       | fall in love with it, but I can't get over just how slow it is...
       | It takes a fresh installation of Discourse over 10 minutes to
       | start-up again on a small underpowered VM and uses 10x as much
       | RAM as an alternative platform such as Flarum.
        
         | inopinatus wrote:
         | My developer experience is that the long initial start time (of
         | Rails in particular) is more than offset by my productivity.
        
         | freedomben wrote:
         | I'm one of those people that fell in love with Ruby, and yeah
         | the speed is the biggest downside. That said, a lot of the bulk
         | is often Rails. I usually use Sinatra now and it's pretty
         | light. On the smallest VM it usually starts quickly and runs
         | fine for quite a while. One even survived an HN Hug. There are
         | also some big improvements coming with Ruby 3 (if you aren't
         | already upgraded to that) and more to come. But you definitely
         | "pay" a fee in CPU/memory for the privilege of using Ruby. In
         | most cases, it's way worth it IMHO. I've also been loving
         | Elixir lately. It's got much the same feeling of beauty that
         | Ruby does, and it's much lighter and lightning fast. I often
         | measure response times in microseconds rather than
         | milliseconds!
        
       | syrusakbary wrote:
       | This is super exciting!
       | 
       | They also created an awesome playground to try Ruby online [1]...
       | all powered by Wasmer/WASI [2]!
       | 
       | [1] https://try.ruby-lang.org/playground/
       | 
       | [2] https://wasmer.io
        
       | eatonphil wrote:
       | This looks awesome! I've already played around with pyodide and
       | coldbrew doing the same thing for CPython. I use it for an in-
       | memory playground [0] of an open-source desktop app I build [1].
       | I've been waiting for Ruby, Julia, and R support to add them in
       | too.
       | 
       | That said, I am not seeing a link in here about how to actually
       | use this code. Is there a good tutorial/example somewhere?
       | 
       | [0] app.datastation.multiprocess.io
       | 
       | [1] github.com/multiprocessio/datastation
        
       | [deleted]
        
       | swlkr wrote:
       | One notable thing is the ruby apps in a single .wasm file. This
       | may make ruby CLI apps easier, as well as eventually replacing
       | things like docker or shipping your ruby code to a server.
        
         | exdsq wrote:
         | Why would it replace docker? You still need the dependencies of
         | the CLI app
         | 
         | Edit: Ah I guess it's just the WASM vm if it includes
         | everything
        
         | cpuguy83 wrote:
         | You don't just execute a .wasm file, it requires a runtime
         | which will JIT compile the code into machine code and handle
         | the (wasi) system interfaces (e.g. read, write, stat, etc).
        
           | specialp wrote:
           | Yes this is true with all interpreted languages. But if you
           | consider the use-case the OP was contrasting with (Docker)
           | that not only has the Ruby runtime, but an entire Linux OS as
           | well.
        
         | qbasic_forever wrote:
         | I was thinking the same thing, isn't ruby particularly hard to
         | package as it doesn't support static compilation? It would be
         | nice to just sidestep all of that with a hermetic little WASM
         | distribution.
        
       | alberth wrote:
       | Does this imply that Rails apps could run as WASM server apps and
       | receive a huge performance boost?
        
         | teeray wrote:
         | I've found that Rails is like the Crysis of Ruby. Usually the
         | answer to "will X ruby runtime run Rails?" is "not yet."
        
           | pqdbr wrote:
           | Lol, good analogy. "But can it run Rails?"
        
         | eatonphil wrote:
         | There are many other existing/mature Ruby bytecode VMs/JITs you
         | could switch to before a WASM bytecode VM/JIT.
        
           | JohnBooty wrote:
           | Yes, but they're not portable/interoperable in the way that a
           | WASM version would be -- which is why the WASM version is
           | exciting, right?
           | 
           | (Somebody correct me if I'm wrong; I know what WASM is but
           | I'm not sure how it's employed in practice outside of in-
           | browser tech demos of games and things)
        
             | alberth wrote:
             | Cloudflare Workers allows you to deploy server side WASM
             | apps.
             | 
             | https://blog.cloudflare.com/webassembly-on-cloudflare-
             | worker...
        
         | Mikeb85 wrote:
         | It doesn't compile Ruby code to wasm code, it compiles the Ruby
         | interpreter to wasm, so it'll be roughly the same performance
         | as as the Ruby interpreter on Windows or Linux.
         | 
         | Also Rails is plenty quick these days, tons of people running
         | it at massive scale.
        
           | the_duke wrote:
           | The best case WASM performance is roughly 20-50% slower than
           | native code, depending on the runtime and the type of code
           | executed.
           | 
           | In the browser you have to also factor in the warmup time.
           | 
           | I'd imagine an interpreter will suffer a lot because certain
           | C tricks like computed goto don't work directly. (This will
           | hopefully be improved by future Wasm proposals)
           | 
           | (Note: that's still plenty fast enough for most use cases,
           | and performance will improve)
        
             | titzer wrote:
             | > The _best case_ WASM performance is roughly 20-50% slower
             | 
             | It's more accurate to say that the _average case_ is 20-50%
             | slower. The best case is on par, or slightly faster than
             | native code[1].
             | 
             | [1] Measurements from our original paper,
             | https://dl.acm.org/doi/10.1145/3062341.3062363
             | 
             | Engines are even faster now.
        
           | bpicolo wrote:
           | It's not comparatively quick, but you don't necessarily need
           | quick to scale or be successful
        
         | shafyy wrote:
         | Why do you think this would give them a performance boost?
        
           | anm89 wrote:
           | This is by far my favorite tech talk of all time. It goes
           | into why WASM can run faster than native code in many
           | contexts, the reason being that it get's around the overhead
           | of OS security rings
           | 
           | https://www.destroyallsoftware.com/talks/the-birth-and-
           | death...
        
             | georgyo wrote:
             | The base argument here doesn't make sense.
             | 
             | WASM requires an interpreter which must be native.
             | 
             | The argument is that this interpreter can be smarter about
             | what crosses OS security rings. But those same improvements
             | could be done in the native compiler or interpreter.
             | 
             | The next argument could be that many things using the WASM
             | target would focus more effort on improving it so all WASM
             | targets benefit outpacing their individual optimizations.
             | 
             | This one is harder to dismiss outright, but instead of
             | optimizing for machine code you are now optimizing your
             | WASM output.
             | 
             | Also this intermediate byte code representation already
             | exists for both LLVM and JVM, which many languages target.
             | 
             | It is difficult to see WASM magically improving performance
             | at all and especially not dramatically enough to encourage
             | people to switch to it for that reasoning.
        
             | eatonphil wrote:
             | That talk is from 2014 and Wikipedia says wasm was
             | announced in 2017?
        
               | time_to_smile wrote:
               | While the parent is does seem to be treading into Poe's
               | law territory, it's not entirely correct to dismiss that
               | talk's relationship to wasm based on the dates your
               | quoting.
               | 
               | Bernhardt in the talk explicitly mentions asm.js which is
               | the precursor to wasm (it's even mentioned in the
               | wikipedia article you skimmed a bit too quickly). asm.js
               | was released Feburary 2013.
               | 
               | I'm surprised HN has such a short memory, but the impetus
               | for that talk was a clearly disturbing trend at the time
               | implying that everything should be done in javascript.
               | Node.js was gaining rapid popularity, people were
               | discussing javascript as the new C for using as the
               | language to write example code in, and while things like
               | asm.js were exciting, they seemed to point towards the
               | hilariously nightmarish future Bernhardt is discussing
               | there.
        
               | dnsco wrote:
               | This talk is about asm.js which is a precursor technology
               | to wasm, parents logic seems to be "wasm is an
               | improvement on asm.js". I have no idea if the kernel
               | isolation benefits the garry bernhardt talk is about
               | apply.
        
               | capableweb wrote:
               | asm.js was first mentioned in 2013. asm.js was eventually
               | superseded by wasm and is pretty much the beginning of
               | wasm as we know it. Didn't watch the talk, but could
               | asm.js be the thing the presenter was talking about?
        
               | AprilArcus wrote:
               | it post-dated and took inspiration from Mozilla's asm.js,
               | which was highly influential on wasm.
        
       | k__ wrote:
       | Is it comparable to Opal?
        
         | AprilArcus wrote:
         | Not really. Opal is a source-to-source compiler that compiles
         | Ruby to JavaScript. Ruby 3.2 compiles the whole Ruby VM and
         | runtime to wasm, which then runs Ruby inside a real Ruby VM
         | nested within the JS VM.
         | 
         | A good analogy is that Opal is like PureScript, whereas Ruby
         | 3.2 is like GHCJS.
        
         | poisonta wrote:
         | I hope it performs better than Opal.
        
       ___________________________________________________________________
       (page generated 2022-04-07 23:00 UTC)