[HN Gopher] Ruby 3.2 preview 1 with support for WASM compilation ___________________________________________________________________ Ruby 3.2 preview 1 with support for WASM compilation Author : pvsukale3 Score : 228 points Date : 2022-04-07 17:38 UTC (5 hours ago) (HTM) web link (www.ruby-lang.org) (TXT) w3m dump (www.ruby-lang.org) | freedomben wrote: | Timeouts for Regexp is quite interesting. The engineering purity | in me saddens at such a thought, but indeed it seems highly | practical. | | The syntax feels a little rough although I have no ideas how to | make it better: Regexp.timeout = 1.0 | ... /^a*b?a*$/ =~ "a" * 50000 + "x" | | I think I would favor the: long_time_re = | Regexp.new("^a*b?a*$", timeout: 1.0) | | version instead but I use the `=~` almost entirely, so that would | still be a big style change. Probably end up setting a global | timeout per app and then overriding for individual checks as | needed? | speedgoose wrote: | A timeout for regexps makes so much sense. And it would end all | these denial of service security reports. | [deleted] | Thaxll wrote: | It does not make sense to me, best solution is to have an | implementation like re2 that does not have those problems. | | Adding a timeout is a bit strange, first because you don't | know in advance how long it's going to take for large search. | The timeout is a failsafe against something that should be | fixed in the first place. | jatone wrote: | thats the problem with all these syntax sugar features in | languages. you literally can't change them without blowing | up your entire ecosystem. | djur wrote: | How would this problem be different if Ruby did not have | syntax support for regexes and instead offered a regex | module in the standard library? | JohnBooty wrote: | You _can_ use re2 in Ruby if you like - it 's just not the | default. https://github.com/mudge/re2 | best solution is to have an implementation like re2 that | does not have those problems. | | By design RE2 isn't fully compatible with Onigmo. As | another poster mentioned, a hybrid "use RE2 when possible; | fall back to Onigmo otherwise" approach was considered and | rejected for well-explained reasons https://bugs.ruby- | lang.org/issues/18653 | | Maybe in addition to `Regexp.timeout = 1.0` there could | also be a `Regexp.parser = :re2` option with `:onigmo` | being the default. | messe wrote: | I think a limit on stack/recursion/backtracking depth would | be tad a bit more elegant than a timeout and would keep | your code behaviour the same between different machines. | capableweb wrote: | It'd be harder to control perceived performance of user | facing applications though. If I can set the timeout, I | can guarantee that something happens within X seconds, | instead of within X iterations which could have different | performance machine to machine. | MaxLeiter wrote: | is this not the halting problem? You can guarantee a | certain depth isnt reached but you cant guarantee a | recursion will unwind or anything like that | __s wrote: | It's not the halting problem; it's bounding computational | depth | | Their suggestion is essentially making stack overflow a | feature in regex, & then allowing that stack depth to be | tuned | andreynering wrote: | I wouldn't expect people to have to change that setting often, | and 1 second seems very reasonable to me. | | So yes, `Regexp.timeout` is supposed to be a default setting | for the app, and when really needed you can override it with | the `timeout:` key. | JohnBooty wrote: | Yeah the implementation they've chosen seems totally perfect | to me. Sane global default, easily overridable globally or | locally. | | No easy way to override it locally when using =~, but I can't | imagine too many cases where you would want to use a local | timeout anyway.... can just switch away from =~ syntax for | those. | | This is mostly a denial-of-service mitigation tool, something | you'd just want to apply globally to avoid disasters spawned | by malformed or malicious input. In practice, it's hard to | imagine a use case where you'd really want to be twiddling | the knobs on a regexp-by-regexp basis. | freedomben wrote: | Yes good point, I was initially thinking that it would make | sense to always ask yourself "how long should this take" and | tune appropriately, but for the vast majority of regexes | that's overkill, especially if you're not doing anything | O(n^2). sticking a 1 second in there gives you a lot of | headroom and you can just get more specific for any | exceptions. | JohnBooty wrote: | I was initially thinking that it would make sense | to always ask yourself "how long should this take" | and tune appropriately, but for the vast majority | of regexes that's overkill | | More than being overkill, it's actually impossible right? | | The execution time will also vary greatly based on base CPU | performance, and current server load. | | A regexp that takes 10ms to process right now might take | 500ms tomorrow when your server is under heavy load. So we | can't predict how much time each regexp "needs." | | But, like you said, we can set a somewhat ridiculously high | limit to help prevent regex-based _oopsies_ or re-based DoS | attacks from dragging us down =) | ainar-g wrote: | I wonder why they didn't just include an option to use a non- | backtracking algorithm, like re2's[1]. As far as I know, that | would completely eliminate the possibility of catastrophic | backtracking happening. | | [1]: https://github.com/google/re2 | byroot wrote: | It was explored but decided against, at least for now | https://bugs.ruby-lang.org/issues/18653 | dragonwriter wrote: | Wrapping RE2 with a fallback to the existing engine to try | to maintain compatibility was explored; that, like the | timeout approach, is pretty clearly a stopgap measure; | actually implementing an RE2-style algorithm without the | compatibility and toolchain warts of RE2 for Ruby's | existing code and functionality is a bigger but more | permanent solution, that I don't think has really been | ruled out of explored. | riffraff wrote: | if you break compatibility you might as well just use | some re2 bindings[0] | | [0] https://github.com/mudge/re2 | exfascist wrote: | Better arguably would be to use a generator or continuation. | marcus_cemes wrote: | I've been very attracted to learn Ruby a couple of times, being | exhausted of the JS ecosystem. Everybody who's used it seems to | fall in love with it, but I can't get over just how slow it is... | It takes a fresh installation of Discourse over 10 minutes to | start-up again on a small underpowered VM and uses 10x as much | RAM as an alternative platform such as Flarum. | inopinatus wrote: | My developer experience is that the long initial start time (of | Rails in particular) is more than offset by my productivity. | freedomben wrote: | I'm one of those people that fell in love with Ruby, and yeah | the speed is the biggest downside. That said, a lot of the bulk | is often Rails. I usually use Sinatra now and it's pretty | light. On the smallest VM it usually starts quickly and runs | fine for quite a while. One even survived an HN Hug. There are | also some big improvements coming with Ruby 3 (if you aren't | already upgraded to that) and more to come. But you definitely | "pay" a fee in CPU/memory for the privilege of using Ruby. In | most cases, it's way worth it IMHO. I've also been loving | Elixir lately. It's got much the same feeling of beauty that | Ruby does, and it's much lighter and lightning fast. I often | measure response times in microseconds rather than | milliseconds! | syrusakbary wrote: | This is super exciting! | | They also created an awesome playground to try Ruby online [1]... | all powered by Wasmer/WASI [2]! | | [1] https://try.ruby-lang.org/playground/ | | [2] https://wasmer.io | eatonphil wrote: | This looks awesome! I've already played around with pyodide and | coldbrew doing the same thing for CPython. I use it for an in- | memory playground [0] of an open-source desktop app I build [1]. | I've been waiting for Ruby, Julia, and R support to add them in | too. | | That said, I am not seeing a link in here about how to actually | use this code. Is there a good tutorial/example somewhere? | | [0] app.datastation.multiprocess.io | | [1] github.com/multiprocessio/datastation | [deleted] | swlkr wrote: | One notable thing is the ruby apps in a single .wasm file. This | may make ruby CLI apps easier, as well as eventually replacing | things like docker or shipping your ruby code to a server. | exdsq wrote: | Why would it replace docker? You still need the dependencies of | the CLI app | | Edit: Ah I guess it's just the WASM vm if it includes | everything | cpuguy83 wrote: | You don't just execute a .wasm file, it requires a runtime | which will JIT compile the code into machine code and handle | the (wasi) system interfaces (e.g. read, write, stat, etc). | specialp wrote: | Yes this is true with all interpreted languages. But if you | consider the use-case the OP was contrasting with (Docker) | that not only has the Ruby runtime, but an entire Linux OS as | well. | qbasic_forever wrote: | I was thinking the same thing, isn't ruby particularly hard to | package as it doesn't support static compilation? It would be | nice to just sidestep all of that with a hermetic little WASM | distribution. | alberth wrote: | Does this imply that Rails apps could run as WASM server apps and | receive a huge performance boost? | teeray wrote: | I've found that Rails is like the Crysis of Ruby. Usually the | answer to "will X ruby runtime run Rails?" is "not yet." | pqdbr wrote: | Lol, good analogy. "But can it run Rails?" | eatonphil wrote: | There are many other existing/mature Ruby bytecode VMs/JITs you | could switch to before a WASM bytecode VM/JIT. | JohnBooty wrote: | Yes, but they're not portable/interoperable in the way that a | WASM version would be -- which is why the WASM version is | exciting, right? | | (Somebody correct me if I'm wrong; I know what WASM is but | I'm not sure how it's employed in practice outside of in- | browser tech demos of games and things) | alberth wrote: | Cloudflare Workers allows you to deploy server side WASM | apps. | | https://blog.cloudflare.com/webassembly-on-cloudflare- | worker... | Mikeb85 wrote: | It doesn't compile Ruby code to wasm code, it compiles the Ruby | interpreter to wasm, so it'll be roughly the same performance | as as the Ruby interpreter on Windows or Linux. | | Also Rails is plenty quick these days, tons of people running | it at massive scale. | the_duke wrote: | The best case WASM performance is roughly 20-50% slower than | native code, depending on the runtime and the type of code | executed. | | In the browser you have to also factor in the warmup time. | | I'd imagine an interpreter will suffer a lot because certain | C tricks like computed goto don't work directly. (This will | hopefully be improved by future Wasm proposals) | | (Note: that's still plenty fast enough for most use cases, | and performance will improve) | titzer wrote: | > The _best case_ WASM performance is roughly 20-50% slower | | It's more accurate to say that the _average case_ is 20-50% | slower. The best case is on par, or slightly faster than | native code[1]. | | [1] Measurements from our original paper, | https://dl.acm.org/doi/10.1145/3062341.3062363 | | Engines are even faster now. | bpicolo wrote: | It's not comparatively quick, but you don't necessarily need | quick to scale or be successful | shafyy wrote: | Why do you think this would give them a performance boost? | anm89 wrote: | This is by far my favorite tech talk of all time. It goes | into why WASM can run faster than native code in many | contexts, the reason being that it get's around the overhead | of OS security rings | | https://www.destroyallsoftware.com/talks/the-birth-and- | death... | georgyo wrote: | The base argument here doesn't make sense. | | WASM requires an interpreter which must be native. | | The argument is that this interpreter can be smarter about | what crosses OS security rings. But those same improvements | could be done in the native compiler or interpreter. | | The next argument could be that many things using the WASM | target would focus more effort on improving it so all WASM | targets benefit outpacing their individual optimizations. | | This one is harder to dismiss outright, but instead of | optimizing for machine code you are now optimizing your | WASM output. | | Also this intermediate byte code representation already | exists for both LLVM and JVM, which many languages target. | | It is difficult to see WASM magically improving performance | at all and especially not dramatically enough to encourage | people to switch to it for that reasoning. | eatonphil wrote: | That talk is from 2014 and Wikipedia says wasm was | announced in 2017? | time_to_smile wrote: | While the parent is does seem to be treading into Poe's | law territory, it's not entirely correct to dismiss that | talk's relationship to wasm based on the dates your | quoting. | | Bernhardt in the talk explicitly mentions asm.js which is | the precursor to wasm (it's even mentioned in the | wikipedia article you skimmed a bit too quickly). asm.js | was released Feburary 2013. | | I'm surprised HN has such a short memory, but the impetus | for that talk was a clearly disturbing trend at the time | implying that everything should be done in javascript. | Node.js was gaining rapid popularity, people were | discussing javascript as the new C for using as the | language to write example code in, and while things like | asm.js were exciting, they seemed to point towards the | hilariously nightmarish future Bernhardt is discussing | there. | dnsco wrote: | This talk is about asm.js which is a precursor technology | to wasm, parents logic seems to be "wasm is an | improvement on asm.js". I have no idea if the kernel | isolation benefits the garry bernhardt talk is about | apply. | capableweb wrote: | asm.js was first mentioned in 2013. asm.js was eventually | superseded by wasm and is pretty much the beginning of | wasm as we know it. Didn't watch the talk, but could | asm.js be the thing the presenter was talking about? | AprilArcus wrote: | it post-dated and took inspiration from Mozilla's asm.js, | which was highly influential on wasm. | k__ wrote: | Is it comparable to Opal? | AprilArcus wrote: | Not really. Opal is a source-to-source compiler that compiles | Ruby to JavaScript. Ruby 3.2 compiles the whole Ruby VM and | runtime to wasm, which then runs Ruby inside a real Ruby VM | nested within the JS VM. | | A good analogy is that Opal is like PureScript, whereas Ruby | 3.2 is like GHCJS. | poisonta wrote: | I hope it performs better than Opal. ___________________________________________________________________ (page generated 2022-04-07 23:00 UTC)