hngopher.com

       [HN Gopher] Enigma: Erlang VM Implementation in Rust
       ___________________________________________________________________
        
       Enigma: Erlang VM Implementation in Rust
        
       Author : adamnemecek
       Score  : 273 points
       Date   : 2020-04-30 14:48 UTC (8 hours ago)
        
 (HTM) web link (github.com)
 (TXT) w3m dump (github.com)
        
       | chx wrote:
       | > sans the distributed bits for now
       | 
       | I thought the very point of Erlang was the distributed nature of
       | BEAM? Failure is normal etc?
        
       | mwcampbell wrote:
       | A similar project is Lumen [1], which is targeted primarily at
       | WebAssembly.
       | 
       | [1]: https://github.com/lumen/lumen
        
         | archseer wrote:
         | Yep! They appeared shortly after I started and were backed by
         | Dockyard. I think they didn't have much success finding
         | external contributors either though :/
        
           | bcardarella wrote:
           | The project is still under heavy development to get to a
           | point where external contributors make sense. We hope to be
           | there soon!
        
       | amelius wrote:
       | What are some good books on the topic of VM implementation? If no
       | books, then papers also welcome.
       | 
       | (Not necessarily related to Erlang)
        
         | enitihas wrote:
         | Crafting Interpreters is a good book, free to read online. For
         | papers you can read "The implementation of Lua 5"
        
       | JimmyRuska wrote:
       | I would love an erlang implementation where there can be many
       | versions of the code in memory, where you could re-order the
       | message pattern matching at runtime, where you can specify
       | arguments to functions in terms of a map, specify the args and
       | the types of that map specification, and have it compile into
       | numbered argument, that way you don't have to add update many
       | many functions to add another argument.
        
         | toast0 wrote:
         | > many versions of the code in memory,
         | 
         | You can model that today if you script compilation/loading.
         | When loading a module, you could first load it as {?MODULE,
         | ?VERSION} or ?MODULE_?VERSION if we can't stuff a tuple there,
         | and then also load it as ?MODULE, to use when you don't specify
         | a version.
         | 
         | The hard part is deciding what version to call when, and
         | passing that through to the call sites. And also, to figure out
         | how to signal a process that you want it to update its version.
         | 
         | > where you could re-order the message pattern matching at
         | runtime,
         | 
         | Pattern matching order is part of your code, and hot loading is
         | the way to make changes to your code.
         | 
         | > where you can specify arguments to functions in terms of a
         | map, specify the args and the types of that map specification,
         | and have it compile into numbered argument, that way you don't
         | have to add update many many functions to add another argument
         | 
         | You could do this yourself today as well; a function could
         | check if its argument is a Map and demapify the arguments, or
         | you could make a utility call_function(Module, Function, Arity,
         | Map) that demapifies and calls erlang:apply on the function.
         | Or, you could have your rapidly changing functions all just
         | take Maps; I did that in the past with Proplists.
        
           | JimmyRuska wrote:
           | Thanks for these suggestions!
           | 
           | Shouldn't there be a way for the compiler to know enough to
           | convert the map to positional arguments, so long as the map
           | params could be put into a spec? Something like that would be
           | super nice, because I think tail calls in Erlang cost nothing
           | so long as you keep the same arg positions. Having a map is
           | always a convenient way of starting a function and evolving
           | it, but having it with a few more constraints and with equal
           | performance would be better. If I'm not mistaken I remember
           | dart made a similar optimization.
        
         | yetihehe wrote:
         | You could do that already. You can compile code at runtime from
         | binary (self modifying code) if you need or you could extract
         | that into several dynamically compiled modules and call them
         | based on argument.
         | 
         | BUT
         | 
         | Different versions of code in memory - that sounds like
         | nightmare to debug. Erlang already stores two versions of newly
         | compile code, old one for processes currently actively using
         | that code and new one for all others. Once all processes jump
         | to new code (by exiting from old function module or calling
         | into new version with module:function call) old code is purged.
        
       | archseer wrote:
       | Thanks for sharing, author here! (AMA?)
       | 
       | I had a lot of fun working on this project, having implemented
       | enough of the VM to run both Elixir and IEx before I stopped.
       | 
       | Ultimately development stalled since I couldn't get any community
       | interest (I was hoping to give an ElixirConf talk but wasn't
       | accepted either). Was hoping to raise some interest and find some
       | contributors in similar vein to
       | https://github.com/RustPython/RustPython
       | 
       | Nowadays I write a lot less Elixir and a lot more Rust.
        
         | jacquesm wrote:
         | Why start if you have no intention to see it through? Talks and
         | adoption by the community are a chicken and egg problem, if you
         | don't believe enough in the project to give it staying power
         | then the community is right to not adopt it: they already have
         | a VM for BEAM and it works well. Without an additional selling
         | point 'now in Rust' doesn't cut it.
        
           | GolDDranks wrote:
           | I find this attitude puzzling. If they wanted to start, why
           | not? It seems, according to the comment by the author itself,
           | that they had a lot of fun, and it seemed like an interesting
           | project. So I don't see any reason to back away from it? On
           | the other hand, for the precisely same reason, I see
           | perfectly reasonable that they didn't continue with it. If
           | the fun is gone, why continue?
        
           | merlinsbrain wrote:
           | > Why start if you have no intention to see it through
           | 
           | The author had a threshold of good feedback they needed from
           | the community in a certain amount of time. They got the
           | feedback they needed - people aren't interested in it,
           | probably because of the latter part of your comment.
           | 
           | I don't think that's a valid reason to ask why someone
           | started a thing, people start things for a variety of
           | reasons. As far as I am concerned, they saw the development
           | of a reimplementation of solid tech through and learned a lot
           | from it.
           | 
           | > they already have a VM for BEAM and it works well. Without
           | an additional selling point 'now in Rust' doesn't cut it.
           | 
           | This is spot on though.
        
           | busterarm wrote:
           | I see this tendency a lot from the Rust community where
           | there's a lot of "now in Rust" being built, expectations had
           | and then hurt feelings when they're either ignored or shown
           | the door. The community seems to think that "now in Rust"
           | _is_ the selling point. Tools are just tools.
           | 
           | What they don't realize is that they're often building
           | solutions that are looking for problems, rather than
           | solutions to solve problems. It's also vaguely cultish in the
           | approach.
           | 
           | It's a terrific language and there's a lot of learn from it,
           | but I'd like to see it solve real world problems on its own
           | versus try and screw itself into everyone else's.
        
             | 59nadir wrote:
             | While it's undeniable the majority of things to come out of
             | Rust are mostly superfluous rewrites of already solid
             | projects (to your point about "now in Rust!" being the
             | selling point) I think it's clear the author in this
             | particular case was just looking into BEAM internals and
             | started a fun project, so I don't think it applies here.
             | 
             | In general, though, I think people ought to consider that
             | if they are putting the language they wrote their project
             | in in the marketing blurb for it (given a more serious
             | project), maybe that indicates that the project itself is
             | of little value to other people. "* Written in Rust" isn't
             | a value proposition, it's just an implementation detail.
             | Make real claims about zero crashes, zero leaks, something
             | actually concrete and it can be scrutinized for real.
        
               | chc wrote:
               | I think you're possibly making a faulty assumption here.
               | You're right that "written in Rust" has no particular
               | value to people who just want to use the thing, but
               | that's not the only perspective people bring to open-
               | source software. When somebody markets a project as "X
               | written in Y," I generally assume they're marketing to
               | people who might want to hack on it, and it _is_ relevant
               | from that perspective.
        
               | merlinsbrain wrote:
               | Written in X can definitely be a value proposition.
               | 
               | If you are evaluating a tool/lib/etc that moves at a fast
               | pace and your whole shop is extremely fluent in language
               | X there's huge value add to being able to dive in without
               | a context switch to understand how it works, especially
               | when debugging harder problems.
               | 
               | I don't think it applies to _this_ case where we're
               | getting a VM that is _extremely_ battle-tested. Am I
               | going to use a new OS instead of linux in production
               | because someone tried to write an OS in zig? No. Will I
               | congratulate the author for writing an OS in zig? Yes.
               | 
               | If I am looking for a key-value store and two are
               | equivalent in their purported features and stability, I
               | will choose the one that is written in X that my shop is
               | most fluent in.
        
         | tomp wrote:
         | How did you implement the GC? Is it possible to implement an
         | allocator + GC in Rust without hitting UB?
        
           | enitihas wrote:
           | You can simply use unsafe as an escape hatch
        
         | adamnemecek wrote:
         | Do you think it's reasonable to make a Rust library that allows
         | you to do Erlang style binary matching?
         | 
         | That's what I was originally looking for when I found this.
        
           | swsieber wrote:
           | If your okay with macros, probably nightly only, then it
           | seems reasonable.
           | 
           | There is also slice matching on stable, which let's you match
           | on parts of slices: https://github.com/rust-
           | lang/rust/pull/67712/ . It went out in 1.42. It has some
           | stuff which makes binary stuff easier, but not by much. But
           | perhaps someday you'll get native binary matching in the
           | language that's closer to what Erlang offers.
           | 
           | It made it in the 1.42 release.
        
         | hopia wrote:
         | A new Erlang VM with just replicated functionality is a fairly
         | hard sell to the Erlang/Elixir community, who brag with the
         | industrial track record of the BEAM.
         | 
         | I believe you'd get much more interest if there was some
         | ambitious new promise for this new VM, such as 10x sequential
         | performance etc.
        
           | themgt wrote:
           | If the VM is in Rust could it be compiled to WASM?
        
             | rkangel wrote:
             | You wouldn't want to, for various reasons. See this blog
             | post about Lumen and the decision decisions:
             | 
             | https://tylerscript.dev/bringing-the-beam-to-webassembly-
             | wit...
        
               | hobofan wrote:
               | You wouldn't want to _right now_. However for almost all
               | points there is a solution underway/in planning. In a
               | year or two it might be feasible.
               | 
               | There would however be other limitations, like filesystem
               | APIs etc. not being available in the browser that a lot
               | of frameworks in BEAM languages expect, that would
               | severely limit the usefulness, though I guess that
               | applies to either implementation strategy.
        
               | dnautics wrote:
               | I think the point though is that architecturally there
               | are performance hits if you don't respect the fact that
               | WASM has a different architecture than what the BEAM
               | expects (harvard vs von neumann IIRC), so you may NEVER
               | want to if you get it right in the first place.
        
               | wahern wrote:
               | The Harvard-Von Neumann dichotomy is wrong. C also works
               | perfectly fine on Harvard architectures--it's why
               | function pointers in C are special, aren't guaranteed to
               | be convertible to a void pointer, and why uintptr_t is
               | optional. POSIX adds these additional guarantees to
               | support dlsym, which returns function addresses as void
               | pointers.
               | 
               | The problems with compiling other languages to Web
               | Assembly are primarily 1) lack of goto and 2) inability
               | to instantiate and jump between different stacks. These
               | limitations are especially problematic for languages like
               | Erlang/BEAM and Go because Web Assembly-based VM
               | implementations require an extra level of indirection in
               | order to implement some of their core language semantics,
               | resulting in quite slow performance compared to even a
               | pure, strictly compliant C implementation (and presuming
               | the WASM VM itself adds no overheard, which is not
               | actually the case).
               | 
               | WASM excluded goto support because it was argued that the
               | relooper algorithm required to translate goto constructs
               | to structured WASM statements was sufficiently capable to
               | cover the vast majority of existing code. And they
               | provided evidence to back up that claim. The flaw in that
               | reasoning is that language implementations and similar
               | niche cases have special needs that application code
               | rarely requires, and in that space constructs like goto
               | are crucial to both simplicity of implementation and
               | performance; the inadequacies of relooper become the norm
               | rather than the exception.
        
               | dnautics wrote:
               | thanks for the clarification!
        
           | jlg23 wrote:
           | Just being able to amend job requirements with "or rust
           | experience" is most probably worth it.
        
           | dnautics wrote:
           | Someday this is going to need to happen though. IMO, "the
           | right way" to do this is via the strangler pattern:
           | 
           | https://www.michielrook.nl/2016/11/strangler-pattern-
           | practic...
           | 
           | Probably the language that is most poised to achieve this is
           | Zig; it would be feasible to start by wrapping the entire
           | BEAM in a zig compilation unit; which at the very least
           | potentially offers an easier path to maintaining the codebase
           | across multiple platforms. Followed by hodgepodge doing bits
           | and pieces in zig, which could be achieved via
           | straightforward transliteration at first.
           | 
           | The very different mindset of the rust PL lends itself to
           | total rewrites, which I don't think will sit well in the BEAM
           | community. On the other hand erlang has tons of strange
           | rewrites happening over its own internal ecosystem all the
           | time (gen_fsm -> gen_statem, pg -> pg2 -> pg), etc.
        
             | muizelaar wrote:
             | Do you know of any examples of this being done with Zig? I
             | can think of a couple with Rust:
             | 
             | - https://gitlab.gnome.org/GNOME/librsvg completed a
             | migration to Rust.
             | 
             | - https://github.com/RazrFalcon/rustybuzz and
             | https://github.com/immunant/rexpat are making decent
             | progress.
        
               | dnautics wrote:
               | No, because zig is still in 0.6.0, and the BDFL says
               | "don't use this in prod yet"? Yeesh.
        
           | fortran77 wrote:
           | We do a lot of Erlang work here. The BEAM is so reliable that
           | I wouldn't spend 1 minute looking at an experimental
           | alternative.
        
             | carapace wrote:
             | Yeah, this. I'm just getting started with Erlang and I
             | already feel like an idiot for not using it sooner. When I
             | think of some of the things I've done to try to achieve
             | what the BEAM does out-of-the-box...
        
               | rhlsthrm wrote:
               | Can you give some examples? I've been getting more and
               | more interested in Erlang.
        
               | battery_cowboy wrote:
               | RPC is basically built in, so you'll probably never use
               | REST internally. There's an in memory database built in
               | (ETS) that will replace redis for most key value storage
               | cases. There's easy recovery from crashes via supervision
               | trees and associated features. You can do hot upgrades
               | while your system is fully operational.
        
             | spockz wrote:
             | Can you share something on this reliability? Is it more
             | reliable than the JVM? Or is it more predictable in terms
             | of performance? I've found the jvm itself to be rock solid
             | as well.
        
               | hopia wrote:
               | Erlang VM is consistent/predictable in terms of latency,
               | it's engineered that way. For pure computational
               | performance you won't find it optimal.
        
           | archseer wrote:
           | Yeah I definitely didn't see it being production ready any
           | time soon, but I thought it was an interesting project for
           | people that wanted to learn BEAM internals. That's how I got
           | started with it at least, I had problems trying to contribute
           | BEAM just because of the sheer size of the codebase and
           | lacking the domain specific knowledge.
           | 
           | I do think that having alternative implementations is good
           | for experimentation though, similar to how Ruby was improved
           | upon ideas from JRuby and Rubinius, even if most users never
           | used those two directly.
        
           | conradfr wrote:
           | There is some go projects like [1] that can connect to Erlang
           | nodes and claim to be speedier.
           | 
           | [1] https://github.com/halturin/ergo
        
         | organicfigs wrote:
         | Nice work! If I wanted to learn how to build VMs, how would I
         | start? My experience is in backend development/distributed
         | systems in Java and Go (so assume I know nothing outside of an
         | introductory OS course)
        
           | Aqua_Geek wrote:
           | The "Crafting Interpreters" book by Bob Nystrom is probably a
           | good way to dig in. It has a whole chunk on implementing a
           | VM: http://craftinginterpreters.com/a-bytecode-virtual-
           | machine.h...
        
             | organicfigs wrote:
             | This is pretty engaging, thank you!
        
             | shijie wrote:
             | Thank you for posting this! Thanks to you I'm digging in to
             | this book right now and having a blast. The author is an
             | engaging writer and it's been tremendous fun thus far.
        
               | Aqua_Geek wrote:
               | His book, "Game Programming Patterns," is great as well:
               | http://gameprogrammingpatterns.com
        
         | songshuu wrote:
         | A VM is more than enough of a project, but are there any
         | thoughts of a Phoenix port?
        
           | ethelward wrote:
           | Why would you want to port Phoenix? As long as the underlying
           | VM follow the spec, Phoenix should be oblivious of on which
           | it is running.
        
         | d4mi3n wrote:
         | I love seeing new implementations of popular languages.
         | 
         | Curious: Did implementing this in Rust expose any bad or
         | interesting behavior when replicating the Erlang language spec
         | (https://github.com/erlang/spec) or whatever reference
         | implementation you were targeting?
        
           | archseer wrote:
           | If I remember correctly I found a few edge cases, but they
           | weren't ever hit by OTP, just by partially implemented VMs
           | like mine :)
           | 
           | It was kind of interesting exploring the OTP internals,
           | especially some of the parts that haven't changed in a long
           | time. One example is the PAM: I think it stood for "patrick's
           | abstract machine" and it would compile erlang terms into
           | bytecode for pattern matches (intended for fast ETS lookups).
           | It's all there in one file and it took a fair bit of digging
           | to figure out how it works since it's been static for a long
           | while and nothing on the internet really documented it.
        
             | callamdelaney wrote:
             | The pattern matching algorithm was originally based on the
             | algorithm described in `The implementation of Functional
             | Programming Languages`, the 1987 edition (there are two
             | versions, one is more basic).
             | 
             | Edit: this book is available for free here:
             | https://www.microsoft.com/en-us/research/publication/the-
             | imp...
        
             | d4mi3n wrote:
             | Hah! Great find!
             | 
             | If this is still the case you should definitely consider
             | contributing to the documentation of those files. Odds are
             | they'll be used by the next person to try something
             | similar. :)
        
       | throwaway894345 wrote:
       | This is cool, but looks like development has stalled. Last commit
       | was Sept 2019.
        
       ___________________________________________________________________
       (page generated 2020-04-30 23:00 UTC)