[HN Gopher] Our Experience Porting the YJIT Ruby Compiler to Rust
       ___________________________________________________________________
        
       Our Experience Porting the YJIT Ruby Compiler to Rust
        
       Author : thunderbong
       Score  : 86 points
       Date   : 2022-05-11 18:41 UTC (4 hours ago)
        
 (HTM) web link (shopify.engineering)
 (TXT) w3m dump (shopify.engineering)
        
       | epage wrote:
       | Glad to see an explanation for why they bypassed cargo. I had
       | wondered about that when the PR was posted as it didn't contain
       | any documentation for why they were going off the beaten path.
       | However, I'm a bit confused at the reasoning. Because a solution
       | felt sub-optimal (vendoring an optional dependency), they went
       | the route of invoking rustc directly? That seems a bit scorched
       | earth especially when they later talk about needing another
       | dependency offline and instead of vendoring both they instead
       | worked around that in a different way.
        
       | Thaxll wrote:
       | Only 3month for such a project is impressive, how big was the
       | original C code base?
        
         | vlang1dot0 wrote:
         | According to the article:
         | 
         | > YJIT is a relatively simple JIT compiler that totals about
         | 11,000 lines of C code.
        
           | SemanticStrengh wrote:
           | the rust port is 16K LOCs despite the crazy C verbosity??
           | (I'm reffering to the github addition count)
        
         | pawelduda wrote:
         | They stated it's not that big of a project and since it was a
         | direct code conversion rather than to idiomatic Rust, it
         | probably reduced the workload.
        
       | ibraheemdev wrote:
       | The rewrite commit is here for anyone interested:
       | https://github.com/ruby/ruby/pull/5826
        
       | steveklabnik wrote:
       | I've been tickled pink to watch this work happen. A wild fusion
       | of my past and my present. And of course, much cleaner and
       | properly done, as opposed to fun experiments I used to like to
       | do, like https://github.com/steveklabnik/ruby/tree/rust
        
         | rubyskills wrote:
         | Haha, love it. Both of your passions melded together! (We met
         | in Michigan at our mutual rust conf btw)
        
         | throwaway38311 wrote:
        
       | avgcorrection wrote:
       | You might recognize the author from https://pointersgonewild.com/
        
       | faitswulff wrote:
       | I wonder what the author would have preferred to the visual
       | litter of `unsafe` blocks? Can you use `unsafe` to mark off
       | entire scopes to avoid having to redeclare it? Seems unavoidable
       | in a project that is meant to heavily interoperate with a C99
       | code base like Ruby.
       | 
       | Cargo having to phone home all the time sounds annoying, though.
       | I hope that gets improved, though I know the cargo team is
       | swamped right now.
        
         | steveklabnik wrote:
         | > Can you use `unsafe` to mark off entire scopes to avoid
         | having to redeclare it?
         | 
         | Yes.
         | 
         | > Cargo having to phone home all the time sounds annoying,
         | though.
         | 
         | It does not have to phone home, ever. They for some reason
         | didn't seem to want to use the right workflow, but that
         | workflow does exist, and is an important and well-supported
         | use-case.
         | 
         | (It seems from the bug report that they did hit a confusing
         | error message, which is unfortunate.)
        
           | faitswulff wrote:
           | Thanks for the clarifications!
        
             | steveklabnik wrote:
             | No problem. And I am serious about the "for some reason"
             | bit, I don't know why `cargo vendor` isn't acceptable for
             | them, but they know their requirements better than I do.
        
           | tptacek wrote:
           | She gave the reason in the piece, and linked to the ticket
           | she filed:
           | 
           | https://github.com/rust-lang/cargo/issues/10352
        
             | steveklabnik wrote:
             | I read the piece and the ticket. I still don't understand.
             | That's fine, it's not my responsibility to anymore. I'm
             | glad they found a solution they're happy with.
             | 
             | Specifically, cargo vendor was suggested in the ticket, but
             | there was no response as to why it wasn't adequate for this
             | case, just that it doesn't "feel" good. I'm probably
             | missing something.
        
               | tptacek wrote:
               | The reason is right there in the ticket. They have
               | exactly one dependency, and it's optional. They don't
               | want to vendor it just so they can build the no-capstone
               | artifact in CI without having to phone home to crates.io
               | or whatever. You don't have to agree with her, but you
               | can't pretend she didn't give a clear reason.
        
               | epage wrote:
               | > They have exactly one dependency, and it's optional.
               | 
               | Except they would have two dependencies (the other being
               | libc) but they wanted to avoid vendoring for that as
               | well.
        
               | tptacek wrote:
               | That's a good point, thanks!
        
               | steveklabnik wrote:
               | Saying "we don't want to" is technically a reason, sure,
               | but it doesn't mean I understand the why. Why is that a
               | problem?
               | 
               | (Depending on that why, I have an idea or two for a
               | workaround, but you can't suggest solutions until you
               | understand the actual issue. I have my own laundry list
               | of issues with Cargo that I have to work around; it's
               | painful and I have a lot of empathy for that.)
        
               | tptacek wrote:
               | For all the same reasons that vendoring isn't the
               | _default_ option in any dependency management system in
               | any language: because there is a cost to vendoring. She
               | didn 't say that, but I wouldn't think she'd have to. You
               | don't vendor all of your dependencies either: why not?
               | There's your answer.
               | 
               | I'm wondering if maybe you've missed the part where they
               | don't need this single dependency in the builds they're
               | talking about doing.
               | 
               | It seems like the Rust community people responding on the
               | ticket had no trouble understanding what the issue is.
               | There's a difference between acknowledging something and
               | conceding that it needs to be changed.
        
               | steveklabnik wrote:
               | I don't because I don't need to worry about offline
               | builds. If I did, then I'd vendor them. Hence not
               | understanding. (And even if it literally is that only,
               | then I have at least one idea for what they _could_ do
               | that would make this work. But maybe that solution would
               | fail for other reasons, and I want to acknowledge that
               | it's not like I have the full requirements list and am
               | saying "oh gosh just do this," because I am not.)
               | 
               | That it's a single dependency is why it feels extreme to
               | me; won't that be a very, very tiny thing to vendor?
               | 
               | Anyway I don't really know why you're being so aggro
               | here. I don't know how many times I can say "I'm probably
               | missing something" and "they know better than I do." I
               | really like you, Thomas, but I also hear the passive
               | aggression, and don't like it. It makes me want to return
               | it, but that doesn't benefit anyone.
        
               | tptacek wrote:
               | You said "They for some reason didn't seem to want to use
               | the right workflow", which by my lights manages to be
               | both loaded and dismissive (if that was the goal, well
               | played). I pointed out that the "some reason" is right
               | there in the post, and you doubled down. I'm not sure why
               | you did; you could have just said, "oh, interesting,
               | thanks" and I'd have said "pip pip! cheerio!" but if you
               | want instead to... uh... explore the contours of this
               | controversy? I'm always game, which is what you're
               | experiencing now.
               | 
               | If you want to keep me invested in a programming
               | discussion, one very good way to do it is to imply that
               | you can't for the life of you understand why someone
               | wouldn't want to vendor Capstone.
        
         | tptacek wrote:
         | For what it's worth: the complaint here makes sense, but
         | "unsafe" isn't just telling the compiler what's going on, it's
         | (probably more importantly) telling other programmers what's
         | going on.
        
       | VWWHFSfQ wrote:
       | > Unlike C, Rust won't automatically promote integer types to
       | wider types. It forces you to manually cast any mismatching
       | integer types for every operation
       | 
       | > Rust's insistence on manual casting everywhere encourages
       | people to write inefficient code, because writing verbose code
       | feels uncomfortable and adds friction.
       | 
       | This reads a lot to me like "we like how easy it is to create
       | bugs in C. We'd like to be able to do that in Rust, too." But
       | maybe I'm misunderstanding. I'm sure these people know a lot more
       | about the consequences of implicit integer casting than I do.
        
         | colejohnson66 wrote:
         | Yes, but:
         | 
         | > ..there's no reason why you couldn't safely promote a u8,
         | u16, or u32 into a usize, and asking programmers to manually
         | cast integers everywhere makes the code more noisy and verbose.
         | 
         | They have a valid point. In C#, numerical casts that won't lose
         | precision are implicit, and any that could are explicit. I
         | think that's a nice compromise. So, int to uint requires a
         | cast, but int/uint to long doesn't:                   int myInt
         | = /* ... */;         long myLong = myInt;         uint myUInt =
         | myInt; // error CS0266: Cannot implicitly convert type 'int' to
         | 'uint'. An explicit conversion exists (are you missing a cast?)
         | ulong myULong = myInt; // error CS0266: Cannot implicitly
         | convert type 'int' to 'ulong'. An explicit conversion exists
         | (are you missing a cast?)              uint myUInt = /* ... */;
         | int myInt = myUInt; // error CS0266: Cannot implicitly convert
         | type 'uint' to 'int'. An explicit conversion exists (are you
         | missing a cast?)         long myLong = myUInt;         ulont
         | myULong = myUInt;
        
           | stonemetal12 wrote:
           | u8 max value + 1 means something as a u8. u8 max value + 1
           | means something different as a u16. Implicitly changing
           | behavior is never cool.
        
           | shepmaster wrote:
           | And chiming in:
           | 
           | > there's no reason why you couldn't safely promote a u8,
           | u16, or u32 into a usize
           | 
           | The people running Rust on a 16-bit architecture (e.g. AVR)
           | would disagree. On those platforms, usize is 16-bit, so
           | promoting u32 to usize would not be possible.
           | 
           | I _do_ wish there was a nicer way to declare up-front "I
           | *never* want this crate to compile on something smaller than
           | 32-bit (or 64-bit), so assume that".
           | 
           | IIRC, similar ideas like this have floated around under the
           | name of a "platform compatibility lint".
        
             | wahern wrote:
             | But whether implicitly or explicitly converted, the
             | compiler would reject the code all the same, no? (Unless an
             | explicit conversion would actually succeed, which is even
             | _worse_. I 'm not a Rust programmer so don't know. But this
             | is why explicit casts are frowned upon in C.)
        
               | JoshTriplett wrote:
               | The proposal is that by default the compiler would refuse
               | to do an infallible conversion from `u32` to `usize`, but
               | if you declare up front "this code assumes at least a
               | 32-bit target", then you gain the ability to do
               | infallible conversions from u32 to usize. And if you
               | declare up front "this code assumes at least a 64-bit
               | target", you gain the ability to do infallible
               | conversions from u64 to usize.
               | 
               | Also, you can already convert u8 and u16 to usize
               | infallibly, today. Only u32 and u64 currently require
               | fallible conversions via try_from.
        
         | prirun wrote:
         | Nim had this same problem a couple of years ago, and it's very
         | annoying. I just checked some stdlib functions and it's still
         | there:
         | 
         | https://github.com/nim-lang/Nim/blob/devel/lib/std/varints.n...
         | 
         | For example, in readVu64:                 18: proc readVu64*(z:
         | openArray[byte]; pResult: var uint64): int =       24: pResult
         | = (uint64 z[0] - 241) * 256 + z[1].uint64 + 240       28:
         | pResult = 2288u64 + 256u64*z[1].uint64 + z[2].uint64       31:
         | pResult = (z[1].uint64 shl 16u64) + (z[2].uint64 shl 8u64) +
         | z[3].uint64       33: let x = (z[1].uint64 shl 24) +
         | (z[2].uint64 shl 16) + (z[3].uint64 shl 8) + z[4].uint64
         | 
         | I get that there has to be some indication to the compiler to
         | distinguish between doing 8-bit unsigned math vs 64-bit
         | unsigned math to evaluate these expressions. But having such a
         | mish-mash in a stdlib function makes new users think, "There
         | must have been a good reason to do it like this since it's in
         | stdlib, but dang if I know what it is!"
         | 
         | I ran into this myself a couple of years ago when looking at
         | Nim and agree it is annoying, verbose, and ugly to be forced to
         | specify types for obvious integer things.
        
           | SemanticStrengh wrote:
           | This is a big reason on why I loosed interest on rust
        
         | __s wrote:
         | It's more complaining that Rust is verbose when one wants to
         | store small integers in structs but have intermediate
         | computation use machine width integers
         | 
         | Implicit widening doesn't tend to cause bugs. I don't think
         | it'd be so bad if vec[u8] implicitly worked without needing
         | vec[u8 as usize]
        
           | estebank wrote:
           | FWIW, you can have a newtype wrapper around Vec that provides
           | indexing access with any type you want. If you are porting or
           | writing code that uses i32 for indices for some reason, then
           | you use the wrapper instead of Vec with no runtime cost and
           | no inline type conversions.
           | 
           | This hasn't been solved at the language level yet because if
           | we `impl Idx for u8` then inference in `vec[42]` breaks.
           | Making inference support that in a principled manner will
           | require a lot of work, and doing it in a hacky way might lock
           | us into a suboptimal place that we haven't fully explored
           | yet. I want to see _some_ solution here at some point.
        
         | [deleted]
        
       | SemanticStrengh wrote:
       | I hope shopify will keep investing in the disruptive truffleruby
        
       ___________________________________________________________________
       (page generated 2022-05-11 23:00 UTC)