[HN Gopher] Our Experience Porting the YJIT Ruby Compiler to Rust ___________________________________________________________________ Our Experience Porting the YJIT Ruby Compiler to Rust Author : thunderbong Score : 86 points Date : 2022-05-11 18:41 UTC (4 hours ago) (HTM) web link (shopify.engineering) (TXT) w3m dump (shopify.engineering) | epage wrote: | Glad to see an explanation for why they bypassed cargo. I had | wondered about that when the PR was posted as it didn't contain | any documentation for why they were going off the beaten path. | However, I'm a bit confused at the reasoning. Because a solution | felt sub-optimal (vendoring an optional dependency), they went | the route of invoking rustc directly? That seems a bit scorched | earth especially when they later talk about needing another | dependency offline and instead of vendoring both they instead | worked around that in a different way. | Thaxll wrote: | Only 3month for such a project is impressive, how big was the | original C code base? | vlang1dot0 wrote: | According to the article: | | > YJIT is a relatively simple JIT compiler that totals about | 11,000 lines of C code. | SemanticStrengh wrote: | the rust port is 16K LOCs despite the crazy C verbosity?? | (I'm reffering to the github addition count) | pawelduda wrote: | They stated it's not that big of a project and since it was a | direct code conversion rather than to idiomatic Rust, it | probably reduced the workload. | ibraheemdev wrote: | The rewrite commit is here for anyone interested: | https://github.com/ruby/ruby/pull/5826 | steveklabnik wrote: | I've been tickled pink to watch this work happen. A wild fusion | of my past and my present. And of course, much cleaner and | properly done, as opposed to fun experiments I used to like to | do, like https://github.com/steveklabnik/ruby/tree/rust | rubyskills wrote: | Haha, love it. Both of your passions melded together! (We met | in Michigan at our mutual rust conf btw) | throwaway38311 wrote: | avgcorrection wrote: | You might recognize the author from https://pointersgonewild.com/ | faitswulff wrote: | I wonder what the author would have preferred to the visual | litter of `unsafe` blocks? Can you use `unsafe` to mark off | entire scopes to avoid having to redeclare it? Seems unavoidable | in a project that is meant to heavily interoperate with a C99 | code base like Ruby. | | Cargo having to phone home all the time sounds annoying, though. | I hope that gets improved, though I know the cargo team is | swamped right now. | steveklabnik wrote: | > Can you use `unsafe` to mark off entire scopes to avoid | having to redeclare it? | | Yes. | | > Cargo having to phone home all the time sounds annoying, | though. | | It does not have to phone home, ever. They for some reason | didn't seem to want to use the right workflow, but that | workflow does exist, and is an important and well-supported | use-case. | | (It seems from the bug report that they did hit a confusing | error message, which is unfortunate.) | faitswulff wrote: | Thanks for the clarifications! | steveklabnik wrote: | No problem. And I am serious about the "for some reason" | bit, I don't know why `cargo vendor` isn't acceptable for | them, but they know their requirements better than I do. | tptacek wrote: | She gave the reason in the piece, and linked to the ticket | she filed: | | https://github.com/rust-lang/cargo/issues/10352 | steveklabnik wrote: | I read the piece and the ticket. I still don't understand. | That's fine, it's not my responsibility to anymore. I'm | glad they found a solution they're happy with. | | Specifically, cargo vendor was suggested in the ticket, but | there was no response as to why it wasn't adequate for this | case, just that it doesn't "feel" good. I'm probably | missing something. | tptacek wrote: | The reason is right there in the ticket. They have | exactly one dependency, and it's optional. They don't | want to vendor it just so they can build the no-capstone | artifact in CI without having to phone home to crates.io | or whatever. You don't have to agree with her, but you | can't pretend she didn't give a clear reason. | epage wrote: | > They have exactly one dependency, and it's optional. | | Except they would have two dependencies (the other being | libc) but they wanted to avoid vendoring for that as | well. | tptacek wrote: | That's a good point, thanks! | steveklabnik wrote: | Saying "we don't want to" is technically a reason, sure, | but it doesn't mean I understand the why. Why is that a | problem? | | (Depending on that why, I have an idea or two for a | workaround, but you can't suggest solutions until you | understand the actual issue. I have my own laundry list | of issues with Cargo that I have to work around; it's | painful and I have a lot of empathy for that.) | tptacek wrote: | For all the same reasons that vendoring isn't the | _default_ option in any dependency management system in | any language: because there is a cost to vendoring. She | didn 't say that, but I wouldn't think she'd have to. You | don't vendor all of your dependencies either: why not? | There's your answer. | | I'm wondering if maybe you've missed the part where they | don't need this single dependency in the builds they're | talking about doing. | | It seems like the Rust community people responding on the | ticket had no trouble understanding what the issue is. | There's a difference between acknowledging something and | conceding that it needs to be changed. | steveklabnik wrote: | I don't because I don't need to worry about offline | builds. If I did, then I'd vendor them. Hence not | understanding. (And even if it literally is that only, | then I have at least one idea for what they _could_ do | that would make this work. But maybe that solution would | fail for other reasons, and I want to acknowledge that | it's not like I have the full requirements list and am | saying "oh gosh just do this," because I am not.) | | That it's a single dependency is why it feels extreme to | me; won't that be a very, very tiny thing to vendor? | | Anyway I don't really know why you're being so aggro | here. I don't know how many times I can say "I'm probably | missing something" and "they know better than I do." I | really like you, Thomas, but I also hear the passive | aggression, and don't like it. It makes me want to return | it, but that doesn't benefit anyone. | tptacek wrote: | You said "They for some reason didn't seem to want to use | the right workflow", which by my lights manages to be | both loaded and dismissive (if that was the goal, well | played). I pointed out that the "some reason" is right | there in the post, and you doubled down. I'm not sure why | you did; you could have just said, "oh, interesting, | thanks" and I'd have said "pip pip! cheerio!" but if you | want instead to... uh... explore the contours of this | controversy? I'm always game, which is what you're | experiencing now. | | If you want to keep me invested in a programming | discussion, one very good way to do it is to imply that | you can't for the life of you understand why someone | wouldn't want to vendor Capstone. | tptacek wrote: | For what it's worth: the complaint here makes sense, but | "unsafe" isn't just telling the compiler what's going on, it's | (probably more importantly) telling other programmers what's | going on. | VWWHFSfQ wrote: | > Unlike C, Rust won't automatically promote integer types to | wider types. It forces you to manually cast any mismatching | integer types for every operation | | > Rust's insistence on manual casting everywhere encourages | people to write inefficient code, because writing verbose code | feels uncomfortable and adds friction. | | This reads a lot to me like "we like how easy it is to create | bugs in C. We'd like to be able to do that in Rust, too." But | maybe I'm misunderstanding. I'm sure these people know a lot more | about the consequences of implicit integer casting than I do. | colejohnson66 wrote: | Yes, but: | | > ..there's no reason why you couldn't safely promote a u8, | u16, or u32 into a usize, and asking programmers to manually | cast integers everywhere makes the code more noisy and verbose. | | They have a valid point. In C#, numerical casts that won't lose | precision are implicit, and any that could are explicit. I | think that's a nice compromise. So, int to uint requires a | cast, but int/uint to long doesn't: int myInt | = /* ... */; long myLong = myInt; uint myUInt = | myInt; // error CS0266: Cannot implicitly convert type 'int' to | 'uint'. An explicit conversion exists (are you missing a cast?) | ulong myULong = myInt; // error CS0266: Cannot implicitly | convert type 'int' to 'ulong'. An explicit conversion exists | (are you missing a cast?) uint myUInt = /* ... */; | int myInt = myUInt; // error CS0266: Cannot implicitly convert | type 'uint' to 'int'. An explicit conversion exists (are you | missing a cast?) long myLong = myUInt; ulont | myULong = myUInt; | stonemetal12 wrote: | u8 max value + 1 means something as a u8. u8 max value + 1 | means something different as a u16. Implicitly changing | behavior is never cool. | shepmaster wrote: | And chiming in: | | > there's no reason why you couldn't safely promote a u8, | u16, or u32 into a usize | | The people running Rust on a 16-bit architecture (e.g. AVR) | would disagree. On those platforms, usize is 16-bit, so | promoting u32 to usize would not be possible. | | I _do_ wish there was a nicer way to declare up-front "I | *never* want this crate to compile on something smaller than | 32-bit (or 64-bit), so assume that". | | IIRC, similar ideas like this have floated around under the | name of a "platform compatibility lint". | wahern wrote: | But whether implicitly or explicitly converted, the | compiler would reject the code all the same, no? (Unless an | explicit conversion would actually succeed, which is even | _worse_. I 'm not a Rust programmer so don't know. But this | is why explicit casts are frowned upon in C.) | JoshTriplett wrote: | The proposal is that by default the compiler would refuse | to do an infallible conversion from `u32` to `usize`, but | if you declare up front "this code assumes at least a | 32-bit target", then you gain the ability to do | infallible conversions from u32 to usize. And if you | declare up front "this code assumes at least a 64-bit | target", you gain the ability to do infallible | conversions from u64 to usize. | | Also, you can already convert u8 and u16 to usize | infallibly, today. Only u32 and u64 currently require | fallible conversions via try_from. | prirun wrote: | Nim had this same problem a couple of years ago, and it's very | annoying. I just checked some stdlib functions and it's still | there: | | https://github.com/nim-lang/Nim/blob/devel/lib/std/varints.n... | | For example, in readVu64: 18: proc readVu64*(z: | openArray[byte]; pResult: var uint64): int = 24: pResult | = (uint64 z[0] - 241) * 256 + z[1].uint64 + 240 28: | pResult = 2288u64 + 256u64*z[1].uint64 + z[2].uint64 31: | pResult = (z[1].uint64 shl 16u64) + (z[2].uint64 shl 8u64) + | z[3].uint64 33: let x = (z[1].uint64 shl 24) + | (z[2].uint64 shl 16) + (z[3].uint64 shl 8) + z[4].uint64 | | I get that there has to be some indication to the compiler to | distinguish between doing 8-bit unsigned math vs 64-bit | unsigned math to evaluate these expressions. But having such a | mish-mash in a stdlib function makes new users think, "There | must have been a good reason to do it like this since it's in | stdlib, but dang if I know what it is!" | | I ran into this myself a couple of years ago when looking at | Nim and agree it is annoying, verbose, and ugly to be forced to | specify types for obvious integer things. | SemanticStrengh wrote: | This is a big reason on why I loosed interest on rust | __s wrote: | It's more complaining that Rust is verbose when one wants to | store small integers in structs but have intermediate | computation use machine width integers | | Implicit widening doesn't tend to cause bugs. I don't think | it'd be so bad if vec[u8] implicitly worked without needing | vec[u8 as usize] | estebank wrote: | FWIW, you can have a newtype wrapper around Vec that provides | indexing access with any type you want. If you are porting or | writing code that uses i32 for indices for some reason, then | you use the wrapper instead of Vec with no runtime cost and | no inline type conversions. | | This hasn't been solved at the language level yet because if | we `impl Idx for u8` then inference in `vec[42]` breaks. | Making inference support that in a principled manner will | require a lot of work, and doing it in a hacky way might lock | us into a suboptimal place that we haven't fully explored | yet. I want to see _some_ solution here at some point. | [deleted] | SemanticStrengh wrote: | I hope shopify will keep investing in the disruptive truffleruby ___________________________________________________________________ (page generated 2022-05-11 23:00 UTC)