[HN Gopher] Emitting Safer Rust with C2Rust ___________________________________________________________________ Emitting Safer Rust with C2Rust Author : dtolnay Score : 96 points Date : 2023-03-14 05:32 UTC (1 days ago) (HTM) web link (immunant.com) (TXT) w3m dump (immunant.com) | Animats wrote: | DARPA is funding this. Good. | | They haven't reached inter-procedural static analysis yet, which | means they can't solve the big problem: how big is an array? Most | of the troubles in C come from that. Whoever creates the array | knows how big it is. Everybody else is guessing. | | A bit of machine learning might help here. If you see | void dosomethingwitharray(int arr[], size_t n) {} | | a good conjecture is that _n_ is the length of _arr_. So, the | question is, if this is translated to fn | dosomethingwitharray(arr: &[i64]) {} | | does it break anything? Both caller and callee have to be | analyzed. The C caller has the constraint | assert_eq!(arr.len(), n); | | That's a proof goal. If a simple SMT-type prover can prove that | true., then the call can be simplified to just use an ordinary | Rust slice. If not, conversion to Rust has to drop to those ugly | C pointer forms, preferably with a comment inserted. So you need | something that makes good guesses, which is a large language | model kind of thing, and something which checks them, which is a | formalism kind of thing. | | The process can be assisted by putting asserts in the original C, | as checks on the C and hints to the conversion process. That's | probably the cleanest way to provide human assistance. | | I've wanted this for conversion of OpenJPEG code to Rust. That's | a tangle of code doing wavelet transforms, with long blocks of | touchy subscripting and arithmetic, plus encoders and decoders | for an overly complex binary format containing offsets and | lengths. Someone recently ran it through c2rust. The unsafe Rust | code works. It's compatible with the original C - it segfaults | for the same test cases which cause the C code to segfault. This | is why a naive transpiler isn't too helpful. | | (The date at the bottom of the article is 2022-06-13. Has there | been further progress?) | meepmorp wrote: | > The date at the bottom of the article is 2022-06-13. Has | there been further progress? | | The article links to their github repo: | | https://github.com/immunant/c2rust | | There's commits in the last hour, so at least some signal of | life. | mtlmtlmtlmtl wrote: | Has anyone put this to serious use? I played around with it at | some point when it was fairly new and at that time I was able to | transpile the C into Rust just fine, but that didn't help me | much. The idea was to be able to use the Rust toolchain to better | understand the code, but the resulting Rust code was even less | understandable, and also much harder to refactor. In this case I | wasn't attempting a rewrite per se, just trying to understand a C | codebase plagued with memory safety issues. Quickly gave up on | this avenue at that point and just started carefully refactoring | the C to make the bugs easier to shake out. | | Would love to see a technical write up of someone outside | Immunant using this on a real world codebase for whatever | purpose. | diego_moita wrote: | I am very curious to see how this transpiler problems will be | handled by gpt4 in the upcoming months. | boredumb wrote: | C2rust is really cool, but if you're familiar with writing rust | and implement even a trivial C function in there it produces | something absolutely terrifying. I really enjoy rust and pray I | don't find myself working in a code base someone just ran c2rust | against. | FridgeSeal wrote: | Isn't the point to generate _semantically_ equivalent Rust code | from C, so that you can just get it re-compiling under Rust, | and then from there you have a working base from which to start | rewriting into safer Rust? | masklinn wrote: | Yes, it's literally spelled out in TFA: | | > this provides a starting point for manual refactoring into | idiomatic and safe Rust | FpUser wrote: | Do no know this particular tool but some automated language to | language transpilers I saw produce the code one would not be able | to comprehend never mind edit if the need comes. | masklinn wrote: | The goal of C2rust is not to provide a usable code base per se, | it's to provide a convenient base for conversion: once the | project is in unsafe rust it can be managed entirely via rust | tooling and is hopefully a lot easier to finish up than if you | keep having to redefine bindings as you move code from C to | Rust. | | C2rust is a springboard, if you move C2rust-Ed code to | production you're doing it very wrong. | 0cf8612b2e1e wrote: | On the other hand, if I have some working C dependency which | I never intend to modify (owing to its complexity or | stability), plopping the autogenerated Rust code simplifies | your build step. | anticrymactic wrote: | What problem does c2Rust solve exactly? Isn't it just gonna | produce "garbage" rust. | | Calling c directly is already possible in rust. | kelnos wrote: | This isn't about calling external C code from Rust; it helps | people "rewrite" their C code in Rust. | | You can debate the merits of doing so, of course, but some | people do want to do that, and a tool to generate safe, | somewhat idiomatic Rust from C code would seem to be useful. | pohl wrote: | From c2rust.com: | | _The C2Rust project is being developed by Galois and Immunant. | This tool is able to translate most C modules into semantically | equivalent Rust code. These modules are intended to be compiled | in isolation in order to produce compatible object files. We | are developing several tools that help transform the initial | Rust sources into idiomatic Rust. | | The translator focuses on supporting the C99 standard. C source | code is parsed and typechecked using clang before being | translated by our tool._ | eptcyka wrote: | It helps by lowering the barrier to entry when working on | rewriting a codebase in rust. | masklinn wrote: | It moves the project directly into rust land and tooling, which | hopefully makes it easier to convert it without needing to set | up multi langage tooling and a moving barrier / interface | between the two langages. | dureuill wrote: | From reading the article, I get that the latest version can | transform some C into _safe_ Rust. | | This gains us machine-proved memory safety. This is huge. | kccqzy wrote: | The article shows what improvements they are thinking of so | that it _doesn 't_ produce garbage rust. (If by garbage rust | you mean unsafe rust.) | hardwaregeek wrote: | The post does address this and shows their attempt to produce | higher quality Rust. I've also seen it used to move off of a C | toolchain and onto a pure Rust toolchain by porting C code to | Rust. | jandrese wrote: | It makes it easier to get your project on the front page of HN | as you can claim it is written in Rust. | hardwaregeek wrote: | I'm very excited at the possibilities for C2Rust! Dynamic | analysis to fill in the gaps of static analysis makes a lot of | sense. I've wanted something similar for inferring TypeScript | types via runtime analysis (would not be surprised if it exists | already). | | I could see a really compelling use case in cross-compilation | where you compile your C code to Rust, then use a Rust toolchain | to cross compile. Or avoiding interop as well. | CharlesW wrote: | This seems like an interesting project to bridge the "boil the | ocean" approach of rewriting in Rust wholesale. | | (For anyone else who found it slightly difficult to read, you can | remove the added 0.06em `letter-spacing` using your browser's | developer tools.) ___________________________________________________________________ (page generated 2023-03-15 23:00 UTC)