[HN Gopher] Compilers and IRs: LLVM IR, SPIR-V, and MLIR ___________________________________________________________________ Compilers and IRs: LLVM IR, SPIR-V, and MLIR Author : matt_d Score : 36 points Date : 2022-10-29 19:19 UTC (3 hours ago) (HTM) web link (www.lei.chat) (TXT) w3m dump (www.lei.chat) | thechao wrote: | I have an irrational dislike of SPIR-V. On the flip side of the | coin I think MLIR is a work of genius -- especially as a | springboard for ideas in developing custom IR. | fooker wrote: | MLIR is also siloing compiler research and development. | | Every one and their mother has their own proprietary MLIR | dialect nowadays, and the era of competitive open source | compilers is sort of fading. | k4st wrote: | At Trail of Bits, we are creating a new compiler front/middle end | for Clang called VAST [1]. It consumes Clang ASTs and creates a | high-level, information-rich MLIR dialect. Then, we progressively | lower it through various other dialects, eventually down to the | LLVM dialect in MLIR, which can be translated directly to LLVM | IR. | | Our goals with this pipeline are to enable static analyses that | can choose the right abstraction level(s) for their goals, and | using provenance, cross abstraction levels to relate results back | to source code. | | Neither Clang ASTs nor LLVM IR alone meet our needs for static | analysis. Clang ASTs are too verbose and lack explicit | representations for implicit behaviours in C++. LLVM IR isn't | really "one IR," it's a two IRs (LLVM proper, and metadata), | where LLVM proper is an unspecified family of dialects (-O0, -O1, | -O2, -O3, then all the arch-specific stuff). LLVM IR also isn't | easy to relate to source, even in the presence of maximal debug | information. The Clang codegen process does ABI-specific lowering | takes high-level types/values and transforms them to be more | amenable to storing in target-cpu locations (e.g. registers). | This actively works against relating information across levels; | something that we want to solve with intermediate MLIR dialects. | | Beyond our static analysis goals, I think an MLIR-based setup | will be a key enabler of library-aware compiler optimizations. | Right now, library-aware optimizations are challenging because | Clang ASTs are hard to mutate, and by the time things are in LLVM | IR, the abstraction boundaries provided by libraries are broken | down by optimizations (e.g. inlining, specialization, folding), | forcing optimization passes to reckon with the mechanics of how | libraries are implemented. | | We're very excited about MLIR, and we're pushing full steam ahead | with VAST. MLIR is a technology that we can use to fix a lot of | issues in Clang/LLVM that hinder really good static analysis. | | [1] https://github.com/trailofbits/vast | erichocean wrote: | > _LLVM dialect in MLIR, which can be translated directly to | MLIR_ | | Should be: | | LLVM dialect in MLIR, which can be translated directly to _LLVM | IR_ | | Otherwise, great project! We're also using MLIR internally and | it's been awesome, game-changing even when considering how much | can be accomplished with a reasonable amount of effort. | k4st wrote: | Typo fixed! Thanks :-) | | I think the next big problems for MLIR to address are things | like: metadata/location maintenance when integrating with | third-party dialects and transformations. With LLVM | optimizations, getting the optimization right has always | seemed like the top priority, and then maybe getting metadata | propagation working came a distant second. | | I think the opportunity with MLIR is that metadata/location | info can be the old nodes or other dialects. In our work, we | want a tower/progression of IRs, and we want them | _simultaneously_ in memory, all living together. You could | think of the debug metadata for a lower level dialect being | the higher level dialect. This is why I sometimes think about | LLVM IR as really being two IRs: LLVM "code" and metadata | nodes. Metadata nodes in LLVM IR can represent arbitrary | structures, but lack concrete checks/balances. MLIR fixes | this by unifying the representations, bringing in structure | while retaining flexibility. | manv1 wrote: | Funny that there was no mention of GCC, since it was probably one | of the first IRs that anyone encountered IRL. If I remember | correctly one motivation for Clang/LLVM was because GCC's IR was | so bad. | | I knew people that wrote backends for gcc, and they pretty much | all agreed it was a nightmare. ___________________________________________________________________ (page generated 2022-10-29 23:00 UTC)