[HN Gopher] How to learn compilers: LLVM Edition ___________________________________________________________________ How to learn compilers: LLVM Edition Author : AlexDenisov Score : 62 points Date : 2021-11-04 21:00 UTC (1 days ago) (HTM) web link (lowlevelbits.org) (TXT) w3m dump (lowlevelbits.org) | anonymousDan wrote: | I have to say personally I find general program analysis (e.g. | for security) a much more interesting topic than most vanilla | compiler courses. For example I recently came across this course | by the maintainers of soot: | https://youtube.com/playlist?list=PLamk8lFsMyPXrUIQm5naAQ08a... | | Any pointers to similar courses much appreciated! | the_benno wrote: | Anders Moeller and Michael Schwartzbach's book [1] on static | program analysis is a fantastic resource, with (I think) a | great balance of theory and practice. If you want to get really | deep into the theory of program analysis, Patrick Cousot just | published an incredibly thorough book on abstract | interpretation (just got my copy this week, so haven't fully | explored enough to have much of an opinion on it as a | pedagogical resource) | | [1] cs.au.dk/~amoeller/spa | andrewchambers wrote: | More great things: | | - https://c9x.me/compile/ | | - https://github.com/vnmakarov/mir | tester34 wrote: | since here's many compiler hackers then I'd want to ask question: | | How do you distribute your frontend with LLVM? | | Let's say that I have lexer, parser and emitter written in e.g | Haskell (random example) | | I emit LLVM IR and then I use LLVM to generate something other | | but the problem is, that I need to have LLVM binaries and I'd | rather avoid telling people that want to contribute to my OSS | project to install LLVM because it's painful process as hell | | So I thought about just adding "binaries" folder to my repo and | put executables there, but the problem is that they're huge as | hell! and also when you're on linux, then you don't need windows' | binaries | | Another problem is that LLVM installer doesnt include all LLVM | components that I need (llc and wasm-ld), so I gotta compile it | and tell cmake (iirc) to generate those | | I thought about creating 2nd repo where there'd be all binaries | compiles for all platforms: mac, linux, windows and after cloning | my_repo1, then instruction would point to download specific | binaries | | How you people do it? | staticfloat wrote: | In the Julia world, we make redistributable binaries for all | sorts of things; you can find lots of packages here [0], and | for LLVM in particular (which Julia uses to do its codegen) you | can find _just_ libLLVM.so (plus a few supporting files) here | [1]. If you want a more fully-featured, batteries-included | build of LLVM, check out this package [2]. | | When using these JLL packages from Julia, it will automatically | download and load in dependencies, but if you're using it from | some other system, you'll probably need to manually check out | the `Project.toml` file and see what other JLL packages are | listed as dependencies. As an example, `LLVM_full_jll` requires | `Zlib_jll` [3], since we build with support for compressed ELF | sections. As you may have guessed, you can get `Zlib_jll` from | [4], and it thankfully does not have any transitive | dependencies. | | In the Julia world, we're typically concerned with dynamic | linking, (we `dlopen()` and `dlsym()` our way into all our | binary dependencies) so this may not meet all your needs, but I | figured I'd give it a shout out as it is one of the easier ways | to get some binaries; just `curl -L $url | tar -zxv` and you're | done. Some larger packages like GTK need to have environment | variables set to get them to work from strange locations like | the user's home directory. We set those in Julia code when the | package is loaded [5], so if you try to use a dependency like | one of those, you're on your own to set whatever environment | variables/configuration options are needed in order to make | something work at an unusual location on disk. Luckily, LLVM | (at least the way we use it, via `libLLVM.so`) doesn't require | any such shenanigans. | | [0] https://github.com/JuliaBinaryWrappers/ [1] | https://github.com/JuliaBinaryWrappers/libLLVM_jll.jl/releas... | [2] | https://github.com/JuliaBinaryWrappers/LLVM_full_jll.jl/rele... | [3] | https://github.com/JuliaBinaryWrappers/LLVM_full_jll.jl/blob... | [4] https://github.com/JuliaBinaryWrappers/Zlib_jll.jl/releases | [5] | https://github.com/JuliaGraphics/Gtk.jl/blob/0ff744723c32c3f... | xrisk wrote: | You can statically link LLVM, no problem. | | In fact, you never have to call any binaries specifically; just | do it through code and everything should link at compile-time | and become one big binary. | 10000truths wrote: | This is in fact what Zig does. Everything is statically | linked into one binary that is used for compiling, linking, | building, testing etc. | HowardStark wrote: | Not a compiler hacker and unfamiliar with the scene but is | there a specific reason that `git-lfs` wouldn't work? It's the | first thing that came to mind reading this. You can also pretty | easily fetch specific objects as opposed to everything, so in | your README you could direct contributors to only fetch | specific binaries for given tasks. | jcranmer wrote: | When you have an LLVM frontend, what you generally do is have | your driver run the optimization and code generation steps | itself using the LLVM APIs rather than using opt/llc binaries | to drive this step. That way, you don't need the LLVM binaries, | just the libraries that you statically link into your | executable. | | For example, all of the code in clang to do this is located in | https://github.com/llvm/llvm-project/blob/main/clang/lib/Cod... | tester756 wrote: | What if my frontend is written in non cpp? e.g haskell, js, | java, c#, etc. | jcranmer wrote: | You use the LLVM-C bindings via your favorite FFI mechanism | to generate the code then, usually. | [deleted] | chrisaycock wrote: | I learned a lot about LLVM by looking at the compiler output from | Clang: clang -emit-llvm -S sample.cpp | | The article mentions Clang's AST, which can also be emitted: | clang -Xclang -ast-dump -fsyntax-only sample.cpp | | And for checking compiler outputs across lots of languages and | implementations, there's always Matt Godbolt's Compiler Explorer: | https://godbolt.org | CalChris wrote: | 1. _Getting Started with LLVM Core Libraries_ | | It's a bit dated (covers DAGISel rather than GlobalISel) but it | gives a thorough introduction. | | 2. LLVM Developer Meeting tutorials | | These are _really_ good although you 'll have to put them in | order yourself. They will be out of date, a little. LLVM is a | moving target. Also, you don't have to go through every tutorial. | For example, MLIR is not for me. | | 3. LLVM documentation | | I spent less time reading this than going through the Developer | Meeting tutorials. I generally use it as a reference. | | 4. Discord, LLVM email list, git blame, LLVM Weekly | | ... because you will have questions. | | 5. MyFirstTypoFix (in the docs) | | ... when it comes time to submit a patch. | | 6. Mips backend | | If you're doing a backend, you will need a place to start. The | LLVM documentation points you to the horribly out of date SPARC | backend. Don't even touch that. AArch64 and x86 are very full | featured and thus very complex (100 kloc+). Don't use those | either. RISC-V is ok but concerns itself mostly with supporting | new RISC-V features rather than keeping up to date with LLVM | compiler services. Don't use that either although _definitely_ | work through Alex Bradbury 's RISC-V backend tutorials. Read the | Mips backend. It is actively maintained. It has good GlobalISel | support almost on par with the flagship AArch64 and x86 backends. | | BTW, Chris Lattner is a super nice guy. | jcranmer wrote: | > 4. Discord, LLVM email list, git blame | | and don't forget IRC! | UncleOxidant wrote: | > LLVM Developer Meeting tutorials | | Are these all in one place or scattered about? | CalChris wrote: | Either llvm.org under Developer Meetings or the LLVM Youtube | channel. The advantage of llvm.org is that it has a lot of | the PDFs for the presentations as well as some old, pre- | Youtube tutorials. ___________________________________________________________________ (page generated 2021-11-05 23:00 UTC)