[HN Gopher] How to learn compilers: LLVM Edition
       ___________________________________________________________________
        
       How to learn compilers: LLVM Edition
        
       Author : AlexDenisov
       Score  : 62 points
       Date   : 2021-11-04 21:00 UTC (1 days ago)
        
 (HTM) web link (lowlevelbits.org)
 (TXT) w3m dump (lowlevelbits.org)
        
       | anonymousDan wrote:
       | I have to say personally I find general program analysis (e.g.
       | for security) a much more interesting topic than most vanilla
       | compiler courses. For example I recently came across this course
       | by the maintainers of soot:
       | https://youtube.com/playlist?list=PLamk8lFsMyPXrUIQm5naAQ08a...
       | 
       | Any pointers to similar courses much appreciated!
        
         | the_benno wrote:
         | Anders Moeller and Michael Schwartzbach's book [1] on static
         | program analysis is a fantastic resource, with (I think) a
         | great balance of theory and practice. If you want to get really
         | deep into the theory of program analysis, Patrick Cousot just
         | published an incredibly thorough book on abstract
         | interpretation (just got my copy this week, so haven't fully
         | explored enough to have much of an opinion on it as a
         | pedagogical resource)
         | 
         | [1] cs.au.dk/~amoeller/spa
        
       | andrewchambers wrote:
       | More great things:
       | 
       | - https://c9x.me/compile/
       | 
       | - https://github.com/vnmakarov/mir
        
       | tester34 wrote:
       | since here's many compiler hackers then I'd want to ask question:
       | 
       | How do you distribute your frontend with LLVM?
       | 
       | Let's say that I have lexer, parser and emitter written in e.g
       | Haskell (random example)
       | 
       | I emit LLVM IR and then I use LLVM to generate something other
       | 
       | but the problem is, that I need to have LLVM binaries and I'd
       | rather avoid telling people that want to contribute to my OSS
       | project to install LLVM because it's painful process as hell
       | 
       | So I thought about just adding "binaries" folder to my repo and
       | put executables there, but the problem is that they're huge as
       | hell! and also when you're on linux, then you don't need windows'
       | binaries
       | 
       | Another problem is that LLVM installer doesnt include all LLVM
       | components that I need (llc and wasm-ld), so I gotta compile it
       | and tell cmake (iirc) to generate those
       | 
       | I thought about creating 2nd repo where there'd be all binaries
       | compiles for all platforms: mac, linux, windows and after cloning
       | my_repo1, then instruction would point to download specific
       | binaries
       | 
       | How you people do it?
        
         | staticfloat wrote:
         | In the Julia world, we make redistributable binaries for all
         | sorts of things; you can find lots of packages here [0], and
         | for LLVM in particular (which Julia uses to do its codegen) you
         | can find _just_ libLLVM.so (plus a few supporting files) here
         | [1]. If you want a more fully-featured, batteries-included
         | build of LLVM, check out this package [2].
         | 
         | When using these JLL packages from Julia, it will automatically
         | download and load in dependencies, but if you're using it from
         | some other system, you'll probably need to manually check out
         | the `Project.toml` file and see what other JLL packages are
         | listed as dependencies. As an example, `LLVM_full_jll` requires
         | `Zlib_jll` [3], since we build with support for compressed ELF
         | sections. As you may have guessed, you can get `Zlib_jll` from
         | [4], and it thankfully does not have any transitive
         | dependencies.
         | 
         | In the Julia world, we're typically concerned with dynamic
         | linking, (we `dlopen()` and `dlsym()` our way into all our
         | binary dependencies) so this may not meet all your needs, but I
         | figured I'd give it a shout out as it is one of the easier ways
         | to get some binaries; just `curl -L $url | tar -zxv` and you're
         | done. Some larger packages like GTK need to have environment
         | variables set to get them to work from strange locations like
         | the user's home directory. We set those in Julia code when the
         | package is loaded [5], so if you try to use a dependency like
         | one of those, you're on your own to set whatever environment
         | variables/configuration options are needed in order to make
         | something work at an unusual location on disk. Luckily, LLVM
         | (at least the way we use it, via `libLLVM.so`) doesn't require
         | any such shenanigans.
         | 
         | [0] https://github.com/JuliaBinaryWrappers/ [1]
         | https://github.com/JuliaBinaryWrappers/libLLVM_jll.jl/releas...
         | [2]
         | https://github.com/JuliaBinaryWrappers/LLVM_full_jll.jl/rele...
         | [3]
         | https://github.com/JuliaBinaryWrappers/LLVM_full_jll.jl/blob...
         | [4] https://github.com/JuliaBinaryWrappers/Zlib_jll.jl/releases
         | [5]
         | https://github.com/JuliaGraphics/Gtk.jl/blob/0ff744723c32c3f...
        
         | xrisk wrote:
         | You can statically link LLVM, no problem.
         | 
         | In fact, you never have to call any binaries specifically; just
         | do it through code and everything should link at compile-time
         | and become one big binary.
        
           | 10000truths wrote:
           | This is in fact what Zig does. Everything is statically
           | linked into one binary that is used for compiling, linking,
           | building, testing etc.
        
         | HowardStark wrote:
         | Not a compiler hacker and unfamiliar with the scene but is
         | there a specific reason that `git-lfs` wouldn't work? It's the
         | first thing that came to mind reading this. You can also pretty
         | easily fetch specific objects as opposed to everything, so in
         | your README you could direct contributors to only fetch
         | specific binaries for given tasks.
        
         | jcranmer wrote:
         | When you have an LLVM frontend, what you generally do is have
         | your driver run the optimization and code generation steps
         | itself using the LLVM APIs rather than using opt/llc binaries
         | to drive this step. That way, you don't need the LLVM binaries,
         | just the libraries that you statically link into your
         | executable.
         | 
         | For example, all of the code in clang to do this is located in
         | https://github.com/llvm/llvm-project/blob/main/clang/lib/Cod...
        
           | tester756 wrote:
           | What if my frontend is written in non cpp? e.g haskell, js,
           | java, c#, etc.
        
             | jcranmer wrote:
             | You use the LLVM-C bindings via your favorite FFI mechanism
             | to generate the code then, usually.
        
       | [deleted]
        
       | chrisaycock wrote:
       | I learned a lot about LLVM by looking at the compiler output from
       | Clang:                 clang -emit-llvm -S sample.cpp
       | 
       | The article mentions Clang's AST, which can also be emitted:
       | clang -Xclang -ast-dump -fsyntax-only sample.cpp
       | 
       | And for checking compiler outputs across lots of languages and
       | implementations, there's always Matt Godbolt's Compiler Explorer:
       | https://godbolt.org
        
       | CalChris wrote:
       | 1. _Getting Started with LLVM Core Libraries_
       | 
       | It's a bit dated (covers DAGISel rather than GlobalISel) but it
       | gives a thorough introduction.
       | 
       | 2. LLVM Developer Meeting tutorials
       | 
       | These are _really_ good although you 'll have to put them in
       | order yourself. They will be out of date, a little. LLVM is a
       | moving target. Also, you don't have to go through every tutorial.
       | For example, MLIR is not for me.
       | 
       | 3. LLVM documentation
       | 
       | I spent less time reading this than going through the Developer
       | Meeting tutorials. I generally use it as a reference.
       | 
       | 4. Discord, LLVM email list, git blame, LLVM Weekly
       | 
       | ... because you will have questions.
       | 
       | 5. MyFirstTypoFix (in the docs)
       | 
       | ... when it comes time to submit a patch.
       | 
       | 6. Mips backend
       | 
       | If you're doing a backend, you will need a place to start. The
       | LLVM documentation points you to the horribly out of date SPARC
       | backend. Don't even touch that. AArch64 and x86 are very full
       | featured and thus very complex (100 kloc+). Don't use those
       | either. RISC-V is ok but concerns itself mostly with supporting
       | new RISC-V features rather than keeping up to date with LLVM
       | compiler services. Don't use that either although _definitely_
       | work through Alex Bradbury 's RISC-V backend tutorials. Read the
       | Mips backend. It is actively maintained. It has good GlobalISel
       | support almost on par with the flagship AArch64 and x86 backends.
       | 
       | BTW, Chris Lattner is a super nice guy.
        
         | jcranmer wrote:
         | > 4. Discord, LLVM email list, git blame
         | 
         | and don't forget IRC!
        
         | UncleOxidant wrote:
         | > LLVM Developer Meeting tutorials
         | 
         | Are these all in one place or scattered about?
        
           | CalChris wrote:
           | Either llvm.org under Developer Meetings or the LLVM Youtube
           | channel. The advantage of llvm.org is that it has a lot of
           | the PDFs for the presentations as well as some old, pre-
           | Youtube tutorials.
        
       ___________________________________________________________________
       (page generated 2021-11-05 23:00 UTC)