[HN Gopher] So you want to be a compiler wizard (2016)
       ___________________________________________________________________
        
       So you want to be a compiler wizard (2016)
        
       Author : pcr910303
       Score  : 131 points
       Date   : 2020-04-12 14:28 UTC (8 hours ago)
        
 (HTM) web link (belkadan.com)
 (TXT) w3m dump (belkadan.com)
        
       | azhenley wrote:
       | I also recommend checking out the Make a Lisp project:
       | https://github.com/kanaka/mal
       | 
       | You can see the source code for a small Lisp interpreter in 81
       | different languages.
        
       | jmercouris wrote:
       | There is missing a clear path of how to become a compiler
       | "wizard". Abstract suggestions on things to learn don't turn you
       | into a compiler wizard. Also, the bit at the end about open
       | source developer types felt like a generalized and incoherent
       | rant.
        
         | tom_mellior wrote:
         | I don't think there is a clear path, or a collection of clear
         | paths. If you're interested, read stuff and try stuff. With
         | hard work and enough luck you might be a "wizard" some day. It
         | would be silly to make promises.
         | 
         | For whatever it's worth, I have a PhD in compilers, and I work
         | as a compiler developer. I have done nothing listed in the
         | article except learn about regular expressions.
         | 
         | Also, I disagree with your disagreement with the last part of
         | the article.
        
       | dang wrote:
       | Discussed at the time:
       | https://news.ycombinator.com/item?id=11777631
        
       | smabie wrote:
       | Here's a valuable tip for making a compiler/interpreter: don't
       | use a separate lexer and parser like lex and yacc. Parser
       | combinators (also known as parser monads) are _so_ much easier to
       | use. In fact, if you already have decided upon your language 's
       | grammar, you can probably write yourself a parser and lexer with
       | a parser combinator inside a day.
        
         | samatman wrote:
         | Agreed, and this goes even smoother when one realizes the
         | isomorphism between (naive) parser combinators and parsing
         | expression grammars.
        
         | popotamonga wrote:
         | How easier are they? I did the lex and yacc for javascript and
         | it was easy (except parsing js's // regex format in lex needed
         | a little hack)
        
         | amelius wrote:
         | What is the class of grammar they can handle? (I mean in the
         | sense of LALR(k), LL(k), etc.)
         | 
         | And is there a large (e.g. exponential) performance penalty for
         | large lookahead?
        
         | millstone wrote:
         | This is going to be language-specific: parser combinators are
         | much more pleasant in Haskell than in C++ for example.
         | 
         | Handwritten recursive descent has a lot going for it, error
         | messages in particular.
        
       | [deleted]
        
       | vivekseth wrote:
       | In addition to the suggestions in the post, I would strongly
       | recommend https://www.craftinginterpreters.com/
       | 
       | It's a very clear introduction to creating a language and
       | building parser, interpreter, compiler, and VM for it. The book
       | uses Java and C, but you can use pretty much any language you
       | want (ex: I used Swift).
        
         | twalla wrote:
         | Also recommended:
         | 
         | Peter Norvig's lisp interpreter series (Python):
         | 
         | https://norvig.com/lispy.html
         | 
         | https://norvig.com/lispy2.html
         | 
         | Thorsten Ball's interpreter and compiler book (Go):
         | 
         | https://interpreterbook.com/
        
       | userbinator wrote:
       | The Crenshaw tutorial is also great reading:
       | https://compilers.iecc.com/crenshaw/
       | 
       | It starts from the "build a calculator" angle which I think is a
       | very good way to get into compilers, since it emphasises the
       | recursive nature of things, and extending the calculator to a
       | full programming language becomes easier that way.
        
       | rbtprograms wrote:
       | I can second the 'make a calculator' as a place to get started
       | with this, that's a solid project to get the basics of flexing ->
       | parsing -> evaluation. Helped me learn a ton when I first wanted
       | to approach this problem.
       | 
       | On a side note, I pretty annoyed by the prevelance of 'So you
       | want to be a <insert something here>' titled articles. It's so
       | common around now and just doesn't seem to express good intent
       | about what is actually going to be in an article at this point.
        
         | titanomachy wrote:
         | S/flexing/lexing ? Nice typo :)
        
           | samatman wrote:
           | Presumably a pun on `flex`, the Gnu implementation of `lex`.
        
             | rbtprograms wrote:
             | I would love to be this clever
        
       | saagarjha wrote:
       | From 2016, it looks like.
        
         | dang wrote:
         | Added. Thanks!
        
       | hackandtrip wrote:
       | What about something more specific for production-grade
       | compilers?
       | 
       | The majority of people those references will learn toy-compilers,
       | that are surely important but a completely different league than
       | production-grade compilers, e.g: LLVM.
       | 
       | Talking specifically about LLVM, does someone have their go-to
       | references to start and have a sense of the infrastructure, or
       | even some specific reading about a part of the (huge)
       | infrastructure?
        
       | _hardwaregeek wrote:
       | Some other good projects:
       | 
       | * Lambda calculus evaluator
       | 
       | * Hindley Milner typechecker for lambda calculus
       | 
       | * Stack based calculator (extend with variables)
       | 
       | * Play around with macros
       | 
       | * Work through Crafting Interpreters
       | 
       | But really in my experience the best way to get better at
       | compilers (I can't claim to be a wizard) is to just build a
       | goddamn compiler. Start by writing a parser (you can crib from
       | Crafting Interpreters), writing the simplest possible typechecker
       | for arithmetic (crib from Hindley Milner), then a code generator
       | to whatever target you want. Code generation is tricky, but if
       | you're generating arithmetic, it's really not that bad.
       | 
       | If at any point you get confused or have no clue what to do, take
       | a deep breath and guess! Yep, just guess how to do it. I've done
       | this quite a few times and later learned that my guess was really
       | a dumbed down version of idk, solving the AST typing problem, or
       | lambda lifting, or whatever. If that's too hard, scroll through
       | other compilers and figure out what they do.
       | 
       | Once you get something working, celebrate! It's pretty cool
       | seeing even arithmetic end up as code generated and run.
       | 
       | Then start adding other features! Add local variables. Add nested
       | functions. Add whatever.
       | 
       | The secret is to treat a compiler as a challenging software
       | development project. Because that's what it is. Start writing
       | code and figure it out.
       | 
       | Granted this is not great advice if you want a usable compiler at
       | the end. I'm just trying to learn, not make something for people
       | to use.
        
         | j88439h84 wrote:
         | How do I learn to write a type checker? Got any suggested
         | readings?
        
           | krat0sprakhar wrote:
           | I have a simple implementation of the Hindley Milner Type
           | Inference algorithm in OCaml that should be easy to grok:
           | https://github.com/prakhar1989/type-inference
        
           | PeCaN wrote:
           | Type checkers are actually surprisingly easy to write. I
           | recommend learning about unification (in the logic sense),
           | maybe just by playing around with Prolog. After you
           | understand that I think type checking is rather easy to
           | figure out (it's literally just unification).
        
             | chubot wrote:
             | I think it depends on whether you have type inference or
             | not. If you have type inference, then it's like
             | unification. If you have explicit type annotations, it's
             | even simpler than that.
             | 
             | https://eli.thegreenplace.net/2018/unification/
             | 
             | https://eli.thegreenplace.net/2018/type-inference/
        
           | barrkel wrote:
           | Imperative languages without type inference (for variables
           | and functions, that is) work from the bottom up. Symbols
           | (variables, functions) have declared types, constants have
           | known types. Operators have overloads, and depending on the
           | arguments a particular overload is chosen which best matches,
           | and its return type is the type of the operator expression.
           | Type conversions may need to occur along the way to make the
           | arguments to operators or functions fit.
           | 
           | The hairiest thing is usually overload resolution (both
           | operator and function, if implemented). Determining the set
           | of applicable overloads may depend in the types of the
           | arguments, scoping rules, or generic instantiation if it's in
           | the language, and determining the best overload depends on
           | weighing up coercions, subtyping and polymorphism. Simple
           | rules may lead to rejection of overload selection as
           | ambiguous in situations where programmers don't want to
           | define more overloads.
        
         | zanderwohl wrote:
         | I second your recommendation of Crafting Interpreters. Not only
         | is it free, but it's an excellent way to learn tokenization,
         | parsing, and tree walking, it's a lot of fun to read.
        
         | WalterBright wrote:
         | > in my experience the best way to get better at compilers (I
         | can't claim to be a wizard) is to just build a goddamn
         | compiler.
         | 
         | Yup. In fact, the best way to start is to build a desk
         | calculator, where you type in strings like:                   4
         | + 6
         | 
         | and it gives you back 10. There was one in the old K+R book. A
         | compiler just scales that up.
        
         | azhenley wrote:
         | Exactly. Just start from the beginning and work through each
         | problem as you get to it. Keep the scope small and iterate.
         | You'll probably find yourself wanting to start from scratch a
         | lot once you learn how to do it better, that's ok.
         | 
         | I'm writing up a tutorial for my students right now. It is just
         | a stripped down dialect of BASIC. I think seeing the entire
         | compiler process from start to finish on a tiny language is
         | what helped me the most.
        
           | WalterBright wrote:
           | I got my start from the Tiny Pascal compiler listing from
           | BYTE Magazine in the 1970's. It was a complete compiler and
           | interpreter.
        
             | kick wrote:
             | https://archive.org/details/byte-magazine-1978-09
             | 
             | https://archive.org/details/byte-magazine-1978-10-rescan
             | 
             | https://archive.org/details/byte-magazine-1978-11-rescan
             | 
             | For anyone curious.
        
               | zoomablemind wrote:
               | Thanks! First time reading Byte mag. Kinda almost half
               | century late :)
               | 
               | The article is written with amazing (these days) clarity.
               | Kinda coincided with my current interest in LLVM. Nice to
               | see this article also be coming from UofI Urbana-
               | Champaign grads.
               | 
               | Reportedly the article series misses implementation of
               | one component - run-time, which is needed in order to
               | fully replicate it now. Well, anyway, reading on!
               | 
               | Btw: HN thread on Tiny Pascal from 2018
               | https://news.ycombinator.com/item?id=17220507
        
               | WalterBright wrote:
               | Thank you. I was too lazy to look it up.
        
               | kick wrote:
               | Always happy to lend context to these things; BYTE was so
               | fantastic.
        
       | [deleted]
        
       | Scarbutt wrote:
       | There also is: http://www.buildyourownlisp.com/
        
       ___________________________________________________________________
       (page generated 2020-04-12 23:00 UTC)