[HN Gopher] So you want to be a compiler wizard (2016) ___________________________________________________________________ So you want to be a compiler wizard (2016) Author : pcr910303 Score : 131 points Date : 2020-04-12 14:28 UTC (8 hours ago) (HTM) web link (belkadan.com) (TXT) w3m dump (belkadan.com) | azhenley wrote: | I also recommend checking out the Make a Lisp project: | https://github.com/kanaka/mal | | You can see the source code for a small Lisp interpreter in 81 | different languages. | jmercouris wrote: | There is missing a clear path of how to become a compiler | "wizard". Abstract suggestions on things to learn don't turn you | into a compiler wizard. Also, the bit at the end about open | source developer types felt like a generalized and incoherent | rant. | tom_mellior wrote: | I don't think there is a clear path, or a collection of clear | paths. If you're interested, read stuff and try stuff. With | hard work and enough luck you might be a "wizard" some day. It | would be silly to make promises. | | For whatever it's worth, I have a PhD in compilers, and I work | as a compiler developer. I have done nothing listed in the | article except learn about regular expressions. | | Also, I disagree with your disagreement with the last part of | the article. | dang wrote: | Discussed at the time: | https://news.ycombinator.com/item?id=11777631 | smabie wrote: | Here's a valuable tip for making a compiler/interpreter: don't | use a separate lexer and parser like lex and yacc. Parser | combinators (also known as parser monads) are _so_ much easier to | use. In fact, if you already have decided upon your language 's | grammar, you can probably write yourself a parser and lexer with | a parser combinator inside a day. | samatman wrote: | Agreed, and this goes even smoother when one realizes the | isomorphism between (naive) parser combinators and parsing | expression grammars. | popotamonga wrote: | How easier are they? I did the lex and yacc for javascript and | it was easy (except parsing js's // regex format in lex needed | a little hack) | amelius wrote: | What is the class of grammar they can handle? (I mean in the | sense of LALR(k), LL(k), etc.) | | And is there a large (e.g. exponential) performance penalty for | large lookahead? | millstone wrote: | This is going to be language-specific: parser combinators are | much more pleasant in Haskell than in C++ for example. | | Handwritten recursive descent has a lot going for it, error | messages in particular. | [deleted] | vivekseth wrote: | In addition to the suggestions in the post, I would strongly | recommend https://www.craftinginterpreters.com/ | | It's a very clear introduction to creating a language and | building parser, interpreter, compiler, and VM for it. The book | uses Java and C, but you can use pretty much any language you | want (ex: I used Swift). | twalla wrote: | Also recommended: | | Peter Norvig's lisp interpreter series (Python): | | https://norvig.com/lispy.html | | https://norvig.com/lispy2.html | | Thorsten Ball's interpreter and compiler book (Go): | | https://interpreterbook.com/ | userbinator wrote: | The Crenshaw tutorial is also great reading: | https://compilers.iecc.com/crenshaw/ | | It starts from the "build a calculator" angle which I think is a | very good way to get into compilers, since it emphasises the | recursive nature of things, and extending the calculator to a | full programming language becomes easier that way. | rbtprograms wrote: | I can second the 'make a calculator' as a place to get started | with this, that's a solid project to get the basics of flexing -> | parsing -> evaluation. Helped me learn a ton when I first wanted | to approach this problem. | | On a side note, I pretty annoyed by the prevelance of 'So you | want to be a <insert something here>' titled articles. It's so | common around now and just doesn't seem to express good intent | about what is actually going to be in an article at this point. | titanomachy wrote: | S/flexing/lexing ? Nice typo :) | samatman wrote: | Presumably a pun on `flex`, the Gnu implementation of `lex`. | rbtprograms wrote: | I would love to be this clever | saagarjha wrote: | From 2016, it looks like. | dang wrote: | Added. Thanks! | hackandtrip wrote: | What about something more specific for production-grade | compilers? | | The majority of people those references will learn toy-compilers, | that are surely important but a completely different league than | production-grade compilers, e.g: LLVM. | | Talking specifically about LLVM, does someone have their go-to | references to start and have a sense of the infrastructure, or | even some specific reading about a part of the (huge) | infrastructure? | _hardwaregeek wrote: | Some other good projects: | | * Lambda calculus evaluator | | * Hindley Milner typechecker for lambda calculus | | * Stack based calculator (extend with variables) | | * Play around with macros | | * Work through Crafting Interpreters | | But really in my experience the best way to get better at | compilers (I can't claim to be a wizard) is to just build a | goddamn compiler. Start by writing a parser (you can crib from | Crafting Interpreters), writing the simplest possible typechecker | for arithmetic (crib from Hindley Milner), then a code generator | to whatever target you want. Code generation is tricky, but if | you're generating arithmetic, it's really not that bad. | | If at any point you get confused or have no clue what to do, take | a deep breath and guess! Yep, just guess how to do it. I've done | this quite a few times and later learned that my guess was really | a dumbed down version of idk, solving the AST typing problem, or | lambda lifting, or whatever. If that's too hard, scroll through | other compilers and figure out what they do. | | Once you get something working, celebrate! It's pretty cool | seeing even arithmetic end up as code generated and run. | | Then start adding other features! Add local variables. Add nested | functions. Add whatever. | | The secret is to treat a compiler as a challenging software | development project. Because that's what it is. Start writing | code and figure it out. | | Granted this is not great advice if you want a usable compiler at | the end. I'm just trying to learn, not make something for people | to use. | j88439h84 wrote: | How do I learn to write a type checker? Got any suggested | readings? | krat0sprakhar wrote: | I have a simple implementation of the Hindley Milner Type | Inference algorithm in OCaml that should be easy to grok: | https://github.com/prakhar1989/type-inference | PeCaN wrote: | Type checkers are actually surprisingly easy to write. I | recommend learning about unification (in the logic sense), | maybe just by playing around with Prolog. After you | understand that I think type checking is rather easy to | figure out (it's literally just unification). | chubot wrote: | I think it depends on whether you have type inference or | not. If you have type inference, then it's like | unification. If you have explicit type annotations, it's | even simpler than that. | | https://eli.thegreenplace.net/2018/unification/ | | https://eli.thegreenplace.net/2018/type-inference/ | barrkel wrote: | Imperative languages without type inference (for variables | and functions, that is) work from the bottom up. Symbols | (variables, functions) have declared types, constants have | known types. Operators have overloads, and depending on the | arguments a particular overload is chosen which best matches, | and its return type is the type of the operator expression. | Type conversions may need to occur along the way to make the | arguments to operators or functions fit. | | The hairiest thing is usually overload resolution (both | operator and function, if implemented). Determining the set | of applicable overloads may depend in the types of the | arguments, scoping rules, or generic instantiation if it's in | the language, and determining the best overload depends on | weighing up coercions, subtyping and polymorphism. Simple | rules may lead to rejection of overload selection as | ambiguous in situations where programmers don't want to | define more overloads. | zanderwohl wrote: | I second your recommendation of Crafting Interpreters. Not only | is it free, but it's an excellent way to learn tokenization, | parsing, and tree walking, it's a lot of fun to read. | WalterBright wrote: | > in my experience the best way to get better at compilers (I | can't claim to be a wizard) is to just build a goddamn | compiler. | | Yup. In fact, the best way to start is to build a desk | calculator, where you type in strings like: 4 | + 6 | | and it gives you back 10. There was one in the old K+R book. A | compiler just scales that up. | azhenley wrote: | Exactly. Just start from the beginning and work through each | problem as you get to it. Keep the scope small and iterate. | You'll probably find yourself wanting to start from scratch a | lot once you learn how to do it better, that's ok. | | I'm writing up a tutorial for my students right now. It is just | a stripped down dialect of BASIC. I think seeing the entire | compiler process from start to finish on a tiny language is | what helped me the most. | WalterBright wrote: | I got my start from the Tiny Pascal compiler listing from | BYTE Magazine in the 1970's. It was a complete compiler and | interpreter. | kick wrote: | https://archive.org/details/byte-magazine-1978-09 | | https://archive.org/details/byte-magazine-1978-10-rescan | | https://archive.org/details/byte-magazine-1978-11-rescan | | For anyone curious. | zoomablemind wrote: | Thanks! First time reading Byte mag. Kinda almost half | century late :) | | The article is written with amazing (these days) clarity. | Kinda coincided with my current interest in LLVM. Nice to | see this article also be coming from UofI Urbana- | Champaign grads. | | Reportedly the article series misses implementation of | one component - run-time, which is needed in order to | fully replicate it now. Well, anyway, reading on! | | Btw: HN thread on Tiny Pascal from 2018 | https://news.ycombinator.com/item?id=17220507 | WalterBright wrote: | Thank you. I was too lazy to look it up. | kick wrote: | Always happy to lend context to these things; BYTE was so | fantastic. | [deleted] | Scarbutt wrote: | There also is: http://www.buildyourownlisp.com/ ___________________________________________________________________ (page generated 2020-04-12 23:00 UTC)