[HN Gopher] Google Common Lisp style guide ___________________________________________________________________ Google Common Lisp style guide Author : fanf2 Score : 114 points Date : 2020-07-07 17:07 UTC (5 hours ago) (HTM) web link (google.github.io) (TXT) w3m dump (google.github.io) | kazinator wrote: | > _You must not use INTERN or UNINTERN at runtime._ | | I.e. you must not read Lisp data at run time, if it contains | symbols, because that will call _intern_. | | > _You should avoid using a list as anything besides a container | of elements of like type._ | | Good-bye, code-is-data. | | I could reduce this guide by a good 30% with "You should avoid | using Lisp as anything as Go or Java". | | But that could be seen as defining a macro, which you must seldom | do. | Spivak wrote: | I really don't think that this is bad advice in general. Mess | with the code all you want at compile time, but don't touch it | at runtime is the good kind of boring. CL is an extremely | powerful language, it doesn't mean you should be using it all | the time in your day-to-day work. | avmich wrote: | > CL is an extremely powerful language, it doesn't mean you | should be using it all the time in your day-to-day work. | | That's the problem of code guides - they try to avoid | problems with some abuses, but to make a good judgement when | some rare decision is justified is hard. So the guide misses | the mark by making an approximate limitation - often on the | safe side. | | It has benefits to write on boring, safe subsets of | languages. Still writing code guidelines is hard. | kazinator wrote: | The thing is, you usually shouldn't be calling INTERN even at | compile time. A better rule is this: | | _Don 't use code to calculate character strings, which are | then converted to symbols via INTERN. The main exceptions to | this rule are structs (which generate slot-reader functions | by combining the structure name and slot names)._ | | Macrology which calculates names using code, which are then | supposed to be explicitly referenced in code, is pretty | stinky. | | E.g.: (define-blob foo ...) | | Here, you're suposed to Just Know that the above is | referenced as _blob-foo_ and not _foo_ , because internally | it catenates "BLOB-" onto (symbol-name 'foo), and calls | intern on that. | Spivak wrote: | I feel like there needs to be a refinement that allows for | something like (define-blob foo) | | to produce the symbol FOO but no other symbols. | dreamcompiler wrote: | Above is one of the many reasons I prefer defclass to | defstruct. Defclass doesn't do this ridiculous nonsense. | AnimalMuppet wrote: | >> You should avoid using a list as anything besides a | container of elements of like type. | | > Good-bye, code-is-data. | | Could you regard that as a list of AST nodes? | divs1210 wrote: | > you must not read Lisp data at run time, if it contains | symbols, because that will call intern | | or use a non-interning reader. Clojure does exactly that - | instead of clojure.core/read, you use clojure.edn/read to read | data without running it as code | kazinator wrote: | That doesn't work if you want symbols to be symbols, such | that if _x_ occurs in two places in the data, it is the same | object according to _eq_. | | According to this coding guideline, you cannot develop a | .fasl format that is made of Lisp read syntax, or exploit | Lisp for sophisticated, structured data formats in general. | ambulancechaser wrote: | can you explain? ((juxt type identity) | (clojure.edn/read-string "x")) [clojure.lang.Symbol | x] | | It seems that the reader returns symbols just fine. | kazinator wrote: | I don't use Clojure; I have no idea. If two occurrences | of "x" are mapped to the same object, that is interning; | maybe what that does is use its own package-like | namespace, separately allocated for each call. | | The documentation for EDN says that "nil, booleans, | strings, characters, and symbols are equal to values of | the same type with the same edn representation." The only | way two symbol values can be equal is if they are | actually same symbol, I would hope. | kmill wrote: | Did you see the justifications? For example, | | > Not only does [INTERN] cons, it either creates a permanent | symbol that won't be collected or gives access to internal | symbols. This creates opportunities for memory leaks, denial of | service attacks, unauthorized access to internals, clashes with | other symbols. | | It even has some advice on using wrappers for INTERN if you | really need it. | | The document has provisions for exceptions to the rules. | There's discussion about using EVAL despite the fact the rule | says you _must not_ use it. | | Also, "should avoid" means that you need a good reason, | addressed in a comment and code review. Many examples you're | probably thinking about are easily containers of elements of | like type anyway (allowing mild cases of sum types and such). | Though things tend to be more robust with intentionally created | data types, I find. | [deleted] | rvense wrote: | What it says is to not abuse lists, isn't it? I think what they | mean is something like, don't use a list as a "object"/tuple | that gets picked apart with a lot of cdddaring? Rather, use | real structures and other containers as appropriate, keeping in | mind the performance characteristics. It specifically says | lists are appropriate for macros and functions used by macros | at compile-time. | reikonomusha wrote: | This is exactly what they mean. | kazinator wrote: | I think it's perfectly fine to use a nested list object | that gets picked apart with destructuring until the point | that it becomes a maintenance or performance problem. (That | point could arrive later that same day, or it might come | never). | | This approach is one of the things that make Lisp Lisp; if | it gives you an allergic reaction, use something else. | dreamcompiler wrote: | I have seen things you people wouldn't believe. 500-line | 7-deep plists off the shoulder of Orion. I watched CLOS | glitter in the dark, unused, near the Tannhauser Gate. | 28-clause typecase statements because the author didn't | understand generic functions. All those moments will be | lost in time, like tears in rain. | | And thank dog for that. | TeMPOraL wrote: | I once won a huge performance win in a commercial project | by discovering that some numerically-heavy computations | on large vectors of numbers were actually using _lists_ | to keep those numbers. I replaced those universally with | arrays, yielding some ridiculous efficiency benefit. The | lists were fine when there were 5 numbers in the bag, but | nobody noticed when the amount of numbers grew to | thousands... | kazinator wrote: | Such things are quantifiable. We can have an exact rule | which says that you can use an ad-hoc list as a structure | if (for instance): | | 1. No more than 17 functions handle this datum, spread | among no more than three source files. | | 2. The structure contains no more than 8 conses. | | 3. In a long-running application under a typical | production load, no more than 10,000 of these objects are | freshly allocated in any five minute period. | | 4. A major software component (such as a library) can | internally have at most three separate instances of such | a data type, and they are not to be involved in the APIs | between major software components. | | Okay, that's now a target we can enforce without wishy | washy judgment calls. | joshuamorton wrote: | > Such things are quantifiable. We can have an exact rule | which says that you can use an ad-hoc list as a structure | if (for instance): | | So what happens when you have a datum that is used in | exactly 17 functions, but you need to add another | feature? | | Or, what happens when people combine functions into | larger ones to avoid having to define a real datatype, | or... | | These concerns exist for each thing. | | "Don't use a list where you really have a struct" is much | more concrete and quantifiable. | kazinator wrote: | > _but you need to add another feature?_ | | You have to add your feature in such a way that the | resulting code meets the rules. | | It's exactly the same like when you need to add something | to a line that is already 79 characters long (maximum | allowed by your coding convention). Or if you have to add | lines of code into a function that is already 200 lines | long (coding style max). Or if you need another argument | in some API that already has the maximum of 8. You have | to step back to some extent and change more of the | surrounding program, than just shoving your intended | change into it. | | > _Or, what happens when people combine functions into | larger ones to avoid having to define a real datatype_ | | If the code remains under the quantified limit for | function size, then it is complying with the document. | | > _" Don't use a list where you really have a struct" is | much more concrete and quantifiable._ | | There are two answers to "where do you really have a | struct?" One is purely opinionated, and one is objective. | The objective answer is: "you really have a struct in all | situations where the list isn't a variable length | container of items of the same type". | | So the concrete and quantifiable (and therefore the | right) interpretation of the rule amounts to _never_ | using a fundamentally characteristic Lisp technique, in | Lisp! | dragonwriter wrote: | > Okay, that's now a target we can enforce without wishy | washy judgment calls. | | Sure, you can have an exact set of rules like that, and | feel free to have an automated enforcement of your own | set of exact rules. There are good reasons coding style | guides often include things which are _not_ exact rules, | and the target for them is often not automated | enforcement but supporting human judgement that balances | multiple factors. Yes, that results in fuzzy boundaries, | but it 's because experience has both shown that there is | an issue but has not provided (yet) sufficient basis for | a quantifiable boundary, because the set of factors being | balanced is complex and multidimensional. Reducing the | dimensionality for simplicity of automated enforcement is | easier, but not necessarily _better_. | monadic2 wrote: | For an alternative point of view, I write lisp for my day | job and would not approve any code that just shoved a | bunch of unrelated data in a list. Thinking about the | destructuring code, especially if you don't have access | to pattern matching, is enough to give me a headache. You | know that's not going to be documented, and if it is, | people won't be able to find the documentation. Just use | records; every lisp has records, and it's the point of | records, and it's not any less "elegant" or "beautiful" | or "lispy" or whatever to have named fields and nice | printing and accessors for free. | | The one exception I can think of is procedures with | variadic arguments. | kazinator wrote: | Why don't you have access to pattern matching in your | Lisp day job? | | (There is at least a coffee machine, health benefits, and | a halfway ergonomic chair to sit on, hopefully.) | reikonomusha wrote: | The purpose of code guidelines is to do exactly that: | guide developers into a certain practice that's | consistent and broadly agreeable in a team setting. Using | lists to fake objects isn't good modern style, and it's a | reasonable guideline to suggest not doing that. I'm sure | Googlers won't gripe if you temporarily cons up a list to | shuttle it around. I'm sure they will gripe if you're | doing SICP 101 style object definitions exclusively with | lists. The latter easily gets out of hand in a language | renowned for its several robust facilities for object | definitions. | | MAXIMA (nee MACSYMA) is known for lists-as-objects not | only being depended upon, but also getting so out of hand | it's nearly impossible to refactor now. MAXIMA at least | has the excuse of being very old software. | | I feel you're interpreting this style guide as some kind | of draconian law of writing Lisp code at Google, and | making awful conclusions from it (e.g., "well if you're | allergic to objects as lists might as well not use Lisp" | or "most of this can be summed up as writing Lisp like | Java"). | kazinator wrote: | In my experience, these rules end up deployed in such a | way that you cannot commit any change until they are | obeyed to the letter. (That goes doubly so if they get | encoded into an automated tool.) | | I don't disagree with that; a coding standard that is not | enforceable without generous judgment calls is far less | useful than a rigorous one. | | Make it exact, and then get everyone to stick to it. | aidenn0 wrote: | I have the exact opposite experience; my job with the | favorite coding standard explicitly had ways of allowing | variances. The CEO came down hard on anyone who said that | "The standard forbids X" and would tell them to reread | the standard because it actually said something like | "Don't do X without doing Y first." | | The basic mentality was if you couldn't responsibly | follow the style guidelines after working there for a | year, then you should be looking for work elsewhere. | alaaalawi wrote: | YMMV but i do prefer the following: | https://www.cs.umd.edu/~nau/cmsc421/norvig-lisp-style.pdf | lordgrenville wrote: | Slightly off-topic but is this official, and if so...why is | Google hosting stuff on Github Pages? Seems sort of amateurish. | Not to mention it belongs to a rival of theirs. | jgodbout wrote: | Most Google open sourced code is on Github. Generally the code | made by developers isn't "official" it's just code (documents) | made by people at Google. | cbarrick wrote: | Google moved a lot of their open source projects to GitHub | after they shut down Google Code. That was before the MS | acquisition. | | You can find a lot of smaller projects at | https://github.com/google, and obviously there are some big | ones like https://github.com/tensorflow/tensorflow. | | Notable exceptions are Android and Fuchsia, which have their | own hosted git repos at | https://{android,fuschia}.googlesource.com. | lordgrenville wrote: | Thanks for clarifying. It makes sense to host repos on GitHub | - it's so widespread - but what struck me as weird was using | the.github.io domain, which I usually associate with amateur | bloggers who don't want the hassle of registering a domain. | Jtsummers wrote: | GitHub hasn't always been owned by MS, I believe this was up on | GitHub before the acquisition (along with a lot of other Google | content after they closed up Google Code). | | And yes, it's official. Google acquired ITA which was rather | famous as a Common Lisp shop. This meant that they had acquired | a substantial Common Lisp codebase. If you drop the document | from the URL you end up at [0] which includes links to other | language style guides. | | [0] https://google.github.io/styleguide/ | slenk wrote: | Google doesn't have a product like that... | brundolf wrote: | > Macros bring syntactic abstraction, which is a wonderful thing. | It helps make your code clearer, by describing your intent | without getting bogged in implementation details (indeed | abstracting those details away). It helps make your code more | concise and more readable, by eliminating both redundancy and | irrelevant details. But it comes at a cost to the reader, which | is learning a new syntactic concept for each macro. And so it | should not be abused. | | I really think this just applies to any kind of indirection - | classes, functions, even named constants (vs literals). | skybrian wrote: | Syntax changes have different consequences for reading | comprehension. You can often skip over a function call, making | only a reasonable guess at what the function does and relying | on invariants for all function calls. A macro can arbitrarily | change the lexical environment of anything contained in it with | few constraints, so reading other parts of a file without | knowing what each macro does is more precarious. | | And when it comes to navigating large amounts of code, you do | need to stop somewhere; you can't do a depth-first read of | everything before doing anything. | lioeters wrote: | Indeed, for what are classes and functions but specific kinds | of macro (roughly speaking); or, macros and classes as special | kinds of functions.. | | I'd include overloading operators in that list. It can be | convenient, but comes at a cost to newcomers to the codebase. | | I suppose any kind of shortcut or abbreviation carries this | risk, to increase the cognitive load of the reader - things | they have to remember and mentally substitute the shortcuts | until they become second nature. | | (Oh, right, what we call "shortcut" and "indirection" are both | examples of _abstraction_ , its value and cost.) | aidenn0 wrote: | Right; all abstractions are only as good as they don't leak, | but the question is "how easy is it to debug when it does | leak?" | | I think you listed classes, functions, and named constants in | approximately the order of debugability too. | | It can be unclear even which macroexpansions are in play from a | backtrace, much less which one caused the breakage. (non | inlined) functions are right there in the backtrace, and of | course, debugging a named constant is as simple as typing its | name into the REPL. | fiddlerwoaroof wrote: | In emacs, the macrostep expander solves most of the Macro | debugging issues: you usually can expand a macro use, and see | exactly what code is being generated or "refactor" the macro | away by copying the expansion at a certain level of detail | and replacing the original form with it. | aidenn0 wrote: | Yes, this is a useful tool; I still maintain that debugging | macros with this tool is harder than debugging functions | with the various other SLIME tools. | dreamcompiler wrote: | > I really think this just applies to any kind of indirection - | classes, functions, even named constants (vs literals). | | Disagree. Macros are fundamentally different in that they can | change the syntax of the language. As such, they can inhibit | the code's readability in ways the other defining forms cannot. | Looking up something by name in CL is as simple as meta-dot. | But you cannot meta-dot syntax. (Well, you could on a Lisp | Machine, but not so much on current CL implementations.) | TeMPOraL wrote: | > _But you cannot meta-dot syntax._ | | You can C-c M-e the macro form to have SLIME expand it for | you inline (read-only), and then navigate around it, using e | to expand and c to collapse back. Super useful. | brundolf wrote: | It's a difference of degree, not of kind. Any indirection | forces the user to reason about a constructed concept instead | of the literal facts of what's happening. By introducing one, | you're making the assertion that the abstract concept is | easier to reason about (including any effort required to | learn it) than the contents being abstracted away. | butterisgood wrote: | Went looking for advice on concurrency/parallelism and error | handling. | aidenn0 wrote: | The 2 word guide to concurrency and parallelism: Use fork(). | danielam wrote: | FWIW, speaking from memory, if QPX, Google's (previously ITA's) | low airfare search engine, is any indication, then the lack of | any real mention is not surprising because QPX did not make use | of parallelization or concurrency. The way it was written made | that move very difficult. From what I recall, there was | interest in parallelizing some bits of the computation, but I | don't recall it ever really going anywhere. (QRES, ITA's | reservation system, which was also written in Common Lisp, may | have made use of concurrency or parallelization in some way, | but my knowledge of that system is limited.) | | N.b., QPX did not quite follow all of these recommended | practices during my tenure (e.g., ubiquitous SETFing of object | slots). | ch_123 wrote: | I assume (perhaps naively) that Google must have a non trivial | amount of CL development if they have a style guide for the | language... anyone know what they use CL for? | shaftway wrote: | It doesn't take much need to write a style guide. And this one | is rather lackluster. By comparison look at the shell style | guide: https://google.github.io/styleguide/shellguide.html | gpanders wrote: | From the linked shell style guide: | | > Indent 2 spaces. No tabs. | | > Use blank lines between blocks to improve readability. | Indentation is two spaces. Whatever you do, don't use tabs. | | The recent obsession with 2 space indents is boggling to me. | I find it much more difficult to read (especially in long | blocks of code with lots of indentation switches) and I'm not | even visually impaired. | | They also apparently _really_ don 't like tabs, which I find | interesting. Personally, I was converted to the tabs camp | after learning how much better tabs are for accessibility. | I'm surprised that Google of all places doesn't take that | more seriously. | carry_bit wrote: | My guess: Google bought ITA Software about 10 years ago, and | their search engine was written in Common Lisp. | danielam wrote: | Yes. The authors of this document were all engineers who | worked on QPX/QRES. | jimbokun wrote: | Is it weird that I enjoy reading this, even though I haven't | programmed in Common Lisp for a long while? | dang wrote: | If curious see also | | 2012 https://news.ycombinator.com/item?id=4639490 | | Not Google's, from last year: | https://news.ycombinator.com/item?id=20505807 | airstrike wrote: | Beat me to it! Link to the article in the 2012 submission is | broken, so here's a web archive copy: | http://web.archive.org/web/20130114221734/https://google-sty... | | Seems like the exact same document which harks back to the ITA | Software acquisition per comments at the time of that | submission ___________________________________________________________________ (page generated 2020-07-07 23:00 UTC)