[HN Gopher] Google Common Lisp style guide
       ___________________________________________________________________
        
       Google Common Lisp style guide
        
       Author : fanf2
       Score  : 114 points
       Date   : 2020-07-07 17:07 UTC (5 hours ago)
        
 (HTM) web link (google.github.io)
 (TXT) w3m dump (google.github.io)
        
       | kazinator wrote:
       | > _You must not use INTERN or UNINTERN at runtime._
       | 
       | I.e. you must not read Lisp data at run time, if it contains
       | symbols, because that will call _intern_.
       | 
       | > _You should avoid using a list as anything besides a container
       | of elements of like type._
       | 
       | Good-bye, code-is-data.
       | 
       | I could reduce this guide by a good 30% with "You should avoid
       | using Lisp as anything as Go or Java".
       | 
       | But that could be seen as defining a macro, which you must seldom
       | do.
        
         | Spivak wrote:
         | I really don't think that this is bad advice in general. Mess
         | with the code all you want at compile time, but don't touch it
         | at runtime is the good kind of boring. CL is an extremely
         | powerful language, it doesn't mean you should be using it all
         | the time in your day-to-day work.
        
           | avmich wrote:
           | > CL is an extremely powerful language, it doesn't mean you
           | should be using it all the time in your day-to-day work.
           | 
           | That's the problem of code guides - they try to avoid
           | problems with some abuses, but to make a good judgement when
           | some rare decision is justified is hard. So the guide misses
           | the mark by making an approximate limitation - often on the
           | safe side.
           | 
           | It has benefits to write on boring, safe subsets of
           | languages. Still writing code guidelines is hard.
        
           | kazinator wrote:
           | The thing is, you usually shouldn't be calling INTERN even at
           | compile time. A better rule is this:
           | 
           |  _Don 't use code to calculate character strings, which are
           | then converted to symbols via INTERN. The main exceptions to
           | this rule are structs (which generate slot-reader functions
           | by combining the structure name and slot names)._
           | 
           | Macrology which calculates names using code, which are then
           | supposed to be explicitly referenced in code, is pretty
           | stinky.
           | 
           | E.g.:                 (define-blob foo ...)
           | 
           | Here, you're suposed to Just Know that the above is
           | referenced as _blob-foo_ and not _foo_ , because internally
           | it catenates "BLOB-" onto (symbol-name 'foo), and calls
           | intern on that.
        
             | Spivak wrote:
             | I feel like there needs to be a refinement that allows for
             | something like                   (define-blob foo)
             | 
             | to produce the symbol FOO but no other symbols.
        
             | dreamcompiler wrote:
             | Above is one of the many reasons I prefer defclass to
             | defstruct. Defclass doesn't do this ridiculous nonsense.
        
         | AnimalMuppet wrote:
         | >> You should avoid using a list as anything besides a
         | container of elements of like type.
         | 
         | > Good-bye, code-is-data.
         | 
         | Could you regard that as a list of AST nodes?
        
         | divs1210 wrote:
         | > you must not read Lisp data at run time, if it contains
         | symbols, because that will call intern
         | 
         | or use a non-interning reader. Clojure does exactly that -
         | instead of clojure.core/read, you use clojure.edn/read to read
         | data without running it as code
        
           | kazinator wrote:
           | That doesn't work if you want symbols to be symbols, such
           | that if _x_ occurs in two places in the data, it is the same
           | object according to _eq_.
           | 
           | According to this coding guideline, you cannot develop a
           | .fasl format that is made of Lisp read syntax, or exploit
           | Lisp for sophisticated, structured data formats in general.
        
             | ambulancechaser wrote:
             | can you explain?                   ((juxt type identity)
             | (clojure.edn/read-string "x"))         [clojure.lang.Symbol
             | x]
             | 
             | It seems that the reader returns symbols just fine.
        
               | kazinator wrote:
               | I don't use Clojure; I have no idea. If two occurrences
               | of "x" are mapped to the same object, that is interning;
               | maybe what that does is use its own package-like
               | namespace, separately allocated for each call.
               | 
               | The documentation for EDN says that "nil, booleans,
               | strings, characters, and symbols are equal to values of
               | the same type with the same edn representation." The only
               | way two symbol values can be equal is if they are
               | actually same symbol, I would hope.
        
         | kmill wrote:
         | Did you see the justifications? For example,
         | 
         | > Not only does [INTERN] cons, it either creates a permanent
         | symbol that won't be collected or gives access to internal
         | symbols. This creates opportunities for memory leaks, denial of
         | service attacks, unauthorized access to internals, clashes with
         | other symbols.
         | 
         | It even has some advice on using wrappers for INTERN if you
         | really need it.
         | 
         | The document has provisions for exceptions to the rules.
         | There's discussion about using EVAL despite the fact the rule
         | says you _must not_ use it.
         | 
         | Also, "should avoid" means that you need a good reason,
         | addressed in a comment and code review. Many examples you're
         | probably thinking about are easily containers of elements of
         | like type anyway (allowing mild cases of sum types and such).
         | Though things tend to be more robust with intentionally created
         | data types, I find.
        
           | [deleted]
        
         | rvense wrote:
         | What it says is to not abuse lists, isn't it? I think what they
         | mean is something like, don't use a list as a "object"/tuple
         | that gets picked apart with a lot of cdddaring? Rather, use
         | real structures and other containers as appropriate, keeping in
         | mind the performance characteristics. It specifically says
         | lists are appropriate for macros and functions used by macros
         | at compile-time.
        
           | reikonomusha wrote:
           | This is exactly what they mean.
        
             | kazinator wrote:
             | I think it's perfectly fine to use a nested list object
             | that gets picked apart with destructuring until the point
             | that it becomes a maintenance or performance problem. (That
             | point could arrive later that same day, or it might come
             | never).
             | 
             | This approach is one of the things that make Lisp Lisp; if
             | it gives you an allergic reaction, use something else.
        
               | dreamcompiler wrote:
               | I have seen things you people wouldn't believe. 500-line
               | 7-deep plists off the shoulder of Orion. I watched CLOS
               | glitter in the dark, unused, near the Tannhauser Gate.
               | 28-clause typecase statements because the author didn't
               | understand generic functions. All those moments will be
               | lost in time, like tears in rain.
               | 
               | And thank dog for that.
        
               | TeMPOraL wrote:
               | I once won a huge performance win in a commercial project
               | by discovering that some numerically-heavy computations
               | on large vectors of numbers were actually using _lists_
               | to keep those numbers. I replaced those universally with
               | arrays, yielding some ridiculous efficiency benefit. The
               | lists were fine when there were 5 numbers in the bag, but
               | nobody noticed when the amount of numbers grew to
               | thousands...
        
               | kazinator wrote:
               | Such things are quantifiable. We can have an exact rule
               | which says that you can use an ad-hoc list as a structure
               | if (for instance):
               | 
               | 1. No more than 17 functions handle this datum, spread
               | among no more than three source files.
               | 
               | 2. The structure contains no more than 8 conses.
               | 
               | 3. In a long-running application under a typical
               | production load, no more than 10,000 of these objects are
               | freshly allocated in any five minute period.
               | 
               | 4. A major software component (such as a library) can
               | internally have at most three separate instances of such
               | a data type, and they are not to be involved in the APIs
               | between major software components.
               | 
               | Okay, that's now a target we can enforce without wishy
               | washy judgment calls.
        
               | joshuamorton wrote:
               | > Such things are quantifiable. We can have an exact rule
               | which says that you can use an ad-hoc list as a structure
               | if (for instance):
               | 
               | So what happens when you have a datum that is used in
               | exactly 17 functions, but you need to add another
               | feature?
               | 
               | Or, what happens when people combine functions into
               | larger ones to avoid having to define a real datatype,
               | or...
               | 
               | These concerns exist for each thing.
               | 
               | "Don't use a list where you really have a struct" is much
               | more concrete and quantifiable.
        
               | kazinator wrote:
               | > _but you need to add another feature?_
               | 
               | You have to add your feature in such a way that the
               | resulting code meets the rules.
               | 
               | It's exactly the same like when you need to add something
               | to a line that is already 79 characters long (maximum
               | allowed by your coding convention). Or if you have to add
               | lines of code into a function that is already 200 lines
               | long (coding style max). Or if you need another argument
               | in some API that already has the maximum of 8. You have
               | to step back to some extent and change more of the
               | surrounding program, than just shoving your intended
               | change into it.
               | 
               | > _Or, what happens when people combine functions into
               | larger ones to avoid having to define a real datatype_
               | 
               | If the code remains under the quantified limit for
               | function size, then it is complying with the document.
               | 
               | > _" Don't use a list where you really have a struct" is
               | much more concrete and quantifiable._
               | 
               | There are two answers to "where do you really have a
               | struct?" One is purely opinionated, and one is objective.
               | The objective answer is: "you really have a struct in all
               | situations where the list isn't a variable length
               | container of items of the same type".
               | 
               | So the concrete and quantifiable (and therefore the
               | right) interpretation of the rule amounts to _never_
               | using a fundamentally characteristic Lisp technique, in
               | Lisp!
        
               | dragonwriter wrote:
               | > Okay, that's now a target we can enforce without wishy
               | washy judgment calls.
               | 
               | Sure, you can have an exact set of rules like that, and
               | feel free to have an automated enforcement of your own
               | set of exact rules. There are good reasons coding style
               | guides often include things which are _not_ exact rules,
               | and the target for them is often not automated
               | enforcement but supporting human judgement that balances
               | multiple factors. Yes, that results in fuzzy boundaries,
               | but it 's because experience has both shown that there is
               | an issue but has not provided (yet) sufficient basis for
               | a quantifiable boundary, because the set of factors being
               | balanced is complex and multidimensional. Reducing the
               | dimensionality for simplicity of automated enforcement is
               | easier, but not necessarily _better_.
        
               | monadic2 wrote:
               | For an alternative point of view, I write lisp for my day
               | job and would not approve any code that just shoved a
               | bunch of unrelated data in a list. Thinking about the
               | destructuring code, especially if you don't have access
               | to pattern matching, is enough to give me a headache. You
               | know that's not going to be documented, and if it is,
               | people won't be able to find the documentation. Just use
               | records; every lisp has records, and it's the point of
               | records, and it's not any less "elegant" or "beautiful"
               | or "lispy" or whatever to have named fields and nice
               | printing and accessors for free.
               | 
               | The one exception I can think of is procedures with
               | variadic arguments.
        
               | kazinator wrote:
               | Why don't you have access to pattern matching in your
               | Lisp day job?
               | 
               | (There is at least a coffee machine, health benefits, and
               | a halfway ergonomic chair to sit on, hopefully.)
        
               | reikonomusha wrote:
               | The purpose of code guidelines is to do exactly that:
               | guide developers into a certain practice that's
               | consistent and broadly agreeable in a team setting. Using
               | lists to fake objects isn't good modern style, and it's a
               | reasonable guideline to suggest not doing that. I'm sure
               | Googlers won't gripe if you temporarily cons up a list to
               | shuttle it around. I'm sure they will gripe if you're
               | doing SICP 101 style object definitions exclusively with
               | lists. The latter easily gets out of hand in a language
               | renowned for its several robust facilities for object
               | definitions.
               | 
               | MAXIMA (nee MACSYMA) is known for lists-as-objects not
               | only being depended upon, but also getting so out of hand
               | it's nearly impossible to refactor now. MAXIMA at least
               | has the excuse of being very old software.
               | 
               | I feel you're interpreting this style guide as some kind
               | of draconian law of writing Lisp code at Google, and
               | making awful conclusions from it (e.g., "well if you're
               | allergic to objects as lists might as well not use Lisp"
               | or "most of this can be summed up as writing Lisp like
               | Java").
        
               | kazinator wrote:
               | In my experience, these rules end up deployed in such a
               | way that you cannot commit any change until they are
               | obeyed to the letter. (That goes doubly so if they get
               | encoded into an automated tool.)
               | 
               | I don't disagree with that; a coding standard that is not
               | enforceable without generous judgment calls is far less
               | useful than a rigorous one.
               | 
               | Make it exact, and then get everyone to stick to it.
        
               | aidenn0 wrote:
               | I have the exact opposite experience; my job with the
               | favorite coding standard explicitly had ways of allowing
               | variances. The CEO came down hard on anyone who said that
               | "The standard forbids X" and would tell them to reread
               | the standard because it actually said something like
               | "Don't do X without doing Y first."
               | 
               | The basic mentality was if you couldn't responsibly
               | follow the style guidelines after working there for a
               | year, then you should be looking for work elsewhere.
        
       | alaaalawi wrote:
       | YMMV but i do prefer the following:
       | https://www.cs.umd.edu/~nau/cmsc421/norvig-lisp-style.pdf
        
       | lordgrenville wrote:
       | Slightly off-topic but is this official, and if so...why is
       | Google hosting stuff on Github Pages? Seems sort of amateurish.
       | Not to mention it belongs to a rival of theirs.
        
         | jgodbout wrote:
         | Most Google open sourced code is on Github. Generally the code
         | made by developers isn't "official" it's just code (documents)
         | made by people at Google.
        
         | cbarrick wrote:
         | Google moved a lot of their open source projects to GitHub
         | after they shut down Google Code. That was before the MS
         | acquisition.
         | 
         | You can find a lot of smaller projects at
         | https://github.com/google, and obviously there are some big
         | ones like https://github.com/tensorflow/tensorflow.
         | 
         | Notable exceptions are Android and Fuchsia, which have their
         | own hosted git repos at
         | https://{android,fuschia}.googlesource.com.
        
           | lordgrenville wrote:
           | Thanks for clarifying. It makes sense to host repos on GitHub
           | - it's so widespread - but what struck me as weird was using
           | the.github.io domain, which I usually associate with amateur
           | bloggers who don't want the hassle of registering a domain.
        
         | Jtsummers wrote:
         | GitHub hasn't always been owned by MS, I believe this was up on
         | GitHub before the acquisition (along with a lot of other Google
         | content after they closed up Google Code).
         | 
         | And yes, it's official. Google acquired ITA which was rather
         | famous as a Common Lisp shop. This meant that they had acquired
         | a substantial Common Lisp codebase. If you drop the document
         | from the URL you end up at [0] which includes links to other
         | language style guides.
         | 
         | [0] https://google.github.io/styleguide/
        
         | slenk wrote:
         | Google doesn't have a product like that...
        
       | brundolf wrote:
       | > Macros bring syntactic abstraction, which is a wonderful thing.
       | It helps make your code clearer, by describing your intent
       | without getting bogged in implementation details (indeed
       | abstracting those details away). It helps make your code more
       | concise and more readable, by eliminating both redundancy and
       | irrelevant details. But it comes at a cost to the reader, which
       | is learning a new syntactic concept for each macro. And so it
       | should not be abused.
       | 
       | I really think this just applies to any kind of indirection -
       | classes, functions, even named constants (vs literals).
        
         | skybrian wrote:
         | Syntax changes have different consequences for reading
         | comprehension. You can often skip over a function call, making
         | only a reasonable guess at what the function does and relying
         | on invariants for all function calls. A macro can arbitrarily
         | change the lexical environment of anything contained in it with
         | few constraints, so reading other parts of a file without
         | knowing what each macro does is more precarious.
         | 
         | And when it comes to navigating large amounts of code, you do
         | need to stop somewhere; you can't do a depth-first read of
         | everything before doing anything.
        
         | lioeters wrote:
         | Indeed, for what are classes and functions but specific kinds
         | of macro (roughly speaking); or, macros and classes as special
         | kinds of functions..
         | 
         | I'd include overloading operators in that list. It can be
         | convenient, but comes at a cost to newcomers to the codebase.
         | 
         | I suppose any kind of shortcut or abbreviation carries this
         | risk, to increase the cognitive load of the reader - things
         | they have to remember and mentally substitute the shortcuts
         | until they become second nature.
         | 
         | (Oh, right, what we call "shortcut" and "indirection" are both
         | examples of _abstraction_ , its value and cost.)
        
         | aidenn0 wrote:
         | Right; all abstractions are only as good as they don't leak,
         | but the question is "how easy is it to debug when it does
         | leak?"
         | 
         | I think you listed classes, functions, and named constants in
         | approximately the order of debugability too.
         | 
         | It can be unclear even which macroexpansions are in play from a
         | backtrace, much less which one caused the breakage. (non
         | inlined) functions are right there in the backtrace, and of
         | course, debugging a named constant is as simple as typing its
         | name into the REPL.
        
           | fiddlerwoaroof wrote:
           | In emacs, the macrostep expander solves most of the Macro
           | debugging issues: you usually can expand a macro use, and see
           | exactly what code is being generated or "refactor" the macro
           | away by copying the expansion at a certain level of detail
           | and replacing the original form with it.
        
             | aidenn0 wrote:
             | Yes, this is a useful tool; I still maintain that debugging
             | macros with this tool is harder than debugging functions
             | with the various other SLIME tools.
        
         | dreamcompiler wrote:
         | > I really think this just applies to any kind of indirection -
         | classes, functions, even named constants (vs literals).
         | 
         | Disagree. Macros are fundamentally different in that they can
         | change the syntax of the language. As such, they can inhibit
         | the code's readability in ways the other defining forms cannot.
         | Looking up something by name in CL is as simple as meta-dot.
         | But you cannot meta-dot syntax. (Well, you could on a Lisp
         | Machine, but not so much on current CL implementations.)
        
           | TeMPOraL wrote:
           | > _But you cannot meta-dot syntax._
           | 
           | You can C-c M-e the macro form to have SLIME expand it for
           | you inline (read-only), and then navigate around it, using e
           | to expand and c to collapse back. Super useful.
        
           | brundolf wrote:
           | It's a difference of degree, not of kind. Any indirection
           | forces the user to reason about a constructed concept instead
           | of the literal facts of what's happening. By introducing one,
           | you're making the assertion that the abstract concept is
           | easier to reason about (including any effort required to
           | learn it) than the contents being abstracted away.
        
       | butterisgood wrote:
       | Went looking for advice on concurrency/parallelism and error
       | handling.
        
         | aidenn0 wrote:
         | The 2 word guide to concurrency and parallelism: Use fork().
        
         | danielam wrote:
         | FWIW, speaking from memory, if QPX, Google's (previously ITA's)
         | low airfare search engine, is any indication, then the lack of
         | any real mention is not surprising because QPX did not make use
         | of parallelization or concurrency. The way it was written made
         | that move very difficult. From what I recall, there was
         | interest in parallelizing some bits of the computation, but I
         | don't recall it ever really going anywhere. (QRES, ITA's
         | reservation system, which was also written in Common Lisp, may
         | have made use of concurrency or parallelization in some way,
         | but my knowledge of that system is limited.)
         | 
         | N.b., QPX did not quite follow all of these recommended
         | practices during my tenure (e.g., ubiquitous SETFing of object
         | slots).
        
       | ch_123 wrote:
       | I assume (perhaps naively) that Google must have a non trivial
       | amount of CL development if they have a style guide for the
       | language... anyone know what they use CL for?
        
         | shaftway wrote:
         | It doesn't take much need to write a style guide. And this one
         | is rather lackluster. By comparison look at the shell style
         | guide: https://google.github.io/styleguide/shellguide.html
        
           | gpanders wrote:
           | From the linked shell style guide:
           | 
           | > Indent 2 spaces. No tabs.
           | 
           | > Use blank lines between blocks to improve readability.
           | Indentation is two spaces. Whatever you do, don't use tabs.
           | 
           | The recent obsession with 2 space indents is boggling to me.
           | I find it much more difficult to read (especially in long
           | blocks of code with lots of indentation switches) and I'm not
           | even visually impaired.
           | 
           | They also apparently _really_ don 't like tabs, which I find
           | interesting. Personally, I was converted to the tabs camp
           | after learning how much better tabs are for accessibility.
           | I'm surprised that Google of all places doesn't take that
           | more seriously.
        
         | carry_bit wrote:
         | My guess: Google bought ITA Software about 10 years ago, and
         | their search engine was written in Common Lisp.
        
           | danielam wrote:
           | Yes. The authors of this document were all engineers who
           | worked on QPX/QRES.
        
       | jimbokun wrote:
       | Is it weird that I enjoy reading this, even though I haven't
       | programmed in Common Lisp for a long while?
        
       | dang wrote:
       | If curious see also
       | 
       | 2012 https://news.ycombinator.com/item?id=4639490
       | 
       | Not Google's, from last year:
       | https://news.ycombinator.com/item?id=20505807
        
         | airstrike wrote:
         | Beat me to it! Link to the article in the 2012 submission is
         | broken, so here's a web archive copy:
         | http://web.archive.org/web/20130114221734/https://google-sty...
         | 
         | Seems like the exact same document which harks back to the ITA
         | Software acquisition per comments at the time of that
         | submission
        
       ___________________________________________________________________
       (page generated 2020-07-07 23:00 UTC)