[HN Gopher] The hardest program I've ever written (2015) ___________________________________________________________________ The hardest program I've ever written (2015) Author : graderjs Score : 230 points Date : 2022-03-05 10:06 UTC (12 hours ago) (HTM) web link (journal.stuffwithstuff.com) (TXT) w3m dump (journal.stuffwithstuff.com) | 734129837261 wrote: | The day job interviews for programmers ask "write me a language | formatter, you have 3 hours" I'll probably end up in jail. Those | things are way beyond my skillset and I'm glad smarter people | than me exist. If you're one of those people: thank you. I love | you. | viginti_tres wrote: | Why jail? Cause you'll lose your mind and do harmful things? | Asking for a friend | andai wrote: | The result will be so bad, it will be considered a violation | of the Geneva Convention. | dang wrote: | Related: | | _The Hardest Program I 've Ever Written - How a code formatter | works (2015)_ - https://news.ycombinator.com/item?id=22706242 - | March 2020 (125 comments) | | _The Hardest Program I 've Ever Written (2015)_ - | https://news.ycombinator.com/item?id=17271963 - June 2018 (76 | comments) | | _The Hardest Program I 've Ever Written (2015)_ - | https://news.ycombinator.com/item?id=15063193 - Aug 2017 (48 | comments) | | _The Hardest Program I 've Ever Written_ - | https://news.ycombinator.com/item?id=10195091 - Sept 2015 (76 | comments) | paxys wrote: | If I ever have to write a code formatter, it will strictly | enforce one line per statement and disallow artificial line | breaks. Devs who end up writing 5000-character function chains | better have a wide monitor. | LudwigNagasena wrote: | Long statements is one of the reason I dislike "fluent | interfaces". To me long statements feel like a problem of bad | language design. And a super smart formatter feels like a crutch | when what you really want is a leg. | cosmiccatnap wrote: | [deleted] | amelius wrote: | Does it find a true optimum, or just some approximation? | algon33 wrote: | Approximation | munificent wrote: | As long as it doesn't hit a built-in hard limit for search | space exploration (which is in practice only encountered on | pathological generated code), it will find the optimally scored | set of line breaks. | coliveira wrote: | I don't understand why people have this fetich for automatic | formatters. If you really want this, you should be using old | style FORTRAN or something similar. The good thing about modern | languages is that you don't depend on the location of code in the | page for it to work. If you start worrying too much about exact | formatting, you throw away this big advantage. I really prefer | code in the location where I put it, not where are machine thinks | it is best. | | And if you think that formatting is a problem to understand the | code, let's get real: this is the smallest of the problems. There | are tons of other things that make code complicated to read, like | variable and function names, the particular style of your code, | how you split it into classes and files, the algorithm you're | using, and so many other, more important things. I can guarantee | you that if a piece of code is well written, you can understand | it independent of where you put braces or the number of spaces | you're using. | dahart wrote: | In addition to what @derefr said, in order to not want | automatic formatting, first you have to get to the place where | zero people in your team/company care about formatting & | whitespace at all. Disagreements over whitespace consume people | time, and those disagreements go away when automated formatting | is used. This is the strongest reason in my experience to use | automatic formatting: to eliminate time spent talking about | formatting. | | Auto-formatting tools in editors exist, and they're very | common, and they're not always configured properly, so people | change formatting on accident. Sometimes formatting changes can | cause code reviews to take more time than necessary. Having | tabs in code can cause actual problems, for example, since tabs | aren't the same size everywhere. | | This is not just a code understanding problem, and shouldn't be | written off as trivial, IMO. | [deleted] | avl999 wrote: | Auto Format is not for the machine, it is for other humans who | work with you. | paxys wrote: | If you truly think that you have never worked on a codebase | with a team size > 3. | umanwizard wrote: | I like automatic formatters (if they're a deterministic | function of AST to text) because I think of what I'm writing as | a syntax tree, and the fact that it's stored as text as a | historical accident. | | I just want to write the tokens without ever thinking about | where they go on the page, periodically save and let my | formatter deal with it. | derefr wrote: | I take it that you don't work on a large corporate code-base / | don't have to code-review other people's code? | | Auto-formatting (esp. when used as a pre-commit hook) means | that _changes_ people make to the style are ignored /reverted | (and/or, that places where people introduce a different style | in new code, are auto-formatted back into the existing style | immediately, rather than that needing to be an additional | commit later on.) Thus, no spurious diff lines from formatting. | Thus, not having to wade through a bunch of "noise" diff-lines, | to get to the "signal" of semantic changes at code-review time. | | Also, having auto-formatting on both your main branch + | development branches, makes merge/rebase conflicts less likely | to happen. (Which basically boils down to "fewer noise diff- | lines" again.) | | In other words: auto-formatting makes code more machine-legible | to _syntax-blind_ parsers; which in turn allows tooling like | diff(1) to be more helpful. | | (Yes, we _could_ just have language-syntax-aware semantic-level | diff /merge/etc. tools. Not sure why nobody ever made these. I | bet this is one of those things where Lisp users have had it | for ages but using their own parallel world of abstractions | that doesn't exist in C/POSIX.) | coliveira wrote: | This has nothing to do with formatting. When you create a | change to a code base you should be submitting only the lines | that are new/changed. If someone is submitting purely | formatting changes, he/she's just wrong and you should reject | that during review. | derefr wrote: | > If someone is submitting purely formatting changes, | he/she's just wrong and you should reject that during | review. | | If you add a line between two existing lines, and then | insert after it a new _blank_ line to serve as a sort of | "paragraph marker", is that a "pure formatting" change? | | If you add a constant in a group of constants, whose name | is longer than the existing ones, do you pad the spacing of | the values of those constants so they line up with one- | another? | | For that matter, if you fully-qualify a previously- | unqualified and potentially-ambiguous identifier, is _that_ | a "pure formatting" change? Some auto-formatter tools do | this, after all. | | These are things that people may or may not do in code- | bases, that "fly under the radar" of even the most | stringent of human code-reviewers, because they're so | irrelevant to _understanding_ the code. They 're "fluff." | But because of this, how people introduce that fluff is | essentially random, and so the cause of a lot of diff | noise. These are the things that auto-formatters can "lock | down" to only happen a certain way. | | But I think you're missing the forest for the trees, as I | mostly wasn't talking about _pure_ formatting changes. What | I 'm talking about is more like: | | You add a formal parameter to a function. Before, the | function's clause head was less than 80 characters. Now | it's more than 80 characters. Do you break the formal | parameter list onto the next line? If so, how far do you | indent it? Do you split the formal-parameter list up so | that _each_ parameter is now on its own line? Etc. | | Done by humans with no strict standard, these sort of one- | off judgements made arbitrarily will add up to "syntax rot" | -- not something you observe with your eyes, but a sort of | "potential energy" of un-made formatting changes, that | means that any given _semantic_ change by a sufficiently- | motivated human might become the impetus for a manual | reformatting _during_ that semantic change, such that that | reformatting will happen at a _random_ time, inflating a | patch where affecting that additional code _wasn 't_ | strictly necessary. (If you ask the programmer why they did | it, they'll say they needed to "clean up the code they were | working on" so that they could understand it well enough to | apply the fix.) Which is horrible for both code review | _and_ merge predictability. | | On the other hand, an auto-formatting tool will apply that | transformation exactly _when_ it becomes necessary; and | will pick some way of formatting the additional lines and | stick to it. There 's no "potential energy" there. At all | times, the codebase is "at rest", with no chance of anyone | introducing "arbitrary" (but actually _left-over_ ) | formatting changes. | | Human formatting is like a sequence of DML statements in an | RDBMS. Auto-formatting is like a sequence of operations | against a CRDT. Given a bunch of changes run in a _random | order_ , the output of human formatters will be arbitrary, | while the output of auto-formatting will be deterministic. | Which is what you want, if you're doing complex things | involving e.g. long-maintained stable branches for 1.x that | cherry-pick changes from 2.x. | brtmr wrote: | > The good thing about modern languages is that you don't | depend on the location of code in the page for it to work. If | you start worrying too much about exact formatting, you throw | away this big advantage. | | Counterpoint: When using a formatter, I stop worrying about | formatting. It's a job for a computer, done by a computer. | Humans are bad at consistency and discipline, computers are | great at it. I want to concentrate on the things that matter, | and formatting isn't one of those. | | Especially in larger teams, consistent formatting is just nice. | No conflicting styles in the same file, and more meaningful | diffs. | coliveira wrote: | If you really want to stop worrying about code formatting, | just stop doing it. It is not really that important. I have | never spent any time worrying about it, and I don't see why | people would be upset about formatting. | | Moreover, using an automatic formatter will not fix it, | because, guess what, there is no universal code formatter. | All of them have different results and a long list of | parameters. Determining the best way to use one will create | more work for you as you manage your team, and will | inevitably add a new step to your already complex building | process. Just stop worrying and use that time in more | productive ways. | RussianCow wrote: | > Determining the best way to use one will create more work | for you as you manage your team, and will inevitably add a | new step to your already complex building process. | | I don't know. I write JavaScript at $DAY_JOB and setting up | Prettier on our repos took all of ~30 minutes, with an | additional ~15 to determine which options to use. (There | aren't many because Prettier is fairly opinionated.) I have | seen far more time wasted quibbling about code styling in | code reviews. | jan_Inkepa wrote: | I've been thinking of working on an automatic formatter for one | particular programming language in order to easily be able to | guarantee consistency of the documentation examples for it. (I | get occasion bug reports about stylistic inconsistency or | inconsistent spacing in them every so often) | yashap wrote: | Strongly disagree: | | - Not having to format my code manually at all, just letting | the formatter do it for me, is a significant productivity win. | I write code as fast as I can, with the minimum number of key | strokes, in a way that would normally be super ugly, and it | comes out the same. I have my editor setup to auto-format the | current file on save, so it's just type a bit of code with zero | formatting, cmd+s, then it's instantly perfectly formatted | | - For a codebase with 10s or 100s of devs working on it, | uniform formatting does significantly help readability. Sure I | can still read it if there's dozens of different formatting | styles going on, but I can read it faster if the formatting is | always consistent | | - Re: the above, yes you can keep consistent formatting without | a code formatter, with a style guide that everyone learns, and | that you enforce in code reviews. But that's a waste of time | both for on-boarding new devs, and a basically neverending | waste of time during code reviews. Also a waste of time writing | and maintaining the style guide itself | | The first point helps me write faster, the second helps me read | faster, and the third keeps code reviews and the like quicker. | | Code formatters are such a clear, easy win, especially with | large teams, that it's hard for me to understand why anyone | would opt out of them. It's not a MASSIVE win, but IMO it | clearly makes for a more productive development environment, | and they're generally dead simple to setup. | coliveira wrote: | I have worked on teams that do automatic formatting and | others that didn't. I have never seen any advantage of | automatic formatting. In my experience, people who like to | complain about simple things like where to put braces or | where to break a line will move the goal posts and start to | complain about particular parameters of the formatter, or try | to change the formatter to something "more powerful". People | who don't care about location of braces will continue working | without problems, and everything will be the same as before, | just with the added complexity. | yashap wrote: | The point is not that any one code formatting style is | best, the point is that consistency in formatting across a | codebase helps you read code faster. Our eyes and brains | are good at picking up patterns - consistent patterns lets | our brains parse code faster than if every file is written | in a different style. | | Furthermore, not worrying at all about indentation, | spacing, brace placement, semicolons or not, etc. lets me | write code faster, not just read it faster. Type it out | with zero effort expended on formatting, save, editor auto- | formats. | | It's not that any of this saves crazy amounts of time, but | it does make all of code writing, code reading and code | reviews slightly faster. When it's so easy to setup, why | not do it? | | The only argument I can see against auto-formatters is that | people like to put their own artistic touch on the code | they write. I get that, but it wastes time, especially when | everyone starts doing things in their own style. | | I've been working professionally as a dev for 9 years, also | on teams that use auto-formatters, and teams that don't. I | think they're a small but clear productivity booster. | oneeyedpigeon wrote: | I have a particular fondness for well-formatted html that I can | read via 'view source' - in contrast to the div-overloaded soup | that I usually encounter. Periodically, I toy with automated | formatting for html until I remember that it's essentially | impossible - two html sources can have a single space character | difference and simultaneously produce the same output and | different output, depending on an external CSS file. This kind of | stuff is tricky. | [deleted] | ameliaquining wrote: | I'm confused, don't the browser dev tools do exactly this? That | seems better than sending pretty-printed HTML on the wire, | which is a bunch of unnecessary bytes that users have to pay | for. | oneeyedpigeon wrote: | The browser dev tools display something very different to the | source of the page - it's a live view of the document, for a | start. | ameliaquining wrote: | That's addressed easily enough: disable JavaScript, so that | the document can't change. | lesam wrote: | It's interesting that the article specifically mentions the go | formatter, but fails to notice that the go formatter sidesteps | this problem entirely by not setting a line length constraint: | https://news.ycombinator.com/item?id=16434566 | xphx wrote: | > _If every statement fit within the column limit of the page, | yup. It's a piece of cake. (I think that's what gofmt does.) | But our formatter also keeps your code within the line length | limit._ | CodesInChaos wrote: | As a programmer I prefer formatters that don't introduce those | heuristic line-breaks based on line length. | | I'm still hoping that Rust will eventually get such a formatter. | Unfortunately the people responsible for rustfmt seem to have a | strong preference for the "ignore line-breaks the user inserted" | approach. | boardwaalk wrote: | This grinds me gears. Especially when you have multiple similar | lines and one is a character longer and the formatter breaks | that line. | | It's like, I could scan the code easier and understand it | better without you doing that, thank you! | | A similar thing is match statements -- some arms using braces | vs statements. | | There's also a bunch of heuristics in rustfmt that are complex | to the point that I literally couldn't format the code the way | it does without building some sort of decision tree annotated | with uneven column limits (e.g. "70% of the column limit"). If | I can't and therefore wouldn't format the code the way the | formatter does, there's an issue. | | Formatters are for consistency, I think, and sometimes they | work against that. | jynelson wrote: | It's not the maintainers - rustfmt has an official style guide | it's not allowed to break without an RFC: | https://github.com/rust-dev-tools/fmt-rfcs/blob/master/guide... | mbrock wrote: | Zig's formatter has no notion of line length. If you want an | argument list to be stacked vertically, you insert a trailing | comma after the last argument, otherwise the formatter will put | it all on a single line. I was a bit bothered by this at first | but I came to really like it. | umvi wrote: | Agreed. Formatters are supposed to remove all cognitive burden | related to formatting. But formatters like Black (for python) | will do line-length based formatting which reintroduces the | cognitive burden again ("oh crap, my variable names are too | long, better shorten them so this line stops getting broken | up"). I like gofmt better for this reason. It doesn't break up | your lines based on some arbitrary line length. | sixstringtheory wrote: | I usually set the line length limit to somewhere around 80 so | I can have several columns visible at once without wrapping | or truncation. | | So far my magic number is three columns: three code files, or | one/two code files with a terminal and/or browser window | thrown in the mix. Or any of these columns can be split into | two vertically stacked boxes for a total of six things. | | I'm also a (nonultrawide) single screen coder (after many | forays into the multi screen world) which has undoubtedly | guided my preference. | munificent wrote: | Ultimately code is a visual medium (for most users). It is a | data format consumed primarily by human eyeballs, so there is | no escaping the reality that things like identifier length, | line length, wrapping, etc. matter. | | Any other strategy is like trying to design a chair without | thinking about butts. You may come up with some sort of | elegant Bauhaus mathematically perfect work of art, but no | one will want to sit in it. | jll29 wrote: | (2015) | svalorzen wrote: | I'm not sure I understand why dynamic programming wouldn't work | (and the author explicitly mentioned Knuth). Tex's main job is | literally doing line breaks, which is the exact same problem | being tackled here. I would expect a similar approach | (progressively build a graph of the most promising breaking | points) to be effective. Why wouldn't it be the case here? | mirekrusin wrote: | Yes, strange, it looks like that's his solution plus some adhoc | logic. At the same time he's more knowledgeable than I am so | dunno. | pantsforbirds wrote: | There are some nice clarifications to the problems he ran into | with his DP implementation with the little skulls at the bottom | of the article. | svat wrote: | As someone very familiar with the Knuth-Plass line-breaking | algorithm (https://tex.stackexchange.com/a/423578/48), an | important difference I see here is that for paragraphs (the | domain of TeX), there is no "state" that needs to be preserved | across lines: if you know that your paragraph is going to | choose a certain break-point, then you can pretty much typeset | the "before" and "after" parts independently, each optimally. | (With one exception: there is a penalty for hyphens being on | successive lines, so we need to track whether the previous line | was hyphenated.) This is the "optimal substructures" property | that makes it so amenable to dynamic programming. | | With the code formatter, to format the part after a certain | character, you need to keep track of the indentation depth of | all the expressions that have not yet terminated at this point | -- because you presumably want parallel expressions to be | formatted with the same indentation depth, for closing | parentheses to match their corresponding opening parentheses, | etc. | | For example, in this example: experimental = | document.querySelectorAll('link').any((link) => | link.attributes['rel'] == 'import' && | link.attributes['href'] == POLYMER_EXPERIMENTAL_HTML); | | and, say (I'm making this up): experimental = | document.querySelectorAll('link').any( (link) | => link.attributes['rel'] == 'import' && | link.attributes['href'] == POLYMER_EXPERIMENTAL_HTML); | | -- knowing that there's a break after the `&&` is not enough; | you also need to know the indentation of the previous | expressions, to decide how you're going to format the part | after the `&&`. | | This is what the author alludes to in the post: | | > _A line break changes the indentation of the remainder of the | statement, which in turn affects which other line breaks are | needed. Sorry, Knuth. No dynamic programming this time. [...] | For most of the time, the formatter_ did _use dynamic | programming and memoization. [...] It worked fairly well, but | was a nightmare to debug._ | | > _It was_ highly _recursive, and ensuring that the keys to the | memoization table were precise enough to not cause bugs but | not_ so _precise that the cache lookups always fail was a very | delicate balancing act. Over time, the amount of data needed to | uniquely identify the state of a subproblem grew, including | things like the entire expression nesting stack at a point in | the line, and the memoization table performed worse and worse._ | | In TeX, paragraphs have each line of the same width (simple | case) or can have a \parshape (in general), but these are | "global" constraints that don't depend on what breaks you | choose. | erichocean wrote: | > _I would expect a similar approach (progressively build a | graph of the most promising breaking points) to be effective. | Why wouldn 't it be the case here?_ | | That's...how he does it. | raverbashing wrote: | Whenever I read something like this I wonder that current | languages (even the higher level ones) are poor at expressing | higher-level concepts like that in a practical way and capturing | that complexity (in an easily manageable) form | | One of the hardest parts of programming is understanding what's | happening from reading code. And if you abstract too much "the | traditional way" then it is just even harder to understand. | r34 wrote: | That's something that I've been thinking about yesterday, while | writing my code in PHPStorm. I thought how much easier those | modern tools make programmer life, how intuitive they are and how | hard it must have been to get to the current state of art. Thanks | for that, creators! | Shadonototra wrote: | language formatter/server with semantic analysis are the hardest | thing ever | thedatamonger wrote: | Bravo! Well written and informative! And as someone who's | obsessed with Dart at the moment timely! Thanks! | tgv wrote: | Heuristics? That sounds like a job for machine learning, and I'm | not being frivolous. I think that it is doable, and when it gets | it wrong, the consequences are almost nil. It would at least make | a decent graduation project. | | Let's not think of the spin-off: FaaS. | munificent wrote: | There is research into using ML for automated formatting. | Personally, I'm not a fan. The heuristics are relatively simple | and when hand-authored can be _explained_. Throwing ML at it | discards explainability and risks really weird formatting | decisions on edge cases for relatively little upside. | | My experience is that people prefer formatting that is: | | 1. Unsurprising. | | 2. Nice looking. | | 3. Simple. | | In roughly that order. Using ML might increase 2 but at the | expense of 1 and certainly 3. | iudqnolq wrote: | Part of the complexity is if I format my code and commit it, | and then you checkout and make a change, we don't want your | formatter to have different opinions on how to format my code. | For this you need (partial) stability at least across minor | versions, which is harder with a less-explainable algorithm. | dorianmariefr wrote: | Related for JavaScript, Ruby, HTML and many more | https://prettier.io | | And the creator of prettier/plugin-ruby worked on a pure-Ruby | implementation https://github.com/ruby-syntax-tree/syntax_tree | knorker wrote: | Shouldn't this be marked 2015? | checker659 wrote: | This is the second or third article on the difficulty of line | breaking I've read on HN just this week. Why aren't there any | good *exhaustive* tome on the art of text editors / line breaking | / text shaping / text rendering / text on the GPU etc. I'd pay | good money. | xarope wrote: | is this /s, in reference to latex/knuth and the precursor to | yak shaving? | bombcar wrote: | There's not a single reference that I know of, at least not | covering all aspects. An interesting place to start may be this | PDF and it's references: | https://mirror.math.princeton.edu/pub/CTAN/info/memdesign/me... | | TeX famously does line breaking in a perhaps decent way - but | it and text shaping become more of an art than a definite list | of rules to follow. | wheelerof4te wrote: | So many formatters popping up. If the old-timers could do without | them, I wonder about the usefulness of them. | | The best code formatter is _you_. | sixstringtheory wrote: | Just because they did without them doesn't mean they wouldn't | have preferred to have had them, given the opportunity. And I | really doubt you've surveyed them all to make such an | authoritative call on this. | runarberg wrote: | It's not that formatters are _essential_ but they are extremely | convenient. They don't just save you from formatting your own | code, they also: | | * Prevent arguments on which formatting convention to adopt, | | * Save a peer reviewer from shallow comments if formatting | conventions are broken, | | * Prevent fill-in white space commit from filling in the git | history, and | | * Decrease the risk of senior developers imposing weird styles | on the code base. | | Formatters _might_ help you write better code by freeing you | from worrying about one aspect of coding, but--much more | importantly--they help create and maintain a better culture | around your code. | Hendrikto wrote: | Formatting is NOT the purely stylistic choice many code | formatter authors make it out to be, though. | | It can greatly affect readability, lead to/prevent | unnecessary merge conflicts, and aid in/stand in the way of | using nice VCS features (blame, revert, bisect, cherry-pick, | ...). | | Many automatic code formatters ignore and do not optimize for | these metrics. | sixstringtheory wrote: | I've never seen a formatter cause merge or blame issues, | but that could be because I always have them run on a | precommit hook and even then always squash all commits on a | PR so there is never a commit in mainline history solely | for code format. I would not admit a PR solely for | formatting, either. | | Git history trumps code style IMO, I would not e.g. add a | formatter to a legacy codebase and then reformat | everything, or do so after changing a rule. Only diffs get | formatted. | majkinetor wrote: | They also: | | 1. Make programming more boring | | 2. Prevents me organizing related thoughts on single lines | | 3. Prevents me doing meaningful and more readable indentation | in specific contexts (such as align on equal sign etc.) | | I don't like them, nor I like any of the style checkers. The | equivalent is aS if somebody wrote a book, and you give it to | GPT-3 to make it more readable for entire world. Fascinating | BS. | sixstringtheory wrote: | > _Make programming more boring_ | | This is a terrible argument for anything to do with | programming/code. It is pure opinion and preference and | therefore is not falsifiable. | | If you're on a team of one, go wild. But if you're on an | actual team trying to get things done, please don't bore | the other people to death with the "interminably soul | crushing debates over code formatting" as the article aptly | puts it. | | The team wants to see exciting results, not "exciting" | code. Code is a means, not an end. | | And especially if you're on the job, you are not being | compensated with excitement, but money. Go seek excitement | in your personal time. | | Formatters are not comparable to GPT because the code | semantics have not changed, only the form. You just don't | want to retrain yourself to read code that isn't written | exactly to your liking. That's laziness. | hhmc wrote: | Most code formatters have pragmas to allow you to break out | of the autoformatter, for when you really do need to | override (your case 3). | cinntaile wrote: | I can see why it is preferred for teams, if you don't work | in teams you can decide for yourself of course. | hhmc wrote: | Will you also forgo antibiotics the next time you have a raging | infection, or do you only appeal to antiquity _sometimes_? | wheelerof4te wrote: | Keep your straws, strawman. I don't want them, be they short | or long. | junga wrote: | But you are not me and here we are. | lordpankake wrote: | Old-timers did without memory safety too, which is why I'm | switching our company's stack over to good ol' C | wheelerof4te wrote: | Some of the older programming languages had memory safety. C | is not the only old programming language. | tomjen3 wrote: | Do you also wonder about the usefulness of git? | wheelerof4te wrote: | Sometimes, yes. | | But I never worked in a big professional team before, where | git's features shine. | | And git != code formatter. | unfocussed_mike wrote: | Preaching to the choir here I suspect, but if other lone | developers are reading this: | | Git's features shine just as brightly, IMO, if you work for | yourself. | | Granted a good number of them are often not very relevant | to a lone developer, but if you work for or by yourself on | any projects over the long term, you should build git (or | some other distributed source control, but probably git) | into the way you work. | | Not to excess, by any means. I use a very small subset of | git's features and in practice many of my simpler projects | don't even use branches, because it's not in the nature of | the changes I am making to those projects or the time | management I do. | | But for example you should consider git an essential step | between dev and live -- using it to deploy -- and you | should look at how you could use it to facilitate staging | and testing. | | Combined with a changelog and relentless use of comments | and notes, git helps "structured forgetting", which as a | freelancer is pretty crucial; sometimes you work | frantically on a thing for a month, get paid and then it | comes back to you years later. | | > And git != code formatter. | | No, obviously, but the needs of the former are supported by | the benefits of the latter. | | That said, I use one for golang but not, in general, for | PHP. I should find a code-formatter I can bend to my will | for PHP, but after 17 years of increasingly complex lone | PHP development, I know what I need from my own formatting | requirements in order to manage projects. | scq wrote: | In a team is also where code formatters shine, IMO. | | The biggest advantages (to me at least) are that they | almost entirely eliminate formatting from code review, and | the consistent style makes it easy to read and edit code | written by different people. | ZephyrBlu wrote: | Git is insanely useful regardless of whether you're in a | team or not. Having a history of your changes, being able | to define an atomic change, branches, etc are all very | useful even when working solo. | tomjen3 wrote: | I have never regretted issuing a git init command, but I | have regretted not doing that. | | Granted when I am developing for myself, most/all commits | are just going to master but even so. | bombcar wrote: | Git is how we store all the white space changes as the | formatting wars rage. | | Apparently some actual code changes are in there but we've | never found them. | munificent wrote: | There's two ways to look at how people of the past compare to | people of today: | | 1. They didn't have X, so obviously we don't need X now. | | 2. There were the ones who _created_ X, so clearly they felt | the absence of X was a problem to be solved. | | I'm inclined to believe 2 comes into play more often than 1. | The present was created by those living in the past. | maerch wrote: | So many hours have been saved thanks to formatters. Not only | writing it, but also countless hours in PR-reviews without | senseless nitpicking. | | One of the best trends recently. | judofyr wrote: | > Not only writing it, but also countless hours in PR-reviews | without senseless nitpicking. | | I've never understood this point. Programmers will _always_ | find something they can nitpick about. Ultimately you want a | culture which focuses on the critical parts: Correctness of | code, good test coverage, big-picture architecture which has | long-term impact. Bringing an auto-formatter into the picture | may reduce some senseless nitpicking, but you haven 't | actually done anything to solve the _real_ culture problem. | If your team was getting blocked because people were arguing | about _formatting_ you have bigger problems that won 't be | magically solved by adding an auto-formatter. | | To give some examples: | | A while back I worked with one person on a project where we | had both Prettier and super strict ESLint, and I would still | get PRs rejected because they wanted the code to be slightly | refactored in a way which was entirely subjective and had no | impact of the correctness (e.g. "flip this negation") . | | And right now I'm working on a team where we explicitly tag | some PR comments with "nitpick". This will _not_ block the PR | from getting merged, but instead it 's a way of saying "I | prefer it this way, but it's not that important in the bigger | scheme of things". This is also a signal that it's not | something that we want to start a bigger discussion around. | | (We use auto-formatters and linters as they are very useful.) | [deleted] | mjevans wrote: | The Dart formatter sounds really advanced, and reflects potential | complexity of the language. | | I think I'd still prefer to see a formatter attempt to preserve | any formatting which is already 'good enough' to pass as an | output threshold. Code isn't just a recipe for a computer to do | something, it's a language for explaining to other programmers | what that thing is and what's important to the structure of | accomplishing it. The choice of where to place a break can matter | for cognition and can be almost as important as the printed | characters for organizing thoughts. | munificent wrote: | _> reflects potential complexity of the language._ | | The language is fairly complex syntactically and that | definitely adds some cost to formatting. | | But I think much of the complexity comes from two things: | | 1. A lot of idiomatic Dart uses function literals in a block- | like way, as in: test(() { expect(1 | + 2, 3); }); | | 2. But Dart doesn't actually having trailing block argument | syntax like Smalltalk, Ruby, and Kotlin. So the formatter has | to look at the closures passed to an argument list and decide | which ones look better using block formatting, versus regular | argument list formatting like: someFunction( | () { expect(1 + 2, 3); }); | | Also, at the time I first wrote dartfmt, there was a lot of | very nicely hand-formatted code in the wild that used different | subtle layout choices to make different argument lists look | nice. In order to persuade people to adopt the formatter at | all, it had to be sophisticated enough to figure out many of | those patterns and apply them automatically. | | It's not as good as a human (mainly because it doesn't have | semantic context) but it had to be pretty close or people | wouldn't have tried it. | | Now that it's well established, I think it would probably be | possible to simplify how it formats while still making users | happy. Possibly _happier_ because the results would be a little | easier to predict. | | _> Code isn 't just a recipe for a computer to do something, | it's a language for explaining to other programmers what that | thing is and what's important to the structure of accomplishing | it._ | | An automated formatter will never be as good as carefully | crafted artisanal formatting. In particular, automated | formatters don't know what stuff _means_. A good human might | choose to line break a function call like so: | setColor(red: 123, green: 54, blue: 26, alpha: 45); | | Because they know that "RGB" is a single coherent concept and | alpha is less closely related. An automated formatter doesn't | (and probably shouldn't) have that domain knowledge. | | But the value proposition of automated formatting is not just | "how nice is the resulting code to read". You have to look at | the total value proposition of completely yielding formatting | to a tool versus allowing human control over it. When it's | completely automated: | | 1. You can run it on generated code that contains absolutely no | whitespace and still get nice output. | | 2. Humans can do large-scale refactorings, format, and get | output that is consistent with the existing state of the | codebase without having to understand any local style | preferences. | | 3. Humans never have to spend time deciding how to format. | Further, they don't even have to spend time deciding _if_ they | should format. | | 4. When reading a random codebase, it is likely to be formatted | in a style you are used to even if you have zero communication | with that team. This is particularly important in open source. | | 5. The code looks familiar to you wherever you encounter it: | IDEs, plain text editors, code review tools, blog posts, | StackOverflow answers. As opposed to letting everyone pick | their own style and relying on users to apply their preferred | style locally, it's just always in a familiar style. | | 6. Like any automation, the tool doesn't make mistakes. Even | very careful humans hand-formatting make more mistakes than | they realize. (I know because I've looked at their code). Those | mistakes can be distracting for readers. | | 7. It gets people out of the mindset of being nitpicky about | style. It encourages them to stay focused on the structure and | naming of their code, which is what really matters. | | 8. It eliminates style arguments in code reviews. Those take | time and, worse, cause disharmony, for next to no benefit. | | I think it's very worth excepting some small loss of overall | formatting quality to get those in return. | ramraj07 wrote: | It might be true but as OP says in the article, the moment you | have potential multiple ways to show the same piece of code, | you're going to surface engineers with each opinion on the code | review. Getting an opinionated formatter is the best way to | bring engineers back to do what they really need to be doing in | code review, which is review the code not its formatting. I'll | never go back to python without black and Isort! | fxtentacle wrote: | I feel like enforcing a specific text style is the wrong | approach. | | In my opinion, different people have different formatting | preferences and forcing someone to use the "wrong" one will | lead to slower reading speed and the risk of overlooking | errors. | | That's why I believe we should treat it like font or color | choices. The IDE should display source code in the viewer's | preferred style, so that each person sees what they expect. | And then the actual source code formatting becomes | irrelevant. | | Go already goes a good step into this direction by making a | language AST tree part of their core libraries. And opening | Go source code in JetBrain's GoLand will show you additional | annotations, spacing, etc. based on the parsed source code | tree (and not based on the source code's text). | coldtea wrote: | > _In my opinion, different people have different | formatting preferences and forcing someone to use the | "wrong" one will lead to slower reading speed and the risk | of overlooking errors._ | | Not if everybody using the language is strongly forced (as | with Go), since then those with "different formatting | preferences" will eventually (and soon) just get used to | the enforced style. | duped wrote: | The AST usually isn't enough, you want a CST (or what a few | sources call "full" syntax tree, preserving white space, | comments, etc). I think .NET has the best implementation of | this out there, unsurprisingly they have incredible tooling | support. | eyelidlessness wrote: | For anyone interested in CST tooling outside of the .NET | ecosystem, tree-sitter[1] is general purpose, _quite_ | fast, supports a wide range of grammars, and has bindings | in quite a lot of environments (including WASM, so likely | can be used anywhere with some effort). | | 1: https://tree-sitter.github.io/tree-sitter/ | duped wrote: | Tree sitter is absurdly complicated to use in a real | project, a hand written parser might be slower but it's | way easier to implement and build. | ramraj07 wrote: | This seems like the arguments devs stuck on vim and eMacs | keep making. "Ohhh I'm tired of moving my fingers from one | key to another!!" [1] It's just code, if you're in a good | team each PR is just a small diff, it's hard to believe | that's somehow too complicated for an ostensibly good | engineer. | | [1] Incidentally the only people I know with carpel tunnel | are folks who are entrenched in command line text editors. | Maybe there's some benefit to moving your arms around? | Similarly maybe there's benefit in reading code in a | different format, you might actually read it slower and | hence comprehend it better. | awild wrote: | From experience I can tell that one can get used to other | people's styles and just work with that oneself. Unifying | among an enforced style and sticking to it really reduces | the amount of unnecessary discussions. | | I genuinely feel that focusing a lot on the detriment of | someone else's or an established code style is a sign of a | lack of team work. | davemp wrote: | Having a strictly (formatter) enforced style is actually | how you allow people the freedom to use their preferred | style. | | Everyone can just set up a pre-commit hook to automatically | format the code back into the official style, then everyone | is free to put the codebase back into their preferred | style. | | Otherwise, if developers reformatted code to their | preferred style regularly, it would create massive diffs. | hiccuphippo wrote: | And this is how git works with the different newline | styles. It can convert them to a single enforced style on | commit and converted back when you pull changes. | delusional wrote: | I've never had that work correctly. Mostly it just gets | in the way when the line endings actually matter (because | my javascript has to load in IE7). | fxtentacle wrote: | I feel like I might have explained that badly. In my | suggestion, the source code would be diffed as a machine- | readable AST tree. That means all source code | reformatting actions which do not change the meaning of | the source code also do not appear in the diff. | | In this explanation about Go: | | https://golangdocs.com/golang-ast-package | | the text following "and we get a nice structure" is what | the source code management tools would be working on. | It's an abstract representation of the source code's | meaning, but not tied to how things are formatted or | indented. | ameliaquining wrote: | I agree it would (maybe) be better if things had been | designed this way from the beginning, but as it is it | would be totally incompatible with all existing source | code management tools, which is a nonstarter. Even if you | can switch to a new editor or IDE, you still have to | worry about the GitHub web UI and your error reporting | tooling and who even knows what else. | rowanG077 wrote: | On paper I like this idea. But I can't shake the feeling | that every non-text based format I have ever encountered | sucks. | chrisweekly wrote: | Agreed! I use Prettier (driven by ESLint) in all my web | projects (including enterprise clients', where I've helped | introduce / improve / standardize tooling). IME it's a | mistake not to use it. | rowanG077 wrote: | I'd really hate a formatter to attempt to preserve formatting | if it's "close enough"(tm). Because what is "close enough"(tm) | is hugely subjective, like formatting style itself. Formatter | should just format to a common style. No funny business. | CodesInChaos wrote: | > The Dart formatter sounds really advanced, and reflects | potential complexity of the language. | | I would not be surprised if a lisp formatter following the same | philosophy as the dart formatter would be similarly complex, | since the complexity comes from optimizing line length, not | parsing. | mbrock wrote: | Incidentally I spent yesterday implementing a Lisp formatter | with the algorithm described by Jean-Philippe Bernardy in the | simple and elegant paper "A Pretty But Not Greedy Printer." | | https://jyp.github.io/pdf/Prettiest.pdf | | It's based on introducing choice between vertical and | horizontal stacking. It avoids combinatorial explosion by | pruning away strictly suboptimal choices. With just a few | extra rules for Lisp forms, the results are quite good. | | I don't handle comments though, since I only need it for | pretty printing values so far. | virtualwhys wrote: | > The Dart formatter sounds really advanced, and reflects | potential complexity of the language. | | You got that right -- check out this epic GitHub thread on | optional semi-colons [1], and the author's own comment on the | subject in an HN thread [2] | | [1] https://github.com/dart-lang/language/issues/69 | | [2] https://news.ycombinator.com/item?id=22706645 | mushyhammer wrote: | > I think I'd still prefer to see a formatter attempt to | preserve any formatting which is already 'good enough' to pass | as an output threshold. | | That feels like Prettier sometimes. I think it leaves some | objects on multiple lines or one line depending on how you | leave them. But I'm not too sure. | rendall wrote: | There is tension between having a single source of truth for | source code, including an opinionated formatter; and allowing | individual programmers expressive facility to format and | structure code in a way that makes most sense to them. | | I've been thinking lately of a formatter that would resolve | this tension. It would be a local, individual formatter, | complementary to the global, opinionated, "Prettier" formatter. | | When the source code was committed to remote for review, the | opinionated formatter would do its thing, making sure the code | was formatted properly according to whatever the team agreed. | But locally, the code would remain however the developer liked | it. | hprotagonist wrote: | a recipe for merge conflicts. | | i think maybe your only hope is to somehow "version the AST", | and let formatting be a style sheet or something. But i'm 1. | talking out my butt and 2. sure i'm missing something. | fragmede wrote: | It's a decent idea! | | One problem though is all of the non-code elements, aka | comments and how they intentionally align up with the code | (mixed with spaces and tabs), so the ideal format is the | one the original programmer wrote it in, and the second | programmer using a different stylesheet can't edit comments | and have them end up pretty for the first programmer. (The | solution, obviously, is to not comment any code >_< ) | Cullinet wrote: | I agree with the decency of the ide | | But isn't this the nearest argument why we should start | publishing negative results papers? | | The most important thing that I have ever learned was a | announcement made by Google about how their industry | consortium seeking to trade print magazine advertising | inventory, failed with reference to the nature of this | failure. No matter how lacking in detail this notice was, | no matter that I was sent down a seven years solitary and | very lonely path of mercifully ultimate discovery, on the | heels of my startups exit collapse due to my cofounder | tragically dying but we'd have been sunk by the problems | indirectly revealed in that Advertising Age news item. | However painful, and I'm talking about therapy for years | after I emerged from reclusion, and real health issues | giving up running and my diet turning to junk energy | hits...I have learned more from being told "no can do Z | because of y" and I seriously think that we /seriously | have to start already getting over the reasons why we | don't talk about our failures/. I mean holy smoke | wouldn't we ever run out of good conversation ever again | if we could have a good old banter and brew over our | cockups? I'm thinking that this is how women are so much | more successful in reproduction if they actually want to. | What could we do if we tried? | VBprogrammer wrote: | Was this GPT-2 trying to write a HN comment? | olvy0 wrote: | I'm not sure. Their posting history has multiple posts of | this type with disconnected sentences. With GPT 2 at | least there is usually some continuity and a semblance of | a shared context between the sentences. | freedomben wrote: | After reading several of the comments, I think OP is not | a native English speaker, and that makes for some awkward | grammar/sentence structure. | andai wrote: | I am reminded of Terry Davis. | javbit wrote: | If the display of the code is removed from its | representation, I think the same could be done for | comments. Comments could be kept as part of the AST and | rendered how you like. | | E.g. (+ 1 2) ; Add two numbers. | | Would become (comment (+ 1 2) "Add two | numbers") | | With semantics like const. | | Another would be (+ ; Adding | 1 ; one 2 ; and two. ) | | To ((comment + "add") (comment | 1 "one") (comment 2 "two")) | | You could display comments as popups, marginalia, or even | in a traditional fashion (since some intent is captured | by the comment scoping. You could also have different | types of comments like annotation to have different kinds | of display types. | anchpop wrote: | This is exactly what Unison is trying to do | rendall wrote: | > _let formatting be a style sheet_ | | That's a good expression of the idea. If merge conflicts | happen, we're doing it all wrong. Version control shouldn't | even really be aware of this. | Cullinet wrote: | the most extreme validation of typographical precision is | the banknote. Discuss?... | hprotagonist wrote: | tree-sitter and git: now kith! | MeteorMarc wrote: | Maybe a code formatter should be just brutally simple and | predictable. Fearing the look of long, complicated statements, | coders will shorten their statements and just do one thing per | statement. | criddell wrote: | It also encourages shorter and often less descriptive names. | That's not necessarily a good thing. | eyelidlessness wrote: | This really depends a lot on the language syntax, naming of | built ins, and common idioms. Like Java is notorious for its | long lines because it's so verbose and long naming is the | common convention. Lisps are notoriously far off to the other | extreme. There's quite a lot of room between those extremes, | occupied by languages like Python and Ruby. And then there's | languages like JavaScript and especially TypeScript which | span most of the range depending on preference. | egberts1 wrote: | Sheesh. That is hard. And that does NOT pales in comparison to me | polishing the "tiny" Bash regex for removal of inline comment (as | denoted by hash, semicolon or double-slash) in the INI-format | (version 1.4 2009) file ... while ... and while permitting those | same inline comment characters in quoted string to be allowed in | (along with its sub-sequential string up to its ending pair of | matching single/double quote symbol. | | I have a working regex (passed by many regex online testers) but | in bash yet, NO! | | To graderjs of HN, the Author of Dart code for matter, you got | mad respect from me. | jb3689 wrote: | I've been the primary maintainer for a vim plug-in for a number | of years. Dealing with indentation expectations of a modern, | complex programming language without an AST parser (you need to | be able to deal with code that doesn't compile) is one of the | hardest problems I've had to work on. Dealing with string and | comment detection, dealing with the constant influx of new | features, keeping it performant and maintainable in a dedicated | scripting language with bespoke debugging tools. The best | approach I've come up with is to be ruthless about testing and | use strictly TDD for everything | | Take a look at how complex the code is for something like vim- | ruby to get a feel for what I'm talking about | axydlbaaxr wrote: | So wise to share the failures and pitfalls, along with the | successes. ___________________________________________________________________ (page generated 2022-03-05 23:00 UTC)