[HN Gopher] What's wrong with Git? A conceptual design analysis ... ___________________________________________________________________ What's wrong with Git? A conceptual design analysis (2016) Author : edward Score : 72 points Date : 2021-04-27 20:24 UTC (2 hours ago) (HTM) web link (blog.acolyer.org) (TXT) w3m dump (blog.acolyer.org) | yarg wrote: | A DAG of features rather than a DAG of changes would make bug | patching a nicer experience. | runningmike wrote: | conceptual designs should be created before. Not afterwards. | Using it for analysis of a product will of course create a | complicate view since evolution of a product is missed. | crazygringo wrote: | Huh? | | This is a critique of the conceptual designs, which is valid. | And an improvement upon them IMHO. | | I don't understand what you're arguing. | ketzo wrote: | This feels more like a learning exercise than an actual | critique of git that would necessitate some kind of response. I | thought it was a useful way to practice thinking about software | design with a tool I already know. | ajarmst wrote: | They should perhaps revisit their title, then. | ketzo wrote: | I mean, it's absolutely a _criticism_ of git, so in that | sense the title 's spot on; it's not a pull request, or a | suggestion that we should all go out and stop using git. | | I mostly took issue with GP's implication that there was no | reason to perform a conceptual design review after the | software's already created. | kerblang wrote: | I have a dedicated shell script for each thing I do in git, | because it is impossible for me to memorize the original syntax | and I'm terrified I'll gut the repo by getting a flag wrong. I | have an "undo" script, that just gets rid of unstaged changes, a | branch deletion script for removing local & remote branches, and | so on. I just take my scripts with me everywhere I go. They're | not ingenious, just a necessity. I advise doing this. | | People like to tell me, "Oh see you just need to understand that | git has something called a 'directed acyclic graph' | blahblahblahblah" and no, that has nothing to do with it. It's | just incredibly arcane and bizarre syntax for the most obvious | everyday operations. But since my scripts make life easy, I sort | of don't care at this point. | | Also I don't rebase and thus I don't force push and thus I don't | reduce my repo to a pile of burning rubble and work til 3AM | repairing it and if people ask me to rebase enough times I'll | threaten to literally take a dump on their desk/car/front porch | which usually makes them stop. I advise doing this as well. | domano wrote: | I have never understood the benefit of hiding away the actual | history. Many people i know find rebasing to be superior | because you have this pretty history, but i actually love | having the ability to understand what happended when exactly as | it happened in reality. | formerly_proven wrote: | "No broken commits" is one, "nobody cares how often you | merged master into your feature branch" is another. Most of | the real reasons revolve around making a bisect quicker and | easier. | kerblang wrote: | If it helps any, I usually squash-merge my final output | into sparkling, concise commits with nice messages (my | original branch usually has "x" for every commit message). | And yes I have a dedicated script for squash-merging too... | formerly_proven wrote: | > I'm terrified I'll gut the repo by getting a flag wrong. | | There are pretty few operations in git which can't be undone by | a quick glance at git-reflog. | jrochkind1 wrote: | Last listed release of gitless seems to be two years ago. Is it | still alive? How's it going? | | https://gitless.com/ | crazygringo wrote: | This actually... makes a ton of sense. | | Just as one example, automatically stashing and unstashing when | switching branches is so obviously convenient, in hindsight I'm | puzzled as to why git doesn't just do this automatically. | | So whatever happened to this project? | | In the same way SourceTree adopted git-flow with a few special | menu commands, I'd love to also see it be able to adopt a | "gitless" mode. | | Because the great thing about this model is that it can still | remain the more complicated git underneath. It just makes using | it more straightforward. | deathanatos wrote: | > _Just as one example, automatically stashing and unstashing | when switching branches is so obviously convenient, in | hindsight I 'm puzzled as to why git doesn't just do this | automatically._ | | Yeah, I too have wished for this. Do note that that can fail, | if there are conflicts, so you'd have to decide whether to fail | the command (like today) or to drop the user into conflict | resolution. | hvdijk wrote: | That's the exception, though. If the changes are | automatically stashed when you switch away from a branch, | restored when you switch back, and operations that act on | branches require you to first switch to that branch, the | stash and branch won't get out of sync, the stash will be | right on top of the branch. Conflicts shouldn't happen unless | you do weird things like reset a branch (git checkout -B) or | update it directly without checking it out (git update-ref), | in which case you might not even need conflict resolution, | such operations might just be specified to immediately drop | the stash. | hvdijk wrote: | Git could not do this (automatically stashing and unstashing) | in a backwards-compatible fashion, but it could be opt-in with | a new config key and does seem like a very useful feature. It | also seems like one that should be relatively easy to implement | as all the main logic to actually stash and unstash is already | part of Git, it would merely need to be saved to a different | ref than refs/stash, so all it needs is someone with enough | interest to actually implement it. | deathanatos wrote: | > _the elimination of the staging area was enthusiastically | received as a major reduction in complexity_ | | Well, that'd be a huge negative to me... it is much easier to | build up a commit piece by piece, verify that it's correct, and | then commit it. The only two alternatives I can imagine are | worse: disallowing partial commits (I use this feature all the | time; I guess I could replace it by a temp commit on a branch and | then so long as cherry-pick -p still exists, use that) or | specifying it to git-commit (that would be one gnarly set of | flags...) | | Sure, a git without a staging area is less complex. But it | removes essential complexity, and would no longer solve my | problems. | | (And in systems that lack it, like Perforce, I've sorely wanted | it, and there was just not a good workaround. But yeah, that | aspect of the VCS was simpler, I suppose...) | | I do agree _teaching_ the staging area seems problematic. | Newcomers struggle with it. I also agree with how the article | mentions other commands sometimes effecting one or more of the | staging area & working dir, and it being unclear when are where | they effect what. `reset` is perhaps the most confusing command | in that regard. Outside of "git reset", "git reset -p" and "git | reset --hard", I'm looking at the manual while I run it. | (Granted, those first 3 cover probably 99% of the use cases...) | erik_seaberg wrote: | I can't build the staging area. I would much rather git stash | -p the changes I _don't_ want to commit yet, and test and | commit the workspace. | jcranmer wrote: | > The only two alternatives I can imagine are worse: | disallowing partial commits (I use this feature all the time; I | guess I could replace it by a temp commit on a branch and then | so long as cherry-pick -p still exists, use that) or specifying | it to git-commit (that would be one gnarly set of flags...) | | Are you aware that git commit --amend exists? That replaces | anything I might ever be tempted to use the staging area for. | Arelius wrote: | I'm not sure how you use --amend to do partial commits of a | file. | jcranmer wrote: | Well, that would be solved with --interactive, would it | not? | williesleg wrote: | It's not China? | ajross wrote: | All the examples[1] are about the index. And indeed, the index is | probably a feature that should have been optional. While it's | supremely useful for maintainer-side operations (splitting | patches, picking out parts of them, committing only the meat of a | patch and not debug code, etc...) it's really not useful for a | typical new user. At all. New users would be best served by a | "commit" variant that simply updated the HEAD commit in place. | | [1] Actually not the last one. I couldn't figure out what they | were talking about. "Only committed versions of a file can be | part of a branch" -- I mean... duh? Branches are a DAG of | committed trees. I don't understand what they're asking for when | they complain that the working version of a file can't be part of | more than one branch. It's not part of any yet! | tsimionescu wrote: | The problem they are talking about is that you can't checkout a | new branch wothout first committing, stashing or undoing your | work. | | In other VCS tools, including gitless apparently, your | uncommitted work is also branched (usually by virtue of | different branches being chekced out in different directories | on disk). You can also achieve this in Git by using different | working trees. | ajross wrote: | Ah, OK. And this is why I think lots of criticism like this | is a little off the rails. It's too dependent on what to my | eyes is just a broken mental model. I mean... branches are | commits. If you want your uncommitted work to be on a branch | you... commit it. There's a command for that. It's called | "commit". | | I mean, literally "git commit -a -m WIP; git checkout | $OTHER_BRANCH" is all that's required. It's even "orthogonal" | per the requirements in the article. | gmueckl wrote: | When I read comments like this, I feel sad. Between the | lines I always read a resignation to design choices that | create unnecessary accidental complexity amd are an obvious | step backwards compared to other version control systems, | even ones that came before. Competing systems like | Mercurial, Plastic or fossil are a simple proof by | existence for this claim. Yet there is some | snobbery/mysticism around git that makes any meaningful | evolution in this space impossible. | ajross wrote: | Quoted without comment: | | > When I read comments like this, I feel sad. Between the | lines I always read a resignation to design choices that | create unnecessary accidental complexity amd are an | obvious step backwards compared to other version control | systems, even ones that came before. Competing systems | like Mercurial, Plastic or fossil are a simple proof by | existence for this claim | | And then _IMMEDIATELY_ : | | > Yet there is some snobbery/mysticism around git | ajkjk wrote: | The idea is that that mental model isn't a very useful and | error-proof one. Saying "well that's the mental model" | doesn't evade criticism of the mental model. | zwieback wrote: | Where I work nobody enforces which RCS to use so we have | lots of SVN, lots of git and some Perforce and maybe | another oddball somewhere. | | Even though it's ugly I'm sold on git but every time I | recommend it to someone who is not used to it I tell them: | "get used to the idea that you only ever have one version | of your source on your PC, no concurrent branches in | different directories." I know git half-heartedly supports | multiple working trees but I think you're best off doing | the "commit-WIP-then-branch" approach. Even stash is pretty | ugly, I only use it for things like config files someone | committed that I in turn would never commit (like IDE proj | files, etc.) | lowbloodsugar wrote: | Which is, of course, bollocks, because the first thing | anybody does when working on a hotfix is to _create a new | directory and create a new tree there_. The only | difference between git and perforce is that in perforce, | the concept of directory <->branch is built into the | system, but with git, git knows nothing about it and its | all manual. Instead, git is one of the VCS that says "you | can have only one working copy on disk" and when you | switch branches we'll believe that you want to throw away | what you were just working on. | | I used to think the "one directory" for everything was | the way to go. In a startup around 2001, I tried ten | different VCS systems, including looking for ones I'd | never heard of (e.g. AccuRev), and I initially vetoed | perforce _because_ it wanted to put different branches in | different folders. When all was said and done, I selected | perforce for precisely that reason. | | But git goes further than simply not having "multiple | checked out branches" as a concept. Instead, its | conceptual model for _why_ I have branches and _what_ I | want to do with them is fundamentally wrong, as this | article points out. | formerly_proven wrote: | > because the first thing anybody does when working on a | hotfix is to create a new directory and create a new tree | there. | | This doesn't make any sense, at all, in any way or form | to me. Everything from build systems to IDEs is tied to | were the tree is, I most certainly _don't_ want to | constantly change that. | | > and when you switch branches we'll believe that you | want to throw away what you were just working on. | | Switching branches checks if something would be thrown | away, and will not proceed if that is the case. | geofft wrote: | This is the workflow I follow, but my experience is it's | pretty high-mental-overhead in practice - Git doesn't draw | a distinction between commits that I created because I'm | still working on something and want to switch branches, and | are therefore reasonable to modify, and commits that came | from some remote source and which you want to keep intact. | They're both "commits." That means you, the user, have to | keep track of which is which and think about which Git | operations might disturb the wrong type of commits. | | I'd like Git to have a mode where, say, git rebase -i won't | show me published commits by default, only work-in-progress | ones. (This isn't necessarily the same as "commits since | the remote branch we're tracking," since it's possible to | end up tracking a branch that's behind for some reason, or | even in a situation like a detached HEAD where you're not | tracking anything at all.) | | I'd also like Git to keep track of whether I'm done with a | commit and _ready_ to publish it. Usually when I commit to | switch branches, it 's because I'm thinking about the | branch I'm switching _to_ , so I don't want to spend the | time yet to write a three-paragraph commit message about my | current work. But that makes it too easy to push an | incomplete commit message. | gmueckl wrote: | What you describe seems to be very close to phases in | Mercurial. There, the repository keeps track of whether | commits were pushed to other repositories. And you can | mark commits as secret, which will stop them from getting | pushed out: | | https://www.mercurial-scm.org/wiki/Phases | leephillips wrote: | 'New users would be best served by a "commit" variant that | simply updated the HEAD commit in place.' | | Isn't this what `git commit -a` does? Or do I misunderstand? | gmueckl wrote: | The article already contains an eloquent critique of this | option: its semantics are too broad and inflexible to cover | that use case. | gmueckl wrote: | An actually sane VCS would stash your changes when switching to | another branch amd restore the stash when switching back, at | least by default. Git instead handles this in a way that | introduces a handful of potentially data shredding gotchas. | hctaw wrote: | I always just commit my changes and switch branches, come | back and squash. | slaymaker1907 wrote: | My biggest complaints with git is more around how it is difficult | to throw away changes and how ignoring files is such a stateful | process. I almost never need something nearly as specific as "git | checkout" or "git reset". I generally either want to throw | everything away up to the last local commit or I want to throw | away everything and get my version to be the same as the branch | on the server/another local branch. If I need something more fine | grained, I will still do this, just on a per file basis. At no | point have I ever wanted to just get rid of untracked files. | | That leads to my next point about ignores. I hate that I have to | manually untrack a file even if it would be ignored according to | a new ignore file. The best practice there in my opinion would be | to explicitly exclude said file IN THE IGNORE FILE ITSELF from | being ignored. The other problem with ignore rules in Git is that | for some reason Git tries to hide the fact that it has a per repo | ignore file in the .git folder which doesn't get synced at all. | | Instead of having untracked/tracked files, I would really rather | just use ignore files, both local and remote, so that excluding | something from being committed is an explicit action. | tejohnso wrote: | > I want to throw away everything and get my version to be the | same as the branch on the server/another local branch. | | Isn't that `git reset --hard remote/branch` ? | ajkjk wrote: | It is also really absurd that you can reset everything with | `git reset --hard` but you have to use `git checkout -- ` to | reset individual files. It's not even clear why `checkout` is | involved in resetting files at all. | wereHamster wrote: | git-reset - Reset current HEAD to the specified state | git-checkout - Switch branches or restore working tree files | | reset works primarily to modify HEAD, and when you include | `--hard` it's like saying: oh and btw also update the working | tree. | | checkout works primarily (in the case of checkout <pathspec>) | to copy file from somewhere (commit, index) into the working | tree. You can use checkout to mimic a hard reset: `git reset | --hard` is pretty much equivalent to `git checkout HEAD .` | veilrap wrote: | For me, checkout is easy to remember because I'm literally | checking out an older version of the file. | johnmaguire wrote: | `git reset --hard` resets your worktree to the HEAD state. | | `git checkout -- <file>` checks out the HEAD version of a | given file. | infogulch wrote: | Yes, but you can also word it in two other ways that make | the redundancy more apparent at the cost of a tiny bit of | inaccuracy: | | A: | | `git reset --hard` checks out the HEAD version of your | worktree. | | `git checkout -- <file>` checks out the HEAD version of a | given file. | | B: | | `git reset --hard` resets your worktree to the HEAD state. | | `git checkout -- <file>` resets a file to the HEAD state. | | I don't think a newbie could even grok the difference | between these three descriptions. | gpanders wrote: | git reset --hard does not only modify the work tree | though, it also updates where your current branch points | to. In fact, that is the main purpose of 'git reset', the | fact that it modified your work tree _at all_ is only | because of the '--hard' flag. So it's not correct to say | that "reset --hard and checkout do basically the same | thing". | hinkley wrote: | I find that I often want to throw everything away except 2 | files. Occasionally I want to stash everything except 2 files. | These are both especially hard to do, especially if you renamed | or copied a file, which is now a staged change. | jayd16 wrote: | As a tech lead on game projects that feature less than the most | technical developers (artists, game designers) the biggest | "problem" with git happens to be its strength. | | The inconsistent cli and naming can be pasted over with cleaner | GUIs. No, the real issue is gits configurability. Not just the | configurability but the _enforced_ decentralization of that | configurability. | | For better or worse, it seems like git is actively hostile to any | kind of concept of a central authority. Can I configure hooks for | my team? No. Can I configure diff tools for my team? No. Can set | recursive pulls to be the default? No. Can I enforce line | endings? No. Can I even enforce LFS is setup right? No. | | I certainly can't even enforce things in receive hooks and expect | team members to know how to change their histories to adhere to | our policies. | | I love git but its such a pain. | | I really wish someone would come up with some kind of "accept | remote config" policy or some such thing. | nightpool wrote: | It seems like what you're asking for is a way to configure your | coworkers remote computers. Shouldn't you do that with a more | general configuration management tool, in the same way you | configure any other application for your team? For example, on | Windows you could use a custom Default User Profile with | .git/config already set up the way you like it. | geofft wrote: | There are two ways to solve this: | | 1. Write and distribute a wrapper script that sets up the clone | in the way you want, with hooks, diff tools, LFS, etc. | | If you're talking about diff tools and LFS, you need some way | to actually install those components, so presumably it | shouldn't be hard to get this wrapper script onto end user | machines. | | My employer uses this and it works well - we have a special | command that creates a new clone and configures it | appropriately. Once you have a clone, you can use the git | commands as usual. People seem fine with this workflow. | (Previously the special command was named something like "git- | xyzclone" so you could just run "git xyzclone" instead of "git | clone". But as we added a few more build/review/release | features, we made a general "xyzdev" command with a few | subcommands, of which "clone" is one.) | | 2. Set up /etc/gitconfig (or ~/.gitconfig) and in particular | the init.templateDir setting to customize the settings you | need. | | Again, presumably you have some way to install things - but if | you don't, tell employees to run "curl -O | https://corp.example.com/.gitconfig" as part of setup, and then | you can bootstrap yourself from there. | | Git doesn't and cannot accept executable hooks/LFS helpers/etc. | from an arbitrary remote server because then as soon as one of | your employees clones something from GitHub they'll get event- | stream'd. Your authority to enforce configuration comes from | your authority (either technical or social) to get employees to | install things on their machines - if you have that authority, | then there are multiple ways to accomplish this. | | (I suppose Git could have a | "core.executeArbitraryCodeFrom=https://git.example.com" setting | or something, but if you have the ability to push out that | setting to all your users, you have the ability to push the | actual settings you want.) | jayd16 wrote: | >Git doesn't and cannot accept executable hooks/LFS | helpers/etc. | | I understand the sentiment but I feel like this is just not | true. Its designed to share code you plan to build and run, | after all. As you say, it seems like we could solve it with | signatures and trust but I doubt we'll ever get there, sadly. | geofft wrote: | Sure, then you have option 3: Put a "setup-git.sh" script | in your repo, and have the README tell users to run that to | set things up. | | In fact you can set the default branch to contain nothing | but those two files, and have the setup script switch to | the real branch once it's done with its work. | satya71 wrote: | What you're looking for is called Perforce. | dijit wrote: | Perforce for assets? Sure. | | Perforce for code? It tries.. but, no thank you. Even if the | client was good (it's really not, and will spin a CPU core | when it detects a network hiccup) -- it still has a | frustrating workflow paradigm, p4ignore is configured per- | client, everything is read-only (leading to ugly context | switches unless your editor directly integrates with p4) and | the command line is so terrible that I think it was actually | a joke that someone ran with. | | No, perforce is not what you want, without even going into | the "decentralised vs centralised" aspect of it. | secondcoming wrote: | Last time I used Perforce was around ~2006 in a mobile OS | company of approx 3000 devs. Boy was that thing slow. | | Still though, P4Merge is my go-to diffing tool. | icedchai wrote: | I worked at a company that used Perforce. It took 2 weeks to | get a branch created. No thank you. | [deleted] | ralph84 wrote: | Enterprise Config for Git might help: | https://github.com/Autodesk/enterprise-config-for-git | [deleted] | dgfitz wrote: | Someone can correct me if I'm wrong but I think mercurial would | solve the issues you describe. | alkonaut wrote: | Git isn't chosen on its own merits - it's what the _other | tools_ support. If CI servers, issue managers, container | tools etc all had a nice VCS-agnostic interface that would be | great. But they don't. We're stuck with git or at least | something that speaks git. Choosing an objectively "better" | VCS is often directly coupled to the loss of some other tool. | brundolf wrote: | For me it's the enormous number of (mostly hidden) states that a | local repository can be in. That's what gives me anxiety as soon | as I go off the golden path: you have to keep a running mental | model of where you are (where the repo is) in any ongoing | process, and if you don't do the right thing as the next step, | you can get into an even worse and more obscure state that you | have to chart your way out of (mostly blindly). | | It's like that old saying about democracy: "Git is the worst | version-control system, except for all the others." | t-writescode wrote: | Have you looked at using a good gui for the repository? I find | that helps a lot with the mess. | | I used to use Git Extensions and lately I've been using the one | built into IntelliJ's products. It's not as good, but it's | available on mac, so my options are limited. | jmchuster wrote: | For git on mac, i've found https://www.git-tower.com/ to be | ridiculously good. Now that i primarily work on linux, i've | settled for Sublime Merge, which has customizable actions, | but isn't as intuitive a match for how i want scm to work, | like Tower was. | Guthur wrote: | I use magit and everything seems a lot less wrong :). | | I personally found Mercurials a lot more palatable but at this | stage git is pretty much ubiquitous. | jcranmer wrote: | > I personally found Mercurials a lot more palatable but at | this stage git is pretty much ubiquitous. | | What I really want at this point is somebody to build a | mercurial frontend for the git backend, so that I get to use | the clean frontend of mercurial to work with git repos. No, hg- | git doesn't really work for this purpose, and I have tried it | for my work git repos (which are in the 100k of commit range). | | Apparently, there's a new git extension for mercurial that is | closer to the model I want, but I haven't had a chance to try | it yet. | globular-toast wrote: | Magit is by far the best way to use git. I've been the git | expert at every job I've been in but I owe it all to magit. It | really can't be overstated how big of a game changer it is. | | I've thought about writing a stand-alone TUI version for people | who are scared of emacs. But then I realise it's impossible. | Magit is inseparable from emacs. Emacs is what makes it so | powerful. Emacs is not a text editor, it's an environment for | interacting with text-based tools. Having your text editor and | git, as well as every other text tool, in the same environment | with the same interface is unbeatable. | crdrost wrote: | I was trying to explain Mercurial to other folks at a | Subversion shop to make the case for DVCS to them, seemed like | it'd be easier to describe the shift from SVN to Hg rather than | SVN to Git. As part of this I drew a diagram on a whiteboard I | had put up in my cubicle, which I then forgot to erase. | | A week later I was in the middle of coding when one of the | nontechnical people from the company came by my cubicle, I | motioned for him to wait a minute for me to finish what I was | writing before talking to him, which he did. When I finally | looked up, "okay, what's up," he said "first, let me see if I | got the gist of this diagram. Inside of a folder you have this | repository file named .hg..." and proceeded to give me a half- | good explanation of how Mercurial works. And I was just floored | and I said "yep, yep, there you go," and at the end he was like | "so why do we save all of these things in Fileserv with a | timestamp at the end?" and I was like "well I'll send you a | link to TortoiseHg and you can just start using it, but yeah | most people just want to do the thing that's in front of them | and don't want to learn a whole tool that does things | differently to get started." (And sadly that was true and as | far as I know he also never invested the time to learn | TortoiseHg and get started with that, haha.) | | Left an impression on me, for sure. ___________________________________________________________________ (page generated 2021-04-27 23:00 UTC)