[HN Gopher] Anu: A sound, distributed version control systema
       ___________________________________________________________________
        
       Anu: A sound, distributed version control systema
        
       Author : lelf
       Score  : 81 points
       Date   : 2020-11-05 20:16 UTC (2 hours ago)
        
 (HTM) web link (anu.dev)
 (TXT) w3m dump (anu.dev)
        
       | harikb wrote:
       | I was hoping to get a vcs that distributes it files via sound
        
         | rhizome31 wrote:
         | And I was expecting version control for sound files.
        
           | AceJohnny2 wrote:
           | And I was expecting sound version control for files
        
       | okokok___ wrote:
       | https://nest.anu.dev/anu/manual
       | 
       | This is returning "Not found" for me.
        
       | wocram wrote:
       | Is there demand for this?
       | 
       | Most real complaints about git are around scalability of giant
       | monorepos, and a lot of work has gone into various solutions.
       | 
       | The secondary complaints about usability seem to be papered over
       | by popularity, and of course the relevant xkcd:
       | https://xkcd.com/1597/
        
         | alkonaut wrote:
         | I'd take a new git with just the UX fixed. If the underlying
         | implementation of a new VCA is also better, that's great, but
         | my main problem with git isn't that it's not sound but that the
         | UX is a steaming pile of legacy cruft, and it's really more an
         | API for version control than a polished interactive app for
         | humans.
        
         | rakoo wrote:
         | My complaints with git are a bit different, having never felt
         | the burden of giant monorepos (but it's definitely related)
         | 
         | git was built as a tool for completely distributed source
         | versioning, but most of us are using it in a centralized way.
         | It's nice to be able to work offline, but when we need to
         | synhcronize there's always a huge dance of fetching first, see
         | if it has moved, merge/rebase, etc... git is good at storing
         | what we did, but it doesn't help at all at saving what we are
         | _doing_: all changes to the working directory are ephemeral,
         | like files stored in ramfs. When working on public
         | repositories, you can't push a branch prefixed with your name;
         | you have to fork the whole project _and_ push a branch before
         | you can start interacting. Instead of having one server and a
         | client, you now have 1 central server, 1 other server that only
         | _you_ can access and will in practice contain 1 branch, and
         | will be abandoned as soon as you're tired of it, and a client.
         | Rights can't be managed at the branch level, so I'm just going
         | to copy-paste the whole thing from the beginning of history and
         | give it to you.
         | 
         | What I would like to see in a VCS:
         | 
         | - There is one central place where people coordinate - There is
         | exactly one commit associated to a branch, and that association
         | is the same on all machines at the same time (I don't want to
         | git fetch) - If you want to do changes to a branch, you do a
         | sub-branch - That sub-branch, along with your local changes in
         | or out of the staging area, is synchronized to the server. If
         | authorized, other clients can have a view of those as well
         | 
         | It seems it already exists with fossil (https://fossil-
         | scm.org/home/doc/trunk/www/concepts.wiki#work...) and with
         | older SCMs, although older SCMs are plagued with the locking
         | problem.
         | 
         | In a way the work that is done to handle giant monorepos is
         | helping git move in this direction: all branches are
         | automatically synchronized, and the vision with this kind of
         | repo is that it's ok to commit often, even in small batches.
         | But it's not quite there yet. I've read an account of how
         | things are done in Google
         | (https://cacm.acm.org/magazines/2016/7/204032-why-google-
         | stor...) and it's closer to my dream system.
        
           | felipelemos wrote:
           | > When working on public repositories, you can't push a
           | branch prefixed with your name; you have to fork the whole
           | project _and_ push a branch before you can start interacting.
           | 
           | I believe this is more of a issue with GitHub then with git
           | itself.
        
         | astine wrote:
         | People definitely complain about merging difficulties with Git.
         | The idea with Git is that the data model is simple enough that
         | you can basically manually fix issues that come up. That
         | unfortunately means that you have to take the time to
         | understand Git's data model and not just memorize the interface
         | and a _lot_ of people have complained about that over the
         | years. I think the idea with Anu is that issues don 't come up
         | in the first place and that it's hopefully more intuitive to
         | use in the long run.
        
         | rudedogg wrote:
         | Take a look at https://anu.dev/documentation/associativity.html
         | 
         | The patch model is simpler IMO
        
           | spockz wrote:
           | Wasn't the patching fixed already with darcs and mercurial?
        
             | gwenzek wrote:
             | I don't think Mercurial is patch based. Pijul is more
             | similar to Darcs. They claim to have a sounder and faster
             | patch algorithm.
             | 
             | And Anu is apparently even sounder and faster?
             | 
             | https://pijul.org/manual/why_pijul.html#pijul-for-darcs-
             | user...
        
         | miloignis wrote:
         | The site has https://anu.dev/documentation/why.html
         | 
         | For me, I'm excited about better, more rigorous merging and
         | being able to cherry-pick & rollback changes without causing
         | conflicts later on. (Cherry-pick in Git makes a new, unrelated
         | commit, so merging with the branch you cherry-picked from can
         | often cause merge conflicts, etc)
         | 
         | In general, tracking and working with actual dependence between
         | patches seems to open up more workflows, and less hacky ones.
        
       | ComputerGuru wrote:
       | The lack of a clear "why this rewrite was needed" somewhere
       | accessible is a pretty big "f u" to anyone that evangelized for
       | Pijul in the past.
        
         | Koshkin wrote:
         | > _why this rewrite was needed_
         | 
         | We all know _why_
        
           | Ygg2 wrote:
           | Why though?
        
           | hobofan wrote:
           | No we don't. Care to enlighten us?
        
         | wtracy wrote:
         | Soaking as someone who is completely unfamiliar with Pijul, the
         | explanation on this page is pretty lackluster.
         | 
         | "It is based on changes rather than snapshots"
         | 
         | Well, every VCS I'm familiar with is based on changes/deltas. I
         | assume that these terms have specific meanings here that I'm
         | not familiar with, but it manages to sound like the author has
         | never heard of git or Mercurial.
        
           | [deleted]
        
           | tomjakubowski wrote:
           | git fundamentally tracks and stores snapshots, not changes. I
           | believe mentioning this is meant to emphasize a difference
           | between Anu (or Pijul) and git.
        
             | ben509 wrote:
             | This is correct. The common misunderstanding is due to the
             | fact that gits pack files are delta compressed, but that's
             | a implementation detail.
        
           | zanecodes wrote:
           | Git actually does not work with deltas; each commit contains
           | the hash of a tree object [0], which contains the hashes of
           | the files within it [1]. This tree is effectively a snapshot,
           | since it contains every file hash in the working directory at
           | a certain point in time. Since Git uses the file hash instead
           | of just the file path, it doesn't have to download the files
           | whose hashes haven't changed since the last commit, which is
           | what makes it behave somewhat as if it is operating on a
           | delta.
           | 
           | [0] https://git-scm.com/book/en/v2/Git-Internals-Git-
           | Objects#_gi... [1] https://git-scm.com/book/en/v2/Git-
           | Internals-Git-Objects#_tr...
        
         | [deleted]
        
         | gwenzek wrote:
         | My uderstanding is that they wanted to change the algorithm and
         | that the codebase needed a major refactoring. I think there was
         | performance issues with the first implementation and design so
         | they had to make very large change.
         | 
         | https://discourse.pijul.org/t/is-this-project-still-active-y...
        
       | dan-robertson wrote:
       | It seems this was posted very shortly after the webpage became
       | live and that the website is probably not in a fully fleshed out
       | state. Therefore it's missing some details. Probably the website
       | is mostly targeted at people who already know what pijul is.
       | 
       | Here are some attempts at short descriptions of what I think this
       | aims to achieve:
       | 
       | - pijul 1.0. A stable repo format with performance problems
       | resolved and a good foundation for further work
       | 
       | - darcs except the algorithm is more convincingly correct and
       | merges don't take exponential time
       | 
       | - A version control system where certain things behave in
       | reasonable ways avoiding potential strange behaviour, eg it
       | doesn't matter what order you merge things in, you always get the
       | same result.
       | 
       | - A version control system that provides a good user interface to
       | humans, a simple mental model, and asymptotically good
       | performance.
        
       | bobuk wrote:
       | For lazy people: Anu is Pijul (modern distributed version control
       | system) rewriten from scratch, also with rust. It's on very early
       | stage and now interesting mainly for academic/research.
        
         | nerdponx wrote:
         | So can I read Pijul repos with Anu and vice versa?
        
           | bobuk wrote:
           | It's rewriten from scratch and authors didn't answer about
           | compatibility so I checked it right now. No, it doesnt' work
           | with pijul repo at least in my scenarios
        
         | glandium wrote:
         | So... pijul is dead?
        
           | dan-robertson wrote:
           | Pijul has been dead waiting for pijul 1.0 for about a year.
           | People were already expecting the file format to change with
           | pijul 1.0, however I think the name change was unexpected.
        
           | bobuk wrote:
           | It's like a zombie now, whole body is working but no brain
           | activity at least for last 6 months.
        
         | rudedogg wrote:
         | Is Anu GPL2 like Pijul? I can't find a license in
         | https://nest.anu.dev/anu/anu.
         | 
         | Edit: crates.io says gpl2
        
       | gigatexal wrote:
       | missing docs ugh
        
       | gigatexal wrote:
       | i can't get this to build on ubuntu for it not being able to find
       | libssl
        
       | thefurman wrote:
       | Taking into account the history of how lines have changed isn't
       | much better, sorry Anu. (Or if you think it is, please give some
       | very compelling real world examples).
       | 
       | I believe that you need to understand the semantics of the code
       | to truly do what you are trying to do well, and for all other
       | cases the snapshot model is more than good enough and given how
       | we structure and modify code, it works out really well in
       | practice. Code dealing with a single aspect should and almost
       | always is co-located, so to get a conflict of intention in a
       | merge is very rare. There are other human aspects like code
       | ownership and collaborating teams which makes the issue even less
       | of a problem.
        
         | garmaine wrote:
         | I don't know about Anu (haven't looked at it yet), but with
         | Pijul it would be perfectly possible to take advantage of
         | semantic knowledge. Line-based changes is a default, but you
         | could certainly apply file deltas based on a richer
         | understanding of the underlying filetype.
        
           | dan-robertson wrote:
           | I'm not convinced by this but I'm also not convinced by the
           | argument of the comment you're replying to. The theoretical
           | foundation Pijul/Anu works by starting with files as lists of
           | lines (or some other thing) and patches as (injective)
           | mappings from one list of lines to another which preserve the
           | relative order between lines, then constructing the smallest
           | generalisation of this structure to one where all merges
           | exist and are, in some sense, well behaved. This
           | generalisation is from lists of lines to partial orders of
           | lines, where "B is preceded by A" becomes "A<B".
           | 
           | To do something similar with more structured files, one must
           | find the corresponding idea to "a list of lines", and this
           | must work in a good way (e.g. changes like x -> (x); [a; b]
           | -> [a] foo [b]; [[p, q], [r, s]] -> [p, q, r, s] must in some
           | sense be natural operations in your structure (and diffs need
           | to be reasonably easy to compute)). And of course it still
           | needs to work in a sane way for unstructured data in big
           | comments. Therefore I don't agree that Anu would be easily
           | generalised to this.
           | 
           | I think this is basically impossible to do for situations
           | where you want to capture all the structure (such that a
           | patch to rename something merges well with other patches). I
           | think it's likely extremely hard for a part way solution.
           | 
           | Finally I'm not convinced that the change would be that
           | useful. Much of the structure of computer programs is
           | implicit in the scoping rules in such a way that the "move
           | blocks around" changes that line-based VCSes often struggle
           | with will still be invalid with structural diffs.
        
             | garmaine wrote:
             | This is the same underlying theory as the "operational
             | semantics" that is used by Google docs to merge out-of-
             | order changes by simultaneous editors and resolve into a
             | single consistent shared global state. So take that as a
             | proof of principle that it works for more complex
             | structured information.
        
       | [deleted]
        
       | ChrisMarshallNY wrote:
       | Good luck. I mean that.
       | 
       | Git isn't perfect, but I've been using version control since
       | Apple Projector (in the late 1980s), and Git has done the best
       | for me. I've been using it for many years.
       | 
       | I don't miss Projector one tiny bit.
       | 
       | VSS (Visual SourceSafe) was a dog. It was direct file-based, and
       | server connections would get _very_ busy. It was the old-
       | fashioned kind, with the need to check out files.
       | 
       | But it had one very cool feature: You could create "aliases" of
       | repo components; essentially creating a virtual repo that pointed
       | into several other repos, taking just a couple of files from
       | each.
       | 
       | I could see how that would be a technical nightmare to implement,
       | but I like it a lot more than "the whole kit & kaboodle" approach
       | that Git takes.
       | 
       | I also used Perforce for many years. It was a robust and
       | dependable system, but had that need to check out files to work
       | on them, and that drove me nuts.
       | 
       | I like Git, because it is "team-friendly," and has a really light
       | touch. It encourages many small checkins, which is how I think I
       | should usually work.
       | 
       | I wish it handled big files better, but that's not really a big
       | deal to me. I think this might be why Perforce is still preferred
       | for game development (their asset libraries get _big_ ).
        
         | skissane wrote:
         | > I've been using version control since Apple Projector (in the
         | late 1980s)
         | 
         | Never heard of Apple Projector before. I've always been
         | interested in the history of version control systems, so I'd
         | would like to learn more about it. But when I search for the
         | term, almost all I find is stuff about using projectors with
         | Macs/iPhones/iPads/etc. Can anyone point to any information
         | sources on it?
        
       | thewebcount wrote:
       | I've been very disappointed with the pains of using git. I would
       | really like something like this, but the steps to install it are:
       | 
       | > Anu is written in Rust, and can be installed by first
       | installing Rust, and then...
       | 
       | Yeah, I'm not installing an entire language just to use your
       | tool. I don't need to install a C or C++ compiler to run
       | Photoshop or Microsoft Word. Why do I need to install a compiler,
       | libraries, etc. just to try out your tool? No thanks.
        
         | wtetzner wrote:
         | I mean, it's apparently still in alpha. I imagine there will be
         | installers and it'll be included in package managers when it
         | gets to 1.0.
        
         | astine wrote:
         | Because that's traditionally how open source has been done? For
         | decades? Especially if you're in a Unix/Linux environment?
         | Eventually we started getting package managers and the distro
         | maintainers started creating binaries of most of the packages
         | you'd want to install, but the distro maintainers usually build
         | those packages from source. This is new software and I imagine
         | the distro maintainers will package it up if it starts to gain
         | steam.
         | 
         | Doing it from source also has the advantage that the package
         | maintainers can customize the build process so that it works
         | better with their system. Photoshoto and MS Word are closed
         | source and proprietary, which creates issues if you want to
         | package them.
        
       | [deleted]
        
       | consultutah wrote:
       | I'm excited to see there is still work being done on new VCS. Git
       | will be hard to beat, but it looks like Anu is hitting on some of
       | its weak points.
        
         | minerjoe wrote:
         | > Git will be hard to beat.
         | 
         | With the latest kurfuffle at Github, I've started moving to
         | fossil. Having everything, wiki, pull requests, etc. as part of
         | the repo is looking like a good move.
         | 
         | Why let yet another corporation have control over something
         | they should have never been given?
         | 
         | https://fossil-scm.org
        
           | andrewzah wrote:
           | github != git ... I'm not sure why people strongly conflate
           | these two so much.
           | 
           | It would be equally as valid to self-host
           | gitea/gogs/sourcehut/gitlab and/or an issue tracker of your
           | choice, which arguably is preferable to adopting a completely
           | different tool over what is a provider issue.
        
             | reificator wrote:
             | While that's a common conflation I don't think the GP was
             | doing that. While I tend to self-host git, I can see the
             | value they're claiming fossil has.
             | 
             | Whether self-hosted git or hosting on Github, your issue
             | trackers and such are typically separate from your main
             | repository. Most platforms offer wikis as a side-by-side
             | repository so that should be easy to move, but the rest is
             | at the whims of the platform.
             | 
             | The GP is claiming they moved to fossil because the one
             | repository contains all of this data.
        
             | wtracy wrote:
             | I think minerjoe is trying to emphasize that fossil has all
             | the features of GitHub included in the VCS itself,
             | eliminating the need for any of the tools you listed above.
             | 
             | I haven't followed Fossil, so hearing that it includes
             | things like a wiki is news to me.
        
             | dan-robertson wrote:
             | I think all git has going for it is its existing inertia
             | and GitHub. I think the foundations were a bigger deal when
             | git was newer. Other DVCSes have decent foundations.
             | 
             | Going against git is an atrocious user interface (if it
             | were good then [1] would be neither funny nor sad). Most
             | people just memorise a few commands and if they stop
             | working they transfer their changes elsewhere, delete the
             | repo, and start again. Sometimes a team will have a "git
             | expert" who has merely memorised a few more commands and is
             | better able to get a repo out of a broken state. Git fails
             | badly at an important for a developer tool: largely getting
             | out of the way.
             | 
             | [1] https://git-man-page-generator.lokaltog.net/
        
               | reificator wrote:
               | You'll never hear me say that git's interface is good,
               | but this seems to be blowing things way out of
               | proportion. I haven't seen someone blow up and recreate a
               | git repo in maybe a decade.
               | 
               | I've definitely pulled out the BFG here and there to
               | clean up credentials but that's an issue in any VCS.
               | 
               | Maybe I'm biased because I'm am "better able to get a
               | repo out of a broken state", but for the record it's
               | definitely not because I've "memorised a few more
               | commands".
        
           | [deleted]
        
         | alquemist wrote:
         | What are those weak points and how does Anu fix them? I found a
         | technical example in
         | https://anu.dev/documentation/associativity.html, but it was
         | unconvincing. In their example, three way merge should produce
         | a conflict for a human to resolve it. Which is good, we don't
         | want a 'smart' tool to do the wrong thing and silently
         | introduce bugs. If the argument is that pijul/anu reduce the
         | number of conflicts by exploiting history, is there a
         | quantization of this benefit for practical workloads, e.g.
         | popular git repos?
        
           | IshKebab wrote:
           | I agree. Also in their example they lose a nice property of
           | git - invariance to squashing. If Alice squashes her two
           | changes then the final merge behaves differently to if she
           | hadn't. Very confusing!
           | 
           | Git definitely could be much smart about merge conflicts, but
           | it's a hard research problem so I'm not surprised it isn't.
        
           | andolanra wrote:
           | That example, while technically correct, is a little bit
           | misleading. From a practical point of view, the thing that
           | Pijul/Anu both do is not "automatically resolve conflicts"
           | but rather "allow repo operations to happen even in the
           | presence of conflicts". In Git, if you've got a conflict, Git
           | will require you to fix it before doing anything else. In
           | Pijul or Anu, you can continue applying changes--possibly
           | creating more conflicts!--in a way that's guaranteed to never
           | throw away changes. At the end of that, a human still needs
           | to resolve those merges manually.
           | 
           | But there are scenarios in which this avoids tedious human
           | merges. Consider that I'm applying a series of patches which
           | make changes in a file and later on walk those changes back,
           | and run into a merge conflict there. In Git, I could squash
           | those changes to avoid dealing with the conflict, but then
           | I've lost history. I could apply the changes, skipping the
           | relevant patches, but if those patches still contained useful
           | work elsewhere, then I'd have to go in and resolve those
           | problems manually.
           | 
           | In contrast, this same scenario in Pijul and Anu would just
           | trivially work in a way that didn't produce conflicts. I
           | would apply the sequence of patches, and one patch would
           | produce a conflict... but because they can keep doing work in
           | the presence of conflicts, then they could keep applying
           | subsequent patches and apply the patches which walk back the
           | changes, and in that resolve the conflict automatically, but
           | unlike the Git approach where I flattened the changes first,
           | I would still have the full commit history associated with
           | that sequence.
           | 
           | Now, that doesn't mean that Pijul or Anu will automatically
           | fix all merges. If you have two separate code edits to
           | reconcile, you might still need a human in the loop to
           | reconcile them. But the fact that they can keep making
           | changes in the presence of conflicts allows them to avoid a
           | certain kind of "busywork" that comes with managing git
           | history.
        
       ___________________________________________________________________
       (page generated 2020-11-05 23:00 UTC)