[HN Gopher] Using CRDTs for multiplayer text editing
       ___________________________________________________________________
        
       Using CRDTs for multiplayer text editing
        
       Author : henning
       Score  : 70 points
       Date   : 2022-12-01 19:03 UTC (3 hours ago)
        
 (HTM) web link (zed.dev)
 (TXT) w3m dump (zed.dev)
        
       | eep_social wrote:
       | This reminds me of Oracle's SCN which blew my mind back in ~2009
       | when I moved up from mysql home gamer stuff to the corporate
       | world.
       | 
       | > Even long edit histories barely compare to the memory savings
       | Zed obtains from not being built with Electron.
       | 
       | I got a good chuckle out of this. Speaking of performance, I
       | think it's I nteresting that ms word and google docs are still
       | using OT rather than CRDT. The goog version seems to work well
       | but I have had nothing but frustration with ms word. Bad merges
       | and weird states are typical, particularly from the fat client.
       | 
       | Anyone else interested in the background of OT vs CRDT might find
       | this thread useful: https://news.ycombinator.com/item?id=18191867
        
       | tym0 wrote:
       | Has Microsoft published anything about their pairing plug-in for
       | VS Code?
        
       | rubatuga wrote:
       | Yeah CRDTs are pretty annoying for collaborative text editing,
       | even Google Docs doesn't use them. Some companies have switched
       | from CRDTs back to traditional methods.
        
         | paulgb wrote:
         | In addition to OP (nice work, Nathan!), one of the yjs authors
         | has made a pretty in-depth argument that (modern) CRDTs are
         | suitable for human-scale text editing.
         | https://blog.kevinjahns.de/are-crdts-suitable-for-shared-edi...
        
           | nathansobo wrote:
           | Thanks very much!
        
         | nathansobo wrote:
         | We're quite happy with them. Why do you find them annoying?
        
           | tlb wrote:
           | One common failure mode is that two people start typing at
           | the beginning of the same line, and rather than getting two
           | lines, it alternates characters. At least, Etherpad did this.
        
             | maxbond wrote:
             | This is a common artifact of operational transforms I
             | believe. Etherpad launched in 2008; CRDTs were proposed in
             | 2011. AFAIK Etherpad used (uses?) OT.
             | 
             | Open to correction though, it's been a while since I dug
             | into the differences in these approaches & my memory is
             | imperfect.
        
             | muxator wrote:
             | Etherpad used Operational Transform, not CRDTs.
             | 
             | Source: I have been etherpad's maintainer for two years.
        
             | josephg wrote:
             | This is called the "interleaving problem". It shows up with
             | simple list algorithms like fractional indexing.
             | 
             | All the main text editing CRDT algorithms around today
             | solve this no problem. (Yjs, automerge, diamond types,
             | etc).
        
               | nathansobo wrote:
               | We honestly don't solve it yet, but haven't found it that
               | big of an issue in practice. Would be curious to see the
               | best resource on it.
        
               | josephg wrote:
               | I remember reading about it in one of Martin Kleppmann's
               | papers, though I can't remember which one.
               | 
               | It's an ordering problem that comes from some of the
               | simpler ordering algorithms. For Diamond types I'm using
               | a variant of Yjs's ordering. But even RGA doesn't have
               | this problem because each character's insert location is
               | specified by naming the character immediately to the left
               | when that character was typed.
               | 
               | This repository implements a few different list CRDTs
               | using an insertion sort approach, where the algorithm
               | scans for the appropriate location every time an insert
               | happens. This is the scanning function for RGA
               | (automerge's algorithm):
               | 
               | https://github.com/josephg/reference-
               | crdts/blob/fed747255df9...
               | 
               | And this is an interactive visualisation of how diamond
               | types works (which uses Yjs's algorithm instead),
               | complete with run-length encoding:
               | https://home.seph.codes/public/diamond-vis/
        
           | maxbond wrote:
           | I'll offer up that I've had a hard time wrapping my mind
           | around them and that in practice it seems like you need to
           | implement a check pointing operation on top of them which is
           | necessarily not conflict-free, or your replication log will
           | expand without bound & eventually overwhelm your system.
           | (Though perhaps not, depending on your problem domain. Not
           | CRDTs but in this interview the engineers discuss a massive,
           | high frequency replication log that they don't checkpoint,
           | which they've been running at scale for years. Though you
           | could also say they implicitly checkpoint every trading day,
           | and they're working on implementing checkpointing.
           | https://signalsandthreads.com/state-machine-replication-
           | and-...)
           | 
           | That being said I would use CRDTs for any greenfield
           | collaboration project.
        
         | maxbond wrote:
         | The first version of Google Docs was written in 2006; CRDTs
         | weren't proposed until 2011, and still seem to be maturing as a
         | technology.
         | 
         | So, I don't think this is a reflection on the merits of CRDTs
         | versus operational transforms as much as it is a reflection on
         | the ecosystem and the history of the codebase.
        
           | josephg wrote:
           | Yep. I've been working in the collaborative editing space for
           | over a decade - with both OT algorithms and CRDT algorithms.
           | The modern CRDTs I've been developing are significantly
           | faster and smaller than any OT system I made 5+ years ago.
           | 
           | Much more complex though. (~3k loc for a good, high
           | performance text crdt merging algorithm, vs 300 loc for a
           | good text OT algorithm.)
        
             | HWR_14 wrote:
             | Sorry, I'm not sure I properly understood OT as an acronym.
             | Did you mean "operational transformation"? Because I
             | thought that OT was method of making operational based
             | CRDTs[0].
             | 
             | But I'm probably wrong in at least one way! Hoping to
             | learn.
             | 
             | [0] https://link.springer.com/chapter/10.1007/978-3-662-433
             | 52-2_...
        
             | zackoverflow wrote:
             | Can you explain why they're more complex? I've always heard
             | the narrative that OT is harder to get right because of
             | edge case
        
       | nmlt wrote:
       | This webpage crashed my iPad 6 Gen Safari when repeatedly
       | scrolling down without reading (iPadOS 16.1.1). It's probably
       | just too little memory, but still, isn't this just a blog post?
        
         | iamnbutler wrote:
         | Apologies, looks like we underestimated how heavy the
         | autoplaying diagrams would be. If you could try again in about
         | 5 minutes and see if you still have the problem I'd appreciate
         | it!
         | 
         | Sorry for the hassle.
        
         | nathansobo wrote:
         | We embeded a lot of videos for animated diagrams. I think we
         | need to disable autoplay, at least on mobile.
        
       | sghosh2 wrote:
       | Very cool use case for CRDTs! I've seen a bunch of different use
       | cases from other products like https://liveblocks.io/ and
       | https://electric-sql.com/. It's interesting how CRDTs are now
       | taking hold so much for all these collaborative syncing
       | scenarios. Wonder what's driving the proliferation now given
       | they've been around for awhile?
        
       ___________________________________________________________________
       (page generated 2022-12-01 23:00 UTC)