[HN Gopher] Faster CRDTs: An Adventure in Optimization ___________________________________________________________________ Faster CRDTs: An Adventure in Optimization Author : xnx Score : 469 points Date : 2021-07-31 11:37 UTC (11 hours ago) (HTM) web link (josephg.com) (TXT) w3m dump (josephg.com) | hinkley wrote: | I'm getting mixed messages on CRDTs. Are we at the point now | where they are general enough that the human observer is not | constantly confronted with 'surprises' from the behavior of the | system? | | Some of the talks by Kleppmann go straight into the weeds and | make it hard to tell if he's just nerding out about finer points | or lamenting unsolved problems, or even paradoxes. | JW_00000 wrote: | By the way, as someone who has published academic papers, if | you're ever bothered about a paper or have some comments, don't | hesitate to mail the authors. (Their e-mail addresses are always | on the paper; especially target the first author because they | have normally done the work.) We are happy to hear when someone | has read our work and I at least would've liked to have known if | someone found a problem with my papers. | zladuric wrote: | > especially target the first author because they have normally | done the work | | As someone living with a recently promoted? (is that the | correct term?) PhD in social sciences, this surprises me. Is | that something specific for my country, for social sciences or | my wife simply landed in a case full of rotten apples? | galgalesh wrote: | In my uni in Belgium, this is the case too. First author does | most of the work. | shrimpx wrote: | In computer science the first author does the work and is | usually a PhD student. The last author is usually the | professor that pushed and helped develop the idea, provided | funding, and probably wrote or was heavily involved in | writing the paper's abstract, intro and conclusion sections | -- the bulk of "framing" the work. | | But there are exceptions. Some profs are less student- | oriented or don't like delegating so much, and remain | "individual contributors" deep in their careers. Those tend | to publish nonzero number of first- and single-author papers. | | Edit: I've noticed that in Theory and Algorithms, profs tend | to take first author even though the student slaved out the | proofs. That field is kind of an outlier in that it's close | pure math, and I think borrows cultural artifacts from math | research. | [deleted] | exmadscientist wrote: | At least in the fields I've published in, first author does | the work (or at least most of it, almost always including | writing the paper) and last author secured the funding. | Sometimes authors are listed in alphabetical order, in which | case one or two are usually listed as "corresponding author". | The less-senior corresponding author usually did the work, | and the more-senior corresponding author is usually either | the one who secured the funding, or a long-term member of the | research group who is likely to actually be _around_ to | answer correspondence in the future (and who also probably | helped out enough to be worth corresponding with). | anonymousDan wrote: | I think it's pretty field specific. In a lot of CS for | example the first author did most of the work and the last | author got the funding. The people in the middle may have | done varying degrees of work ranging from helping | substantially with the implementation, evaluation or writing | of the paper to once having read the paper abstract :) | robbles wrote: | I think the term you're looking for might be "postdoc" | (postdoctoral researcher)? I've seen "recent PhD" a few times | as well, seems pretty clear. | lewisjoe wrote: | I've been looking for a practical OT alternative for our online | word processor (https://zoho.com/writer). We already use OT for | syncing our realtime edits and exploring CRDTs targetting | stronger consistency for tackling offline edits (which are | typically huge & defragmented, since the edits are not syncing in | realtime) | | So the baseline is that OT has a better model for holding state | in terms of performance/memory, since the edits can be compiled | into plain string types. CRDTs in comparison forces us to hold | deleted states as well and demands richer information per unit | (character/string/etc) - which makes it harder on the CPU/RAM. | | Here's the story as I understand: | | 1. Automerge tackles this by just moving to a better lower-level | runtime: Rust. | | 2. Yjs handles this by using a similar technique i.e relying on | V8's hidden classes to handle the performance optimizations and | assuming real-world cases to narrow down and optimize | datastructures. | | But none of these, seem to be a fundamental breakthrough in the | efficiency of the algorithm itself. They all at best look like a | workaround and this keeps bothering me. | kevinjahns wrote: | I know that it is hard to comprehend why modern CRDT | implementations are fast. But the data confirms that they work | great. OT seems to be much simpler, but there are real | advantages in using CRDTs. The performance problems have been | solved through an efficient representation of the CRDT model. | | The gist of the below [1] read is that it is impossible for a | human to create a document that Yjs can't handle (even in the | worst case scenario). But yes, it handles real-world scenarios | particularily well. | | The concept of "hidden classes" is super old. It has first been | implemented in a fork of smalltalk and then became foundational | concept of runtime engines for scripting languages. It is | implemented in V8, python, ruby, spidermonkey, .. | | Yjs does not assume a "real-world scenario" and it is not | optimized for any specific runtime engine. It runs fast in any | browser. The benchmarks confirm this. [2] | | Yjs is being used in practice by several companies (eg Nimbus | Notes with >3 million users) for quite some time now. I'm not | aware of any performance problems. | | [1]: https://blog.kevinjahns.de/are-crdts-suitable-for-shared- | edi... [2]: https://github.com/dmonad/crdt-benchmarks | pfraze wrote: | You can remove tombstones in a cleanup pass if you constrain | behavior a bit. | | For instance, if there's just a single server, after 5 minutes | of no active connections you could clean out tombstones. After | that, if a client connects with some changes they had been | holding onto but got DCed, you can reject the write and let the | user's client merge by some other means (perhaps even manual). | josephg wrote: | If you've got big offline edits (or you're merging multiple | large sets of edits), even existing CRDTs will generally handle | that more efficiently than OT will. OT algorithms are usually | O(n * m) time complexity when merging _n_ edits from one peer | with _m_ edits from another peer. A CRDT like diamond-types is | O((n + m) * log(s)) where s is the current size of the | document. In practice its super fast. | | As for holding deleted states and richer information per unit, | its not so bad in absolute terms. 1-2mb of data in memory for a | 17 page document is honestly fine. But there's also a few | different techniques that exist to solve this in CRDTs: | | 1. Yjs supports "garbage collection" APIs. Essentially you say | "anything deleted earlier than this point is irrelevant now" | and the data structures will flatten all runs of items which | were deleted earlier. So storage stays proportional to the size | of the not-deleted content. | | 2. Sync9 has an algorithm called "antimatter" which mike still | hasn't written up _poke poke mike!_. Antimatter actively tracks | the set of all peers which are on the network. When a version | is known to have been witnessed by all peers, all extra | information is safely discarded. You can also set it up to | assume any peer which has been offline for however long is gone | forever. | | 3. Personally I want a library to have an API method for taking | all the old data and just saving it to disk somewhere. The idea | would be to reintroduce the same devops simplicity of OT where | you can just archive old history when you know it probably | won't ever be referenced again. Keep the last week or two hot, | and delete or archive history at will. If you combined this | with a "rename" operation, you could reduce the "hot" dataset | to basically nothing. This would also make the implementation | much simpler - because we wouldn't need all these performance | tricks to make a CRDT like diamond-types fast if the dataset | stayed tiny anyway. | vanderZwan wrote: | On a meta-level, does anyone else think that the whole idea of | writing a peer reviewed paper that is just a benchmark of | different algorithms should be really rigorously reviewed before | being accepted? Writing good benchmarks is hard, and so highly | contextual that writing fair comparisons beteen algorithms (or | data structures) is almost impossible unless you're an expert in | all of the algorithms involved. | thysultan wrote: | Only the holy experts can divulge the realm of gods that is | benchmarking and reveal to us the results. How--to do we | beseech them o wise one? | Diggsey wrote: | Yeah, I've also seen several academic papers on performance or | "optimization" of existing algorithms which just demonstrate a | complete lack of knowledge about how those algorithms are | implemented in practice. | | For example, there was a paper explaining how you could | optimize the GJK algorithm by reducing the number of distance | checks required, and in turn the number of square-roots... | Despite the fact that everyone (including the authors of the | original GJK algorithm) knows that you don't _actually_ need to | do a square-root to compare distances... | meepmorp wrote: | > Despite the fact that everyone (including the authors of | the original GJK algorithm) knows that you don't actually | need to do a square-root to compare distances.. | | Academia's purpose is to produce research, typically measured | in publications per unit time. Optimizing one paper leads to | a global reduction in the size of the literature by pruning | opportunities for subsequent research, harming the overall | performance of the system. | lostdog wrote: | Benchmarking papers are inaccurate when the original algorithms | are not open sourced, and the grad student needs to rewrite the | algorithm from scratch. They can easily create different | implementation details, and wind up with an algorithm that's | slower than the original. | | I do think that the original algorithm authors should have the | opportunity to correct the benchmarking code, or to release | their original implementation as open source to be benchmarked. | | In some sense, the benchmarking paper with a slower | implementation is more "correct," since an engineer evaluating | which algorithm to use is just as likely to implement the | algorithm in a slower way than the original paper. The | incentives are right too: the original paper author should be | giving enough details to recreate their work, and the | benchmarker is showing that really their published algorithm is | slow. | knuthsat wrote: | Problem is that academics are rarely experts at programming or | have knowledge of computer architectures as much as someone in | the industry. There are various tricks that are never taught at | college, therefore academics have no idea some stuff even | exists. | | Best example is discrete optimization research (traveling | salesman, vehicle routing and its variants, schedule rostering | etc.). Stuff you find in the papers there achieves state-of- | the-art results very slowly (using integer linear programming | or some rarely optimized heuristics) making you believe these | instances of a general NP-hard problem can't be solved quickly. | | When you start tinkering, you either find that data structures | can be added that reduce the complexity significantly or that | there are regularities in instances that, when exploited, | support massive speedups. | | I would say that TSP research is an exception but most of stuff | coming out that has a lot citations is way too slow and is | never as brilliantly implemented as Lin Kernighan heuristic or | other stuff from the age of insanely slow computers. | makmanalp wrote: | > Problem is that academics are rarely experts at programming | or have knowledge of computer architectures as much as | someone in the industry. There are various tricks that are | never taught at college, therefore academics have no idea | some stuff even exists. | | I want to push back on this generalization a bit. The | academics that are focused on pushing the mathematical | boundaries of discrete optimization are focused, no surprise, | on only that. If theoretical improvements is not what you | want, don't read those papers, read ones what people more | focused on applications. Read literally any databases paper, | or stuff like hyperscan, simdjson. I'd argue that a non- | trivial amount of these are vastly /ahead/ of what's standard | in industry, but industry is slow to adapt new ideas and | catch up because of legacy software and existing codebases. | Very similar stuff in the programming languages sphere, it | took ages for industry to adapt ideas that were popular in | academia for a long time (eg: Rust, LINQ). The idea that | academia (at least in CS) is an ivory tower, far from the | concerns of industry, is not very true as of recent. There's | a lot of cross pollination and a revolving door of people | going back and forth from one to the other spreading ideas. | knuthsat wrote: | I must say that when it comes to discrete optimization, the | genetic/ant/simulated annealing/etc. stuff is more popular | in academia than in industry (at least the industry that | doesn't heavily include academics). | | Works like Lin-Kernighan heuristic are extremely rare and a | bunch of knowledge exists in industry only. Even the | mentioned heuristic was for decades being implemented | incorrectly until one individual came and demonstrated its | superiority (K. Helsgaun). | | I mean, most of the code that gets released today, doesn't | even handle time windows constraint in the best way | possible (optimal updates, constant time feasibility | evaluations etc.). I believe all open source vehicle | routing libraries do not have any data-structure relevant | optimizations for handling time windows, task precedence, | dynamic workshift starts etc. All are fairly simple | implementation tricks that just got lost under giant | amounts of population based heuristics or mathematical | linear programming approaches. | hodgesrm wrote: | 100x +1. The whole field of query optimization has been | devoted to improvement of efficiency and speed of database | systems for decades. [1] Academic results are studied quite | closely in industry. Also, "academic" includes people like | Mike Stonebraker and Andy Pavlo. They aren't exactly | slouches regarding issues of performance. | | More generally, major waves of performance innovation in | the IT field have been driven by advances in storage, | compression, vectorization, virtualization, etc. Academic | results have led to entirely new products like Vertica | (from C-Store [2]) or VMware (from the Disco system [3]). | | [1] https://en.wikipedia.org/wiki/Query_optimization | | [2] https://w6113.github.io/files/papers/cstore-vldb05.pdf | | [3] http://www.cs.cmu.edu/~15712/papers/bugnion97.pdf | | edit: clarity | kaba0 wrote: | You do realize there is a whole area of the research field | dedicated for heuristic algorithms? They have a proper | academic basis just as much as the "correct" solutions. | pyrale wrote: | From what I saw (I worked in an academia-consulting | partnership to solve scheduling problems in transports), the | OR researchers very frequently work in association with OR | consulting shops, and they're well-aware of the difference | between academic datasets and industrial datasets. In the | conferences I saw, it was not infrequent to see different | tracks for industrial and academic papers, both attended by | the same crowd. | | The point I agree with, though, is that this is not reflected | in the papers. Academic papers focus on academic instances | because they are more general (and usually harder, as you | said) and because optimizations of specific instances of the | problem are not that useful from an academic pov. | | It's hard to know who works with who and who has experience | with what if you're not an insider, though. | raverbashing wrote: | > Stuff you find in the papers there achieves state-of-the- | art results very slowly (using integer linear programming or | some rarely optimized heuristics) making you believe these | instances of a general NP-hard problem can't be solved | quickly. | | Yes, or looking for things like mathematical purity, linear | problem statement, etc | | In practice: you don't need the best solution and you can get | a great solution in a multitude of ways. | comicjk wrote: | On the one hand, peer review takes long enough already. On the | other... I saw an influential paper that published obviously- | wrong timing data, essentially saying that time(A) + time(B) < | time(B). It seems they were including the Python interpreter | startup time (~0.1s) in a profile of a ~0.01s algorithm. | z3t4 wrote: | What I like about "tests" in software development is that anyone | can run them, just download the source code, then run ./test or | right click and "run tests". It would be cool if computer science | could offer the same experience, just download the source code | and run it, compare if you got the same result, inspect and learn | from the source code, etc. Instead of "here's some pseudo-code | we've never tried", and here's a mathematical formula that you | need to be a mathematics professor to understand... Yes we know | you are not a professional software developer, the code is going | to be at a beginners level, but that is fine, I am not reading | your paper to criticize your code for being "impure", or not | using the latest syntax and frameworks, I'm reading it to | understand how to implement something, to solve a problem. | paulgb wrote: | This is great! I'd like to quote a line here, because I think the | answer is "someone on HN knows" and I'd like to hear the answer | as well. | | > V8 is actually suspiciously fast at this part, so maybe v8 | isn't using an array internally to implement Arrays? Who knows! | bouk wrote: | The V8 blog is a good starting point for learning how it works | under the hood: https://v8.dev/blog/fast-properties | vlovich123 wrote: | Hmm... I would have thought they implemented something | similar to a std::deque so that you have ammortized O(1) | insertions into the middle of a vector. | mfbx9da4 wrote: | Reminds me of the data structures in this markdown library | https://github.com/markdown-it/markdown-it/blob/master/docs/... | | The author hand waves away the possibility that there could be | memory locality performance benefits by using arrays in JS but my | hunch is that there is something in that. I know the react code | base for example went for a monomorphic Object structure to | represent components to leverage inline caching of hidden | classes. | dpiepgrass wrote: | > my range tree is just a slightly modified b-tree. But usually | when people talk about b-trees they mean a BTreeMap. Thats not | what I'm doing here. Instead of storing keys, each internal node | of the b-tree stores the total number of characters (recursively) | in that item's children. So we can look up any item in the | document by character position, or insert or delete anywhere in | the document in log(n) time. | | Cool! This is essentially the same idea I implemented in 2012; I | call it the AList or A-List data structure: | http://core.loyc.net/collections/alists-part1 | | Ever since I made it, I've been looking for a good application | for it. I guess this is it! I mean, I knew A-List was a good data | structure for text editing, but most applications can use a Gap | Buffer which is much simpler. But when it comes to concurrent | editing, you've got multiple editing points so a Gap Buffer is | suddenly much less attractive. | | > Honestly I'm shocked and a little suspicious of how little ram | Yjs uses in this test. | | It's good, but still it uses ~30x as much RAM as plain string | edits. Not surprisingly, you got 3x better memory usage by using | A-List and a more efficient language (Rust in this case, but C# | and C/C++ can also do well.) | | There is a great article about a CRDT concept called "Causal | Trees[1]. I wonder how it compares to flat-list-based CRDTs (it's | been too long since I researched this). | | By the way, Microsoft has a new set of libraries for concurrent | editing called Fluid Framework[2]. I'm told it's a "generalized | data structure" that was inspired by Causal Trees but with a | unique and intention-preserving editing scheme. I found out about | it after they decided to use my fast-cloning-copy-on-write B+Tree | for TypeScript[3]... they sent me a pull request for diffing | versions of B+ trees, but I haven't yet looked into the | architecture of their concurrent data type. | | [1] http://archagon.net/blog/2018/03/24/data-laced-with-history/ | | [2] https://fluidframework.com/ | | [3] https://www.npmjs.com/package/sorted-btree | stephc_int13 wrote: | Trees are a powerful and practical data structure, but even if it | does not appear clearly when doing O(n) style complexity | analysis, they are usually slow. | | Unfortunately, the difference between slow and fast can be | several orders of magnitude, while the perception of the | programmer doing a back of the envelope analysis seems to be a | logarithmic scaling of the reality... | josephg wrote: | Trees seemed to work pretty well for me here! | | The problem with trees is usually that people don't pack enough | data into each node in the tree. I implemented a skip list in C | a few years ago[1]. For a lark I benchmarked its performance | against the C++ SGI rope class which was shipped in the C++ | header directory somewhere. My skip list was 20x faster - which | was suspicious. I looked into what the SGI rope does and it | turns out it was only putting one character into each leaf node | in the tree it constructed. Benchmarking showed the optimal | number was ~120 or so characters per leaf. Memcpy is much much | faster in practice than main memory lookups. | | In diamond-types (benchmarked here), the internal nodes in my | B-tree store 16 pointers and leaf nodes store 32 entries. With | run-length encoding, all 180,000 inserted characters in this | data set end up in a tree with just 88 internal nodes and a | depth of 3. It goes fast like this. But if you think an array | based solution would work better, I'd love to see it! It would | certainly need a lot less code. | | [1] https://github.com/josephg/librope | feikname wrote: | Correct me I'm mistaken | | The difference between diamond native and diamond WASM | demonstrates how, even with WASM, native implementations beat | browsers _hard_ , and native implementations performance-wise are | still very worth, specially for lower powered devices, and, | perhaps, reducing battery usage (as consequence of less CPU use) | in mobile devices. | Jweb_Guru wrote: | The wasm implementation here was still running under a | JavaScript test harness, so I suspect it's the JS-WASM boundary | interactions that are causing the slowdown. WASM itself (if it | doesn't need to interact with JavaScript) usually runs with a | much smaller performance penalty. | josephg wrote: | I suspect so too - given there are 280 000 calls across the | JS-wasm boundary, and most of those calls pass a string. I'd | love to know for sure though. I considered making this | benchmark pass the whole data set in one go through JSON or | something, but that felt like cheating - since thats not how | the API would be used in practice during a real editing | session. | | But even paying that cost, it still seems much faster to use | rust + WASM than run the same algorithm in native javascript. | And the JS-wasm boundary will probably get gradually faster | over the next few years. | __s wrote: | Yes. Ultimately WASM is executing within a sandbox & involves | being JIT compiled (read: not heavily optimized except for hot | loops eventually). If native compilation is an option it makes | sense to go that route | | WASM competes with asm.js not asm (or, arguably, jvm etc) | Rusky wrote: | WASM JIT implementations tend to be quite a bit different | from JavaScript JIT, so that's not really where the perf | difference comes from. | | First, WASM gets all the heavy AOT optimizations from the | middle end of the compiler producing it. At runtime, WASM JIT | doesn't start from program source, but from something that's | already been through inlining, constant propagation, common | subexpression elimination, loop optimizations, dead code | elimination, etc. And WASM is already typed, so the JIT | doesn't have to bother with inline caching, collecting type | feedback, or supporting deoptimization. | | Because of that, the only really beneficial work left to do | is from the back end (i.e. arch-specific) part of the | compiler- basically, register allocation and instruction | selection. WASM JIT compilers don't bother trying to find hot | loops or functions before optimizing. Instead, they do a fast | "streaming" or "baseline" codegen pass for fast startup, and | then eagerly run a smarter tier over the whole module and | hot-swap it in as soon as possible. (See e.g. | https://hacks.mozilla.org/2018/01/making-webassembly-even- | fa...) | | The perf difference vs native rather comes from the | sandboxing itself- memory access is bounds checked, support | for threads and SIMD is limited (for now), talking to the | browser has some overhead from crossing the boundary into | JavaScript (though this overhead will go down over time as | WASM evolves), etc. | galesky wrote: | This is awesome! I'm researching delta state based CRDTs as a | master dissertation, this kinds of optimizations on op-Based are | really interesting | rawoke083600 wrote: | Damn! What a fantastic read ! And of course brilliant | coding/thinking :) Well done sir ! | lewisjoe wrote: | People who are interested in the topic: I just found out they | have open meetings about the parent project and seems like | anybody could join - https://braid.org/ | | Great way to share progress. Kudos! :) | toomim wrote: | Yep! Our next meeting is in two Mondays from now on August 2nd, | at 4:00pm Pacific Time. All are welcome: | https://braid.org/meeting-16 | iampims wrote: | This is a fantastic write-up. Worth a read even if you don't care | about CRDTs. | josephg wrote: | Hello HN! Post author here. I'm happy to answer questions & fix | typos once morning rolls around here in Australia | gfodor wrote: | Thanks for ShareDB. It's dope. I extended it to support | collaborative voxel editing (https://jel.app) and works great. | teodorlu wrote: | Have you used CRDTs to solve any practical problems? | | If so, how does the CRDT solution compare to a non-CRDT | solution? If a non-CRDT solution is feasible at all? | mirekrusin wrote: | Article mentions at the beginning that the author used CRDT | in Google Wave/ShareJS. | mirekrusin wrote: | My mistake, it says OT was used, thank you for correcting | me. | paulgb wrote: | Wave used OT rather than CRDTs, as the author discusses in | another post: https://josephg.com/blog/crdts-are-the- | future/ | saurik wrote: | AFAIK Wave and ShareJS both used OT (which the paper that | this article referred to was also attempting to benchmark). | | FWIW, I am myself also curious about this (the question of | comparing CRDT to non-CRDT solutions): I found OT | beautiful, but never really felt CRDT had the same feeling | of elegance; and so I am downright fascinated to see the | person I have always seen as a "god of OT" deciding to | forsake it and move to CRDTs going forward. Are they really | that great? Is there something simple I am missing for the | explanation for why they are so much better? (Is it maybe | that they can support arbitrary merging without a sequencer | to linearize the edits? Honestly, my question probably | sucks a bit as I haven't spent any time thinking about OT | or CRDT in at least three years--and even then only once | since a few years before that, as I have had other things | to spend most of my mental energy on recently--and so I am | failing to remember the details of my use cases or the | tradeoffs I saw or the implementation issues I felt were | easy/hard.) | the-smug-one wrote: | Do you want a centralized server to control the data? | Then just use OT. Do you want users to control the data, | and have your server essentially just be a forever- | present user? Then use CRDT. | | CRDTs certainly do have a mathematical elegance to them. | josephg wrote: | Yep, this is the best practical advice at the moment. | Well, for list CRDTs. State CRDTs (like a counter) are | small and fast, and kinda better than OT in every way. | | List ("operation based") CRDTs and OT systems are | "equivalent" in a very academic sense that nobody really | talks about or understands. Its really not obvious unless | you've been staring at this stuff for years but the | equivalence is there: | | You can make a CRDT out of any OT system by just shipping | the entire history of operations to each peer. List CRDTs | essentially do that, with a whole lot of tricks to | compress that data set and use it without needing to | linearly scan. | | And you can convert the other way too. You can add a | "rename" operation into a list CRDT which assigns a new | name to each element currently in the document. Before | the rename operation document "hello" might have IDs [a4, | b2, b3, b1, a5]. The rename operation changes the IDs to | [c1, c2, c3, c4, c5]. When an operation happens you | specify the version and the ID at that version of the | predecessor (eg c2). The insert happens there. Then you | need a method to take the ID at one version and | "transform" it to the ID of the same item at a different | version. Do the rename operation implicitly after _every_ | change, and viola! You now have OT semantics. "Insert | after c1" means "Insert after position 1". | | OT systems have one big advantage which is that you don't | have to ship the CRDT state to every peer. With a rename | operation, we can add back the operational simplicity of | OT systems into a CRDT. But the code is (and always will | be) much more complicated. So I think OT makes sense for | strictly server-client systems. | | You can also have a hybrid server, which talks CRDT to | full peers on the network but just does OT when talking | to browser clients and things like that. We talked about | this at our public braid meeting at the start of the | week. The discussion about this stuff starts about 30 | minutes in: https://braid.org/meeting-15 | zozbot234 wrote: | > And you can convert the other way too. You can add a | "rename" operation into a list CRDT which assigns a new | name to each element currently in the document. | | Operations in a CRDT must be commutative for merge/update | to be well-defined, so it's not immediately clear how a | "rename" operation can be expected to work properly. | Jailbird wrote: | Drifting off-topic but I've wondered this myself - I've been | interested in CRDTs in a "read the news" way but not a "this | rises to something I'm going to implement" way. | | Perhaps it's blindingly obvious to all here, so no one | mentions it: Thinking about more practical and real-world | problems seems like collaboration on on more | complex/structured data. Today, it seems one would still have | to transform that data to a large flat string underneath, and | implement an editor that only performs edits that maintain | the integrity of the higher-level object, while the flat | string provides collaboration features. Perhaps it's possible | to implement an XML schema known to the editor so all | insertions/deletions keep the document syntactically correct. | | I wonder if there's some other way to either generalize on | top of "big string" or create another base model (somehow?) | detaro wrote: | CRDT is a general concept, editing text is just one | possible application. If you have a stronger datatype, | great, you can build operations on top of it to implement a | CRDT system, depending on its properties. | Jailbird wrote: | What I've read talks about character insertion and | concepts that apply to editing text. | | Perhaps I just need to find bedrock to build up from | about the properties of a stronger datatype that allow | CRDTs to work. | detaro wrote: | https://arxiv.org/pdf/1805.06358.pdf looks like a decent | starting point, with references to work on other types. | Jailbird wrote: | and pointers to other papers, too - thanks! | josephg wrote: | > Today, it seems one would still have to transform that | data to a large flat string underneath, and implement an | editor that only performs edits that maintain the integrity | of the higher-level object, while the flat string provides | collaboration features. | | Lots of people think this and have mentioned it over the | years, but its a dangerous idea. The way concurrent edits | are handled makes it really easy for the automatic merging | algorithms to mess up the syntax of your XML / JSON | content. And thats really hard for non-programmers to fix. | | The right answer for this stuff is to just make CRDTs which | support other data types, the same way we did for OT with | ot-types[1]. We need CRDT code for lists + text (working | now - text is just a list of characters). And rich-text and | JSON. And that'd cover 99% of the use cases. I'm tackling | strings first in diamond-types because its probably the | hardest case; but I want to have native support for other | data structures soon too. | | Yjs and automerge already do this. They have support for | plain text, XML and arbitrary JSON structures. | | The simplest implementation for JSON structures + tuples is | probably shelf[2], which is so simple you can implement it | in about 25 lines of javascript. Shelf doesn't support | lists, but combining shelf with the list code I already | have in diamond-types is probably going to be good enough | for most applications. | | [1] https://github.com/ottypes/ | | [2] Shelf's description + code is here: | https://braid.org/algorithms/shelf . Or you can watch the | video of Greg (shelf's author) discussing the algorithm | here: https://braid.org/meeting-8 . Kevin Jahns (Yjs's | author) is in that recording too, and he's super jazzed | about how simple and beautiful shelf is. | lewisl9029 wrote: | First of all, thank you for the amazing read! I | thoroughly enjoyed the entire article, and it gave me a | new perspective on the feasibility of CRDTs for real | world applications performance-wise. | | Though I am curious now to hear your thoughts on the | conflict resolution side of the equation for complex data | structures like deeply nested JSON. | | The biggest takeaway I got from Martin's talk on the | topic from a few years ago was that while CRDTs are | theoretically guaranteed to eventually converge, the | resulting converged state might not make any sense to | applications that need to consume and act on that state | [1]. | | It seemed like a pretty fundamental challenge to using | CRDTs to store arbitrary application state to me at the | time, but I imagine the field has progressed immensely | since then, so would love to hear any insights you might | have around strategies that could be used to alleviate or | at least work around this challenge if I wanted to build | a CRDT-based app today. | | [1] https://youtu.be/8_DfwEpHE88?t=2232 | username91 wrote: | It's a great article - really informative and enjoyable to | read. Thanks for making it happen. :) | conaclos wrote: | Hi josephg, I'm a CRDT researcher. This is great to see so much | work around CRDT nowadays! | | Some optimizations whom you discuss are already proposed by | some papers and implementations. | | For instance, LogootSplit [1] proposes an implementation based | on an AVL tree with extra metadatas to get a range tree. | LogootSplit proposes also a block-wise approach that stores | strings instead of individual characters. Xray [2], an | experimental editor built by Github and written in Rust, uses a | copy-on-write B-tree. Teletype [3] uses a splay tree to speedup | local insertions/deletions based on the observation that a user | performs several edits on the same region. | | [1] | https://members.loria.fr/CIgnat/files/pdf/AndreCollabCom13.p... | [2] https://github.com/atom-archive/xray [3] | https://github.com/atom/teletype | josephg wrote: | Cool! It'd be interesting to see those CRDT implementations | added to Kevin Jahns' CRDT Benchmarks page[1]. The | LogootSplit paper looks interesting. It looks like xray is | abandoned, and I'm not sure about teletype. Though teletype's | CRDT looks to be entirely implemented in javascript[2]? If | the authors are around I'd love to see some benchmarks so we | can compare approaches and learn what actually works well. | | And I'm not surprised these techniques have been invented | before. Realising a tree is an appropriate data structure | here is a pretty obvious step if you have a mind for data | structures. | | To name it, I often find myself feeling defensive when people | read my work and respond with a bunch of links to academic | papers. Its probably totally unfair and a complete projection | from my side, but I hear a voice in my head reword your | comment to instead say something awful like: "Cool, but | everything you did was done before. Even if they didn't make | any of their work practical, usable or good they still | published first and you obviously didn't do a good enough | literature review if you didn't know that." And I feel an | unfair defensiveness arise in me as a result that wants to | find excuses to dismiss the work, even if the work might be | otherwise interesting. | | Its hard to compare their benchmark results because they used | synthetic randomized editing traces, which always have | different performance profiles than real edits for this | stuff. Their own university gathered some great real world | data in an earlier study. It would have been much more | instructive if that data set was used here. At a glance their | RAM usage looks to be about 2 orders of magnitude worse than | diamond-types or yjs. And their CPU usage... ?? I can't tell | because they have no tables of results. Just some hard to | read charts with log scales, so you can't even really eyeball | the figures. So its really hard to tell if their work ends up | performance-competitive without spending a couple days | getting their enterprise style java code running with a | better data set. Do you think thats worth doing? | | [1] https://github.com/dmonad/crdt-benchmarks | | [2] https://github.com/atom/teletype-crdt | mirekrusin wrote: | It seeems that the issue of reproducibility in computer science | where no gigantic/proprietary datasets are needed should not be | a problem by simply publishing repository with the code. Are | there any forces present that make it so rare in practice? | josephg wrote: | Credit where its due, the academics did publish the code they | wrote on github. But I don't know if anyone - reviewers or | readers - actually took the time to read it. Let alone | understand why it throws doubt on the paper's conclusions. | JW_00000 wrote: | Usually, (at least in my specific niche of the computer | science field,) if the code is published it's only | published after the paper has been reviewed. This is partly | to preserve anonymity during the review process, and also | because usually the code isn't seen as "part of the paper" | (i.e. "the paper should stand on its own"). Although I | agree that you could argue that for papers about | benchmarks, the code should definitely be considered an | essential part of the paper. | robmorris wrote: | That's an impressive optimisation! Out of curiosity, what do | you think are the most interesting or useful possible | applications for an optimised CRDT? | | When you're approaching an optimisation like this, do you mind | me asking how you think about it and approach it? | politician wrote: | Great post! I had no idea that list CRDTs could actually be | fast because I read the papers showing how they were | impractical. Thanks for investigating and writing this up -- | please accept my offer of a legitimate academic credential. | [deleted] | GlennS wrote: | When optimizing `findItem`, did you consider storing the | original index of each item on itself and using that as a | starting point? | | Obviously this might move later (maybe it can only increase?), | but usually not by much, so I would guess it would make an | efficient starting point / be immediately correct 99% of the | time? | | Looks like you already have 2 good solutions to this though | (start from index of recent edits and range tree). | benjismith wrote: | I've been following your work for years (and I'm actually neck- | deep in a ShareDB project right now) so I just want to say | thank you for all of your contributions! I especially enjoyed | this post. | pookeh wrote: | Wait it doesn't look like you used the performance branch of | automerge (which is now merged into master). It is | significantly faster. | | https://github.com/automerge/automerge/pull/253 | josephg wrote: | I did use the performance branch. And I had a chat with a few | people in the automerge community about the performance | numbers I was seeing long before I published to see if I was | doing anything wrong. I tested a few different versions of | automerge but in this test there wasn't much performance | difference between 0.12, 1.0.x-preview versions (which are | built from the merged performance branch) and I tried the | unreleased automerge-rs. When I ran my tests timing numbers | for automerge ranged from about 5m with the old non | performance branch down to about 4m20s or so with automerge- | rs. Still far from Yjs's 0.9 seconds. | | I just checked and it looks like automerge 1.0.1-preview-4 | has landed. I wrote the post benchmarking preview-2. I've | been knee deep in diamond types lately and haven't been | watching. Fingers crossed there's some more performance | improvements in the pipeline. I'd love to do a follow up in 6 | months showing much improved performance. | lewisjoe wrote: | Thank you for writing this piece Joseph. | | Just want to make sure if something's a possible typo or I'm | getting it all wrong :) | | Quote: "But how do we figure out which character goes first? We | could just sort using their agent IDs or something. But argh, | if we do that the document could end up as _abcX_ , even though | Mike inserted X before the b. That would be really confusing." | | Since the conflict is only between the children of _(seph, 0)_ | the only possibilities are, either ending up with _" aXbc"_ or | _" abXc"_ right? Or is there a legitimate possibility of ending | up with "abcX" ? | | I'm assuming we'll apply a common sorting logic only to | clashing siblings. | josephg wrote: | Good question. That part of the article could probably use | another diagram to explain it. | | The resulting document is generated by doing a depth-first | prefix traversal of the tree. The ambiguity comes because "b" | and "X" are both direct children of "a". So its not clear how | they should be ordered relative to each other. Because "c" is | a child of "b" in this example, the "X" can't appear between | the "c" and "b". The only valid orderings are, as I said, | "aXbc" or "abcX". But without knowing how "b" and "X" should | be ordered, its ambiguous which one to use. | | Let me know if thats still confusing! This stuff is hard to | explain without a whiteboard. | lewisjoe wrote: | Clear enough Joseph. Thanks :) | trishume wrote: | Have you seen my Xi CRDT writeup from 2017 before? https://xi- | editor.io/docs/crdt-details.html | | It's a CRDT in Rust and it uses a lot of similar ideas. Raph | and I had a plan for how to make it fast and memory efficient | in very similar ways to your implementation. I think the piece | I got working during my internship hits most of the memory | efficiency goals like using a Rope and segment list | representation. However we put off some of the speed | optimizations you've done, like using a range tree instead of a | Vec of ranges. I think it also uses a different style of | algorithm without any parents. | | We never finished the optimizing and polished it up, so it's | awesome that there's now an optimized text CRDT in Rust people | can use! | josephg wrote: | Oooohhhh no I haven't read that - thanks for the link! I feel | embarrassed to say this but I knew about Xi editor years ago | but I totally forgot to go read & learn about your crdt | implementation when I was learning about Yjs and automerge | and others. I'll have a read. | | And thanks for writing such an in depth article. It's really | valuable going forward. Maybe it's addressed in your write up | but are there any plans for that code, or has everyone moved | on? I'd love to have a zoom chat about it and hear about your | experiences at some point if you'd be willing. | neolog wrote: | Out of curiosity, what do you use to make those diagrams? | trishume wrote: | https://www.figma.com/ and putting a lot of effort into | them | ta988 wrote: | This was a great read, thank you. I wish there were more | explanations of the "black magic" part of Yjs. I'll have to dig | into that. | josephg wrote: | If you're interested in learning more about Yjs's internals, | I interviewed Kevin for 3 hours on zoom and got him to take | me through Yjs's code. He's a great guy. | | The video of that conversation is here: | https://youtu.be/0l5XgnQ6rB4 | thechao wrote: | I love high level systems languages like C/++ and Rust... but | everything you said about JavaScript being slow is the same | thing assembly programmers experience when optimizing high | level systems languages. | | In general, when I see C code and I'm asked to speed it up, I | always use "100x" as my target baseline. | josephg wrote: | Whoa thats a really impressive baseline to reach for when | optimizing C code! I'd love to hear some war stories. | | As you can probably tell from my article, most of my skill at | this stuff is from hard won tricks I've picked up over the | years - like reducing heap allocations and packing memory for | cache coherency. There's probably lots of things I just | haven't learned because I haven't discovered it on my own. | | Do you have a blog, or any recommendations for stuff to read | by you or others? | thechao wrote: | I learned low level programming while at Intel, working | with one of their performance teams; so, nothing written. | In general, though, assembly is more _expressive_ than | higher level languages; it lets you "tighten up" the code | the computer executes, reducing work & bandwidth. | | Specifically, a guy named Mike Abrash taught me some of | this stuff, and he's got some books explaining the theory | in terms of practice. | Majromax wrote: | When you write: | | > Yjs does one more thing to improve performance. Humans | usually type in runs of characters. So when we type "hello" in | a document, instead of storing ['h','e','l','l,'o'], Yjs just | stores: ['hello']. [...] This is the same information, just | stored more compactly. | | Isn't this not just the same information when faced with | multiple editors? In the first implementation, if I pause to | think after typing 'hel', another editor might be able to | interject with 'd' to finish the word in another way. | | In my view, these data structures are only "the same | information" if you provide for a reasonably-sized, fixed | quantum of synchronization. The merging makes sense if e.g. you | batch changes every one or two seconds. It makes less sense if | you would otherwise stream changes to the coordinating agent as | they happen, even with latency. | goldenkey wrote: | The chunking should depend entirely on the network | synchronization. If synchronization is editing-debounced, | then chunking could be applied rather safely. | josephg wrote: | I didn't explain this well but the transformation is | lossless. No data is lost from compressing like this. It has | no impact on the concurrency protocol or network protocol; it | just impacts how the data is stored locally. | | If we need to, we could split the run back out again into | individual characters without losing information. And that | does happen - we do that if something later gets inserted | into the middle of the run of characters. Pausing doesn't | help. Even the same user could later type something in the | middle of a word they'd typed previously. | | This limits what we can run-length encode. The code in | diamond only run-length encodes items when they are: | | - Typed sequentially in time | | - And sequentially in insert location (position 10, 11, 12, | 13, ...) | | - And either all of the characters have been deleted or none | of them have been deleted | | Notice the other fields in that example in the post. Each | subsequent character has: | | - An id field = the previous id + 1 | | - A parent field = the id of the previous item | | - The same value for _isDeleted_ | | If any of this stuff didn't match, the run would be split up. | | Instead of storing those fields individually, we can store | the id and parent of the first item, an isDeleted flag and a | length field. Thats all the information we actually need to | store. With yjs style entries (which is what diamond-types | actually implements), the code is here if you can make heads | or tails of it. The poorly named "truncate" method splits an | item in two. Spans are only extended if can_append() returns | true: https://github.com/josephg/diamond- | types/blob/c4d24499b70a23... | | With real data from Martin's editing trace, 182 000 inserted | characters get compacted down to 12 500 list items. Which is | still pretty good - thats 14x fewer entries. And it means | performance doesn't stutter when large paste events happen. | jg23 wrote: | I have a related question to this, if you're storing | ["hello"] as one chunk, what happens when you perform an | edit to say adding an extra ["e"] after the ["e"]? In the | unoptimised structure I know you can just add the new ["e"] | as a child of the original ["e"]. So here would you then | delete the chunk ["hello"] and split it into two halves | like ["he"] and ["llo"]? | josephg wrote: | Yes exactly. You replace the chunk with "he" and "llo" | then insert the extra item in the middle. The result is | ["he", "e", "llo"]. The code to do all this inline in a | b-tree is pretty hairy but it works pretty well! | Majromax wrote: | Ah, I see. I had thought that the consolidation gave a | batched update with a single ID, so 'h' + 'ell' + 'o' would | have IDs of 1, 2, and 3 respectively. That would have made | an editing conflict in the middle of 'ell' impossible. | josephg wrote: | Ah that makes sense! Concurrent changes are one thing, | but concurrent changes are rare. The big problem if we | did it that way is that it would become impossible to | insert in the middle of "ell", because we wouldn't be | able to name any of those internal positions. | | To get around that we assign a sequential ID for every | inserted character, regardless of how they're typed or | stored. Typing "hello" would assign IDs 1-5 even if you | paste "hello" from the clipboard. | kohlerm wrote: | Very nice, when I read "double linked list" I immediately thought | "what about a btree like structure?" I guess Martins idea to | replace the IDs comes from the "vector clock" idea for concurrent | updates | audidude wrote: | If anyone is looking for the combination of a piecetable and | b+tree (which appears to be what is talked about in this | article), I have one that I've been using for years across | various GTK components. | | https://gitlab.gnome.org/chergert/textrange/ | gritzko wrote: | About a decade ago, I implemented the Causal Tree CRDT (aka RGA, | Timestamped Insertion Tree) _in regular expressions_ using a | Unicode string as a storage. Later we made a collaborative editor | for Yandex based on that code. It used many of the tricks as | described in the text, even the optimization where you remember | the last insertion point. So I am terribly glad it all gets | rediscovered. | | The code is on GitHub [1] There is also an article which might be | a tough reading, so no link. Recently I greatly improved CTs by | introducing the Chronofold data structure [2]. Regarding that | benchmark article, I spent many years in academia, so the quality | of content problem is familiar to me. In other words, I don't | take such articles seriously. CRDTs are fast enough, that is not | a concern. | | [1]: https://github.com/gritzko/citrea-model | | [2]: https://arxiv.org/abs/2002.09511 | omgtehlion wrote: | That is nice! | | A couple of questions: Do you have released a CT implementation | in top of Chronofold? Have you any plans to benchmark it | against other algs? | gritzko wrote: | Thanks. No, I didn't release it, although there are | implementations on GitHub. Rust and Kotlin, at least. The RON | proof-of-concept wiki runs on the (unreleased) C++ | implementation [1]. I benchmarked it at some point, | everything was like "k nanoseconds" [2]. | | [1]: http://doc.replicated.cc/%5EWiki/ron.sm | | [2]: https://github.com/dmonad/crdt- | benchmarks/issues/3#issue-599... | omgtehlion wrote: | Thanks! | | I googled only 1 impl in Rust: | https://github.com/dkellner/chronofold which seems to | produce invalid results on some inputs. Actually the hard | part (integration) is made of hacks there... | | That PoC Wiki sounds really interesting and the whole | replicated.cc project! Any plans on releasing it? | gritzko wrote: | Here is Kotlin https://github.com/decentralized- | hse/collab-edit | | libron will be released, yes. | _hl_ wrote: | I remember seeing that (regex CTs) and immediately thinking | "wtf, why would anyone want to do that". Took me quite a while | to understand that it's actually a pretty clever way to write | fast state machines in browserland. So thank you for this work! | ta988 wrote: | CRDT: https://en.m.wikipedia.org/wiki/Conflict- | free_replicated_dat... | teleforce wrote: | This previous HN discussions on OT vs CRDT paper is an excellent | overview of the topics [1],[2]. | | From the paper conclusions "Our discoveries from this work | revealed facts and evidences that refute CRDT superiority claims | over OT on all accounts, which helps to explain the underlying | reasons behind the choices between OT and CRDT in the real | world." | | The fact that the paper provided the refutation to one of the HN | discussion points being made in [1] (i.e. reference to itself), | regarding the claimed of CRDT superiority in its footnotes is | rather amusing and the first such attempt I have personally seen | in a published paper. | | [1]https://news.ycombinator.com/item?id=18191867 | | [2]https://arxiv.org/abs/1810.02137 | tekkk wrote: | Excellent article! As someone who has to work with collaborative | editing I must say the complexity of the whole area is at times | daunting to say the least. So many edge-cases. So many mines to | step on. | | Now I think I am convinced that the OT vs CRDT performance | comparison is kind of moot point and the question is more about | the user experience. Which version produces nicer results when | two very diverged documents are merged. Maybe one of these days | I'll read an article about that too. | | To get off on a tangent a little bit, I'd be interested to know | how one could add in tracking of changes to Diamond or other | CRDT? Can you add an arbitrary payload to the operation and then | just materialize the areas different from the original snapshot? | I know Yjs can do this by creating snapshots and then comparing | them to another snapshot but it seemed a bit awkward and not | suited for real-time editing. | johnarno wrote: | If it weren't for academic papers, we probably wouldn't have the | beautiful and nice concept of a CRDT in the first place. We might | have 100 different solutions promising us 100 different | replication schemes, some with and some without guarantees, | hiding behind a gazillion different undefined terms that want to | make us believe each solution is better than sliced bread. Also, | I don't thing Google Wave didn't make it due to its OT algorithm. | uyt wrote: | I think this data structure is usually called a Counted B-tree | https://www.chiark.greenend.org.uk/~sgtatham/algorithms/cbtr... | instead of range tree ___________________________________________________________________ (page generated 2021-07-31 23:00 UTC)