[HN Gopher] A simple way to build collaborative web apps ___________________________________________________________________ A simple way to build collaborative web apps Author : thezjy Score : 165 points Date : 2021-08-17 13:42 UTC (9 hours ago) (HTM) web link (zjy.cloud) (TXT) w3m dump (zjy.cloud) | Gabrielwxf wrote: | > Dealing with a global database brings in much complexity that | is not essential to the subject matter of this article, which | will wait for another piece. | | Excellent write. It would be great to know why CockroachDB failed | your needs. | BackBlast wrote: | You could build this with couchdb multi master regional servers | and pouchdb on the client and have full consistency with the | replication both to clients and servers as well as conflict | resolution (in case of collision) done for you. | | This route seems like a lot of extra work for pretty similar | functionality. | nesarkvechnep wrote: | I'm interested how this stacks against Phoenix Channels + | Presence. | deathtrader666 wrote: | Yes, it would be great if someone with this experience can | chime in, especially since Phoenix has CRDTs built-in. | sambroner wrote: | I'm really glad to see an article like this. I've worked in the | space for a while (Fluid Framework) and there's a growing number | of libraries addressing realtime collab. One of the key things | that many folks miss is that building a collaborative app with | real time coauthoring is tricky. Setting up a websocket and | hoping for the best won't work. | | The libraries are also not functionally equivalent. Some use OT, | some use CRDTs, some persist state, some are basically websocket | wrappers, fairly different perf guarantees in both memory & | latency etc. The very different capabilities make it complicated | to evaluate all the tools at once. | | Obviously I'm partial the Fluid Framework, but not many realtime | coauthoring libraries have made it as easy to get started as | Replicache. Kudos to them! | | A few solutions with notes... - Fluid Framework - | My old work... service announced at Microsoft Build '21 and will | be available on Azure - yJS - CRDTs. Great integration with | many open source projects (no service) - Automerge - CRDTs. | Started by Martin Kleppman, used by many at Ink & Switch (no | service) - Replicache - Seen here, founder has done a great | job with previous dev tools (service integration) - | Codox.io - Written by Chengzheng Sun, who is super impressive and | wrote one of my fav CRDT/OT papers - Chronofold - CRDTs. | Oriented towards versioned text. I'm mostly unfamiliar - | Convergence.io - Looks good, but I haven't dug in - | Liveblocks.io - Seems to focus on live interactions without | storing state - derbyjs - Somewhat defunct. Cool, early | effort. - ShareJS/ShareDB - Somewhat defunct, but the code | and thinking is very readable/understandable and there are good | OSS integrations - Firebase - Not the typical model people | think of for RTC, but frequently used nonetheless | | I should add... I talk to many folks in the space. People are | very welcoming and excited to help each other. Really fun space | right now. | vyrotek wrote: | Fluid Framework looks pretty cool! I somehow missed the Build | announcement about this. | | Maybe it's just me, but it has a SignalR + Orleans sort of vibe | to it when I think about the types of problems it solves. I | will definitely be digging into this a bit more. | BackBlast wrote: | I'll have to look at some of these, I've reviewed some of these | but not all. You are missing some I'm familiar with. | | PouchDB+CouchDB work well out of the box with minimal fuss for | open pieces you can just plug into this role. PouchDB handles | the client's state persist and replication on the client, | couchdb is the reliable cloud service you can replicate to. | | Meteor, at least their pre-apollo stack had realtime collab | type features with their mini-mongo client and oplog tailing. | nikodunk wrote: | I've used YJS and can strongly recommend. | https://github.com/yjs/yjs | | Built a Google Docs like rich text collaborator for a client on | Express/Psql and React. Worked like a charm. The hardest part | was dealing with ports on AWS to be honest. | memco wrote: | Very nice writeup! However, the example did not fully work for | me. I could perform CRUD on a single tab, but opening the list in | multiple tabs did not replicate the list or actions. Seeing this | in the console: [Error] Could not connect to the | server. [Error] Fetch API cannot load https://damp- | fire-554.fly.dev/replicache-pull?list_id=kx1I-gXPWwOxU9teRUJ_c | due to access control checks. [Error] Failed to load | resource: Could not connect to the server. (replicache-pull, line | 0) | | Safari 14.2 on macOS 10.15.7. | timwis wrote: | What about conflict resolution? If two users update the same | record/field around the same time? Isn't that the trickiest part | of real time? | tommoor wrote: | I believe it uses a CRDT hosted by a third party service. | amelius wrote: | Trickiest part is probably adding fine-grained access control | rules. | tmikaeld wrote: | That's what Replicache[0] solves, it provides for Causal+ | Consistency across the entire system. | | "This means that transactions are guaranteed to be applied | atomically, in the same order, across all clients. Further, all | clients will see an order of transactions that is compatible | with causal history. Basically: all clients will end up seeing | the same thing, and you're not going to have any weirdly | reordered or dropped messages." | | [0] https://doc.replicache.dev/design | | Note: There's more in their links, but the linked sites are | down.. | btown wrote: | It appears Replicache doesn't use CRDTs since it has a | central source of truth: | https://news.ycombinator.com/item?id=22175530 | | See also the commentary here: | https://doc.replicache.dev/guide/local-mutations | | This sounds a lot like Operational Transform but without the | transform part - it assumes that locally applied mutations | can be undone and rebased without user interaction. But I | feel like the Google Wave team would have a lot of objections | to the idea that this can just be ignored. If your state is | just a group of key value stores where last write wins and | everyone can agree on who's last, that's fine, but text/token | streams require a notion of transformation that I'm worried | Replicache simply glosses over. | aboodman wrote: | I'm not sure if you are understanding that when Replicache | rebases operations locally it actually re-executes code | which can have arbitrary effects. This design yields a lot | of flexibility to preserve intent: the function can look at | current state of world and decide to do something | different. | | Now, it is true that OT is considered the gold standard for | certain kinds of collaborative editing, in particular | unstructured text. But CRDTs are quickly catching up and I | believe that any CRDT should by definition be implementable | on top of Replicache. | | Its also quite a lot easier to implement a Replicache | backend than an ot backend. | tmikaeld wrote: | I'd rather it was configurable, since there's different | use-cases for both and it can be in the same app. So you're | definitely making a valid point. | Chris_Newton wrote: | Indeed, there can never be one universal solution to this, | because the problem is one of specification rather than | (only) implementation. | | For example, suppose we have an edit/delete conflict, where | two clients concurrently interact with the same entity in | your data model. In a simple case, we can decide to | "resurrect" the affected entity and apply the edit, which | is the option that never results in significant data loss | and so might be a reasonable behaviour if no user | interaction is involved. | | Now, what if there were other consequences of deleting that | entity? Maybe the client that deleted the entity then | created a new entity that would violate some uniqueness | constraint if both existed simultaneously. Or maybe it | wasn't the originally deleted entity that would violate | that constraint, but some related one that was also deleted | implicitly because of a cascade. How should we reconcile | these changes, if simply allowing either one to take | precedence means discarding data from the other? | | At least if all clients are communicating in close to real | time, it's unlikely that any one of them will diverge far | from the others before they get resynchronised, so the | scope for awkward conflicts is limited. But in general, we | might also need to support offline working for extended | periods, when multiple clients might come back with longer | sequences of potentially conflicting operations, and | there's no general way to resolve that without the | intervention of users who can make intelligent decisions | about intent, or at least a set of automated rules that | makes sense in the context of that specific application. | And in the latter case, we'd still probably want to prove | that our chosen rules were internally consistent and | covered all possible situations, which might not be easy. | soco wrote: | The good old CAP theorem hits again... | tabtab wrote: | How one wants to see them could depend; that's why I | recommend using an RDBMS. One can "play back" transactions | using different orders and filters. If teams get confused or | accidentally "step on each others toes", then one may need to | review different scenarios to see what was intended by two or | more parties. | Gabrielwxf wrote: | I suppose, as mentioned in the essay, it's handled by | Replicache. | Zealotux wrote: | Figma's blog has a few valuable articles on that subject: | https://www.figma.com/blog/how-figmas-multiplayer-technology... | [deleted] | Wowfunhappy wrote: | I remember listening to an episode of the Exponent podcast, in | which Ben Thompson said something like (paraphrasing from | memory): | | > People who love "native apps" can complain about Electron all | they want--but there's simply no replacement for the real-time | collaboration offered by web-based apps like Figma! | | As someone who's not exactly thrilled with Electron and its | memory usage--is there a reason the two go together? Is there a | reason we can't build collaborative apps in Cocoa and GTK? I | think these systems are awesome, I just think they'd be even | better if they weren't also running full web browsers! | rl3 wrote: | Figma's performance is excellent due in large part to the fact | they compile a lot of native code to Wasm. Electron or not it's | still fast. | | To answer your question, collaborative apps ideally need to | target the widest possible audience. Barring a massive budget, | the best way to accomplish this is to also have a singular | compile/build target. In most cases, that's the web platform. | Wowfunhappy wrote: | Figma's performance is impressive for an Electron app, but it | does choke on very large files, which Sketch would have | handled without a care. It's not great. | | If Sketch had had Figma's collaboration features, we wouldn't | have switched. But during the pandemic it was necessary. | BackBlast wrote: | It could totally be done natively. The obstacle is how much of | the stack you have to write and maintain. There are js | libraries that do most of this heavy lifting for you, and CRDTs | are pretty new to most devs. | | It's just much much easier and cost effective to build a single | code base and hit many many targets platforms with it. | | Computing history has also shown that publishing efficient lean | software doesn't help in the market. At least not over time to | market, getting the key features right, and your ongoing costs. | idontevengohere wrote: | Really interesting...you can build a similar (websocket/db | backed) app with LiveView out of the box, no? Any idea how well | that'd hold up against this solution? | paulgb wrote: | Does LiveView have any conflict resolution, or would it just be | last-write-wins? | _virtu wrote: | This was my first thought as well. | sirtimbly wrote: | A 225K gzipped .wasm file download for a client-side state | management and persisistence layer is not great. It is | competitive with some similar solutions, but still a lot for any | web app's performance budget | aboodman wrote: | The release build is 100k brotli I believe. It's possible this | site is using the dev binary. | albertgoeswoof wrote: | This stack reminds me of Meteor, which came out nearly a decade | ago(!). https://meteor.com | | It never really took off in the mainstream - I think because it | was before many developers really trusted JS on the server, and a | "full stack" framework is quite a big commitment for a team to | shift to. Also most CRUD apps don't need real time collab. | | I remember being amazed when changes were instantly propagated | between my phone and laptop browsers with almost zero lag. This | was the demo that sold it for me | https://www.youtube.com/watch?v=MGbmW9bwJh4 | eatonphil wrote: | I haven't yet done this but based on some research it seems to me | like the core of any collaborative app today (that wants to avoid | Firebase and the other hosted platforms like Replicache seems to | be) is easiest served by picking some CRDT library. | | There are a couple of open-source CRDT libraries that provide | both clients and servers (yjs [0] and automerge [1] are two big | ones for JavaScript I'm aware of). | | My basic assumption is that as long as you put all your relevant | data into one of these data structures and have the CRDT library | hook into a server for storing the data, you're basically done. | | This may be a simplistic view of the problem though. For example | I've heard people mention that CRDTs can be space inefficient so | you may want/have to do periodic compaction. | | [0] https://github.com/yjs/yjs | | [1] https://github.com/automerge/automerge | brunoqc wrote: | Would Chronofold works for this too? | eatonphil wrote: | If this [0] is what you're talking about, at the moment yjs | and automerge are significantly more full-featured and used | by many major companies. | | [0] https://github.com/dkellner/chronofold | brunoqc wrote: | Thanks! | tabtab wrote: | > _is easiest served by picking some CRDT library._ | | RDBMS A.C.I.D. and transactions are also capable of much of the | same. | feanaro wrote: | You probably don't want to use Automerge. See | https://josephg.com/blog/crdts-go-brrr/ for a nice CRDT | optimization story. | eatonphil wrote: | Interesting! I know there was a large performance refactor | that was merged in May [0]. This post you link was written in | June of this year. Unclear if the performance fix is related | to the reported issues and unsure if it still exists or not. | | At the very least, the automerge maintainers seem to be very | actively tackling performance problems. | | [0] https://github.com/automerge/automerge/pull/253 | Zealotux wrote: | So far I've managed to keep the state in my side-project in sync | with Websockets and Redux, Replicache sounds like the kind of | solution I'd love to use, but boy the pricing makes it impossible | to even consider. | ZeroCool2u wrote: | I don't have any plans to use Replicache, but I went and looked | at the pricing and I was kind of struck by your comment. | Looking at it, it seems pretty fair to me? Especially under 10k | MAC's. It seems like a flat rate / month is pretty nice too. | Plus, it's free for all non-commercial use. | | Am I wildly off base here? Is it just that middle tier jump to | over 10k that is a no go? | | Again, I don't have a horse in this race or even my own | startup, just trying to understand if my own judgement is way | off. | Zealotux wrote: | I would quickly be in the $500/mo tier and that would be a | consequent cost to handle since I don't really make that kind | of profit yet. But I have to agree anything beyond 10K is | very reasonable given the features. I just kind of wish they | had an more affordable bracket between 500 and 10K but they | probably have reasons not to. | craig_asp wrote: | We implemented all that manually, more or less in swift (and | sqlite), then react+redux, and on the back end - postgres and | python+flask. Works flawlessly so far. We do have the same setup | more or less, with listeners triggering UI updates and push | messages signalling the clients to fetch data from the server. | Then, on the server, we have two dbs -> one where we store each | update or create message, in a postgres-based queue, and another | one, in a normalised format which we use for login (it's way | faster than replaying all messages from the queue). There are | complexities when you move beyond one or two tables, though - | like maintaining relations, ensuring things get done in the | correct order, that they get merged (we merge all attributes of | each item - e.g. one client can change color, and the if another | changes the text content of the item these will get merged), etc. | | We gave up on the websocket part and implemented basic polling, | because they were not supported by App Engine at the time (things | might have moved on since then, which is a couple of years ago). | Yet, for a note/todo/habit tracking app, it simply doesn't need | to be real-time from our experience. | | Have a play at https://www.mindpad.io/app/. You can see how it | works if you open up the web app in two incognito tabs, or on an | iPhone and the web. | davedx wrote: | It's a nice summary of how to use these technologies, but | considering it states avoiding vendor lock-in is a goal, I was | surprised to see it using fly.io and a managed cockroachDB. | mrkurt wrote: | It didn't actually use CockroachDB, they ended up using | Postgres + Read Replicas. | | I work on Fly.io, but there's very little vendor lock in here. | We can't afford to lock people in, we're too small. We need to | make their existing stuff work with zero friction. ___________________________________________________________________ (page generated 2021-08-17 23:00 UTC)