[HN Gopher] Show HN: DriftDB - an open source WebSocket backend ... ___________________________________________________________________ Show HN: DriftDB - an open source WebSocket backend for real-time apps Hey HN! I've written a bunch of WebSocket servers over the years to do simple things like state synchronization, WebRTC signaling, and notifying a client when a backend job was run. I realized that if I had a simple way to create a private, temporary, mini-redis that the client could talk to directly, it would save a lot of time. So we created DriftDB. In addition to the open source server that you can run yourself, we also provide https://jamsocket.live where you can use an instance we host on Cloudflare's edge (~13ms round trip latency from my home in NY). You may have seen my blog post a couple months back, "You might not need a CRDT"[1]. Some of those ideas (especially the emphasis on state machine synchronization) are implemented in DriftDB. Here's an IRL talk I gave on DriftDB last week at Browsertech SF[2] and a 4-minute tutorial of building a cross-client synchronized slider component in React[3] [1] https://news.ycombinator.com/item?id=33865672 [2] https://www.youtube.com/watch?v=wPRv3MImcqM [3] https://www.youtube.com/watch?v=ktb6HUZlyJs Author : paulgb Score : 302 points Date : 2023-02-03 11:12 UTC (11 hours ago) (HTM) web link (driftdb.com) (TXT) w3m dump (driftdb.com) | globalise83 wrote: | This looks just about perfect for powering all those team online | games we all played a lot during lockdown (and still do), is that | right? | paulgb wrote: | Yep, in fact, it was making a word game[1] for my family to | play on zoom calls early in lockdown that sent me down the | rabbit hole of synchronizing state in distributed systems. | | [1] https://word.red | Thaxll wrote: | Websocket is not very good for online games because it's TCP | based, also there are millions of websockets library in every | languages. | paulgb wrote: | Right, WebSocket is fine for a chess game, but you wouldn't | use it for a first-person shooter. | | If you do want UDP from a browser (via a WebRTC data channel) | you first need a side channel to establish the connection, | and DriftDB is handy for that. | alexisread wrote: | I've not looked at DriftDB in depth (cloudflare worker running | this is neat!), but can't MQTT handle this sort of workload? | | Obv. there's not a cloudflare worker running say an MQTT server | over websockets, but you can scope topics with wildcards | (https://www.hivemq.com/blog/mqtt-essentials-part-5-mqtt- | topi...), replay missed messages on reconnection, last-will-and- | testament, ACLs, dynamic topic creation, binary messages etc. | | I'm asking as many of these websocket projects seem to use custom | protocols rather than anything standard aka interoperable. | nine_k wrote: | Maybe it's the richness of MQTT that makes it a worse choice | for a startup. Offering a conformant MQTT broker is a lot of | work, and the semantics come from elsewhere, not geared towards | emphasizing your unique advantages. | | Building a much simpler, custom-tailored protocol allows to | ship faster, and improve gradually. If the point is to deploy | on Cloudflare in a massively-parallel fashion (which is likely | harder for a regular MQTT broker), the custom protocol allows | to concentrate on that special advantage, and not on standards | conformance or interoperability with a bevy of existing | libraries. | paulgb wrote: | The problem with MQTT is that most of the use cases I'm | interested in involve a web browser as at least one party of | the connection, and the browser doesn't support MQTT. I could | wrap MQTT in a WebSocket, but then I'd lose the advantages of | MQTT's compactness and interoperability (unless MQTT-over- | WebSocket is a thing?) | | The other operation that I haven't seen elsewhere, but is vital | to enabling stream compaction without a leader, is the idea of | a stream rollup _up to a specific stream number_. NATS | Jetstream, for example, has the ability to roll up an entire | stream, but if another message hits the stream between when the | rollup is computed and when it arrives at the server, that | message too will be replaced (IIRC). So I thought about using | NATS (which already has a WebSocket protocol), but ruled it | out. | alexisread wrote: | MQTT-over-websocket does exist | (https://github.com/mqttjs/MQTT.js), and most MQTT brokers | support it (Mosquito, AmazonMQ etc.). You're right about the | compaction - MQTT doesn't have anything in it's protocol | about compaction, and I don't know of any brokers that | implement it. Having said that, you could use an MQTT-kafka | bridge. | | Something like Mosquito + https://github.com/nodefluent/mqtt- | to-kafka-bridge + Redpanda in a docker image would work, | though obv. this might be a bit overkill for most. Having | said that, it does open many new avenues for interaction at | scale. You pays your money... | fud101 wrote: | What is compaction? | paulgb wrote: | Compaction is where you take a chunk of messages and | replace them with a single message. | | For example, one of the DriftDB demos is a counter | (https://demos.driftdb.com/counter). State is | synchronized by putting increment/decrement events into a | stream. When a new user connects, their client get all | the messages in the stream, plays them back, and arrives | at the same state as everyone else. | | If that's all we did, over time, the stream would grow | unruly. It would take ages to load the page because we'd | have to load every state change. But we only really care | about a single numeric value. Compaction takes a chunk of | messages that look like this: | {"apply":"increment"} {"apply":"increment"} | {"apply":"decrement"} {"apply":"increment"} | | And replaces them with a message that looks like this: | {"reset":2} | | DriftDB doesn't know _how_ to compute the compaction, it | relies on clients to do that. When a client does | something that increases the length of the stream, the | server sends back the new length of the stream, so that | the client can decide whether to compact it (i.e. if it | passes some threshold). | | The important part that I haven't seen elsewhere is that | when a client compacts the stream, it includes a sequence | number of the last message that's part of the compaction. | The server will preserve messages greater than that | sequence number, since they are not part of the | compaction. | chrisdalke wrote: | Most MQTT implementations do support MQTT-over-Websocket. I | use it extensively at work and it's been fairly reliable! | elithrar wrote: | > The problem with MQTT is that most of the use cases I'm | interested in involve a web browser as at least one party of | the connection, and the browser doesn't support MQTT. I could | wrap MQTT in a WebSocket, but then I'd lose the advantages of | MQTT's compactness and interoperability (unless MQTT-over- | WebSocket is a thing?) | | We support MQTT over WS (or JSON over WS, or just HTTP) in | Cloudflare Pub/Sub, FWIW - | https://developers.cloudflare.com/pub- | sub/learning/websocket... | | I also agree with the comments re: MQTT being well suited to | a lot of these "broadcast" use-case, but that the IoT roots | seem to hold it back. MQTT 5.0 is just a great protocol -- | clear spec, explicit about errors, flexible payloads -- that | make it well suited to these broadcast/fan-in/real-time | workloads. The traditional cloud providers do MQTT (3.1.1) in | their respective IoT platforms but never grew it beyond that. | jconley wrote: | Can I get in on that beta? Submitted the form yesterday. | Currently building something that could use it. ;) | manv1 wrote: | Funny, the IoT space has bought into MQTT but the general | internet space has not. | | MQTT scales and works. And it's easy, fast, and small. | | I've been trying to get our guys to do MQTT-based pub/sub, and | they're rather do their own thing with web sockets because MQTT | is scary. <shrug>. | | That's the problem when front-end guys make decisions about | tech sometimes, they choose stuff that seems easy to integrate | without caring about things like deployment, scalability, | capabilities, etc. | manv1 wrote: | I mean, it'd be trivial to write stream replay for MQTT. It's | literally just stashing messages and sending them back on | connect. Not sure what the issue is there. | Scottopherson wrote: | Jeez that's a big paint brush you're slinging around. | | That's the problem when non-front-end guys make decisions | about tech sometimes, they choose stuff that seems easy to | integrate without caring about things like accessibility, | design scalability, client device capabilities, etc. | nine_k wrote: | How does a wire protocol relate to UX concerns like | accessibility or design scalability? | | Client device capabilities are there, MQTT is neither | rocket science nor a resource hog, since it was designed | for underpowered IoT devices. | jwilber wrote: | Awesome stuff. Here's a short video talking about DriftDB at | Browsertech SF (I believe this is an put on by them ("Drifting in | Space"): https://www.youtube.com/watch?v=wPRv3MImcqM | ocimbote wrote: | Plane.dev is mentioned. | | Has anyone experience with it? It seems quite interesting but I | need more opinion on what they call "backend sessions"... | BTBurke wrote: | This is great. I'm going to use this with something I'm working | on. The edge behavior is just what I need. | | When you say limitations are a "relatively small number of | clients need to share some state over a relatively short period | of time," I read in another comment about a dozen or so clients, | but what about the time factor? Can it be on the order of hours? | paulgb wrote: | > but what about the time factor? Can it be on the order of | hours? | | So far I've focused on use cases where clients are online for | overlapping time intervals. When all the clients go offline, | Cloudflare will shut down the worker after some period and the | replay ability will be lost. The core data structure is | designed such that it could be stored in the Durable Object | storage Cloudflare provides, but I haven't wired it up yet. | BTBurke wrote: | One more thought - any consideration of hooking this to | Cloudflare's queue? Then you could optionally connect another | worker to that and e.g. persist everything in their D1 SQLite | database. | paulgb wrote: | I haven't looked at the queue specifically, but Durable | Objects have a nice key/value storage mechanism that | happens to map nicely. It would take a bit of munging to | make it work for a stream instead of a single value, but I | have a design in mind. | BTBurke wrote: | That works perfectly for what I'm using it for. Thanks for | building this! | jcq3 wrote: | I didn't find the use case section, the first thing I read before | code, implementation example or whatever. Why is it always | lacking in SaaS landpages? | paulgb wrote: | Good feedback, here you go :) https://github.com/drifting-in- | space/driftdb/commit/8d946217... | jcq3 wrote: | Beautiful, now it makes me want to use your tool because I | can relate to use cases I might have... | speps wrote: | Reminds me of Colyseus: https://github.com/colyseus/colyseus | | Colyseus has support for persistence as well as matchmaking! | mrtksn wrote: | How the race conditions are handled? If one of the clients of the | shared state delivers the the input with a delay(network issue | etc.), will it overwrite state of the other client once delivered | or will be dismissed? Is there a concept of slave/master client? | | Edit: | | So, I played a bit and it appears that if a client is | disconnected and changes of the state happens when offline, once | connected these changes will be applied to the other client who | was having its own changes in the state. So its working on the | "last message" basis? Also it seems like it can't detect the | offline/online status? | | I'm curious because the interesting part of this kind of systems | is the way races are handled. | paulgb wrote: | > So, I played a bit and it appears that if a client is | disconnected and changes of the state happens when offline, | once connected these changes will be applied to the other | client who was having its own changes in the state. So its | working on the "last message" basis? Also it seems like it | can't detect the offline/online status? | | From the server's point of view, it's just an ordered broadcast | channel with replay. The conflict semantics are whatever you | build on top of that. | | The `useSharedState` hook in the React bindings implements | last-write-wins. For the `useSharedReducer` hook, the reducer | itself determines the semantics, but in the voxel editor demo | we also use last-write-wins. | | > Also it seems like it can't detect the offline/online status? | | Online/offline status is exposed in the client libraries, e.g. | in the react bindings there is a useConnectionStatus hook: | https://driftdb.com/docs/react#useconnectionstatus-hook | | > I'm curious because the interesting part of this kind of | systems is the way races are handled. | | It's academically the interesting part, but I think it matters | less than people assume it does. Here's a section from a blog | post I wrote a couple months ago: | | > Developers may find it tempting to treat collaborative | applications as any other distributed systems, and in many ways | that's a useful way to look at them. But they differ in an | important way, which is that they always have humans-in-the- | loop. As a result, many edge cases can simply be deferred to | the user. | | > For example, every multiplayer application has to decide how | to handle two users modifying the same object concurrently. In | practice, this tends to be rare, because of something I call | social locking: the tendency of reasonable people not to | clobber each other's work-in-progress, even in the absence of | software-based locking features. This is especially the case | when applications have presence features that provide hints to | other users about where their attention is (cursor position, | selection, etc.) In the rare times it does occur, the users can | sort it out among themselves. | | > A general theme of successful multiplayer approaches we've | seen is not overcomplicating things. We've heard a number of | companies confess that their multiplayer approach feels naive | -- especially compared to the academic literature on the topic | -- and yet it works just fine in practice. | | https://driftingin.space/posts/you-might-not-need-a-crdt | mrtksn wrote: | Good point, in the case of users interacting it's probably a | non issue. Thanks for the insight. | Aldipower wrote: | How can something be real-time, if there is a websocket | connection in-between. How do you ensure real time? In real-time | applications response times must be guaranteed. Seems impossible | to me with websocket connections. | paulgb wrote: | I mean real-time apps in the colloquial sense - applications | where two people see the same state nearly instantly. In the | strict computation sense, it's true that you can't guarantee an | upper bound for delivery of a message. This isn't just a | limitation of WebSockets, it's a limitation of TCP/IP, which | don't provide a way to reserve bandwidth along a path between | hosts (IIRC). | bufferoverflow wrote: | SurrealDB was supposed to be a websocket real time DB, but it | seems they never finished that websocket part. | | Glad there's an alternative. | | https://surrealdb.com/docs/integration/websockets | winrid wrote: | Reminds me of DerbyJS and ShareDB/Racer. It's a pretty productive | stack, but came out at the wrong time. You can plug in different | storage engines (mongo, postgres) and it handles conflicts via | operational transform. | JohnCClarke wrote: | useState() --> useSharedState() | | My brain just exploded with how perfect this DX is! Love it! | stmblast wrote: | This is really cool! | | Looking forward to seeing how this progresses. | ArtWomb wrote: | Seems expensive no? To start a http container per request? But I | suppose it does solve many server side persistence issues. And I | love the power it affords you in creating virtual worlds. Awesome | stuff ;) | | https://github.com/drifting-in-space/plane | paulgb wrote: | We created Plane, but we're actually not using it for this! | DriftDB stemmed out of realizing that a lot of the use cases | people were coming to Plane for were simple WebSocket servers | for which spinning up a container is excessive. | | Plane is still great (I mean, I'm biased) if you want to run a | WebSocket server that implements custom business logic, uses | heavy compute, GPUs[1], or is stateful. | | [1] teaser: https://canvas.stream/ | ArtWomb wrote: | Blender over WebRTC demo looks fast too ;) | paulgb wrote: | Thanks! My colleagues gave a talk last week on streaming | data visualization that you might like: | https://www.youtube.com/watch?v=0WyeZ9lKdSU | quickthrower2 wrote: | Would be fascinating if you could build Jitsi like video ontop of | this. | | I think DB in the name is a little misleading due to there being | no persistence (I assume?) but that is a small nitpick! | paulgb wrote: | > I think DB in the name is a little misleading due to there | being no persistence (I assume?) but that is a small nitpick! | | Yes, I feel a bit guilty about that part. When I started it the | design looked more like a traditional key/value or durable | stream database with real-time capabilities, but over time I | realized that the use cases I had in mind usually didn't | actually need long-term persistence. The DB stuck, partly | because it turns out if you add "db" as a suffix it's a lot | easier to find available package names and domains :). If it's | any consolation, I still do intend to support persistence | eventually. | quickthrower2 wrote: | Thanks for the reply. Sounds like a neat bit of | infrastructure. Well done for getting it done! I almost want | to create a project as an excuse to use it ha ha! Also naming | stuff is hard of course. | dabeeeenster wrote: | This is super interesting! Do you have any data on how well this | scales when running on Cloudflare Edge? Can you run more than one | instance and have them share state? | paulgb wrote: | Thanks! When hosted on Cloudflare, it uses their Durable | Objects product. Rather than running multiple backend instances | that share state, it's set up so that all users in the same | "room" are connected to the same instance. The instances can | then be scaled out horizontally (but Cloudflare takes care of | that.) | | Within a room, things are a bit more constrained. We haven't | found the limit yet, and I suspect it's pretty high, but our | design goal was to support on the order of dozens of users in a | room, not necessarily beyond that. (Targeting e.g. a shared | whiteboard use case) | tmikaeld wrote: | We also looked at using Cloudflare, but it was prohibitively | expensive, because you pay for the duration of each "room" | (Connection, depending on how you use it). | | https://developers.cloudflare.com/workers/platform/pricing/#. | .. | | Eventually we went with Centrifuge. | paulgb wrote: | Yeah, it remains to be seen whether it is economical for us | to keep the hosted version on CF. I suspect that for users | who want to run their own geographically distributed | instance of it, CF will be the path that makes sense for | the majority of them. | | Who did you end up going with as a hosting provider? | (Centrifuge looks to be a library, if I'm looking at the | right thing) | unraveller wrote: | CF edge wants you to be more one and done, very anti | connection. Deno is the better priced edgejs compute for | websockets last I checked. | | Probably still worthwhile for DriftDB SaaS if mainly short | lived connections are used, even though similar | functionality can be had with NATS bridge + an ordered | streaming library in your fav language on fly.io | e1g wrote: | "Centrifuge" as in | https://github.com/centrifugal/centrifugo ? | [deleted] | matt-attack wrote: | > DriftDB is a real-time data backend that runs on the edge. | | What does "on the edge" mean in this context? Can I just run the | server part on my own infrastructure? What if I have multiple | pods for redundancy, and client web connections might get | connected randomly to any of those pods? How would the pods all | share state between each other? | paulgb wrote: | > What does "on the edge" mean in this context? | | DriftDB has a concept of "rooms", which are essentially | broadcast channels. By "on the edge", what I mean is that the | authoritative server for each room can be geographically | located near the participants in that room. In practice, today | that means that it can be compiled to WebAssembly and run as a | Cloudflare Worker. | | > Can I just run the server part on my own infrastructure? | | Kinda. It includes a server that runs locally, but it's only | useful as a development server at this point. Your question | about multiple pods is exactly the reason -- unless you have a | routing layer that is aware of DriftDB's "rooms", it won't work | if you scale it up. We also make https://plane.dev which | provides the routing layer, but it might be overkill for a | DriftDB use case. | avinassh wrote: | This is really cool! But how are conflicts handled? | paulgb wrote: | As far as the server itself is concerned, it's just a broadcast | channel with replay and compaction capabilities, so it's not | directly concerned with conflict resolution. You could use it | as a broadcast channel for CRDTs if you wanted to. | | The useSharedState react hook is more opinionated, it uses | last-write-wins semantics in the case of a conflict. The | useSharedReducer hook's behavior on conflict is up to the | reducer provided. | samhuk wrote: | Looks interesting. Coincidentally, I've _just_ completed the bulk | of work on a distributed Websocket network system to synchronize | certain bits of state between multiple clients for my own kind of | Storybook tool [0]. How interesting! | | This kind of tool is exactly what I would have needed, instead of | the approach I've taken which is a bit kludgy and grass-roots. | | By far the most difficult part of it for me was ensuring that the | web socket network can heal from outages of any of the clients or | the server. E.g. If a client loses connection, how does it regain | knowledge of state? If the server dies, what do clients do with | state changes they want to upload? Etc. It was really difficult! | | Good work :) | | [0] https://github.com/samhuk/exhibitor/pull/22 | rlt wrote: | Neat. | | > DriftDB is a real-time data backend that runs on the edge | | What does it mean for these backends to be "on the edge"? Do | geographically disperse clients connect to different backends? If | so are messages synchronized between them? If so what's the point | of them being on the edge? | fernandopj wrote: | OP must have meant it runs on Cloudflare Edge. | scaredginger wrote: | Please explain your reasoning here | paulgb wrote: | That's essentially what I meant. The core database is | separate from the Cloudflare parts, so it could in theory | run on other edges (I want to get it running on fly.io!), | but for now "the edge" can be read as "Cloudflare Workers". | paulgb wrote: | By "on the edge", I mean that if you're in London and I'm in | Amsterdam, and we want to exchange messages, the messages | shouldn't have to do a round-trip through Virginia, they should | go through a server closer to both of us. (Of course, if I'm in | SF and you're in London, this is less of a win.) | | The way it works in DriftDB is that everything is siloed into | "rooms", which are effectively broadcast channels. The room is | started based on the geography of the person who first joins it | (Cloudflare handles this part). | trollitarantula wrote: | Nice! Would love to see Cloudflare deployment guide. | Cloudflare isn't mentioned in the docs. | paulgb wrote: | Ah, you're right, I haven't written that up yet. The tl;dr | is something like: cd driftdb-worker | npm i npm run deploy | | You'll need to sign in to wrangler if you haven't already, | and will need to have rustc/cargo available (wrangler will | install some things and build it into a WebAssembly | module). | HighlandSpring wrote: | Oh, cool! So kinda like IRC? | paulgb wrote: | Yes, the concept of rooms is analogous to rooms in a chat | service. One difference from IRC as a protocol (besides | being over websocket) is that each connection corresponds | to exactly one room (since different rooms may be on | different servers.) | rlt wrote: | > The room is started based on the geography of the person | who first joins it | | Cool, makes a lot of sense because people using a given | "room" are often likely to be geographically collocated. | paulgb wrote: | Exactly! | atentaten wrote: | Can this be used in the Dart/Flutter world? | paulgb wrote: | The server itself speaks a very simple WebSocket protocol[1], | so it could be used by anything that can speak WebSocket. | | The JS/React bindings that implement the actual data sync | patterns (shared state, shared reducers, presence) haven't been | ported to Dart (yet?) though. | | [1] https://driftdb.com/docs/api | rgbrgb wrote: | > presence | | Congrats on the launch! You have a pointer to docs about | presence? Use-case is an ephemeral chatroom where I want to | show who's online. | paulgb wrote: | Good catch, this should be in the react docs but it's | missing. Until then, it's pretty simple. You call `const | presence = usePresence({})` and pass in any data you want, | and the `presence` value that gets returned is an object | that maps client IDs (a unique string for each client) to | the values that _they_ passed in to `usePresence`. | | Here's an example from the voxel demo: | https://github.com/drifting-in- | space/driftdb/blob/af64f62b29... | | And from the canvas demo: https://github.com/drifting-in- | space/driftdb/blob/af64f62b29... | jaime-ez wrote: | for those interested in open source websocket servers checkout | deepstream.io ... data persistence, subscriptions, rpc calls, | authorization, permissions, custom connectors..basically | everything you need to develop an app. ___________________________________________________________________ (page generated 2023-02-03 23:00 UTC)