[HN Gopher] The Inner Workings of Distributed Databases ___________________________________________________________________ The Inner Workings of Distributed Databases Author : bluestreak Score : 88 points Date : 2023-04-17 16:23 UTC (6 hours ago) (HTM) web link (questdb.io) (TXT) w3m dump (questdb.io) | marsupialtail_2 wrote: | In case people are interested, I wrote a post about fault | tolerance strategies of data systems like Spark and Flink: | https://github.com/marsupialtail/quokka/blob/master/blog/fau... | | The key difference here is that these systems don't store data, | so fault tolerance means recovering within a query instead of not | losing data. | hartem_ wrote: | What was the most interesting thing that you learned while | implementing the WAL? Have you thought about how WAL is going to | work in the multi-master setup? | olluk wrote: | We write to WAL and then register the transaction in the | transaction sequence registry. If a concurrent transaction | registered between the start and the end of the transaction, we | update the current uncommitted transaction data with concurrent | transactions and re-try registering it in the sequencer again. | To scale to multi-master we will move the transaction sequence | registry to a service with a consensus algorithm. | MuffinFlavored wrote: | When is the right time to "level up" from "I'm good with just | plain old Postgres" to QuestDB, InfluxDB, Patroni, etc.? | | > Unfortunately, automatic failover is solved neither by | PostgreSQL nor TimescaleDB, but there are 3rd-party solutions | like Patroni that add support for that functionality. PostgreSQL | describes the process of failover as STONITH (Shoot The Other | Node In The Head), meaning that the primary node has to be shot | down once it starts to misbehave. | | Does QuestDB do "Raft consensus"? I don't see Raft mentioned in | the article. | | Aren't all distributed databases basically really clever wrappers | around write-ahead log + really tight timestamp/clock syncing? | omneity wrote: | I wouldn't necessarily call it a level up. | | There's a lot of use cases for which Postgres works very well | _at scale_ , and the main benefit of a solution like these | specialized ones is more of a convenience layer. | hinkley wrote: | > failover as STONITH (Shoot The Other Node In The Head) | | What functional consensus protocol doesn't mandate attempted | murder? When a node becomes incoherent it can't be relied upon | to notice that it has done so and bow out gracefully. Like | cancer, there is always a change that 'cell death' will fail | and leave you in a pathological state. | olluk wrote: | Perhaps the multi-master approach is the example of system | where incoherent does not mean terminal illnesses. | grogers wrote: | If your consensus protocol requires that it is probably | broken. If you can't rely on a node to shut itself down then | you almost certainly can't rely on an external trigger to do | it. Paxos, raft, etc work just fine as long as failures are | non-byzantine. Achieving non-byzantine failures is definitely | not _always_ possible (e.g. someone hacking your server and | reprogramming it to subvert the protocol) but checksums on | disk and network go most of the way. ___________________________________________________________________ (page generated 2023-04-17 23:00 UTC)