[HN Gopher] The Inner Workings of Distributed Databases
       ___________________________________________________________________
        
       The Inner Workings of Distributed Databases
        
       Author : bluestreak
       Score  : 88 points
       Date   : 2023-04-17 16:23 UTC (6 hours ago)
        
 (HTM) web link (questdb.io)
 (TXT) w3m dump (questdb.io)
        
       | marsupialtail_2 wrote:
       | In case people are interested, I wrote a post about fault
       | tolerance strategies of data systems like Spark and Flink:
       | https://github.com/marsupialtail/quokka/blob/master/blog/fau...
       | 
       | The key difference here is that these systems don't store data,
       | so fault tolerance means recovering within a query instead of not
       | losing data.
        
       | hartem_ wrote:
       | What was the most interesting thing that you learned while
       | implementing the WAL? Have you thought about how WAL is going to
       | work in the multi-master setup?
        
         | olluk wrote:
         | We write to WAL and then register the transaction in the
         | transaction sequence registry. If a concurrent transaction
         | registered between the start and the end of the transaction, we
         | update the current uncommitted transaction data with concurrent
         | transactions and re-try registering it in the sequencer again.
         | To scale to multi-master we will move the transaction sequence
         | registry to a service with a consensus algorithm.
        
       | MuffinFlavored wrote:
       | When is the right time to "level up" from "I'm good with just
       | plain old Postgres" to QuestDB, InfluxDB, Patroni, etc.?
       | 
       | > Unfortunately, automatic failover is solved neither by
       | PostgreSQL nor TimescaleDB, but there are 3rd-party solutions
       | like Patroni that add support for that functionality. PostgreSQL
       | describes the process of failover as STONITH (Shoot The Other
       | Node In The Head), meaning that the primary node has to be shot
       | down once it starts to misbehave.
       | 
       | Does QuestDB do "Raft consensus"? I don't see Raft mentioned in
       | the article.
       | 
       | Aren't all distributed databases basically really clever wrappers
       | around write-ahead log + really tight timestamp/clock syncing?
        
         | omneity wrote:
         | I wouldn't necessarily call it a level up.
         | 
         | There's a lot of use cases for which Postgres works very well
         | _at scale_ , and the main benefit of a solution like these
         | specialized ones is more of a convenience layer.
        
         | hinkley wrote:
         | > failover as STONITH (Shoot The Other Node In The Head)
         | 
         | What functional consensus protocol doesn't mandate attempted
         | murder? When a node becomes incoherent it can't be relied upon
         | to notice that it has done so and bow out gracefully. Like
         | cancer, there is always a change that 'cell death' will fail
         | and leave you in a pathological state.
        
           | olluk wrote:
           | Perhaps the multi-master approach is the example of system
           | where incoherent does not mean terminal illnesses.
        
           | grogers wrote:
           | If your consensus protocol requires that it is probably
           | broken. If you can't rely on a node to shut itself down then
           | you almost certainly can't rely on an external trigger to do
           | it. Paxos, raft, etc work just fine as long as failures are
           | non-byzantine. Achieving non-byzantine failures is definitely
           | not _always_ possible (e.g. someone hacking your server and
           | reprogramming it to subvert the protocol) but checksums on
           | disk and network go most of the way.
        
       ___________________________________________________________________
       (page generated 2023-04-17 23:00 UTC)