[HN Gopher] Scalable but Wasteful, or why fast replication proto...
       ___________________________________________________________________
        
       Scalable but Wasteful, or why fast replication protocols are slow
        
       Author : hugofirth
       Score  : 40 points
       Date   : 2021-07-12 06:20 UTC (1 days ago)
        
 (HTM) web link (charap.co)
 (TXT) w3m dump (charap.co)
        
       | mistralefob wrote:
       | So, why?
        
         | Dylan16807 wrote:
         | They're fast when you limit each machine's CPU, but they're
         | slow when you limit total CPU.
        
         | eternalban wrote:
         | A leaderless protocol is more complex and requires greater IO
         | and CPU resources per unit of work. This is the hidden cost.
         | The question is if the cost is reasonably offset by higher
         | performance (throughput) offered by leaderless consensus. OP is
         | arguing that the benefit is not deemed worth the cost outside
         | of academia and this is why e.g EPaxos is not adopted.
        
           | nine_k wrote:
           | But doesn't a leaderless protocol also give you more
           | resilience against failures? Or, in other words, can it be
           | that the higher cost buys you not better throughput but
           | faster reconciliation when connectivity is poor? Not in a
           | data center but on a mobile network?
        
       | hugofirth wrote:
       | Another thing which makes the Raft/Paxos vs new-consensus-
       | algorithm comparisons complicated is caching.
       | 
       | If your raft state machines are doing IO via some write through
       | cache (which they often are) then having specific machines do
       | specific jobs can increase the cache quality. I.e. your leader
       | node can have a better cache for your write workload, whilst your
       | follower nodes can have better caches for your read workload.
       | 
       | This may lead to higher throughput (yay) but then also leave you
       | vulnerable to significant slow-downs after leader elections
       | (boo).
       | 
       | What makes sense will depend on your use case, but I personally
       | agree with the author that multiple simple raft/paxos groups
       | scheduled across nodes by some workload aware component might be
       | the best of both worlds.
        
       | LAC-Tech wrote:
       | > The protocol presents a leader-less solution, where any node
       | can become an opportunistic coordinator for an operation.
       | 
       | Does leader = master here? My first reaction is that this is a
       | multi-master system but I can't quite unpack "opportunistic
       | coordinator".
        
       | toolslive wrote:
       | None of this actually matters. Consensus algorithms allow you to
       | achieve consensus. Period. There's no requirement whatsoever on
       | what you're getting consensus on. A consensus value could be
       | _one_ database update, but it doesn't need to be. It can also
       | consist of 666 database transactions across 42 different
       | namespaces.
        
       | luhn wrote:
       | Honestly I think the answer is simpler: People don't _need_
       | better algorithms. Paxos and Raft are generally used to build
       | service discovery and node coordination, these are not demanding
       | workloads and overwhelmingly read-heavy. Even the largest
       | deployments can probably be serviced by a set of modestly-sized
       | VMs. Paxos and Raft are well-understood algorithms with a choice
       | of battle-tested implementations, why would anyone choose
       | different?
       | 
       | The whole section on "bin-packing Paxos/Raft is more efficient"
       | is strange, because people don't generally bin-pack Paxos/Raft--
       | The bin-packing orchestrators are built off of Paxos/Raft!
        
       | rdw wrote:
       | The hypothesis in the article could be correct (that industry is
       | not adopting new academic innovations because they fail in the
       | real world). Based on my experience in this industry, though, it
       | could just be that there isn't a super strong connection between
       | academia and the people implementing these kinds of systems. I've
       | had many conversations with my academically-minded friends where
       | they're astonished that we haven't jumped on some latest
       | innovation, and I have to let them down by saying that the
       | problem that paper was addressing is super far down our list of
       | fires to put out. Maybe there are places where teams of top-tier
       | engineers are free to spend 6 months every year rewriting
       | critical core systems use un-battle-scarred new algorithms that
       | might have 20% performance improvements, but most places I've
       | worked would achieve the same result for far less money by
       | spending 20% more on hardware.
        
         | anonymousDan wrote:
         | Yes for that kind of improvement I think academia would be
         | better served trying to see if the techniques can be
         | incorporated into an existing well tested implementation.
        
       ___________________________________________________________________
       (page generated 2021-07-13 23:00 UTC)