[HN Gopher] Scalable but Wasteful, or why fast replication proto... ___________________________________________________________________ Scalable but Wasteful, or why fast replication protocols are slow Author : hugofirth Score : 40 points Date : 2021-07-12 06:20 UTC (1 days ago) (HTM) web link (charap.co) (TXT) w3m dump (charap.co) | mistralefob wrote: | So, why? | Dylan16807 wrote: | They're fast when you limit each machine's CPU, but they're | slow when you limit total CPU. | eternalban wrote: | A leaderless protocol is more complex and requires greater IO | and CPU resources per unit of work. This is the hidden cost. | The question is if the cost is reasonably offset by higher | performance (throughput) offered by leaderless consensus. OP is | arguing that the benefit is not deemed worth the cost outside | of academia and this is why e.g EPaxos is not adopted. | nine_k wrote: | But doesn't a leaderless protocol also give you more | resilience against failures? Or, in other words, can it be | that the higher cost buys you not better throughput but | faster reconciliation when connectivity is poor? Not in a | data center but on a mobile network? | hugofirth wrote: | Another thing which makes the Raft/Paxos vs new-consensus- | algorithm comparisons complicated is caching. | | If your raft state machines are doing IO via some write through | cache (which they often are) then having specific machines do | specific jobs can increase the cache quality. I.e. your leader | node can have a better cache for your write workload, whilst your | follower nodes can have better caches for your read workload. | | This may lead to higher throughput (yay) but then also leave you | vulnerable to significant slow-downs after leader elections | (boo). | | What makes sense will depend on your use case, but I personally | agree with the author that multiple simple raft/paxos groups | scheduled across nodes by some workload aware component might be | the best of both worlds. | LAC-Tech wrote: | > The protocol presents a leader-less solution, where any node | can become an opportunistic coordinator for an operation. | | Does leader = master here? My first reaction is that this is a | multi-master system but I can't quite unpack "opportunistic | coordinator". | toolslive wrote: | None of this actually matters. Consensus algorithms allow you to | achieve consensus. Period. There's no requirement whatsoever on | what you're getting consensus on. A consensus value could be | _one_ database update, but it doesn't need to be. It can also | consist of 666 database transactions across 42 different | namespaces. | luhn wrote: | Honestly I think the answer is simpler: People don't _need_ | better algorithms. Paxos and Raft are generally used to build | service discovery and node coordination, these are not demanding | workloads and overwhelmingly read-heavy. Even the largest | deployments can probably be serviced by a set of modestly-sized | VMs. Paxos and Raft are well-understood algorithms with a choice | of battle-tested implementations, why would anyone choose | different? | | The whole section on "bin-packing Paxos/Raft is more efficient" | is strange, because people don't generally bin-pack Paxos/Raft-- | The bin-packing orchestrators are built off of Paxos/Raft! | rdw wrote: | The hypothesis in the article could be correct (that industry is | not adopting new academic innovations because they fail in the | real world). Based on my experience in this industry, though, it | could just be that there isn't a super strong connection between | academia and the people implementing these kinds of systems. I've | had many conversations with my academically-minded friends where | they're astonished that we haven't jumped on some latest | innovation, and I have to let them down by saying that the | problem that paper was addressing is super far down our list of | fires to put out. Maybe there are places where teams of top-tier | engineers are free to spend 6 months every year rewriting | critical core systems use un-battle-scarred new algorithms that | might have 20% performance improvements, but most places I've | worked would achieve the same result for far less money by | spending 20% more on hardware. | anonymousDan wrote: | Yes for that kind of improvement I think academia would be | better served trying to see if the techniques can be | incorporated into an existing well tested implementation. ___________________________________________________________________ (page generated 2021-07-13 23:00 UTC)