[HN Gopher] Jepsen: Redis-Raft 1b3fbf6 ___________________________________________________________________ Jepsen: Redis-Raft 1b3fbf6 Author : aphyr Score : 138 points Date : 2020-06-23 16:05 UTC (6 hours ago) (HTM) web link (jepsen.io) (TXT) w3m dump (jepsen.io) | ris wrote: | So, "work in progress". | tlhunter wrote: | 20 of the identified 21 issues have been fixed. | agustif wrote: | I'm not even a DBAdmin or something, and I probably get 20% of | these, but I enjoy reading them thoroughly | rantwasp wrote: | 20% is better than 0% and having the curiosity to learn about | is what's essential | shay_ker wrote: | It's awesome to see this being done in the development phase. | There's so much to learn, even just from the feedback cycle | between aphyr and RedisLabs. | ses1984 wrote: | Thank you for this, love this work. | LeafMeAlone wrote: | Interesting read as always! | | Small typo, I believe the link in the sentence _Tangentially, we | were surprised to discover that Redis Enterprise's claim of "full | ACID compliance"..._ was copy /pasted incorrectly | tlhunter wrote: | Another typo is: | | > In future work, we believe it be prudent to explore other | types of operations: GET and PUT, perhaps, or operations on | sets. | | Should say GET and SET. | DevKoala wrote: | The scrutiny this release is going through makes me confident | that the Redis Labs team will deliver in the end. | | Also, if you are looking for a linearly scalable distributed pub- | sub with strong guarantees around consensus and message | persistence, it might be worth looking at Apache Pulsar. | benschulz wrote: | It's unbelievable how hard to grasp distributed systems are. I | recently implemented Paxos in Rust and at certain points I | literally thought I was losing my mind. | | When you read Paxos Made Simple it really all seems so, well, | simple. But then you get inconsistent commits and look at the | traces of what happened and just go " _How?!_ " | aphyr wrote: | One of the things that surprised me about this analysis was | just how many bugs we found that had to do with the actual Raft | implementation. Usually when I test Raft-based systems the bugs | are at the edges--like the coupling of the system to the Raft | library, treating it like an externally-queryable log rather | than the driver of a state machine, and so on. We found | integration bugs here too, but also a fair number of issues in | the Raft library itself--and this is despite Redis-Raft having | existing integration tests! | | This stuff is hard! | Diggsey wrote: | Was the C Raft implementation in use a pre-existing library, | or was it developed specifically for Redis-Raft? | aphyr wrote: | Pre-existing! It's a fork of willamt's | https://github.com/willemt/raft/, which has been around | since 2013, _and_ has property-based fuzz testing! It | really does look like it 's got its own extensive tests; | I'm surprised we found issues. | Serow225 wrote: | Has anyone approached Jepsen about running an analysis on the | Erlang Ra implementation? I believe they've been running | Jepsen tests internally, just curious if they're thinking | about getting an official analysis at some point. Thanks for | all that you folks do!! * https://github.com/rabbitmq/ra | aphyr wrote: | No, we haven't talked yet, but I would like to someday. :) | andoriyu wrote: | Well paxos way more complex than raft. I'm not saying building | on top of raft is easy, I'm saying making a MVP raft | implementation is easier than paxos. | | One thing I wish raft had - a learner role which act like a | follower that can't start an election until it has catch up | with the rest of cluster. etcd has it, but I wish it was part | of the raft instead, as well as bulk log transfer. | | Article pointing out a very common issue that anyone who tried | implementing raft runs into: | | Letting a follower forward request to leader on client's behalf | is not easy to implement correctly, that's why most popular | raft based software (hashicorp stack) doesn't do that. Not | worth it. | jeffbee wrote: | I don't see anything in this blog that even touches on | "distributed systems are hard". Every issue in here should be | filed under "Redis has no tests". If you follow basic software | engineering principles, you'll find distributed systems easier | to approach. | benschulz wrote: | My reading of the article's introduction is that Redis is | adding this feature and are (among other things I'm sure) | paying jepsen to test it. So this is them having tests. | | > If you follow basic software engineering principles, you'll | find distributed systems easier to approach. | | When I implemented Paxos I had tests and when they failed | they spit out an exact trace of what happened in what order | and on what node. Sometimes it was still excruciating to | figure out what happened. Here's[1] a comment which you can | think of as a bug tombstone. It took me half a day to figure | out _after_ I had a trace to analyze the issue. | | [1]: https://github.com/benschulz/paxakos/blob/ee051ff67b5da6 | f287... | jeffbee wrote: | Sure, but now imagine you have no confidence that any part | of your paxos implementation works at all, nevermind the | paxos part. That's my impression of issue #13 from the | article: not only did the software not pass the test, it's | clear that nobody ever even tried to use it, at all! | | Full-scale blackbox testing of a database system is similar | to dogfooding. You only use it when you have high | confidence that you have exhausted the possibilities of | unit and integration tests. It's clear this project did not | start with exhaustive unit tests. | | It reminds me a bit of FoundationDB, which is also a | terrible program nobody should entrust with data they ever | want to see again. The first time I tried to use it it ran | out of memory and crashed in about ten seconds. I found the | problem, which was that their huge-page-aware allocator, | which has no tests, had never actually been used by anybody | on a machine with huge pages. It was a core library of a | released database which had never been executed by anyone. | This Redis thing is the same: nobody had ever said "RAFT | SET foo bar", if they had done they would have seen the | problem right away. | aphyr wrote: | > It's clear this project did not start with exhaustive | unit tests. | | I can't speak to "exhaustive", but Redis-Raft _did_ have | an extant unit and integration test suite prior to our | collaboration. Here 's what they looked like: https://git | hub.com/RedisLabs/redisraft/tree/ff9fb28c74db880c... | | I'm hesitant to draw too strong a conclusion here, and I | can't speak for the Redis Labs team, but I do suspect | that this is somewhere where... having an outside tester, | like Jepsen (or a suitably adversarial QA team) can help | detect missing-stairs sorts of problems. Coming from the | perspective of a prospective operator (and having some | experience with testing distributed systems), I | immediately said "of course I want proxy mode by | default", when this wasn't how the Redis-Raft designers | necessarily intended things to be used--they intended | smart clients to make it so that users wouldn't actually | _need_ proxy mode, so they hadn 't focused on testing it | that way. | benschulz wrote: | Fair enough. I think I misinterpreted the "easier to | approach" part of your original answer. Sorry if my | answer came across as defensive. My wounds are still | fresh. ;) | karlding wrote: | I'm curious if there's research into "better" primitives in | Programming Languages in order to simplify writing distributed | systems, analogous to how concurrency primitives beyond | Mutexes, Semaphores, and Condition Variables (like Futures, | Monitors, etc. or approaches such as Actors) can greatly | simplify logic and enhance one's ability to reason about code. | Or things like the Rust borrow checker. | | The closest thing I'm aware of is TLA+. | aphyr wrote: | There's a lot of research into this, actually! Folks have | been working on ways to extract executable code from Alloy, | TLA+, Isabelle/HoL, and Coq specifications. That doesn't help | with implementations which _don 't_ use codegen though--and | it doesn't help you with the parts of the program that _aren | 't_ formalized. | jnwatson wrote: | There are two general issues here: there's the nuts and | bolts, and then there's the emergent properties of the | protocol. | | In my experience, async (which includes futures/promises, and | actor-like mechanisms) makes the nut-and-bolts problems of | avoiding variable race conditions, avoiding deadlock, | managing multiple things going on, way easier. | | You still need fuzzing and model checking to make sure you | got the strategic stuff right. | | That said, the team I work on is about to release our first | Raft-based product, so I might have a different opinion in a | few months. ___________________________________________________________________ (page generated 2020-06-23 23:00 UTC)