[HN Gopher] Jepsen: Redpanda 21.10.1 ___________________________________________________________________ Jepsen: Redpanda 21.10.1 Author : aphyr Score : 134 points Date : 2022-04-29 14:02 UTC (8 hours ago) (HTM) web link (jepsen.io) (TXT) w3m dump (jepsen.io) | dhshshdhdgfff wrote: | The first half is jepsen team trying to divine some actual | testable guarantees from a pile of blog posts and a random Google | doc. What a mess. | jeffbee wrote: | Total mess. It's a real indictment of Kafka, more than it is | anything about redpanda in the first half. | [deleted] | excuses_ wrote: | I wonder if Redpanda thinks about or offers some alternative | protocol that would be better defined in terms of transaction | guarantees. At this point it looks like Kafka's protocol was a | nice try but it needs a major refactoring. | rystsov wrote: | Documentation is a bit confusing: the protocol was evolved | over time (new KIPs) and there is mismatch between the | database model and kafka model. But we see a lot of potential | in the Kafka transactional protocol. | | At Redpanda we were able to push to 5k distributed | transactions cross replicated shard. It's a mind-blowing for | a database to achieve the same result. | | Also Kafka transactional protocol works at low level it's | very easy to build systems on top of it. For example, it's | very easy to build a Calvin inspired system | http://cs.yale.edu/homes/thomson/publications/calvin- | sigmod1... | rystsov wrote: | The mess is mostly the result of the mismatch between the | classic database transactional model and kafka transactional | model (G0 anomaly). If you read the documentation without the | database background it seems ok, but when you notice the | differences between the models it becomes hard to understand if | it's a bug or property of the Kafka protocol. | | There is a lot of research happening around this area even in | the database world. The list of the isolation levels isn't | final and some of the recent developments include PC-PSI and | NMSI which also seem to "violate" the order. I hope one day we | get the formal academic description of the Kafka model. It | looks very promising. | btown wrote: | Are there good research groups or journals to follow to keep | apprised of the state of the art here? | rystsov wrote: | I've created this list a while ago | https://github.com/redpanda-data/awesome-distributed- | transac.... Maybe it's time to update it. | | Usually I start with a couple of seed papers then follow | the references, look at the other papers the authors wrote. | When a phd student explores an area they write several | paper on the topic so there is a lot material to read. But | the real gem is the thesis, it has depth, context and a lot | of links to other work in the area. | mandevil wrote: | I was unfamiliar with Redpanda, and now I know and trust it. | Whatever marketing budget Redpanda spent to get a Jepsen report | was well worth it. | belter wrote: | Agree. Nowadays, I see anything that did not go through Jepsen | with suspicion. Forces me to do the triple of technical due | diligence. | hardwaresofton wrote: | One of the clearest indications prices for a service should be | raised I've ever seen. | | Can we get patio11 in here to say the thing? | titanomachy wrote: | Do you have any information on what Jepsen charges? For all | we know, it could be precisely the right amount. | agallego wrote: | kyle is very friendly and I recommend reaching out. we | can't and wouldn't disclose any pricing that is not public | information. would be unethical on my part. all i can say | is we wish to continue our work with him indefinitely as | long as we keep making progress on the product :) | agallego wrote: | interested! :D | divan wrote: | I happened to know RedPanda founder back in the days he was at | Concord.io (as a founder and a main dev). The level of | obsession with performance and optimization of this guy was | insane. He's not only extremelly skilled with C++, but also | very passionate about rethinking large and complex systems and | rebuilding them to enable 10-100x speed improvements. It's like | his personal hobby - take a piece of software everyone use, and | optimize it to the limits of physics, usually by implementing | better version from scratch himself :) Plus, he's an excellent | communicator. Watching how their team was working I always | thought that successful companies can be built only with that | level of passion and expertise as a single package. | debarshri wrote: | Thing that got my attention was that it has inline transform | functions that can be added as wasm binary | gigatexal wrote: | If your DB doesn't pass the Jepsen tests it's not worth using. | Kudos to both teams. | newman314 wrote: | Redpanda (back when they were VectorizedIO) spammed my work email | after I starred one of their repos, denied it after I called them | out on it and I just noticed that they had deleted their response | to me. | | Pretty sneaky to go back and delete the tweets first denying and | then apologizing. | | Receipts: https://twitter.com/d11cc3s/status/1447573471152656389 | https://twitter.com/d11cc3s/status/1450906855115354116 | agallego wrote: | hi newman314 - i mentioned in the tweet this was a mistake and | offered an apology there, an sdr reached out to you, when i | realized that i apologize. no ill intent. feel free to test | this with a fake github account. my tweets automatically delete | after 6mo, all of them on a rolling window. nothing special | about this interaction. there is no sneaky-ness, though feel | free to disagree. | staticassertion wrote: | Sounds like you have a personal, singular issue with them that | I can't imagine anyone else cares about. | doommius wrote: | Always great to read this. I preformed a jenkins test on | Microsoft internal infra and it's a huge insight. From an | academic side it's just as interesting looking into the lack of | standards within consistently and the definitions of them. | rystsov wrote: | Cool! What did you test? I've played with Jepsen and Cosmos DB | when I was at Microsoft but we had to ditch ssh, write custom | agent and inject faults with PowerShell command lets. | titanomachy wrote: | The level of intellectual discipline and competence on display | here is inspiring. | | I'd love to take one of the Jepsen courses, but it seems they're | offered only as corporate training. Maybe my employeer will agree | to bring them in. | | For now I'll have to satisfy myself with the YouTube videos. | rystsov wrote: | Hey folks, I was working with Kyle Kingsbury on this report from | the Redpanda side and I'm happy to help if you have questions | cgaebel wrote: | Thanks for working with Jespen. Being willing to subject your | product to their testing is a huge boon for Redpanda's | credibility. | | I have two questions: | | 1. How surprising were the bugs that Jepsen found? | | 2. Besides the obvious regression tests for bugs that Jepsen | found, how did this report change Redpanda's overall approach | to testing? Were there classes of tests missing? | rystsov wrote: | It wasn't a big surprise for us. Redpanda is a complex | distributed system with multiple components even at the core | level: consensus, idmepotency, transactions so we were ready | that something might be off (but we were pleased to find that | all the safety issues were with the things which were behind | the feature flags at the time). | | Also we have internal chaos test and by the time partnership | with Kyle started we already identified half of the | consistency issues and sent PRs with fixes. The issues got in | the report because by the time we started the changes weren't | released yet. But it is acknowledged in the report | | > The Redpanda team already had an extensive test suite-- | including fault injection--prior to our collaboration. Their | work found several serious issues including duplicate writes | (#3039), inconsistent offsets (#3003), and aborted | reads/circular information flow (#3036) before Jepsen | encountered them | | We missed other issues because haven't exercised some | scenario. As soon as Kyle found the issues we were able to | reproduce them with the in-house chaos tests and fix. This | dual testing (jepsen + existing chaos harness) approach was | very beneficial. We were able to check the results and give | feedback to Kyle if he found a real thing or if it looks more | like an expected behavior. | | We fixed all the consistency (safety) issues, but there are | several unresolved availability dips. We'll stick with Jepsen | (the framework) until we're sure we fixed then too. But then | we probably rely just on the in house tests. | | Clojure is very powerful language and I was truly amazed how | fast Kyle for able to adjust his tests to new information but | we don't have clojure expertise and even simple tasks take | time. So it's probably wiser to use what we already know even | it it a bit more verbose. | polio wrote: | A complete nit, but the testimonial from the CTO of The Seventh | Sense on https://redpanda.com/ spells Redpanda as "Redpand". | northstar702 wrote: | Thank you. Fixed. | CJefferson wrote: | This isn't anything against Redpanda, but I'm always amazed how | badly all these distributed databases do in Jepsen. | | What would one use them for in practice, which wouldn't be better | suitable by a (the thing I've used), say postgresql and streaming | replication in case the server goes down? (I'm not suggesting | there isn't a good application, just I'm not knowledgeable enough | to know of one). | agallego wrote: | totally different approaches tho. people have tried what you | proposed many times before and for some scale succeeded. hard | to compare at all when you dig into the details. | | expect a companion post. this was super fun to partner with | kyle on this. +1 would recommend to anyone building a storage | system. | jandrewrogers wrote: | When a distributed database is designed, you must navigate and | optimize several complex technical tradeoffs to meet the | architecture and product objectives. The specific set of | tradeoffs made -- and they are different for every platform -- | will determine the kinds of data models and workloads that the | database will be suitable for, especially if performance and | scalability are critical as in this case. | | The reason distributed databases tend to be buggy, especially | in the first iterations, is straightforward if not simple to | address. While it is convenient to describe technical design | tradeoffs as a set of discrete, independent things, in real | implementation they are all interconnected in subtle, complex, | nuanced ways. Modifying one design tradeoff in code can have | unanticipated consequences for other intended tradeoffs. In | other words, there isn't a _set_ of simple tradeoffs, there is | a single _extremely high-dimensionality_ tradeoff that is being | optimized. Not only are complex high-dimensionality design | elements difficult to reason about when writing code the first | time, any changes to the code may shift how the tradeoffs | interact in non-obvious ways. Humans have finite cognitive | budgets, so unless it is obvious that a code change has the | potential to have unintended side effects, we generally don 't | spend the time to fully verify this fact. | | I can't tell you how many times I've seen tiny innocuous code | changes alter the behavior of distributed databases in | surprising ways. This is also why once the core code seems to | be correct, people are reluctant to modify it if that can be | avoided at all. | rystsov wrote: | Different systems solve different problems and have different | functional characteristics. Actually one of the thing which | Kyle highlighted in his report is write cycles (G0 anomaly), it | isn't a problem of the Redpanda implementation but a | fundamental property of the Kafka protocol. Records in Kafka | protocol don't have preconditions and they don't overwrite each | other (unlike the database operations) so it doesn't make sense | to enforce order on the transactions and it's possible to run | them in parallel. It gives enormous performance benefits and | doesn't compromise safety. | georgelyon wrote: | I'm constantly surprised more folks don't use FoundationDB, I'm | pretty sure the Jepsen folks said something to the tune of the | way FoundationDB is tested is far beyond what Jepsen does (Good | talk on FDB testing: | https://www.youtube.com/watch?v=4fFDFbi3toc). | | My read is that most use cases just need something that works | _enough_ at scale that the product doesn't fall over and any | issues introduced by such bugs can be addressed manually (i.e. | through customer support, or just sufficient ad-hoc error | handling). Couple that with the investment some of these | databases have put into onboarding and developer-acquisition, | and you have something that can be quite compelling even | compared to something which is fundamentally more correct. | staticassertion wrote: | Having looked at FoundationDB a bit it wasn't clear why I | would choose it. It has transactions, which is nice, but not | that big of a deal despite how much time they put into | talking about it. I actually don't even need transactions | since all of my writes commute, so it's particularly | uninteresting to me. | | They say they're fast, but I didn't find a ton of information | about that. | | Ultimately the sell seemed to be "nosql with transactions" | and I just couldn't justify putting more time into it. I did | watch their excellent talk on testing, and I respect that | they've put that level of effort into it, and it was why I | even considered it, but yeah, what am I missing? | jwr wrote: | As someone who is switching to FoundationDB: because it's not | easy. It doesn't look like other databases, it isn't in | fashion (yes, these things matter), and it requires thinking | and adapting your application to really use it to its full | potential. It could also benefit from a bit more developer | marketing. | | But it's the best thing out there. | claytonjy wrote: | There's a lot of different ways to answer this, but I think | about it as a different architectural paradigm. Yes you can do | stream-ish things with Postgres but at some level of scale | you'd be putting a square peg in a round hole. | | What opened my eyes to this world is this post from Martin | Kleppman on turning the database inside out: | https://martin.kleppmann.com/2015/03/04/turning-the-database... | antonmry wrote: | This report seems to have some wrong insights. Auto-commit | offsets doesn't imply dataloss if records are processed | synchronously. This is the safest way to test Kafka instead of | commit offsets manually | rystsov wrote: | Can you clarify what you mean? AFAIK with manual commit you | have the most control over when the commit happens | | Look at this blog post describing a data loss caused by auto- | commit: https://newrelic.com/blog/best-practices/kafka- | consumer-conf... | | Also there also may be more subtle issues with auto-commit: | https://github.com/edenhill/librdkafka/issues/2782 | dstroot wrote: | > A KafkaConsumer, by contrast, will happily connect to a jar of | applesauce14 and return successful, empty result sets for every | call to consumer.poll. This makes it surprisingly difficult to | tell the difference between "everything is fine and I'm up to | date" versus "the cluster is on fire", and led to significant | confusion in our tests. | | This tickled my funny bone. Never expected humor in a Jepsen | writeup. Kudos! | staticassertion wrote: | > Never expected humor in a Jepsen writeup | | Jepsen reports are often pretty funny, some famously so | cwillu wrote: | Wait until you find out why it's called "Jepson" | toolz wrote: | please tell me it has something to do with carly jepsens song | "call me maybe" ___________________________________________________________________ (page generated 2022-04-29 23:00 UTC)