[HN Gopher] Jepsen: Redpanda 21.10.1
       ___________________________________________________________________
        
       Jepsen: Redpanda 21.10.1
        
       Author : aphyr
       Score  : 134 points
       Date   : 2022-04-29 14:02 UTC (8 hours ago)
        
 (HTM) web link (jepsen.io)
 (TXT) w3m dump (jepsen.io)
        
       | dhshshdhdgfff wrote:
       | The first half is jepsen team trying to divine some actual
       | testable guarantees from a pile of blog posts and a random Google
       | doc. What a mess.
        
         | jeffbee wrote:
         | Total mess. It's a real indictment of Kafka, more than it is
         | anything about redpanda in the first half.
        
         | [deleted]
        
         | excuses_ wrote:
         | I wonder if Redpanda thinks about or offers some alternative
         | protocol that would be better defined in terms of transaction
         | guarantees. At this point it looks like Kafka's protocol was a
         | nice try but it needs a major refactoring.
        
           | rystsov wrote:
           | Documentation is a bit confusing: the protocol was evolved
           | over time (new KIPs) and there is mismatch between the
           | database model and kafka model. But we see a lot of potential
           | in the Kafka transactional protocol.
           | 
           | At Redpanda we were able to push to 5k distributed
           | transactions cross replicated shard. It's a mind-blowing for
           | a database to achieve the same result.
           | 
           | Also Kafka transactional protocol works at low level it's
           | very easy to build systems on top of it. For example, it's
           | very easy to build a Calvin inspired system
           | http://cs.yale.edu/homes/thomson/publications/calvin-
           | sigmod1...
        
         | rystsov wrote:
         | The mess is mostly the result of the mismatch between the
         | classic database transactional model and kafka transactional
         | model (G0 anomaly). If you read the documentation without the
         | database background it seems ok, but when you notice the
         | differences between the models it becomes hard to understand if
         | it's a bug or property of the Kafka protocol.
         | 
         | There is a lot of research happening around this area even in
         | the database world. The list of the isolation levels isn't
         | final and some of the recent developments include PC-PSI and
         | NMSI which also seem to "violate" the order. I hope one day we
         | get the formal academic description of the Kafka model. It
         | looks very promising.
        
           | btown wrote:
           | Are there good research groups or journals to follow to keep
           | apprised of the state of the art here?
        
             | rystsov wrote:
             | I've created this list a while ago
             | https://github.com/redpanda-data/awesome-distributed-
             | transac.... Maybe it's time to update it.
             | 
             | Usually I start with a couple of seed papers then follow
             | the references, look at the other papers the authors wrote.
             | When a phd student explores an area they write several
             | paper on the topic so there is a lot material to read. But
             | the real gem is the thesis, it has depth, context and a lot
             | of links to other work in the area.
        
       | mandevil wrote:
       | I was unfamiliar with Redpanda, and now I know and trust it.
       | Whatever marketing budget Redpanda spent to get a Jepsen report
       | was well worth it.
        
         | belter wrote:
         | Agree. Nowadays, I see anything that did not go through Jepsen
         | with suspicion. Forces me to do the triple of technical due
         | diligence.
        
         | hardwaresofton wrote:
         | One of the clearest indications prices for a service should be
         | raised I've ever seen.
         | 
         | Can we get patio11 in here to say the thing?
        
           | titanomachy wrote:
           | Do you have any information on what Jepsen charges? For all
           | we know, it could be precisely the right amount.
        
             | agallego wrote:
             | kyle is very friendly and I recommend reaching out. we
             | can't and wouldn't disclose any pricing that is not public
             | information. would be unethical on my part. all i can say
             | is we wish to continue our work with him indefinitely as
             | long as we keep making progress on the product :)
        
           | agallego wrote:
           | interested! :D
        
         | divan wrote:
         | I happened to know RedPanda founder back in the days he was at
         | Concord.io (as a founder and a main dev). The level of
         | obsession with performance and optimization of this guy was
         | insane. He's not only extremelly skilled with C++, but also
         | very passionate about rethinking large and complex systems and
         | rebuilding them to enable 10-100x speed improvements. It's like
         | his personal hobby - take a piece of software everyone use, and
         | optimize it to the limits of physics, usually by implementing
         | better version from scratch himself :) Plus, he's an excellent
         | communicator. Watching how their team was working I always
         | thought that successful companies can be built only with that
         | level of passion and expertise as a single package.
        
         | debarshri wrote:
         | Thing that got my attention was that it has inline transform
         | functions that can be added as wasm binary
        
       | gigatexal wrote:
       | If your DB doesn't pass the Jepsen tests it's not worth using.
       | Kudos to both teams.
        
       | newman314 wrote:
       | Redpanda (back when they were VectorizedIO) spammed my work email
       | after I starred one of their repos, denied it after I called them
       | out on it and I just noticed that they had deleted their response
       | to me.
       | 
       | Pretty sneaky to go back and delete the tweets first denying and
       | then apologizing.
       | 
       | Receipts: https://twitter.com/d11cc3s/status/1447573471152656389
       | https://twitter.com/d11cc3s/status/1450906855115354116
        
         | agallego wrote:
         | hi newman314 - i mentioned in the tweet this was a mistake and
         | offered an apology there, an sdr reached out to you, when i
         | realized that i apologize. no ill intent. feel free to test
         | this with a fake github account. my tweets automatically delete
         | after 6mo, all of them on a rolling window. nothing special
         | about this interaction. there is no sneaky-ness, though feel
         | free to disagree.
        
         | staticassertion wrote:
         | Sounds like you have a personal, singular issue with them that
         | I can't imagine anyone else cares about.
        
       | doommius wrote:
       | Always great to read this. I preformed a jenkins test on
       | Microsoft internal infra and it's a huge insight. From an
       | academic side it's just as interesting looking into the lack of
       | standards within consistently and the definitions of them.
        
         | rystsov wrote:
         | Cool! What did you test? I've played with Jepsen and Cosmos DB
         | when I was at Microsoft but we had to ditch ssh, write custom
         | agent and inject faults with PowerShell command lets.
        
       | titanomachy wrote:
       | The level of intellectual discipline and competence on display
       | here is inspiring.
       | 
       | I'd love to take one of the Jepsen courses, but it seems they're
       | offered only as corporate training. Maybe my employeer will agree
       | to bring them in.
       | 
       | For now I'll have to satisfy myself with the YouTube videos.
        
       | rystsov wrote:
       | Hey folks, I was working with Kyle Kingsbury on this report from
       | the Redpanda side and I'm happy to help if you have questions
        
         | cgaebel wrote:
         | Thanks for working with Jespen. Being willing to subject your
         | product to their testing is a huge boon for Redpanda's
         | credibility.
         | 
         | I have two questions:
         | 
         | 1. How surprising were the bugs that Jepsen found?
         | 
         | 2. Besides the obvious regression tests for bugs that Jepsen
         | found, how did this report change Redpanda's overall approach
         | to testing? Were there classes of tests missing?
        
           | rystsov wrote:
           | It wasn't a big surprise for us. Redpanda is a complex
           | distributed system with multiple components even at the core
           | level: consensus, idmepotency, transactions so we were ready
           | that something might be off (but we were pleased to find that
           | all the safety issues were with the things which were behind
           | the feature flags at the time).
           | 
           | Also we have internal chaos test and by the time partnership
           | with Kyle started we already identified half of the
           | consistency issues and sent PRs with fixes. The issues got in
           | the report because by the time we started the changes weren't
           | released yet. But it is acknowledged in the report
           | 
           | > The Redpanda team already had an extensive test suite--
           | including fault injection--prior to our collaboration. Their
           | work found several serious issues including duplicate writes
           | (#3039), inconsistent offsets (#3003), and aborted
           | reads/circular information flow (#3036) before Jepsen
           | encountered them
           | 
           | We missed other issues because haven't exercised some
           | scenario. As soon as Kyle found the issues we were able to
           | reproduce them with the in-house chaos tests and fix. This
           | dual testing (jepsen + existing chaos harness) approach was
           | very beneficial. We were able to check the results and give
           | feedback to Kyle if he found a real thing or if it looks more
           | like an expected behavior.
           | 
           | We fixed all the consistency (safety) issues, but there are
           | several unresolved availability dips. We'll stick with Jepsen
           | (the framework) until we're sure we fixed then too. But then
           | we probably rely just on the in house tests.
           | 
           | Clojure is very powerful language and I was truly amazed how
           | fast Kyle for able to adjust his tests to new information but
           | we don't have clojure expertise and even simple tasks take
           | time. So it's probably wiser to use what we already know even
           | it it a bit more verbose.
        
         | polio wrote:
         | A complete nit, but the testimonial from the CTO of The Seventh
         | Sense on https://redpanda.com/ spells Redpanda as "Redpand".
        
           | northstar702 wrote:
           | Thank you. Fixed.
        
       | CJefferson wrote:
       | This isn't anything against Redpanda, but I'm always amazed how
       | badly all these distributed databases do in Jepsen.
       | 
       | What would one use them for in practice, which wouldn't be better
       | suitable by a (the thing I've used), say postgresql and streaming
       | replication in case the server goes down? (I'm not suggesting
       | there isn't a good application, just I'm not knowledgeable enough
       | to know of one).
        
         | agallego wrote:
         | totally different approaches tho. people have tried what you
         | proposed many times before and for some scale succeeded. hard
         | to compare at all when you dig into the details.
         | 
         | expect a companion post. this was super fun to partner with
         | kyle on this. +1 would recommend to anyone building a storage
         | system.
        
         | jandrewrogers wrote:
         | When a distributed database is designed, you must navigate and
         | optimize several complex technical tradeoffs to meet the
         | architecture and product objectives. The specific set of
         | tradeoffs made -- and they are different for every platform --
         | will determine the kinds of data models and workloads that the
         | database will be suitable for, especially if performance and
         | scalability are critical as in this case.
         | 
         | The reason distributed databases tend to be buggy, especially
         | in the first iterations, is straightforward if not simple to
         | address. While it is convenient to describe technical design
         | tradeoffs as a set of discrete, independent things, in real
         | implementation they are all interconnected in subtle, complex,
         | nuanced ways. Modifying one design tradeoff in code can have
         | unanticipated consequences for other intended tradeoffs. In
         | other words, there isn't a _set_ of simple tradeoffs, there is
         | a single _extremely high-dimensionality_ tradeoff that is being
         | optimized. Not only are complex high-dimensionality design
         | elements difficult to reason about when writing code the first
         | time, any changes to the code may shift how the tradeoffs
         | interact in non-obvious ways. Humans have finite cognitive
         | budgets, so unless it is obvious that a code change has the
         | potential to have unintended side effects, we generally don 't
         | spend the time to fully verify this fact.
         | 
         | I can't tell you how many times I've seen tiny innocuous code
         | changes alter the behavior of distributed databases in
         | surprising ways. This is also why once the core code seems to
         | be correct, people are reluctant to modify it if that can be
         | avoided at all.
        
         | rystsov wrote:
         | Different systems solve different problems and have different
         | functional characteristics. Actually one of the thing which
         | Kyle highlighted in his report is write cycles (G0 anomaly), it
         | isn't a problem of the Redpanda implementation but a
         | fundamental property of the Kafka protocol. Records in Kafka
         | protocol don't have preconditions and they don't overwrite each
         | other (unlike the database operations) so it doesn't make sense
         | to enforce order on the transactions and it's possible to run
         | them in parallel. It gives enormous performance benefits and
         | doesn't compromise safety.
        
         | georgelyon wrote:
         | I'm constantly surprised more folks don't use FoundationDB, I'm
         | pretty sure the Jepsen folks said something to the tune of the
         | way FoundationDB is tested is far beyond what Jepsen does (Good
         | talk on FDB testing:
         | https://www.youtube.com/watch?v=4fFDFbi3toc).
         | 
         | My read is that most use cases just need something that works
         | _enough_ at scale that the product doesn't fall over and any
         | issues introduced by such bugs can be addressed manually (i.e.
         | through customer support, or just sufficient ad-hoc error
         | handling). Couple that with the investment some of these
         | databases have put into onboarding and developer-acquisition,
         | and you have something that can be quite compelling even
         | compared to something which is fundamentally more correct.
        
           | staticassertion wrote:
           | Having looked at FoundationDB a bit it wasn't clear why I
           | would choose it. It has transactions, which is nice, but not
           | that big of a deal despite how much time they put into
           | talking about it. I actually don't even need transactions
           | since all of my writes commute, so it's particularly
           | uninteresting to me.
           | 
           | They say they're fast, but I didn't find a ton of information
           | about that.
           | 
           | Ultimately the sell seemed to be "nosql with transactions"
           | and I just couldn't justify putting more time into it. I did
           | watch their excellent talk on testing, and I respect that
           | they've put that level of effort into it, and it was why I
           | even considered it, but yeah, what am I missing?
        
           | jwr wrote:
           | As someone who is switching to FoundationDB: because it's not
           | easy. It doesn't look like other databases, it isn't in
           | fashion (yes, these things matter), and it requires thinking
           | and adapting your application to really use it to its full
           | potential. It could also benefit from a bit more developer
           | marketing.
           | 
           | But it's the best thing out there.
        
         | claytonjy wrote:
         | There's a lot of different ways to answer this, but I think
         | about it as a different architectural paradigm. Yes you can do
         | stream-ish things with Postgres but at some level of scale
         | you'd be putting a square peg in a round hole.
         | 
         | What opened my eyes to this world is this post from Martin
         | Kleppman on turning the database inside out:
         | https://martin.kleppmann.com/2015/03/04/turning-the-database...
        
       | antonmry wrote:
       | This report seems to have some wrong insights. Auto-commit
       | offsets doesn't imply dataloss if records are processed
       | synchronously. This is the safest way to test Kafka instead of
       | commit offsets manually
        
         | rystsov wrote:
         | Can you clarify what you mean? AFAIK with manual commit you
         | have the most control over when the commit happens
         | 
         | Look at this blog post describing a data loss caused by auto-
         | commit: https://newrelic.com/blog/best-practices/kafka-
         | consumer-conf...
         | 
         | Also there also may be more subtle issues with auto-commit:
         | https://github.com/edenhill/librdkafka/issues/2782
        
       | dstroot wrote:
       | > A KafkaConsumer, by contrast, will happily connect to a jar of
       | applesauce14 and return successful, empty result sets for every
       | call to consumer.poll. This makes it surprisingly difficult to
       | tell the difference between "everything is fine and I'm up to
       | date" versus "the cluster is on fire", and led to significant
       | confusion in our tests.
       | 
       | This tickled my funny bone. Never expected humor in a Jepsen
       | writeup. Kudos!
        
         | staticassertion wrote:
         | > Never expected humor in a Jepsen writeup
         | 
         | Jepsen reports are often pretty funny, some famously so
        
         | cwillu wrote:
         | Wait until you find out why it's called "Jepson"
        
           | toolz wrote:
           | please tell me it has something to do with carly jepsens song
           | "call me maybe"
        
       ___________________________________________________________________
       (page generated 2022-04-29 23:00 UTC)