hngopher.com

       [HN Gopher] Jepsen: MongoDB 4.2.6
       ___________________________________________________________________
        
       Jepsen: MongoDB 4.2.6
        
       Author : aphyr
       Score  : 526 points
       Date   : 2020-05-24 11:42 UTC (11 hours ago)
        
 (HTM) web link (jepsen.io)
 (TXT) w3m dump (jepsen.io)
        
       | pier25 wrote:
       | > Normally we downweight follow-up posts
       | 
       | So you manually moderate the content?
        
         | VonGuard wrote:
         | I mean, this was kind of an exception case, where there is a
         | big old technical war of words back and forth. Almost a "He
         | said She said" except here, He is an absolute expert, and She
         | is just some marketing dorks at Mongo.
         | 
         | I, for one, welcome this by-hand moderation because it keeps
         | this issue alive, and allows Kyle to keep the discussion going.
         | 
         | As I commented in a previous post, Kyle is the Chef Ramsey of
         | database testing, and here, he's in a position where some idiot
         | has just served him an undercooked hamburger. Bits will fly,
         | marketing people will be flayed alive, and Kyle will be the
         | only one left standing at the end.
         | 
         | Without this by-hand moderation, we'd be missing out on the
         | second act of this intense thriller!
        
           | pier25 wrote:
           | I'm totally ok with the moderation/curation/whatever!
        
         | dang wrote:
         | Oh yes. HN has always been moderated/curated/whatever term you
         | prefer. Many past explanations can be found through these
         | links:
         | 
         | https://hn.algolia.com/?dateRange=all&page=0&prefix=true&que...
         | 
         | https://hn.algolia.com/?dateRange=all&page=0&prefix=true&que...
         | 
         | https://hn.algolia.com/?dateRange=all&page=0&prefix=true&que...
         | 
         | https://hn.algolia.com/?dateRange=all&page=0&prefix=true&que...
         | 
         | (I've detached this subthread from
         | https://news.ycombinator.com/item?id=23294048 to prevent the
         | top comment from being too distracting.)
        
           | pier25 wrote:
           | Thanks for the links!
           | 
           | It's totally fine with me, but I just wasn't aware of it.
        
         | [deleted]
        
         | DoreenMichele wrote:
         | They use a combination of algorithms and human intervention, to
         | generally good effect.
         | 
         | No clue if this "downweighting" in this case is an algorithm or
         | a manual thing. I would assume algorithm for the downweighting
         | and human intervention for reversing it, but that's sort of a
         | guess or inference.
        
         | baq wrote:
         | of course they do? given the quality of discussion here it's a
         | hard requirement to preserve high snr.
        
       | rmdashrfstar wrote:
       | The main argument for using a documented-oriented database:
       | https://martinfowler.com/bliki/AggregateOrientedDatabase.htm...
        
       | nevi-me wrote:
       | Friendly question: did you update anything on the findings since
       | https://news.ycombinator.com/item?id=23191439 ?
        
         | aphyr wrote:
         | Nope! Something weird happened to that post; it got a lot of
         | upvotes and some comments, but never made it to frontpage.
         | After the InfoQ article took off yesterday, an HN mod got in
         | touch and asked if I'd like to resubmit it.
        
       | lllr_finger wrote:
       | Mongo has been _related to_ "perpetual irritation" up to "major
       | production issue" at all three of my last companies.
       | 
       | For as easy as it is to use jsonb in Postgres, or Redis, or
       | RocksDB/SQLite, or whatever else depending on your use case - I
       | can't find any reason to advocate its use these days. In my
       | anecdotal experience, the success stories never happen, and
       | nearly developer I know has an unpleasant experience they can
       | share.
       | 
       | Big thanks to aphyr and the Jepsen suite (and unrelated blog
       | posts like Hexing the Interview) for inspiring me to do thorough
       | engineering.
        
         | StavrosK wrote:
         | I find that using JSON for things you don't need to
         | query/validate (like big blobs you just want to store) and
         | breaking the rest out to columns works well enough. Plus, you
         | can always migrate the data out to a field anyway.
        
           | emerongi wrote:
           | Postgres 12 has generated columns, so you can throw your data
           | in a jsonb column and have Postgres pull data out of it into
           | separate columns for indexing for example.
        
             | magnushiie wrote:
             | Generated columns are not necessary for indexing in
             | Postgres, you can create an index on any expression based
             | on the record (supported by many versions now).
        
         | mtrycz2 wrote:
         | > I can't find any reason to advocate its use these days.
         | 
         | Don't you know? It's web-scale.
        
           | rmdashrfstar wrote:
           | If I was a moderator on HN, I would instantly ban commenters
           | who continue to make these asinine posts. Is this Reddit, or
           | is HN striving to be Reddit?
        
             | reese_john wrote:
             | https://news.ycombinator.com/newsfaq.html
             | 
             |  __" Please don't post comments saying that HN is turning
             | into Reddit. It's a semi-noob illusion, as old as the
             | hills." __
        
               | rmdashrfstar wrote:
               | Interesting taste of my own medicine. Will do, thanks for
               | the reminder!
        
         | ep103 wrote:
         | Is Postgres what most people would suggest as a MongoDB
         | replacement?
         | 
         | Anyone have any suggestions for a true non-MongoDB jsonDocument
         | based noSql option?
        
           | jfkebwjsbx wrote:
           | The first question you must ask yourself is: do I really need
           | a document store?
           | 
           | Because the answer is "no" in the overwhelmingly majority of
           | cases, specially if your product is mature.
        
           | zozbot234 wrote:
           | It depends what you're using it for. Postgres is a very good
           | all-around choice these days (compared to when the whole
           | 'noSql' thing got started) and also supports document-based
           | scenarios quite well via JSON/JSONB columns and its support
           | for these datatypes in queries, updates, indexing etc.
           | Sharding and replication can also be set up via fairly
           | general mechanisms, as described in pgSQL documentation. (For
           | instance, the FDW facility is often used to set up sharding,
           | but it could also support e.g. aggregation.)
        
             | threeseed wrote:
             | Note that there is no Jepsen test for those
             | sharding/replication features.
        
           | threeseed wrote:
           | As has been mentioned above PostgreSQL does not come out of
           | the box with a supported, tested clustering solution.
           | 
           | Given that is a pretty popular part of MongoDB seems like an
           | important thing for people to continuously fail to mention.
        
       | tester756 wrote:
       | Why is this being here everyday for last 3 days?
        
       | judofyr wrote:
       | This is not directly related to this report or Jepsen, but since
       | you're here I've got to ask: Aphyr, are there any recent
       | papers/research in the realm of distributed databases which
       | you're excited about?
        
         | aphyr wrote:
         | Calvin and CRDTs aren't new, but I still think they're
         | dramatically underappreciated! Heidi Howard's recent work on
         | generalizing Paxos quorums is super intriguing, and from some
         | discussion with her, I think there are open possibilities in
         | making _leaderless_ single-round-trip consensus systems for
         | log-oriented FSMs, which is what pretty much everyone WANTS.
         | 
         | I'm also excited about my own research with Elle, but we're
         | still working on getting that through peer review, haha. ;-)
        
           | thramp wrote:
           | > I think there are open possibilities in making leaderless
           | single-round-trip consensus systems for log-oriented FSMs,
           | which is what pretty much everyone WANTS.
           | 
           | Woah, that's wild. Are there any pre-prints/papers/talks that
           | you can link to on this subject? I'd _love_ to read this.
           | 
           | > I'm also excited about my own research with Elle, but we're
           | still working on getting that through peer review, haha. ;-)
           | 
           | I read over bits of Elle; the documentation in it is
           | absolutely top-notch. You and Peter Alvaro knocked it out of
           | the park!
        
             | aphyr wrote:
             | _I think there are open possibilities in making leaderless
             | single-round-trip consensus systems for log-oriented FSMs,
             | which is what pretty much everyone WANTS._
             | 
             | This is based on her presentation and some dinner
             | conversation at HPTS 2019, so I don't know if there's
             | actually a paper I can point to. The gist of is that Paxos
             | normally involves an arbitration phase where there are
             | conflicting proposals, which adds a second pair of message
             | delays. But if you relax the consensus problem to agreement
             | on a _set_ of proposals, rather than a single proposal, you
             | don 't need the arbitration phase. Instead of "who won", it
             | becomes "everyone wins". Then you can impose an order on
             | that set via, say, sorting, and iterate to get a replicated
             | log.
             | 
             |  _I read over bits of Elle; the documentation in it is
             | absolutely top-notch. You and Peter Alvaro knocked it out
             | of the park!_
             | 
             | Thank you! Could I... hang on, just let me grab reviewer #1
             | quickly, I'd like them to hear this. ;-)
        
               | judofyr wrote:
               | > _This is based on her presentation and some dinner
               | conversation at HPTS 2019, so I don 't know if there's
               | actually a paper I can point to. The gist of is that
               | Paxos normally involves an arbitration phase where there
               | are conflicting proposals, which adds a second pair of
               | message delays. But if you relax the consensus problem to
               | agreement on a set of proposals, rather than a single
               | proposal, you don't need the arbitration phase. Instead
               | of "who won", it becomes "everyone wins". Then you can
               | impose an order on that set via, say, sorting, and
               | iterate to get a replicated log._
               | 
               | This sounds very similar to _atomic broadcast_
               | (https://en.wikipedia.org/wiki/Atomic_broadcast) where
               | each node sends a single message and the process ensures
               | that all nodes agree on the same set of messages. Not
               | sure how it would fit with a log-oriented FSM, but it
               | certainly sounds interesting.
        
               | senderista wrote:
               | It's really pretty trivial to implement RSM given an
               | atomic broadcast protocol. But you can implement many
               | other things, like totally ordered ephemeral messaging
               | with arbitrary fanout, or a replicated durable log ala
               | Kafka. Here's my current favorite atomic broadcast
               | protocol (from 2007 or so), which is leaderless, has
               | write throughput saturating network bandwidth, and read
               | throughput scaling linearly with cluster size:
               | 
               | https://os.zhdk.cloud.switch.ch/tind-tmp-
               | epfl/394a62dd-278f-...
        
               | thramp wrote:
               | > This is based on her presentation and some dinner
               | conversation at HPTS 2019, so I don't know if there's
               | actually a paper I can point to.
               | 
               | Thanks for the explanation! I just found
               | http://www.hpts.ws/papers/2019/howard.pdf; I'm reading
               | through it now :)
               | 
               | > Thank you! Could I... hang on, just let me grab
               | reviewer #1 quickly, I'd like them to hear this. ;-)
               | 
               | Do as you please with my praise!
        
       | zzzeek wrote:
       | How many more years do we have to keep evaluating, studying, and
       | reading about MongoDB's ongoing failures? It would appear this
       | product has been a great burden on the community for many years.
        
         | aphyr wrote:
         | I like to keep in mind that MongoDB's existing feature set is
         | maturing--occasional regressions may happen, but by and large
         | they're making progress. The problems in this analysis were in
         | a transaction system that's only been around for a couple
         | years, so it's had less time to have rough edges sanded off.
        
           | zzzeek wrote:
           | there are _so_ _many_ _great_ _databases_ out there. There 's
           | no need for one that has been mediocre for years and
           | continues to make false claims. This is an issue of years of
           | super aggressive marketing of an inferior product making it
           | hard on engineers.
        
       | aphyr wrote:
       | Hi folks! Author of the report here. If anyone has questions
       | about detecting transactional anomalies, what those anomalies are
       | in the first place, snapshot isolation, etc., I'm happy to answer
       | as best I can.
        
         | rystsov wrote:
         | Hi Kyle, thanks for the Elle :) I want to use Elle to check
         | long histories of transactions over small set of keys with read
         | dominant workload, the paper recommends to use lists over
         | registers but when the history becomes long on the one hand it
         | becomes too wasteful to read the register's history on each
         | request on the other hand the Elle's input becomes very large.
         | E.g. when each read should return the whole register's history
         | the size of history grows O(n^2) compared to the case when the
         | reads return just the head.
         | 
         | So I'm curios how would you have described the ability of
         | finding violations with Elle using read-write registers with
         | unique values vs the append-only lists?
        
           | aphyr wrote:
           | _E.g. when each read should return the whole register 's
           | history the size of history grows O(n^2) compared to the case
           | when the reads return just the head._
           | 
           | If you look at Elle's transaction generators, you can cap the
           | size of any individual key, and use an uneven (e.g.
           | exponential) distribution of key choices to get various
           | frequencies. That way keys stay reasonably small (I use 1-10K
           | writes/key), some keys are updated frequently to catch race
           | conditions, and others last hundreds of seconds to catch
           | long-lasting errors.
           | 
           |  _So I 'm curios how would you have described the ability of
           | finding violations with Elle using read-write registers with
           | unique values vs the append-only lists?_
           | 
           | RW registers are significantly weaker, though I don't know
           | how to quantify the difference. I've still caught errors with
           | registers, but the grounds for inferring anomalies are a.)
           | less powerful and b.) can only be applied in certain
           | circumstances--we talk about some of these details in the
           | paper.
        
         | eternalban wrote:
         | "3.4 Duplicate Effects"
         | 
         | This section seems to be the most worrying results in your
         | report, Kyle, with no work around. Did I read that correctly?
        
           | aphyr wrote:
           | Yeah, there's no workaround that I can find for 3.4
           | (duplicate effects), 3.5 (read skew), 3.6 (cyclic information
           | flow), or 3.7 (read own future writes). I've arranged those
           | in "increasingly worrying order"--duplicating writes doesn't
           | feel as bad as allowing transactions to mutually observe each
           | other's effects, for example. The fact that you can't even
           | rely on a single transactions' operations taking place (or,
           | more precisely, appearing to take place) in the order they're
           | written is especially worrying. All of these behaviors
           | occurred with read and write concerns set to
           | snapshot/majority.
           | 
           | That's not to say that workarounds don't exist, just that I
           | didn't find any in the documentation or by twiddling config
           | flags in the ~2 weeks I was working on this report. :)
        
         | devit wrote:
         | Have you considered presenting the data in a concise manner in
         | addition to the in-depth analyses?
         | 
         | That is, a table on the jepsen.io frontpage, or at least on
         | each product's review page, with database products and
         | configuration on rows and consistency properties on columns,
         | and a nice "Yay!" or "Nope!" mark in the cell, plus links on
         | how to achieve the database configurations in the table (esp.
         | how to configure each database to have the most guarantees).
         | 
         | Also, ideally the analyses should be rerun automatically (or
         | possibly after being paid, but making it easy for the company
         | to do so) every time a new major release happens rather than
         | being done once and then being stale.
         | 
         | Finally, there should be tests for the non-broken databases
         | (PostgreSQL for instance, both in single-server mode, deployed
         | with Stolon on Kubernetes and using the multimaster projects)
         | as well to confirm they actually work.
        
           | eloff wrote:
           | Oh man this would be useful.
        
           | aphyr wrote:
           | _That is, a table on the jepsen.io frontpage, or at least on
           | each product 's review page, with database products and
           | configuration on rows and consistency properties on columns,
           | and a nice "Yay!" or "Nope!" mark in the cell, plus links on
           | how to achieve the database configurations in the table (esp.
           | how to configure each database to have the most guarantees)._
           | 
           | This is a wonderful idea, and I've got no idea how to
           | actually do it in a standardized, rigorous way. Vendor claims
           | are often contradictory, it's hard to get a good idea of
           | anomaly frequency, availability is... a rabbithole, and it's
           | hard to come up with a standard taxonomy of anomalies--most
           | of the analyses I do wind up finding something I've never
           | really seen before, haha. With that in mind, I've wound up
           | letting the reports speak for themselves.
           | 
           |  _Also, ideally the analyses should be rerun automatically
           | (or possibly after being paid, but making it easy for the
           | company to do so) every time a new major release happens
           | rather than being done once and then being stale._
           | 
           | I don't know a good way to do this either. Each report is
           | typically the product of months of experimental work; it's
           | not like Jepsen is a pass-fail test suite that gives
           | immediately accurate results. There is, unfortunately, a lot
           | of subtle interpretive work that goes into figuring out if a
           | test is doing something meaningful, and a lot of that work
           | needs to be repeated on each test run. Think, like... staring
           | at the logs and noticing that a certain class of exception is
           | being caught more often than you might have expected, and
           | realizing that a certain type of transaction now triggers a
           | new conflict detection mechanism which causes higher
           | probabilities of aborts; those aborts reduce the frequency
           | with which you can observe database state, allowing a race
           | condition to go un-noticed. That kinda thing.
           | 
           | If I'm lucky and the API/setup process haven't changed, I can
           | re-run an analysis in about a week or so. If I'm unlucky,
           | there's been drift in the OS, setup process, APIs, client
           | libraries, error handling, etc. It's not uncommon for a
           | repeat analysis to take months. :-(
        
             | X6S1x6Okd1st wrote:
             | It's probably more snarky than helpful, but it'd be great
             | to have a section where it's just marketing materials or
             | docs that you've corrected with a red pen
        
               | bcrosby95 wrote:
               | It's probably better to keep it professional. Your
               | average employee can afford some snark. But when
               | companies hire you for this sort of consulting, you could
               | turn off a lot of potential clients by including it in
               | materials you produce, even when they didn't pay for it.
               | Because it is a representation of the product they would
               | be paying for.
               | 
               | It would be kinda like you including this sort of thing
               | on your resume. Which would also be a bad idea.
        
               | ashtonkem wrote:
               | For those who don't know, Kyle makes a living offering
               | these types of analysis to database companies directly.
               | While a lot of us love to dunk on Mongo (myself
               | included), it would be silly to expect Kyle to risk his
               | livelihood.
        
               | jka wrote:
               | If done accurately and professionally, something like
               | you're suggesting could be really useful to aid people
               | and organizations during vendor selection.
               | 
               | https://web.hypothes.is/about/ or similar could be used
               | to develop commentary overlays on top of marketing
               | materials.
        
           | HappyDreamer wrote:
           | > _consistency properties on columns, and a nice "Yay!" or
           | "Nope!" mark in the cell_
           | 
           | Plus maybe a column indicating what [the company behind the
           | database] claims?
        
         | teskk123 wrote:
         | hello :) Where is you found out all information about how to do
         | such testing?
        
           | aphyr wrote:
           | I was lucky to have a good education: my B.A. involved
           | courses in contemporary experimental physics and independent
           | research in nonlinear quantum dynamics (esp. proofs,
           | experimental design, writing), cognitive and social
           | psychology (more experiment design and stats), math
           | structures (proof techniques), philosophy (metaphysics,
           | philosophy of science), and English (rhetoric). All of those
           | helped give me a foundation for doing this kind of
           | experimental work and communicating it to others.
           | 
           | Jepsen draws inspiration from a long line of work on
           | property-based testing, especially Quickcheck & co. It also
           | draws on roughly 10 years of experience building & running
           | distributed systems in production. A lot of Jepsen I invented
           | from whole cloth, but some of the checkers in Jepsen are
           | derived from specific research papers, like work by Wing,
           | Gong, and Howe on linearizability checking.
           | 
           | Then it's just... a lot of thinking, experimenting, and
           | writing. Jepsen's the product of ~6 years of full-time work.
           | Elle, the system which detected the anomalies in this report,
           | was a research project I've been puzzling over for roughly
           | two years.
           | 
           | I write the Jepsen series, and open-source all of the code
           | for these tests, partly as a resource so that other people
           | can learn to do this same kind of work. :-)
        
             | teskk123 wrote:
             | wow. thanks a lot for quick and full answering!
        
             | throwaway_pdp09 wrote:
             | I guess you've answered my question, but to be clear, you
             | do not instrument/analyse the code, you treat it as a black
             | box which you hammer on externally, is that right?
        
               | aphyr wrote:
               | Pretty much, yeah. There are some cases where Jepsen
               | reaches into the guts of a database or lies to it via
               | LD_PRELOAD shims, but generally these are Just Plain Old
               | Binaries provided by vendors; no instrumentation
               | required.
        
         | monstrado wrote:
         | Huge fan of your work! I was curious if you've ever attempted
         | to run your (or part of) Mongo test suite against FoundationDB
         | using their DocumentLayer since it's supposed to be Mongo API
         | compatible.
        
           | robterrell wrote:
           | IIRC one of the FoundationDB engineers tested with Jepsen and
           | found that it passed in its default configuration, but the
           | blog post seems to have disappeared.
           | 
           | https://web.archive.org/web/20150312112556/http://blog.found.
           | ..
        
             | monstrado wrote:
             | Thanks for firing up the time machine! I've been using FDB
             | for a little over a year now and can't recommend it enough.
             | Such a solid piece of meticulous engineering.
        
           | aphyr wrote:
           | No, I haven't! You can see a full list of analyses here:
           | http://jepsen.io/analyses
        
         | rclayton wrote:
         | Hi Kyle! I've really enjoyed your work over the years. I was
         | wondering, with all of your testing and experimentation, is
         | there any system that had really impressed you?
        
           | zbentley wrote:
           | I don't presume to speak for him, but his writeup on
           | ZooKeeper was among the most positive in the Jepsen series:
           | https://aphyr.com/posts/291-jepsen-zookeeper
           | 
           | My bias: I like and heavily use ZooKeeper in production. HN
           | seems not to like it as much.
        
           | aphyr wrote:
           | I'm kind of impressed _any_ distributed system gets off the
           | ground. These things are hard to write!
        
         | dilandau wrote:
         | You're doing very, very valuable work. Thanks fam, keep those
         | vendors honest, and help us make informed decisions.
        
         | politician wrote:
         | Thank you for all of your work over the years. Your reports
         | have helped me and others stand up to bizdev hype and make
         | better decisions for our companies and customers.
         | 
         | Postgres is widely understood to be a robust database with safe
         | defaults. I, and perhaps others, would love to see you aim your
         | array of weapons at Postgres. Do you have any plans to look at
         | stock Postgres?
        
           | aphyr wrote:
           | It's been on my list for a long time, but I've also struggled
           | to find out like... what, exactly, is the right way to do
           | postgres replication? Every time I go into the docs I wind up
           | with a laundry list of different mechanisms for replication
           | and failover, and no idea which one would be most appropriate
           | for a test. I gotta get on this!
        
             | takeda wrote:
             | Well, the built-in ways is the right way to do it. But
             | given that PostgreSQL is quite conservative about it, it
             | will be hard to find issues there (the replicas are read
             | only, so at worst it will be just a replication delay,
             | unless you use synchronous replication, which will remove
             | the replication delay at the cost of slower performance).
             | 
             | All the tooling that provides extra distributed
             | functionality not present in postgres (auto failover, multi
             | master replication, sharding etc) will surely have issues,
             | but then you aren't testing the PostgreSQL itself, but the
             | tooling, so to be fair, you the article should evaluate
             | these tools, and any shortcomings shouldn't go to
             | PostgreSQL (unless it really is a PostgreSQL issue).
        
             | didip wrote:
             | It is true that only recently PG has a standard way of
             | replicating. But even then, PG is not a distributed
             | database by default.
             | 
             | However, if I may suggest, Stolon, Patroni, Postgres XL or
             | Citus Data might be interesting to you.
        
             | bsaul wrote:
             | i feel like this is the reaction of everyone having ever
             | tried to setup postgres replication. With your audience,
             | you deciding for a particular setup will probably help a
             | LOT of people, and ultimately the postgres project as well.
        
               | takeda wrote:
               | If you worry about data, you should not use automatic
               | failover. It's nearly impossible for standby to know why
               | master stopped responding. Maybe there was a hardware
               | failure, or maybe master is just busy. This is why manual
               | failover is better, because you can know the real reason
               | and decide whether you should perform failover or just
               | wait.
               | 
               | With tools like repmgr it is just a single command
               | invoked on the standby.
               | 
               | If you absolutely don't want to lose any data, you should
               | have two masters in close proximity (so the latency isn't
               | high) set up with synchronous replication, then have one
               | or two standbys with asynchronous replication. This will
               | reduce throughout, but then you can be sure that the
               | other machine has all the same transactions. If something
               | happens to both you then can fallback to the asynchronous
               | one which might be a bit behind.
        
               | feike wrote:
               | One of the authors of Patroni here.
               | 
               | Automatic failover for PostgreSQL works great and can be
               | done safely if combined with synchronous replication.
               | 
               | Multiple tools will implement this correctly:
               | 
               | https://patroni.readthedocs.io/en/latest/replication_mode
               | s.h... https://github.com/sorintlab/stolon/blob/master/do
               | c/syncrepl...
               | 
               | Quoting a former colleague here, but "if it hurts, do it
               | more often". That is what you should do with your
               | PostgreSQL failovers.
               | 
               | I have clusters running on timelines in the hundreds
               | without a byte of data loss due to using synchronous
               | replication, tools that help out with leader election,
               | and just doing it often.
        
               | takeda wrote:
               | Can Patroni tell if master node is not responsive because
               | it is busy vs dead? GitHub (I believe) had few outages
               | that caused data loss because their auto failover
               | mechanism kicked in when it shouldn't.
               | 
               | I would actually be interested if aphyr's analysis of
               | Patroni and other distributed add-ons to PostgreSQL.
        
             | pcl wrote:
             | I think that it'd be super-valuable to do an analysis of an
             | RDS Postgres deployment. Amazon is doing some dark magic
             | with RDS that sits at this really interesting "distributed,
             | but not _that_ distributed " inflection point, which
             | impacts the basic assumptions of lots of distributed
             | database design.
             | 
             | I believe RDS Postgres is probably the right answer for
             | lots of applications, especially for those that already
             | depend on AWS for baseline availability. I'd love to see if
             | that holds up against a rigorous analysis.
        
               | aeyes wrote:
               | Are you talking about Aurora? Because in RDS the
               | replication is just what you get out of the box.
        
               | aphyr wrote:
               | I'd like this too, but I'm not sure how to do fault
               | injection against an Amazon-controlled service.
        
               | ashtonkem wrote:
               | You'd probably have to work directly with AWS on that
               | one, either to get a custom harness in AWS, or to find
               | out how they configure RDS replication.
        
               | elesbao wrote:
               | the setup would prob ably be pgsql primary, aurora
               | secondaries on diff zones and something changing cross-
               | zone or cross-region vpc setting to try to break
               | replication ? never tried that but was hurt by rds pure
               | pgsql cross region replication in a network outage
               | situation.
        
             | zbjornson wrote:
             | It'd be especially interesting given that MongoDB claims
             | this:
             | 
             | > Postgres has both asynchronous (the default) and
             | synchronous replication options, neither of which offers
             | automatic failure detection and failover [12]. The
             | synchronous replication only waits for durability on one
             | additional node, regardless of how many nodes exist [13].
             | Additionally, Postgres allows one to tune these durability
             | behaviors at the user level. When reading from a node,
             | there is no way to specify the durability or recency of the
             | data read. A query may return data that is subsequently
             | lost. Additionally, Postgres does not guarantee clients can
             | read their own writes across nodes.
             | 
             | From http://www.vldb.org/pvldb/vol12/p2071-schultz.pdf
        
               | takeda wrote:
               | > It'd be especially interesting given that MongoDB
               | claims this:
               | 
               | > > Postgres has both asynchronous (the default) and
               | synchronous replication options, neither of which offers
               | automatic failure detection and failover [12]. The
               | synchronous replication only waits for durability on one
               | additional node, regardless of how many nodes exist [13].
               | Additionally, Postgres allows one to tune these
               | durability behaviors at the user level. When reading from
               | a node, there is no way to specify the durability or
               | recency of the data read. A query may return data that is
               | subsequently lost. Additionally, Postgres does not
               | guarantee clients can read their own writes across nodes.
               | 
               | > From http://www.vldb.org/pvldb/vol12/p2071-schultz.pdf
               | 
               | This is like those commonly seen tables comparing your
               | product with others where your product had checkmarks in
               | all categories, and of course competitors are missing a
               | bunch of them. The problem is that the categories were
               | picked by you, and are often irrelevant to the other
               | product. This is the case here.
               | 
               | PostgreSQL is not a distributed database, the master is
               | the one doing all writes. The replicas are read only. By
               | default replicas are asynchronous which means they won't
               | affect master performance, at the cost of having data
               | there being late by few seconds. Since you can't write to
               | replicas, this won't cause data corruption, only delay
               | which often is acceptable. If you design your
               | applications in such way that will have two database
               | endpoints: one for writes and one just for reads, you can
               | then decide based on context which endpoint you want to
               | use. The read only is easy to scale, but as mentioned
               | earlier it is read only, and might slight delay.
               | 
               | Now, for failover, you might also opt on using
               | synchronous replicas this will add extra latency, but
               | then you always have at least one machine that has the
               | same data. They mentioned that if you have multiple
               | synchronous standbys then it only one needs to write.
               | Actually that's configurable, you can specify group of
               | synchronous machines and how many and which need to be
               | synchronized, the remaining ones are a backup in case
               | those that you specified aren't available.
               | 
               | Besides, the writes don't work the same way as in mongo,
               | when a standby node is in sync it isn't just in sync for
               | that particular write, it is completely in sync, so their
               | following argument about not being able to specify
               | durability/recency of data on read is redundant. If you
               | contact the master or synchronous replica, you will
               | always get the most recent state. If you don't mind
               | slight delay you should query asynchronous replicas (in
               | fact you should prefer them whenever you can, since those
               | are cheap to add)
        
               | zbjornson wrote:
               | I'm not sure I understand your point.
               | 
               | > the master is the one doing all writes. The replicas
               | are read only. By default replicas are asynchronous
               | 
               | The same is true with MongoDB's defaults in an unsharded
               | cluster.
        
           | zzzcpan wrote:
           | Postgres is not a distributed database and doesn't have a
           | single safe default for running it in a distributed
           | configuration, including talking to it over network. It can't
           | claim any consistency guarantee, so there is nothing for
           | aphyr to test it for.
           | 
           | Even common highly available configurations take the route of
           | no consistency guarantees by doing primitive async
           | replication and primitive failover.
        
             | bsaul wrote:
             | i'm not sure what you mean by pg not being a "distributed"
             | database. it has replication and sharding functionalities
             | that let it run in various clustering configuration. This
             | looks enough to me to qualify it for aphyr tests.
        
               | takeda wrote:
               | Replication is read only, so at worst there's only delay
               | when it is set up asynchronously, but ultimately it will
               | be the same as master. The sharding part, do you mean
               | FDW? I don't think PostgreSQL gives any consistency
               | guarantees if you use them.
        
               | bsaul wrote:
               | ha, my bad. I had the feeling pg provided some solution
               | for sharding, but it seems they're all third-party
               | extensions ( like citus/pg-shard)
        
             | politician wrote:
             | Postgres supports multi-master replication, among other
             | replication models. This could provide an interesting
             | target.
             | 
             | In a classic single node configuration, a confirmation that
             | its transaction isolation behaviors exhibited the
             | corresponding anomalies would be valuable.
             | 
             | So I think there's value in this ask.
        
               | samdk wrote:
               | Postgres doesn't natively support multi-master. (Although
               | there are a variety of open source/proprietary offerings
               | that add support for it to various degrees.)
        
               | [deleted]
        
               | takeda wrote:
               | PostgreSQL doesn't offer multi master replication. There
               | are extensions that do, but if aphyr will evaluate them
               | he should emphasize that he is treating them not the
               | PostgreSQL (unless he finds a bug in PostgreSQL itself).
               | 
               | I think he did something similar for MySQL when
               | evaluating the Galera cluster.
        
               | politician wrote:
               | Jepsen reports often include two distinct types of
               | analyses: correctness in a distributed storage system
               | under a variety of failure scenarios, and in-depth
               | analysis of consistency claims. Both examinations are
               | extremely helpful.
               | 
               | In a single write master configuration, Postgres runs
               | transactions concurrently, so the consistency analysis is
               | still quite relevant.
               | 
               | I don't think it's a stretch to say that everyone expects
               | Postgres to get top marks in this configuration and it
               | would be worth confirming that this is the case.
        
               | takeda wrote:
               | Actually he already did analyze PostgreSQL:
               | https://aphyr.com/posts/282-call-me-maybe-postgres
               | 
               | But it was long ago, and maybe needs to be redone?
               | 
               | Edit: after re-reading it he treats it as a distributed
               | system because client and server is over network. And
               | that is true, it can also be thought of as a distributed
               | system because as you said transactions are concurrent
               | and are running as separate processes. Although in these
               | cases you can't have a partition (which aphyr uses to
               | find weaknesses), or maybe there is something equivalent
               | that happens?
        
               | zozbot234 wrote:
               | > PostgreSQL doesn't offer multi master replication.
               | 
               | Not in itself, but it does offer a PREPARE TRANSACTION -
               | COMMIT PREPARED / ROLLBACK PREPARED extension that could
               | be used to add such support in the future. This would not
               | be unprecedented, as the simpler case of db sharding is
               | already being supported via the PARTITION BY feature,
               | combined with "FOREIGN" database access.
        
         | danpalmer wrote:
         | Not a question necessarily about the technical side, but I'm
         | interested in your opinion as to the root cause - is it desire
         | to achieve certain results for marketing purposes, lack of
         | understanding/training in the team about distributed systems,
         | just bugs and a lack of testing...? Alternatively does most of
         | this come down to one specific technical choice, and why might
         | they have made that choice?
         | 
         | Very happy for (informed) speculation here, I recognise we'll
         | probably never know for certain, but I'm interested to avoid
         | making similar mistakes myself.
        
           | aphyr wrote:
           | There's a few things at play here. One is talking only about
           | the positive results from the previous Jepsen analysis, while
           | not discussing the negative ones. Vendors often try to
           | represent findings in the most positive light, but this was a
           | particularly extreme case. Not discussing default behavior is
           | a significant oversight, and it's especially important given
           | ~80% of people run with default write concern, and 99% run
           | with default read concern.
           | 
           | The middle part of the report talks about unexpected but
           | (almost all) documented behavior around read and write
           | concern for transactions. I don't want to conjecture too much
           | about motivations here, but based on my professional
           | experience with a few dozen databases, and surveys of
           | colleagues, I termed it "surprising". The fact that there's
           | explicit documentation for what I'd consider Counterintuitive
           | API Design suggests that this is something MongoDB engineers
           | considered, and possibly debated, internally.
           | 
           | The final part of the report talks about what I'm pretty sure
           | are bugs. I'm strongly suspicious of the retry mechanism:
           | it's possible that an idempotency token doesn't exist, isn't
           | properly used, or that MongoDB's client or server layers are
           | improperly interpreting an indeterminate failure as a
           | determinate one. It seems possible that all 4 phenomena we
           | observed stem from the retry mechanism, but as discussed in
           | the report, it's not entirely clear that's the case.
        
             | danpalmer wrote:
             | Thanks for the thoughts.
             | 
             | I get the impression that MongoDB may have hyped themselves
             | into a corner in the early days with poorly made (or
             | misleading) benchmarks. Perhaps they have customers with a
             | lot of influence determining how they think about
             | performance vs consistency.
             | 
             | Maybe this combined with patching, re-patching, re-patching
             | again their replication logic/consistency algorithm means
             | that they'll be stuck in this sort of position for a long
             | time.
        
               | aphyr wrote:
               | Possibly! You're right that path dependence played a role
               | in safety issues: the problems we found in 3.4.0-rc3 were
               | related to grafting the new v1 replication protocol onto
               | a system which made assumptions about how v0 behaved.
               | That said, I don't want to discount that MongoDB _has_
               | made significant improvements over the years. Single-
               | document linearizability was a long time in the works,
               | and that 's nothing to sneeze at!
               | 
               | http://jepsen.io/analyses/mongodb-3-4-0-rc3
        
         | staticassertion wrote:
         | I've wanted to try building a toy database to learn more about
         | how they work - any suggestions for good resources?
        
       | [deleted]
        
       | lmilcin wrote:
       | I am tech lead for a project that revolves around multiple
       | terabytes of trading data for one of top ten largest banks in the
       | world. My team has three, 3-node, 3TB per node MongoDB clusters
       | where we keep huge amount of documents (mostly immutable 1kB to
       | 10kB in size).
       | 
       | Majority write/read concern is exactly so that you don't loose
       | data and don't observe stuff that is going to be rolled back. It
       | is important to understand this fact when you evaluate MongoDB
       | for your solution. That it comes with additional downsides is
       | hardly a surprise, otherwise there would be no reason to specify
       | anything else than majority.
       | 
       | You just can't test lower levels of guarantees and then complain
       | you did not get what higher levels of guarantees were designed to
       | provide.
       | 
       | It is also obvious, when you use majority concern, that some of
       | the nodes may accept the write but then have to roll back when
       | the majority cannot acknowledge the write. It is obvious this may
       | cause some of the writes to fail that would succeed should the
       | write concern be configured to not require majority
       | acknowledgment.
       | 
       | The article simply misses the mark by trying to create sensation
       | where there is none to be found.
       | 
       | The MongoDB documentation explains the architecture and
       | guarantees provided by MongoDB enough so that you should be able
       | to understand various read/write concerns and that anything below
       | majority does not guarantee much. This is a tradeoff which you
       | are allowed to make provided you understand the consequences.
        
         | lllr_finger wrote:
         | > The article simply misses the mark by trying to create
         | sensation where there is none to be found.
         | 
         | As someone who is a tech lead for a large database install, I'd
         | urge you to read the rest of the Jepsen reports. They aren't
         | intended to be hit pieces on technology - they're deep dives
         | into the claims and guarantees of each database. IIRC MDB has
         | explicitly reached out to OP in the past (I doubt they'll
         | continue to do so after this).
         | 
         | Why that matters to the rest of us: once I learn all those
         | dials and knobs I'm left wondering why I would choose Mongo
         | over another technology, and how much the design of the default
         | behavior and complexity of said dials/knobs are influenced by
         | their core business.
        
           | lmilcin wrote:
           | I agree. MongoDB has large numbers of peculiarities that you
           | better know before you buy in. It is definitely not so rosy
           | as advertised. In particular it seems the product is not
           | mature (especially if you come from Oracle world) and the
           | features seem slapped on as they go and not thought through.
        
         | nosequel wrote:
         | Since you just leaned all the way in, while repeatedly proving
         | you either will not, or cannot read the posted article at all.
         | Will you let us know what bank you support so at least I can
         | make sure I never use that bank?
         | 
         | Thanks, Those of us who care about our banking and investing
         | data.
        
         | aphyr wrote:
         | _You just can 't test lower levels of guarantees and then
         | complain you did not get what higher levels of guarantees were
         | designed to provide._
         | 
         | Gently, may I suggest that you read the report, or at least the
         | abstract? This is addressed in the second sentence. :-)
        
           | lmilcin wrote:
           | To quote from the report: "Moreover, the snapshot read
           | concern did not guarantee snapshot unless paired with write
           | concern majority--even for read-only transactions."
           | 
           | Of course, it doesn't work when you don't pair it with
           | majority read/write concern. You can't expect to get a
           | snapshot of data that wasn't yet acknowledged by majority of
           | the cluster.
           | 
           | As to the quote you probably are referring to:
           | 
           | "Jepsen evaluated MongoDB version 4.2.6, and found that even
           | at the strongest levels of read and write concern, it failed
           | to preserve snapshot isolation."
           | 
           | I did not find any proof of this in the rest of the report.
           | It seems this is mostly complaint of what happens when you
           | mix different read and write concerns.
           | 
           | I would also suggest to think a little bit on the concept of
           | snapshot in the context of distributed system. It is not
           | possible to have the same kind of snapshot that you would get
           | with a single-node application with the architecture of
           | MongoDB. MongoDB is a distributed system where you will get
           | different results depending on which node you are asking.
           | 
           | The only way you could get close to having a global snapshot
           | is if all nodes agreed on a single truth (for example single
           | log file, block chain, etc.) which would preclude read/write
           | with concern level less than majority.
        
             | aphyr wrote:
             | > I did not find any proof of this in the rest of the
             | report.
             | 
             | May I suggest sections 3.4, 3.5, 3.6, 3.7, 4.0, and 4.1?
        
               | lmilcin wrote:
               | Quoting half the report is bad for the discussion as it
               | makes it impossible for the reader to follow.
        
               | inglor wrote:
               | > May I suggest alternative perspective on the matter?
               | 
               | Can't reply to that since it's too nested so I'll reply
               | here. I warmly recommend getting off tree you climbed on
               | and actually reading the article because if you do - you
               | will see you are not disagreeing on that part.
               | 
               | The article is a mostly technical analysis of the
               | transaction isolation levels and where they hold. The
               | main criticism is how MongoDB _advertises_ itself. If
               | they didn 't claim the database is "fully ACID" then the
               | article would have just been a technical analysis :]
        
               | aphyr wrote:
               | Chief, it does _not_ have to be this hard. 3.4 clearly
               | states:
               | 
               |  _This anomaly occurred even with read concern snapshot
               | and write concern majority_
               | 
               | 3.5: _In this case, a test running with read concern
               | snapshot and write concern majority executed a trio of
               | transactions with the following dependency graph_
               | 
               | 3.6: _Worse yet, transactions running with the strongest
               | isolation levels can exhibit G1c: cyclic information
               | flow._
               | 
               | 3.7: _It's even possible for a single transaction to
               | observe its own future effects. In this test run, four
               | transactions, all executed at read concern snapshot and
               | write concern majority, append 1, 2, 3, and 4 to key 586
               | --but the transaction which wrote 1 observed [1 2 3 4]
               | before it appended 1._
               | 
               | Like... if you had read any of these sections--or even
               | their very first sentences--you wouldn't be in this
               | position. They're also summarized both in the abstract
               | and discussion sections, in case you skipped the results.
               | 
               | 4.0: _Finally, even with the strongest levels of read and
               | write concern for both single-document and transactional
               | operations, we observed cases of G-single (read skew),
               | G1c (cyclic information flow), duplicated writes, and a
               | sort of retrocausal internal consistency anomaly: within
               | a single transaction, reads could observe that
               | transaction's own writes from the future. MongoDB appears
               | to allow transactions to both observe and not observe
               | prior transactions, and to observe one another's writes.
               | A single write could be applied multiple times,
               | suggesting an error in MongoDB's automatic retry
               | mechanism. All of these behaviors are incompatible with
               | MongoDB's claims of snapshot isolation._
               | 
               | It's OK to stop digging now!
        
               | lmilcin wrote:
               | May I suggest alternative perspective on the matter?
               | 
               | Compared to a product like Oracle, transactions on
               | MongoDB are very new, very niche functionality. Even
               | MongoDB consultants do openly suggest not to use it.
               | 
               | MongoDB is really meant to store and retrieve documents.
               | That's where the majority read/write concern guarantees
               | come from.
               | 
               | As long as you are storing and retrieving documents you
               | are pretty safe functionality.
               | 
               | Your article presents the situation as if MongoDB did not
               | work correctly at all. That is simply not true, the most
               | you can say is that a single (niche) feature doesn't
               | work.
               | 
               | Have you ever tried distributed transactions with
               | relational databases? Everybody knows these exist but
               | nobody with sound mind would ever architect their
               | application to rely on it.
               | 
               | Any person with a bit of experience will understand that
               | things don't come free and some things are just too good
               | to be true. MongoDB marketing may be a bit trigger happy
               | with their advertisements but it does not mean the
               | product is unusable, they just probably promised bit too
               | much.
        
               | JohnBooty wrote:
               | This comment will rightfully be downvoted, but I'm going
               | to break HN decorum for once in my long posting history
               | here and simply say:
               | 
               | Holy _shit_ , buddy. Stop.
        
               | threeseed wrote:
               | At least he is contributing something to do the
               | discussion.
               | 
               | He may be right. Hey may be wrong. But it helps everyone
               | learn.
               | 
               | Your comments contribute nothing. So how about you stop ?
        
               | lmilcin wrote:
               | The world does not revolve around HN votes. If your first
               | urge is whether the post gets downvoted or not you might
               | want to rethink your life a little bit.
               | 
               | So don't worry about me.
        
               | dang wrote:
               | Please stop. We don't want flamewars here.
        
               | JohnBooty wrote:
               | I'm not "worried" nor experiencing an "urge." Please skip
               | the concern trolling.
               | 
               | What I do have an interest in is HN's accepted decorum,
               | which I admittedly stepped outside of when I implored you
               | to stop digging yourself such a hole.
               | 
               | HN is far from perfect but there is a culture of
               | respectful discourse here, which is part of the reason
               | for its value IMO.
        
               | speedgoose wrote:
               | You may want to delete this comment too.
        
               | [deleted]
        
               | jiofih wrote:
               | May I suggest the tiniest bit of consideration (such as
               | reading the report) before jumping to conclusions and
               | low-key offending the author? You should be embarrassed.
        
               | aphyr wrote:
               | _Have you ever tried distributed transactions with
               | relational databases?_
               | 
               | I am delighted to say that yes: checking safety
               | properties of distributed systems, including those of
               | relational databases, is literally my job. See
               | https://jepsen.io/analyses for a comprehensive list of
               | prior work, or http://jepsen.io/analyses/tidb-2.1.7,
               | http://jepsen.io/analyses/yugabyte-db-1.1.9,
               | http://jepsen.io/analyses/yugabyte-db-1.3.1, or
               | http://jepsen.io/analyses/voltdb-6-3 for recent examples
               | of Jepsen analyses on relational databases.
        
               | [deleted]
        
             | logicchains wrote:
             | Did you see the part about "Operations in a transaction use
             | the transaction-level read concern. That is, any read
             | concern set at the collection and database level is ignored
             | inside the transaction."?
             | 
             | "Tansactions without an explicit read concern downgrade any
             | requested read concern at the database or collection level
             | to a default level of local, which offers "no guarantee
             | that the data has been written to a majority of replicas
             | (i.e. may be rolled back).""
             | 
             | The big problem is that, even if somebody correctly sets
             | the read and write concerns to something sensible, the
             | moment they use a transaction these guarantees fly out the
             | window, unless they read the docs carefully enough to
             | realise they have to set the read and write concern for the
             | transaction too. The defaults are very un-intuitive; I
             | can't imagine that the case of somebody needing snapshot
             | isolation in general but being fine with arbitrary data
             | less in transactions is a common case, compared to wanting
             | to avoid data loss both generally and in transactions.
        
               | lmilcin wrote:
               | It is different to complain about unclear documentation
               | and unintuitive gurantees and to say that it just doesn't
               | work.
               | 
               | Yes it works. Yes, you have to read the documentation
               | very carefully.
        
               | inglor wrote:
               | Not saying you're wrong. As an anecdotal data point -
               | we've read the docs (carefully) and spoke to MongoDB
               | quite a bit when implementing transactions including
               | their highest paid levels of support and still ran into
               | this issue:
               | 
               | > transactions running with the strongest isolation
               | levels can exhibit G1c: cyclic information flow.
               | 
               | As well as the Node.js API issue (I just checked randomly
               | and their Python API has the same bug lol) listed above.
        
         | bronson wrote:
         | If a database advertises attributes that aren't a part of its
         | default setup, you can expect its docs to make it very simple
         | and clear how to get them.
         | 
         | If not, that's misrepresentation.
        
           | lmilcin wrote:
           | The documentation states that very clearly and the attributes
           | are part of every call to the database (as long as you are
           | using native driver).
           | 
           | In any case any person that has some experience with
           | distributed systems will understand what it roughly means to
           | get an acknowledgment from just a single node vs. waiting for
           | the majority.
           | 
           | Oracle also does not use serializable as its default
           | isolation level, yet it advertises it.
           | 
           | This is all part of the product functionality. Whenever you
           | evaluate product for your project you have to understand
           | various options, functionalities and their tradeoffs.
           | 
           | Defaults don't mean shit. In a complex clustered product you
           | need to understand all important knobs to decide the correct
           | settings and configurable guarantees are most important knobs
           | there are.
        
             | bronson wrote:
             | Good point. If there's a database that rivals Mongo for
             | shady sales tactics, it's Oracle.
        
       | dang wrote:
       | All: there was a big thread about this yesterday
       | (https://news.ycombinator.com/item?id=23285249) but because it
       | didn't focus on the technical content, and because there were
       | glitches with a previous submission of this report (described at
       | https://news.ycombinator.com/item?id=23288120 and
       | https://news.ycombinator.com/item?id=23287763 if anyone cares),
       | we invited aphyr to repost this. Normally we downweight follow-up
       | posts that have such close overlap with a recent discussion (http
       | s://hn.algolia.com/?dateRange=all&page=0&prefix=true&que...), so
       | the exception is probably worth explaining.
        
       | sorokod wrote:
       | I suppose there are reasons why the defaults are the way they
       | are. Can anyone comment on the implications, performance or
       | otherwise, of bumping up the read/write concerns?
        
         | aphyr wrote:
         | Latency is a big one--you've got to wait an extra round-trip
         | for secondaries to acknowledge primary writes, and primaries
         | (assuming you don't have reliable clocks) need to check in with
         | secondaries to confirm they have the most recent picture of
         | things if you want to do a linearizable read. Snapshot isolated
         | reads _shouldn 't_ require that, at least in theory--it's legal
         | to read state from the past under SI, so there's no need to
         | establish present leadership. That's why I'm surprised that
         | MongoDB requires snapshot reads to go through write concern
         | majority--it doesn't _seem_ like it 'd be necessary. Might have
         | something to do with sharding--maybe establishing a consistent
         | cut across shards requires a round of coordination. Even then I
         | feel like that's a cost you should be able to pay only at write
         | time, making reads fast, but... apparently not! I'm sure the
         | MongoDB engineers who designed this system have good reasons;
         | they're smart folks and understand the replication protocol
         | much better than I do.
         | 
         | MongoDB's also published a writeup (which is cited a few times
         | in the Jepsen report!) talking about the impact of stronger
         | safety settings and why they choose weak defaults:
         | http://www.vldb.org/pvldb/vol12/p2071-schultz.pdf
        
         | goatinaboat wrote:
         | In general, MongoDB's defaults fall into two categories. The
         | first could possibly be justified as making it easy for
         | inexperienced devs to get started, but it means that people
         | rely on those defaults and then try to promote to production,
         | and unless there is an experienced traditional DBA with the
         | power to veto it, it will go ahead. This is how they "backdoor"
         | their way into companies. The second category is whatever will
         | look good on a benchmark, regardless of any corners cut.
         | 
         | Compare and contrast with the highly ethical Postgres team, who
         | encourage good practices from the start and who get a feature
         | right first before worrying about performance. That may harm
         | their adoption in the short term but over the long term, that's
         | why they're the gold standard. And with their JSONB datatype
         | they have a better MongoDB than MongoDB anyway! And have a
         | million other features besides!
        
           | logicchains wrote:
           | >And have a million other features besides!
           | 
           | Yeah, but in spite of that their performance still sucks
           | compared to writing directly to /dev/null, and that's where
           | Mongo steals their thunder.
        
           | threeseed wrote:
           | > Compare and contrast with the highly ethical Postgres team
           | 
           | You do know that PostgreSQL had issues with not fsyncing data
           | as well ? It's technology. Bugs will be made. Design
           | decisions will be wrong.
           | 
           | I think it's really disappointing and inappropriate to be
           | labelling MongoDB engineers as unethical for simply having
           | incorrect defaults. Which in their history they often change
           | after they are made aware of them.
        
       | junon wrote:
       | I wanted to incorporate MongoDB into a C++ server at one point.
       | 
       | Their C/C++ client is literally unusable. I went to look into
       | writing my own that actually worked and their network protocols
       | are almost impossible to understand. BSON is a wreck and
       | basically the whole thing discouraged me from ever trying to
       | interact with that project again.
        
       | bbulkow wrote:
       | mongodb's business model, forever, has been to get developers to
       | write code, be damned the fact that you can't support it reliably
       | on a cloudy day.
        
       | jtdev wrote:
       | Now do DynamoDB.
        
         | aphyr wrote:
         | I'd like to, but I don't have any way to do fault injection on
         | a system someone else owns. :(
        
           | jtdev wrote:
           | Would love to see AWS agree to facilitating this. Appreciate
           | your work very much!
        
           | petrikapu wrote:
           | They have downloadable version of it https://docs.aws.amazon.
           | com/amazondynamodb/latest/developerg...
        
       | [deleted]
        
       | chousuke wrote:
       | This article reinforces my stance that bad defaults are a bug.
       | Defaults should be set up with the least number of pitfalls and
       | safety tradeoffs possible so that the system is as robust as it
       | can be for the majority of its users, since the vast majority of
       | them aren't going to change the defaults.
       | 
       | Sometimes you end up with bad defaults simply by accident but I
       | feel like for MongoDB the morally correct choice would be to own
       | up to past mistakes and change the defaults rather than maintain
       | a dangerous status quo for "backwards compatibility", even if you
       | end up looking worse in benchmarks as a result.
        
         | aphyr wrote:
         | I think this is a good way to look at things, and there are
         | vendors who do this! VoltDB, for instance, changed their
         | defaults to be strict serializable even though it imposed a
         | performance hit, following their Jepsen analysis.
         | https://www.voltdb.com/blog/2016/07/voltdb-6-4-passes-offici...
        
       | mtrycz2 wrote:
       | aphyr, you are of great inspiration as an engineer and as a
       | human.
       | 
       | Your attitude of "a tool I need doesn't exists, so I'll just go
       | ahead and create it" blew my mind and changed me for the better.
       | 
       | I'm dedicating my next test framework to you. Thank you for
       | everything.
        
         | aphyr wrote:
         | Aw shucks, thank you! <3
        
       | inglor wrote:
       | Without going into details die to NDAs, the experience in the OP
       | matches the ones of several fortune 500 companies I had gigs
       | with.
        
       | Laszlotejfel86 wrote:
       | SSID: Tejfel Lazalo Protocol: Wi-Fi 5 (802.11ac) Security type:
       | WPA2-Personal Network band: 5 GHz Network channel: 157 Link-local
       | IPv6 address: fe80::1d0:5810:36ab:219d%8 IPv4 address: 10.0.0.148
       | IPv4 DNS servers: 64.71.255.204 64.71.255.198 Manufacturer: Intel
       | Corporation Description: Intel(R) Wireless-AC 9462 Driver
       | version: 20.50.0.5 Physical address (MAC): 7C-2A-31-7B-02-46
        
       | sam1r wrote:
       | Extremely well written! I learned a lot.
       | 
       | I wonder if someone can type up a well-manicured post-Morten of
       | the recent triple byte incident?
        
       | fastball wrote:
       | At this point I think we might be going a bit overboard with
       | title changes.
       | 
       | Now that it's just "MongoDB 4.2.6", the title makes me think that
       | this is a release announcement, not an analysis of the software.
       | 
       | The first title (that specifically referenced a finding of the
       | analysis) was best, imo. Mildly opinionated or whatever, but at
       | least it quickly communicated the gist of the post. On the other
       | hand:
       | 
       | "Jepsen: MongoDB 4.2.6" - not super helpful if you're not already
       | familiar with the Jepsen body of work.
       | 
       | "MongoDB 4.2.6" - as stated above, sounds like a release
       | announcement.
       | 
       | If you want a suggestion, maybe something like "Jepsen evaluation
       | of MongoDB 4.2.6"? Not overly specific (/ negative) like the
       | first title, but at least provides some slight amount of context.
       | 
       | @dang
        
         | dang wrote:
         | Please read the site guidelines:
         | https://news.ycombinator.com/newsguidelines.html. They say: "
         | _If the title includes the name of the site, please take it
         | out, because the site name will be displayed after the link._ "
         | That's why a moderator changed it: the submitted title was
         | "Jepsen: MongoDB 4.2.6".
         | 
         | I don't mind making an exception, since exceptions are things
         | sometimes. Jepsen is famous on HN, so the current title is not
         | an issue. Indeed, referencing a specific finding would arguably
         | be misleading, since this article _is_ the Jepsen report about
         | MongoDB 4.2.6. Btw, I don 't know what you mean by "The first
         | title (that specifically referenced a finding of the analysis)
         | was best". The submitted title was "Jepsen: MongoDB 4.2.6" and
         | it has only ever rotated between two states, one with "Jepsen:
         | " and one without. Are you confusing this thread with
         | https://news.ycombinator.com/item?id=23285249?
         | 
         | It's very silly to have this be the top comment on the page
         | (I've since downweighted it, but that's where it was when I
         | looked in). Yesterday I briefly swapped the URL of this article
         | into the other thread, but then reversed that because it seemed
         | that thread couldn't support a more technical discussion
         | (https://news.ycombinator.com/item?id=23288120). I invited
         | aphyr to repost it instead, which was quite a break from our
         | standard practice of downweighting follow-up posts, but seemed
         | like the best solution at the time. What technical discussion
         | was our reward? Bickering about title policy!
        
         | aphyr wrote:
         | This... usually happens on Jepsen HN threads. The full title,
         | as in the page metadata, and as originally submitted, is
         | "Jepsen: MongoDB 4.2.6. At some point a mod drops the "Jepsen:"
         | part, then we have this discussion, and it comes back. :)
         | 
         | "Why don't you put Jepsen:" on the same line as the database
         | name and version?"
         | 
         | Space concerns, and also, it's immediately above the DB name in
         | giant letters.
         | 
         | "Why don't you give them more creative names?"
         | 
         | Clients _love_ to argue about the titles of these analyses;
         | having a concise, predictable policy for titling is how I get
         | past those discussions.
        
           | fastball wrote:
           | As another commenter pointed out, it might be worth making
           | the titles "An evaluation of X" going forward - better for HN
           | and probably better everywhere else this is shared too.
        
             | aphyr wrote:
             | Not sure how many ways I can say this: the titles are
             | _already_ "Jepsen: X". HN's got a policy in place that
             | means sometimes mods change the title to just "X". That's
             | not something I have control over, sorry.
        
               | fastball wrote:
               | Right, but "Jepsen: X" doesn't really mean anything to
               | anyone that isn't familiar with your work. "An Evaluation
               | of X" is much more informative.
        
               | aphyr wrote:
               | I've invested seven years of my life into this brand, and
               | my choices are carefully considered.
        
         | Ecco wrote:
         | It's the article's title...
        
           | fiddlerwoaroof wrote:
           | A generic "Mongo 4.2.6" title doesn't help me decide whether
           | to click on the link (especially with how light the domain
           | is). I thought it was a release announcement and only clicked
           | through to the comments because of yesterday's discusssion.
        
             | dang wrote:
             | An HN title needs to be read along with the site name to
             | the right of it.
        
               | fiddlerwoaroof wrote:
               | The styling of the site name makes it hard to scan. If
               | it's so essential, the font should be darker and bigger.
        
               | dang wrote:
               | That's a fair point, but people have a lot of
               | contradictory preferences about things like that. I think
               | I'd rather address this by allowing more customization of
               | the site. Still thinking about
               | https://news.ycombinator.com/item?id=23199264.
        
               | fiddlerwoaroof wrote:
               | As I said there, I'd like to see that added
        
           | Fiveplus wrote:
           | No context doesn't help.
        
           | cromulent wrote:
           | Well... it's the articles second H1 header. The title is
           | "Jepsen: MongoDB 4.2.6".
        
           | petepete wrote:
           | And taken out of context it makes little sense.
        
         | simias wrote:
         | "An evaluation of MongoDB 4.2.6" might be neutral and
         | informative enough I suppose.
         | 
         | But then again ultimately the blame is on the author of the
         | article, it's a terrible title for this type of articles. I can
         | understand if the moderators here don't want to go through the
         | trouble of dealing with editorialized titles (with all the
         | controversies it could generate) when clearly the original
         | author didn't care enough to come up with a decent title.
        
           | takeda wrote:
           | Why? His site is about evaluating distributed data stores. In
           | context of his site, that title makes perfect sense, HN
           | should just add the missing context to its title.
        
             | fastball wrote:
             | Because as can be seen from the fact that most people only
             | found this article because it was posted on HN (and not
             | because they were browsing the site), the context of the
             | overall site isn't super relevant.
             | 
             | Site context isn't a given when most of us are finding
             | content via 3rd party sources.
        
       | inglor wrote:
       | I also want to point their Node.js transactions API is wrong and
       | looks like they have no idea how promises or async code work in
       | JS.
       | 
       | In mongo, you have a `withTransaction(fn)` helper that passes a
       | session parameter. Mongo can call this function mutliple times
       | with the same session object.
       | 
       | This means that if you have an async function with reference to a
       | session and a transaction gets retried - you very often get "part
       | of one attempt + some parts of another" committed.
       | 
       | We had to write a ton of logic around their poor implementation
       | and I was shocked to see the code underneath.
       | 
       | It was just such a stark contrast to products that I worked with
       | before that generally "just worked" like postgres, elasticsearch
       | or redis. Even tools people joke about a lot like mysql never
       | gave me this sort of data corruption.
       | 
       | Edit: I was kind of angry when writing this so I didn't provide a
       | source and I'm a bit surprised this go so many upvotes without a
       | source (I guess this community is more trusting than I assumed :]
       | ). Anyway for good measure and to behave the way I'd like others
       | to when making such accusations here is where they pass the same
       | session object to the transacton https://github.com/mongodb/node-
       | mongodb-native/blob/e5b762c6... (follow from withTransaction in
       | that file) - I can add examples of code easily introducing the
       | above mentioned bug if people are interested.
        
         | jfkebwjsbx wrote:
         | > Even tools people joke about a lot like mysql never gave me
         | this sort of data corruption.
         | 
         | People rightfully joked about MySQL when they had the non-ACID
         | engine.
         | 
         | Same for MongoDB. A database that loses data when properly used
         | is a joke.
         | 
         | Yes, there are use cases out there for fast non-guaranteed
         | writes. No, 99% of companies don't have them.
        
           | Something1234 wrote:
           | Can you name a use case for a fast non guaranteed write?
        
             | throwaway744678 wrote:
             | Analytics: you don't want to slow down your app, and you
             | don't care if you lose a few records in the process.
        
               | zbentley wrote:
               | Importantly, in analytics workloads, it is very important
               | to know roughly _how many_ writes aren 't making it.
               | Otherwise your analytics system sucks.
        
               | rocho wrote:
               | Interesting. How would one know that?
        
               | zbentley wrote:
               | Good question. You'd need some accurate-enough data
               | source telling you about failed writes. Which eventually
               | comes back around to needing a consistent database and
               | indications of client disconnects.
        
             | why-el wrote:
             | Even more elementary that sibling comments, this also
             | happens in gaming all the time. You are recording live
             | results, say in Fifa, but if you unplug your device, your
             | results are gone, since they were memory only. The game
             | simply cannot afford to write to disk, the write is "non
             | guaranteed" in the true sense of the word, but it is
             | _fast_.
             | 
             | You then "checkpoint" when the game is over.
             | 
             | You might dissent that is not a "non-guaranted" write,
             | because in fact the write did occur, but I simply want to
             | allude to the concept of a "non-secured" write, in that it
             | vanished without an fsync.
        
             | threeseed wrote:
             | Telemetry.
             | 
             | I work for a telco where we log large amounts of network
             | requests using MongoDB.
        
             | jfkebwjsbx wrote:
             | The number of likes in a given post in your favorite
             | $social-network-of-the-year.
        
             | twic wrote:
             | Caches. If you lose a write, you just get a cache miss.
             | 
             | Periodic snapshots of state held elsewhere. If you lose a
             | write, you just get stale data until the next update.
             | 
             | Firm realtime work. If you lose a write, that sucks, but a
             | slow write sucks just as much.
        
             | Jweb_Guru wrote:
             | Sure. Data that people don't care about enough to be
             | worried about losing--for example, time series data from an
             | unimportant remote sensor. Should this data be recorded at
             | all? Maybe not, but if should then a best-effort recording
             | may be fine. It may even be all that's possible.
        
               | mbreese wrote:
               | I wouldn't go as far as to say an "unimportant" remote
               | sensor... but I think you're correct in spirit.
               | 
               | I could think of an instance where you'd like to log
               | data, but the occasional datapoint being missing wouldn't
               | be terrible. Maybe something like a temperature monitor
               | -- you'd like to have a record of the temperature by the
               | minute, but if a few records dropped out, you'd be able
               | to guess the missing values from context. Something like
               | the data monitoring equivalent of UDP vs TCP.
        
         | xeromal wrote:
         | I just want remind people that this video exists.
         | 
         | https://www.youtube.com/watch?v=b2F-DItXtZs
        
         | vorticalbox wrote:
         | > Mongo can call this function multiple times with the same
         | session object.
         | 
         | isn't that the point? you can use a session to do multi actions
         | within that session.
        
           | inglor wrote:
           | If you have code that looks like this:
           | withTransaction(async session => {                await
           | Promise.all([someOp(sesson),
           | someOtherOp(session)]);              });
           | 
           | Mongo may retry running it (calling the function again) if a
           | "TransaientTransactionError" is raised (the transaction is
           | retried from the client side rather than at the cluster).
           | 
           | However, when the driver calls your function again it doesn't
           | invalidate the `session` object - so previous calls to the
           | same function can make updates to the database.
           | 
           | Let's say `someOp` does something that causes the transaction
           | to retry and `someOtherOp` is doing something non-mongo-
           | related in the meantime (like pulling a value from redis).
           | Now `someOtherOp` reached the mongo part of its code and it
           | is executing it happily with the same session object (so
           | operations succeed although they really shouldn't)
           | 
           | The point of transactions like you said is to perform
           | multiple operations atomically and for them to happen
           | "exactly once or not at all". With Mongo in practice it is
           | very easy to get "Once and some leftovers from a previous
           | attempt".
        
             | IgorPartola wrote:
             | Sorry, I haven't had my coffee yet. If I am reading this
             | correctly, either someOp() or someOtherOp() may execute
             | first, no? And if you introduce an external database, why
             | do you expect Mongo to handle that rollback? Say
             | someOtherOp() increments a Redis value by 1. If that part
             | executed first since both are asynchronous here, what would
             | a Mongo session have to do with it?
             | 
             | What exactly would invalidating that session object do
             | here? And what would the session object do after it was
             | invalidated?
        
               | [deleted]
        
               | waheoo wrote:
               | It sounds like the old session object is reused and
               | becomes live again or something.
        
             | Namari wrote:
             | I think this is the expected behaviour of the transaction
             | but the problem comes from the fact you wrap all DB
             | operation inside a Promise.all.
             | 
             | Because you wrap the DB operations inside a Promise.all, it
             | means it will run them all BUT it will not revert them if
             | one fails (it's not atomic, it just says that one has
             | failed and you need to catch it), it will reject them but
             | not revert them. (the CUD operation will already have
             | changed the data) The problem I believe is the transaction
             | is considering the Promise.all and not what's inside of it
             | so it will run it again despite the fact that some have
             | already succeeded earlier
             | 
             | I think you just have to resolve each of them outside a
             | Promise.all. In your case because Promise.all has been
             | rejected it will redo the transaction, therefor it will
             | redo the one that have already worked in the first call.
             | 
             | I'm no expert but this is how I understand it.
        
               | gabrieledarrigo wrote:
               | This is right. Are you sure nglor that you know how to
               | write code?
        
             | bambataa wrote:
             | Thanks for this explanation. So if I understand correctly,
             | `someOp` has thrown an error but this doesn't affect
             | `someOtherOp`? So `someOtherOp` will end up being called
             | twice?
        
               | inglor wrote:
               | Correct, the easy workaround is not to use that
               | transaction API and write your own disposer instead of
               | using withTransaction.
        
             | [deleted]
        
         | takeda wrote:
         | MySQL is less of a joke than MongoDB is. They similarly started
         | by someone who didn't know anything about databases and learned
         | about them on the go. Actually both of them started as much
         | faster alternatives to other databases, both ended up having
         | complete rewrite of its engine written by someone from outside
         | that knew their stuff. MySQL ISAM then MyISAM and then InnoDB
         | (written by an outsider). Similarly MongoDB got a WiredTiger
         | written.
         | 
         | The thing is that MySQL is older so it went through all of it
         | earlier, but it still suffers from poor decisions from the
         | past. This is contrasting with PostgreSQL, where correctness
         | and reliability was #1 from the beginning. It started as an
         | awfully slow database, but performance for improved and we now
         | have correct, reliable and fast database.
        
           | berns wrote:
           | MySQL is no joke, nothing is perfect and Postgresql is not
           | 100% reliable. Remember:
           | 
           | Transaction ID wraparound:
           | https://twitter.com/bcantrill/status/1110647418008133632
           | 
           | Incorrect use of fsync:
           | https://news.ycombinator.com/item?id=19119991
        
             | goatinaboat wrote:
             | _MySQL is no joke_
             | 
             | If you were around back in the day you will remember the
             | MySQL team claiming that no one needed transactions or
             | referential integrity, that you should just do it yourself
             | in the application...
        
               | beatrobot wrote:
               | And still no transactional DDL in MySQL
        
               | mathnode wrote:
               | No, but it does support online DDL for some operations in
               | InnoDB.
               | 
               | Very few database systems support online DDL, which
               | unlike a transaction, does not require undo or rollback
               | resources. Of course one must have a rollback procedure
               | if something fails, but you need one for transactions
               | too, just in case.
               | 
               | An online rollback is far lest costly than a
               | transactional rollback, because and online rollback is
               | just undoing what you did. Added a column you didn't want
               | in one query? Remove it again in another, very quickly.
               | 
               | TokuDB (a mysql/mariadb storage engine) supported all DDL
               | as an online operation. But percona killed it in favour
               | of TokuMX, the MongoDB equivalent.
               | 
               | TokuMX has no upgrade path to wired tiger, only one major
               | customer at Percona (I can't say who it is) and no
               | engineers.
               | 
               | Any kind of DDL is tricky and requires users to RTFM for
               | the intricacies of their chosen database. One size rarely
               | fits all.
        
               | edw wrote:
               | MySQL's rise IMO cannot be considered without also
               | looking at the rise of Ruby on Rails and other CRUD-
               | optimized platforms and frameworks. Also ORMs. These
               | things denigrated the idea of using an RDBMS as anything
               | but a dumb table store. Features like stored procedures
               | and views were seen as pointless. MySQL was the perfect
               | database for people who had no respect for databases.
        
               | twic wrote:
               | Does MySQL support check constraints yet?
        
               | dnissley wrote:
               | It does finally! In 8.0.16+
        
           | [deleted]
        
         | hodgesrm wrote:
         | > Even tools people joke about a lot like mysql never gave me
         | this sort of data corruption.
         | 
         | That's about a decade out of date at this point. MySQL/InnoDB
         | is the standard table engine and corruption is exceedingly
         | rare. As of 2014, when I last directly worked on MySQL prod
         | systems, there was no practical difference from PostgreSQL in
         | terms of transactional guarantees. That includes APIs like JDBC
         | which we used for billions of transactions.
        
           | morelisp wrote:
           | MySQL still has no transactional DDL (and I think still even
           | autocommits if you try). This is a major difference from
           | Postgres which I believe supports everything short of
           | dropping tables.
        
             | yobert wrote:
             | Every month, we do an external database import into our
             | production PostgreSQL database. In a single transaction, we
             | drop dozens of tables, create new ones with the same names,
             | insert hundreds of thousands of rows, and recreate indexes,
             | all in a single transaction. It works flawlessly.
        
             | takeda wrote:
             | I wouldn't use that particular thing against MySQL. DDL
             | normally supposed to be always outside of a transaction,
             | it's just PostgreSQL feature that you can use them inside
             | and be able to rollback. BTW I'm convinced you also can
             | drop table within a transaction in PostgreSQL.
        
               | morelisp wrote:
               | No, MySQL stands out here. Postgres, SQL Server, DB2, and
               | Firebird all give at least some way to do some major DDL
               | transactionally. Usability varies (e.g. Oracle supports a
               | very specific kind of change that is not its normal DDL
               | statements), but it's at least possible.
               | 
               | https://wiki.postgresql.org/wiki/Transactional_DDL_in_Pos
               | tgr...
               | 
               | That MySQL autocommits is also even worse than just
               | "doesn't support it."
        
               | dragonwriter wrote:
               | > DDL normally supposed to be always outside of a
               | transaction
               | 
               | A basic element of the relational model is that metadata
               | is stored as relational data and that the same guarantees
               | that apply to manipulating main data in the database
               | apply to manipulating the schema metadata.
               | 
               | It's true that many real relational databases compromise
               | on this element in various ways at times, but it is
               | absolutely not the case that DDL "is supposed to be" non-
               | transactional.
        
           | Carpetsmoker wrote:
           | The biggest issue with MySQL/MariaDB isn't so much data
           | corruption at the InnoDB level but stuff like:
           | MariaDB [test]> create table test ( i int );       Query OK,
           | 0 rows affected (0.06 sec)              MariaDB [test]>
           | insert into test values (''), ('xxx');       Query OK, 2 row
           | affected, 2 warning (0.01 sec)              MariaDB [test]>
           | select * from test;       +------+       | i    |
           | +------+       |    0 |       |    0 |       +------+       2
           | row in set (0.01 sec)
           | 
           | There's a bunch of other similar caveats as well, and this
           | can really take you by surprise. I've seen it introduce data
           | integrity issues more than once.
           | 
           | That's a new MariaDB 15.1 with the default settings I just
           | installed the other day to test some WordPress stuff. I know
           | there are warnings, and that you can configure this by adding
           | STRICT_ALL_TABLES to SQL_MODE, but IMO it's a dangerous
           | default.
           | 
           | This is also an issue with using MongoDB as a generic
           | database: every time I've seen it used there were these kind
           | of data integrity issues: sometimes minor, sometimes brining
           | everything down. Jepsen reports aside, this alone should make
           | people double-check if they really want or need MongoDB,
           | because turns out that most of the time you don't really want
           | this.
        
             | mathnode wrote:
             | 15.1 is not a version. Since MariaDB 10.2, this is not
             | possible as strict_trans_tables is enabled by default in
             | sql_mode.
        
         | hintymad wrote:
         | Just curious, what was the reason that your team decided to
         | work around the problem instead of migrating away from MongoDB?
        
           | inglor wrote:
           | We have a complicated system and migration is ~3 months we
           | won't be shipping features.
           | 
           | We have a roadmap we need to meet and so far we have been
           | trying to spill money on it rather than developers (paying
           | mongo atlas) and adding features incrementally as Mongo gets
           | them (like transactions).
           | 
           | If this wasn't a startup we would probably rewrite.
        
           | capableweb wrote:
           | Not the author but done similar things (patching something
           | rather than migrating away from it). Usually it's way more
           | work to migrate away than just patching it again to fit your
           | use-case. Once you find yourself having to patch it too
           | often, you start thinking about migrating away. Then the
           | research slowly begins ad-hoc until it hits "seems we need to
           | migrate away now, otherwise we're spending too much time
           | working around something / fixing their broken shit", that's
           | when you sit down and decide to migrate away from it.
           | 
           | Also would depend on how long time you think the application
           | will be around. You're building an MVP to evaluate something?
           | Just hack together whatever will work (then throw away).
           | You're maintain software for a library/archive that will most
           | likely stick around for a long time, even if they say it's
           | just temporary? Do decisions that will help in the future,
           | always.
        
         | inglor wrote:
         | If you work for Mongo and are reading this. Please just fix it.
         | I don't need to win and I don't care about being "right".
         | 
         | I just don't want to be called to the office on a weekend
         | anymore for this sort of BS.
         | 
         | Production incidents with MongoDB last year: 15 Production
         | instances with redis, elasticsearch and mysql combined last
         | year: 2 (and with much less severity)
         | 
         | Edit: just to add: I didn't pick Mongo, I was just the engineer
         | called to clean that mess. I created enough of my own messes to
         | not resent the person who made that shot for it. We are
         | constantly on the verge of rewriting the MongoDB stuff since a
         | database that small (~250GB) should really not have these many
         | issues (In previous workplaces I ran ~10TB PostgreSQL
         | deployments with much more complicated schemas and queries with
         | far fewer issues). It's also expensive and support at Mongo
         | Atlas hasn't been great (we should probably self host but I am
         | not used to small databases being so problematic)
        
           | brianwawok wrote:
           | This is why most of us don't use mongo in production. It's
           | just not worth it. Postgres is a tank and supports Json when
           | you really need it.
        
             | Quekid5 wrote:
             | I was actually amazed that a big CMS/E-commerce vendor
             | _proudly_ proclaimed in a sales meeting that they were on
             | MongoDB.
             | 
             | I suppose salespeople probably aren't into the nitty-
             | gritty, but their tech people should have warned them about
             | this. Maybe they were just trying to pull our collective
             | leg, but I suppose that why I was at that meeting.
             | 
             | It was obviously an instant 'No'.
        
               | leviathant wrote:
               | There aren't a lot of CMS/Ecommerce vendors that sit on
               | MongoDB, so maybe we were in a meeting together!
               | 
               | Even if we weren't - as a sales engineer on a large
               | CMS/ECommerce platform with merchants running $150M+ in
               | annual revenue, with an average client retention of seven
               | years, and two decades of agency experience behind the
               | decisions around building that platform, if you instantly
               | said no just because of MongoDB, maybe you don't know as
               | much about MongoDB as you think you do.
               | 
               | I came from a SQL background myself, and had reservations
               | based on all the things I'd read about MongoDB as we
               | decided to build a platform after doing things bespoke
               | for two decades, but time has proven our architecture
               | choices out. It's easy to be proud of something that
               | works well.
        
             | hetspookjee wrote:
             | The Guardian posted quite a nice blog in 2018 about the
             | switch to Postgres from MongoDB. Especially interesting
             | because they intended to use Postgres as replacement
             | document storage: Here's the link
             | https://www.theguardian.com/info/2018/nov/30/bye-bye-
             | mongo-h...
        
               | guanzo wrote:
               | > Automatically generating database indexes on
               | application startup is probably a bad idea.
               | 
               | aw crap. oh well it probably doesn't matter for my small-
               | ish application.
        
           | [deleted]
        
           | Carpetsmoker wrote:
           | _I didn 't pick Mongo, I was just the engineer called to
           | clean that mess._
           | 
           | My only experience with MongoDB is being "the engineer called
           | to clean the mess". I'm sure you can effectively use MongoDB
           | in production if you're knowledgable and careful, but most
           | people aren't and they shouldn't have to know the detailed
           | inner working to not create a mess.
        
             | goatinaboat wrote:
             | _My only experience with MongoDB is being "the engineer
             | called to clean the mess"._
             | 
             | It's always the same
             | 
             | 1. Newbie webdev (aren't they all) uses MongoDB because
             | it's easy to use according to blogs and twitter
             | 
             | 2. Somehow it makes it into production
             | 
             | 3. A dozen experienced engineers spend years trying to keep
             | it running
        
         | lossolo wrote:
         | When I was evaluating MongoDb couple of years ago (around the
         | time they were switching to WiredTiger engine), I've found
         | memory leak in their Node.js client on day one, I've submitted
         | a ticket on their Jira and the same time I had a look at other
         | issues they had there. I saw there memory leak after memory
         | leak, memory corruption everywhere, data disappearing without
         | any reason, segfaults etc. After that MongoDB was dropped as a
         | candidate for a DB in project I was working on, we went with
         | Postgres and never regretted it.
        
       | loeg wrote:
       | Aphyr is such a competent professional. What a relatively
       | thorough and polite response to Mongo's inaccurate claims. "We
       | also wish to thank MongoDB's Maxime Beugnet for inspiration." is
       | a nice touch.
        
       | bithavoc wrote:
       | > Clients observed a monotonically growing list of elements until
       | [1 2 3 5 4 6 7], at which point the list reset to [], and started
       | afresh with [8]. This could be an example of MongoDB rollbacks,
       | which is a fancy way of saying "data loss".
       | 
       | I hope they learned the lesson, don't fuck with aphyr.
        
         | amenod wrote:
         | That's... not the lesson they need to learn. Databases are app
         | foundations. Make sure you do them right and don't overpromise.
        
           | baq wrote:
           | I agree but maybe it's the only lesson they are able to
           | understand at this time. Their attitude was asking for
           | somebody to call them, which aphyr is maybe the best
           | positioned to do.
           | 
           | I'd love to read a roasting like that authored by Leslie
           | Lamport for a different perspective but aphyr's works
           | absolutely stand on their own.
           | 
           | Any ideas how to get Jepsen and TLA to work together? :)
        
       | azernik wrote:
       | Ouch. This is what you get when you order up a third-party review
       | and then misrepresent it in advertising.
        
         | taywrobel wrote:
         | I'm still waiting for Jepsen to put Confluent's "Kafka provides
         | exactly once delivery semantics" claim to the test.
         | 
         | Since they're claiming something provably false, it'd be nice
         | to have some empirical evidence as such.
        
           | aphyr wrote:
           | I'm not convinced it _is_ false--IIRC their claim is
           | specifically w.r.t other Kafka side effects, and those they
           | _can_ control.
        
       | egeozcan wrote:
       | The general mood I observed about MongoDB was that it used to be
       | inconsistent and unreliable but they fixed most, if not all of
       | those problems and they now have a stable product but bad word of
       | mouth among developers. Personally, I've treated it as "legacy"
       | and migrated everything that I had to touch since 2013 [0], and
       | luckily (just read the article so hindsight 20/20 -- transaction
       | running twice and seeing its own updates? holy...) never gave it
       | another try.
       | 
       | [0]: https://news.ycombinator.com/item?id=6801970 (BTW: no, my
       | dream of simple migration never materialized, but exporting and
       | dumping data to Postgres JSONB columns and rewriting queries
       | turned out to be neither buggy nor hard).
        
         | cyphar wrote:
         | > MongoDB was that it used to be inconsistent and unreliable
         | but they fixed most, if not all of those problems and they now
         | have a stable product but bad word of mouth among developers.
         | 
         | This report is 9 days old, and tests the latest stable release
         | of MongoDB. The problems it discusses are present on modern
         | MongoDB.
        
           | egeozcan wrote:
           | If it wasn't clear, I said "mood" (what you conveniently
           | ignored), referring to chit-chat I heard recently, and was
           | underlining the fact how wrong it has been. I totally
           | understand what the report says and know what version it
           | tests.
        
             | cyphar wrote:
             | In my defense, it wasn't clear that's what you were saying
             | in your original comment. "Mood" has become a filler word
             | at this point -- hence why I omitted it from the quote --
             | and can mean anything from the traditional meaning of "mood
             | in the room" to "incredibly relatable/factual statement".
             | How I originally understood your comment was that you were
             | saying that you felt that most of the issues are in the
             | past, but you still decided to migrate away from it.
        
               | egeozcan wrote:
               | English is not my mother language and given the down-
               | votes, probably it's my wording at fault here - sorry.
               | 
               | I'm glad now that it's been clarified :)
        
       | koishikomeiji wrote:
       | Fuck me gently with a chainsaw
        
       | depr wrote:
       | >Sometimes, Programs That Use Transactions... Are Worse
       | 
       | I understood that reference
        
       ___________________________________________________________________
       (page generated 2020-05-24 23:00 UTC)