[HN Gopher] The database ruins all good ideas ___________________________________________________________________ The database ruins all good ideas Author : kristianp Score : 102 points Date : 2021-07-17 05:59 UTC (17 hours ago) (HTM) web link (squarism.com) (TXT) w3m dump (squarism.com) | rrdharan wrote: | If you really want or need to go active-active but also don't | want to think about it much and can live with MySQL my preference | would probably be AWS Aurora multi-master if you can live with | its limitations: | | https://docs.aws.amazon.com/AmazonRDS/latest/AuroraUserGuide... | | As noted there and elsewhere though often you are better off just | scaling up. | | If you _do_ want to really think about it then NewSQL (Cloud | Spanner or others) could be the way to go but the cost, | performance, compatibility /portability and other concerns make | this a much more complex decision that isn't easy to capture in a | blog post (or HN comment ;). | edelweis wrote: | Databases scale just fine. Its called sharding. Azure has | HyperScale (with is Citus), AWS has Aurora. The actual problem | with databases is upgrading to major versions without downtime. | All the other problems are just noise | sonthonax wrote: | Agreed. But one impediment to sharding is bad DB design, which | is rife right now because everyone starts with | Django/Rails/Laravel and starts treating the DB like a document | store that happens to have some relational features. | robertlagrant wrote: | I think that being able to have separate databases for different | domains is pretty good, although it does rely partly on the | application layer to keep (UU)IDs consistent. | | I'd be curious to know if there's a way to have databases talk to | each other to just sync up primary keys for referential | integrity. That could maximise the benefit of decoupled databases | while still having good referential integrity. And a network | disconnection would still mean existing PKs would be in place, it | just wouldn't be aware of new ones. Not perfect, but not bad, | perhaps. | k__ wrote: | Aren't there quite some solutions today? | | Fauna, Upstash, DynamoDB, CosmoDB, Firestore, Planetscale. | zzzeek wrote: | just because it's the thing I know best at the moment, it's worth | checking out Galera which actually does do multi-master | replication. Of course you have to live in MySQL/MariaDB world | which is a non-starter for a lot of peeps. | simonw wrote: | The title of this piece is great: very catchy. But I don't think | the content supports the title - by the end of it I wasn't at all | clear /why/ the database ruins all good ideas. | | A few other points. First, horizontally scaling database reads is | actually reasonably straight-forward these days: one leader, | multiple replicas, load balance reads to the replicas and use a | mechanism such that users that have just performed a write have | their read traffic sent to the leader for the next 10 seconds or | so. | | Not trivial, but also not impossibly difficult - plenty of places | implement this without too much trouble. | | Scaling writes is a lot harder - but a well specc'd relational | database server will handle tens of thousands of writes per | second, so the vast majority of projects will never have to solve | this. | | When you do need to solve this, patterns for horizontally | sharding your data exist. They're not at all easy to implement, | but it's not an impossible problem either. | | The article talks briefly about mocking your database: definitely | never do this. How your database behaves should be considered | part of your application code under test. Running a temporary | database for your tests is a solved problem for most development | frameworks these days (Django supports this out of the box). | | Overall, my experience is that the database /enables/ all good | ideas. Building stateful applications without a relational | database in the mix is usually the wrong choice. | tremon wrote: | _horizontally scaling database reads is actually reasonably | straight-forward these days: one leader, multiple replicas_ | | There's one feature that I'd like to see in that area: | partitioned writable replicas. In the same vein that you can | partition the table storage across an index, I'd like it to be | possible to assign different writers to different parts of a | table/database. Of course, you'd still need a single primary | replica to handle the transactions that transcend the | configured partitions, but we already have routing engines that | can transparently redirect an incoming query to any available | replica, so it's partly there. | | There's probably corner cases lurking that I can't even think | of, but in my mind it's the only thing missing from building a | truly web-scale (multiple zones, multiple datacenters) ACID- | preserving relational database. | taffer wrote: | This is essentially the idea of VoltDB [1]: All tables are | partitioned and each partition is assigned to a single | thread. As long as each query only touches a single | partition, you get linear horizontal scalability and | serializable isolation. The drawback is that if a query | touches multiple partitions, only that one query can be | executed on the entire cluster, so this architecture only | works for very specific OLTP workloads. | | [1] https://www.voltdb.com/ | benlivengood wrote: | And this is why Google wrote Spanner. I think cockroachdb tries | to solve the same problems. | | If you need ACID compliance and you need a lot of it, everywhere, | all the time, now there are better options than giant Sun/IBM | boxes. | | Databases are not the problem. | AdrianB1 wrote: | A dockerized database server is something I cannot understand; I | understand bare metal, I can accept virtualized, but I cannot | find a good used case for a mid sized or large server (dozens of | gigabytes to dozens of terabytes) dockerized and I don't know why | a smaller server is a problem. | kevml wrote: | A docker image is simply packaging. There's a small, almost | always negligible, performance hit. But the value comes in | being able to ship a consistent application and configuration | pairing. This becomes really valuable when shipping multiple | databases for multiple services. | AdrianB1 wrote: | I know what docker does, I never saw a database server | deployed in mass. One of my teams manage over 250 database | servers, but nobody ever supported the idea of using docker | for that (all the servers run on Windows). | | Shipping multiple databases for multiple services does not | necessarily mean multiple database servers. In the database | world I saw large servers with many databases more often than | multiple servers of one database each; in the second case it | was the vendor laziness (cannot give the name), not than a | reasonable business or technical reason. When I asked about | it the answer was "we'll consolidate in the next release". | da39a3ee wrote: | I'm not convinced the author knows what they're talking about. | The answer to the question they posed is basically "ACID", so | unclear what all the verbiage is. Also unclear why they expect | all readers to have an architecture featuring multiple | application servers all sharing a single database, since that is | neither classic monolith nor classic microservices. | samatman wrote: | SQLite is a remarkably good solution to most of these problems, | if deployed correctly. | | For your main line-of-business database? Of course not. But a | deployment of rqlite[0] for your service workers in a read-heavy | workload? cuts out a round-trip out of the VM, mocking is | trivial, there's a lot to like there. | | [0]: https://github.com/rqlite/rqlite | eloff wrote: | You're missing the whole point of the article. If you set this | up in an active-active configuration (if that's even supported) | you have the exact same distributed systems problems as the | author. | l0k3ndr wrote: | I have been working on rewriting a monolith into individual | services during past year at work - and what we finally | implemented for consistency across different databases of | services was that we keep a signal queue - where whenever we | encounter an inconsistency, a doc is pushed containing the info | and we have a corrective service always reading from that queue | and doing the necessary updates to tables. We made a heirarchy of | source-of-truthness and we use that to decide what to do with | inconsistent data. Though, users sometimes have to see a error | messagge "We had an error and we are fixing it in few XXX time" | based on current queue load - but it has been working fine mostly | for us. | icakir wrote: | weird, nobody mention nuodb. | yowlingcat wrote: | Am I just dense, or was the idea the CEO had just not a good idea | and simply garden variety premature optimization? | | Also, it seems unfair to say that the database ruins all good | ideas when there continues to be significant impressive | innovation in the space in cloud services. Like Amazon Aurora for | one. If you really want to treat your RDBMS like an elastically | scalable (with significant caveats) resource priced by the | request, you can. | justsomeuser wrote: | I think the title implicitly meant "good", as in "you think | it's a good idea until you press it against the grindstone that | is reality". | | Also I think these issues arise at the database because that's | were the write concurrency is pushed. | thrill wrote: | Use CockroachDB. | bobthebuilders wrote: | CockroachDB is a terrible technology that causes almost | guaranteed data corruption due to its lack of ACID guarantees | and is written in a language with a GC which contirbutes to GC | pauses. The dev team refuses to listen to feedback to port | their code to C ;(. | Guvante wrote: | Because "just rewrite your program in C to avoid GC pauses" | is such a flawed argument when it comes to any production | system that it isn't even worth discussing. | | The reality is there are limited resources to work on any | given project and "rewrite" is generally not the correct way | to fix a given problem. | | Especially since the first thing you are going to need to do | is show that a network system isn't resilient to GC pauses or | that GC pauses are frequent enough to impact throughput (very | very few applications are actually materially impacted by sub | second spikes in latency). | pkolaczk wrote: | GC pauses are not the only GC issue. Thrashing the caches | and blowing up memory usage by an order of magnitude can be | also very bad for performance. In a database system memory | is very precious - the more of it you can use for caching / | buffering users data, the better the performance. | | As for the subsecond spikes in latency, these tend to | multiply in a distributed system. If serving a client | request takes N internal requests, the likelihood of | hitting a GC pause somewhere is much larger than if you did | only one local request. | | Not sure about Go, but none of the "free" Java's GC | guarantees low pauses. There is STW fallback even in the | most advanced ones like ZGC. So you never know when it | stops for more than 1s. | redis_mlc wrote: | > Because "just rewrite your program in C to avoid GC | pauses" is such a flawed argument | | False. All popular databases are written in C, and continue | because of the GC issue. I would not use a general-purpose | database written in Java because of GC, for example. We'll | see how well Go works in practise. | | > sub second spikes in latency | | Go is supposed to be sub-second GC pause latency, but | understand that most SQL queries are sub-millisecond, so GC | latency is still a significant issue compared to query | times. | | Go might be acceptable now for niche databases like column- | store for certain use cases, though. | | Also, see the excellent comment above about distributed | systems and server cache issues. You can't do application | performance analysis with GC literally everywhere. | | The puerile knee-jerk hatred for C on HN has to stop - | almost every software you use is written in C, from | scripting languages to operating systems to web servers to | databases. | | Source: DBA who's worked with current databases, as well as | a custom database written in Java with significant (ie. | brutal) GC problems that required a total refactor | (rewrite) to "work" at all. | andreimatei1 wrote: | > its lack of ACID guarantees | | Our transactions implementation is our crown jewel. You might | want to check your sources. | AdrianB1 wrote: | To solve what problem? | hughrr wrote: | No it doesn't. They scale amazingly well if you throw money at | the problem. Most people never get there. When you do you will | know. I've been there. When you're spending $3 million on | hardware and licenses a year you either have a viable business or | fucked up badly. That's the real decider. | | The answer is to start siloing customers or application concerns | out into separate clusters depending on your operating model. | | If you spent the last decade writing shitty cross database SQL | then you'll have to silo by customer. That's the only real | constraint. | tremon wrote: | I'd phrase it differently: the database is the litmus test of | your supposed "good idea". If your good idea doesn't survive | referential integrity or even data consistency, your idea | doesn't deserve the label "good". | | Of course, the next "good idea" is to sacrifice data integrity | in the name of performance. That can work, but usually it's | just a technical form of growth hacking. Sacrificing data | integrity without understanding the problem space is a disaster | waiting to happen (but luckily, it may not happen on your watch | so you can continue hacking). | hughrr wrote: | Very good points there. | | I have directly experienced the problems associated with | blindly assuming that you can write your own consistency | guarantees in Redis. I didn't do it but I had to write some | tooling to rewrite the AOF to get some data back... | HWR_14 wrote: | I'm sorry, are you saying that a good idea should survive the | loss of referential integrity or data consistency? | | I don't feel that's what you mean, and it's probably my fault | for misreading your comment. | erik_seaberg wrote: | I read it the other way: if abandoning correctness is the | only way to make your idea feasible, your idea kinda sucks. | tremon wrote: | Yeah, I probably could have phrased that better. The | sibling comment is correct, I meant that if your idea | requires abandoning referential integrity or data | consistency, it's probably not a good idea. Since it's the | database that actually enforces those constraints, it may | seem that the database causes the problem. But in most | cases, the problem is the data model or the idea itself. | Ozzie_osman wrote: | > The answer is to start siloing customers or application | concerns out into separate clusters depending on your operating | model. | | This is such an underrated solution (siloing or sharding your | data in some way). I think people don't do it because: | | 1. The tooling doesn't make it super-easy (e.g. good luck | sharding Postgres unless you're willing to pay for Citus) | | 2. "Trendy" companies in the past decade have been network-type | products (social networks, etc), where the structure of the | data makes it much harder to silo (people need to follow each | other, interact with each other's content, etc) | | 3. We as an industry took a several-year detour over to NoSQL | land as a promised solution to scalability. | | Life would be a lot easier if you could say something like: | | * I want a Postgres node. | | * I'm happy to shard my data by some key (customerId, city, | etc) and am willing to accept responsibility for thinking | through that sharding key. | | * My application has some logic that easily knows which DB to | read/write from depending on shard key. | | * There's some small amount of "global application data" that | might need to live on a single node. | simonw wrote: | For your "life would be easier" version, isn't that pretty | much what Citus does? And Citus is open source so paying for | the managed version is optional. | cellis wrote: | saying "Citus is open source" doesn't mean its as simple as | running `yarn install citus; yarn run citus`. Its is much | much more complicated than typical open source. You pay for | support for a very good reason. | jarym wrote: | The mistake is thinking half the system is doing nothing. It's a | bit like saying I'm paying to store my backups offsite but most | of the time it's not really used so it's a waste of money. It | just doesn't hold. | | Most SQL problems that most of us have to deal with stem from | inadequate indexes and/or poorly written queries. No fancy | active-active setup will ever solve those issues. | renewiltord wrote: | Can't be bothered. Aurora in Postgres/MySQL and then just scale | that vertically till it stops. | trollied wrote: | The article is talking about Oracle RAC, and the "crossover" he | is talking about was usually really quick Infiniband, not a | crossover cable (so, a separate NIC for good reason!). The person | that wrote the article doesn't seem to understand RAC enough to | actually comment properly. A properly tuned RAC instance will | scale horizontally very very well. | don-code wrote: | Former (recovering?) Exadata user here. I'll go as far as to | say that configuring a best-practices RAC cluster is easier | than configuring a best-practices MySQL cluster. RAC solves | some additional problems, like discovery and load balancing, | that MySQL kicks out to the user (e.g. you have to manage DNS | and run a tool like ProxySQL separately). ___________________________________________________________________ (page generated 2021-07-17 23:00 UTC)