[HN Gopher] Postgres scaling advice ___________________________________________________________________ Postgres scaling advice Author : BrentOzar Score : 310 points Date : 2021-01-26 12:25 UTC (10 hours ago) (HTM) web link (www.cybertec-postgresql.com) (TXT) w3m dump (www.cybertec-postgresql.com) | orlovs wrote: | I am totally pro-hosted solutions. However, hosted postgres is | always none of it. I had experience on same size machine some | queries took 30x time longer on hosted vs selfhosted | StreamBright wrote: | >> A single PostgreSQL instance can easily do hundreds of | thousands of transactions per second | | For example, on my (pretty average) workstation, I can do ca. 25k | simple read transactions per 1 CPU core on an "in memory" pgbench | dataset...with the default config for Postgres v13! With some | tuning (by the way, tuning reads is much harder in Postgres than | tuning writes!) I was able to increase it to ~32k TPS per core, | meaning: a top-notch, dedicated hardware server can do about 1 | million short reads! With reads, you can also usually employ | replicas - so multiply that by 10 if needed! You then need to | somehow solve the query routing problem, but there are tools for | that. In some cases, the new standard LibPQ connection string | syntax (target_session_attrs) can be used - with some shuffling. | By the way, Postgres doesn't limit the number of replicas, though | I personally have never witnessed more than 10 replicas. With | some cascading, I'm sure you could run dozens without bigger | issues. | | This sort of hand-wavy "benchmark" is not really good for anybody | other then the author's satisfaction. Real world scenarios are | not like that. | kevincox wrote: | I wonder when using a distributed database (like CockroachDB) | will be the default for new applications. Right now it seems that | they are less feature and harder to set up than traditional | RDBMSes but I can only assume that this gap will narrow and at | some point in the future things will be "scalable by default". | (Of course no DB is going to prevent all ways to shoot yourself | in the foot) | andreimatei1 wrote: | I think the "default" will evolve with whatever offers the best | "serverless" experience in the public clouds. In particular, | the cheapest and most granularly-billed option. | WJW wrote: | This question is similar to asking on a car forum when using a | 40 foot lorry will be the default starter car for everyone. The | answer is "probably never" because while it does offer superior | cargo transport scalability, the tradeoffs are not worth it for | the vast majority of users. The question is posed like | distributed databases have no disadvantages over non- | distributed databases, but that is simply not the case. | Clustering introduces all sort of problems, from network splits | to slow(er) joins across nodes. | __s wrote: | To back up how far one server can go, Stack Overflow used a | single database for a long time | | https://nickcraver.com/blog/2016/02/17/stack-overflow-the- | ar... (2013 post had much less redundancy, but even their | 2016 architecture is pretty undistributed in terms of being | able to recreate everything from the single source of truth | database) | bcrosby95 wrote: | Yeah, it's been a while since I've been in a high traffic | situation, but back in the late 00s we had a couple sites | with 3-5 million daily unique users. Each site had a single | master, a couple read replicas, and some queries were | cached in memcached. | | The problem in MySQL we eventually ran into was the read | replicas fell behind master during peak load (by 30-60 | minutes depending upon the day). So we still had to hit the | master for certain queries. I left before we took the next | step to fix this issue. | kevincox wrote: | I don't think it is quite the same. | | - Switching from a car to a van to a lorry is fairly low | cost. You don't need to recreate your product (probably). | | - You don't need to run distributed databases in a cluster to | start. | | But I think most importantly the decrease in dev speed and | performance is an investment in future scalability. And I | only imagine that this different will shrink over time to | where for example a 1 node "distributed" DB isn't that | different to work with than a 1 node "traditional" DB. And | that small difference pays off because adding a second node | doesn't have huge tradeoffs. | | I agree that right now it doesn't make sense. If I was | starting a new product I would still fire up a PostgreSQL | instance. However I think that the day will come where the | difference is small enough and software support is complete | enough that we will start with something distributed, much | like people don't often start with SQLite today, even though | it is quicker to get going (also not a great comparison | because it has less features). | andreimatei1 wrote: | > I agree that right now it doesn't make sense. | | This CRDB engineer respectfully disagrees. This thread | takes it as a given that a non-distributed DB is better if | you don't need to scale up (i.e. if you run a single | "node"). Let me offer a couterpoint: it's easier to embed | CRDB into some software you're distributing than it is to | embed Postgres. This is to say, we do try to compete at | every scale (well, perhaps not at SQLite scale). CRDB | doesn't have all the SQL features of PG, but it does have | its pluses: CRDB does online schema changes, it is | generally simpler to run, comes with a UI, comes with more | observability, can backup to the cloud, can be more easily | embedded into tests. | | Online schema changes are a big deal; the other thing that | I hope will help us win small-scale hearts and minds is the | ever-improving observability story. I hope CRDB will | develop world-class capabilities here. Other open-source | databases traditionally have not had many capabilities out | of the box. | yuribro wrote: | > But I think most importantly the decrease in dev speed | and performance is an investment in future scalability. | | It makes sense only if you'll ever need this scalability. | And you take a hit on other fronts too: Infra cost, | Deployment complexity. And both deployment complexity and | code complexity also increase QA cost, instability, product | and company reputation. | | >> much like people don't often start with SQLite today | | Maybe they should start with SQLite by default | lrem wrote: | For personal projects, I start with SQLite by default. | Never came close to its scalability limit too ;) | WJW wrote: | Making investments in future scalability at the cost of dev | speed and performance is exactly the wrong tradeoff given | that 90% of startups fail. At the start, when you have very | little income, you want to allow for as much speed and | flexibility as possible so that you can get to | product/market fit ASAP. By the time your company gets big | enough that plain MySQL/Postgres can't handle the load any | longer, you will have more than enough money to afford a | few experts that can help you migrate. | cuu508 wrote: | To me it looks like there are not many affordable options right | now. | | CockroachDB understandably wants you to use their Cloud or | Enterprise products: the OSS version is quite limited. For | example it doesn't support row-level partitioning | (https://www.cockroachlabs.com/docs/stable/configure- | replicat...). Which means, if I understand correctly, it is not | of much help for scaling writes to a single big table. | nvanbenschoten wrote: | Hi cuu508, CockroachDB engineer here. You are correct that | row-level partitioning is not supported in the OSS version of | CRDB. However, it sounds like there's a bit of confusion | about where manual table partitioning is and is not needed. | The primary use-case for row-level partitioning is to control | the geographic location of various data in a multi-region | cluster. Imagine a "users" table where EU users are stored on | European servers and NA users are stored on North American | servers. | | If you are only looking to scale write throughput then manual | partitioning is not be needed. This is because CRDB | transparently performs range partitioning under-the-hood on | all tables, so all tables scale in response to data size and | load automatically. If you are interested in learning more, h | ttps://www.cockroachlabs.com/docs/stable/architecture/distr.. | . discusses these concepts in depth. | cuu508 wrote: | Thanks for explaining, and sorry -- looks like I jumped to | conclusions too quickly! | kevincox wrote: | Oh, I didn't realize that the free version was that limited. | I guess I need another name to use as the default open source | distributed database. | berns wrote: | No, you didn't understand correctly. The feature that isn't | supported is row level _replication zones_. Replication zones | allows to define the location of replicas. | kryptiskt wrote: | Why would it ever be default? Very few applications need a | distributed database. These days you can get a single machine | with hundreds of terabytes of storage and terabytes of RAM. Add | a spare machine for failover and you have a reliable setup for | any but the biggest tasks. And the tasks that that setup isn't | sufficient for will certainly demand more thought than a | cookie-cutter setup of some random distributed DB. | jandrewrogers wrote: | What we need is a database that can both scale-up _and_ scale- | out. Most distributed databases offer poor efficiency and | performance on a per node basis, which has a high operational | cost. This is why people avoid using distributed databases | unless they need it. A scale-up database can serve as much | workload as pretty large scale-out database in practice. This | discontinuity creates the market for scale-up systems. | | There is literally nothing preventing distributed databases | from having excellent scale-up performance too. Unfortunately, | people who design distributed databases have a strong bias | toward unnecessarily throwing hardware at performance and | scalability problems. This is partly because very few people | know _how_ to design a modern scale-up database; making | something (trivially) distributed is easier. | FpUser wrote: | >"try all the common vertical scale-up approaches and tricks. Try | to avoid using derivative Postgres products, or employing | distributed approaches, or home-brewed sharding at all costs - | until you have, say, less than 1 year of breathing room | available." | | Very healthy approach. I've always followed the idea of vertical | scalability when writing my modern C++ app servers with local | Postgres. Since I do not sell those to FAANG I've never failed | finding decent very reasonably priced piece of hardware be it | dedicated hosting or on prem that would not satisfy client's need | for any foreseeable future. More then that. I've never needed | even top of the line hardware for that. I concentrate on features | and robustness instead. Using C++ also gives nice speedup. | isoprophlex wrote: | Pretty solid advice, nice article. | | One thing: | | > For example, on my (pretty average) workstation, I can do ca. | 25k simple read transactions per 1 CPU core on an "in memory" | pgbench dataset...with the default config for Postgres v13! | | Forget about reaching those numbers on managed DB-as-a-service | instances, specifically Azure managed postgres. In my experience | these have comparatively poor peak performance, with high | variability in latency to boot. | | Bare metal all the way, if you can spare the allocation of a | dedicated DBA. | | Also... if rapid reads and steady performance is what you're | after, provision a replica for the occasional analytical | workloads. | samf wrote: | Good info; I came here looking for experience with managed | databases. Does anyone have experience with managed instances | on other platforms? | jandrewrogers wrote: | The assertion that PostgreSQL can handle dozens of TB of data | needs to be qualified, as this is definitely not the case in some | surprising and unexpected cases that are rarely talked about. | | PostgreSQL's statistics collection, which is used by the query | planner, _doesn 't scale with storage size_. For some ordinary | data distributions at scale, the statistical model won't reflect | any kind of reality and therefore can produce pathological query | plans. It is quite difficult to get around this scaling behavior. | Consequently, I've moved away from using PostgreSQL for data | models with distributions such that these query planner | limitations will occur. These pathologies occur well before the | "dozens of TB" range. | | The statistics collector is not a scalable design generally, but | that is another matter. In its current state it does not degrade | gracefully with scale. | aeyes wrote: | Vacuum is a way bigger problem, if you have a few million | updates per day on a 5TB table you are going to have a hard | time keeping index and page bloat down. | | Sure, if your tables are insert only you might be fine but | doing any kind of DDL or maintenance (analyze after version | upgrades) is going to ruin your day. | laurenz_albe wrote: | This seems to be unfounded criticism. When statistics are | gathered, PostgreSQL samples a certain percentage of the table, | so that obviously scales. The number of "most common values" | and histogram buckets scales up to 10000, which should be good | even for large tables. While I'll readily admit that not all | aspects of cross-column dependencies are dealt with, and cross- | table distributions are not considered, that has nothing to do | with size. I guess you hit a problem somewhere, couldn't solve | it and jumped to unwarranted conclusions. | jandrewrogers wrote: | It is ironic that you accuse me of "unwarranted conclusions". | I've been customizing and modifying PostgreSQL internals for | almost two decades, I know how to read the source. You aren't | as familiar with PostgreSQL as you think you are. | | This wasn't my problem, I was asked by a well-known company | with many large PG installations and enterprise support | contracts to look at the issue because no one else could | figure it out. The limitations of the statistics collector | are not only evident in the source _but they are documented | there_. There are also deep architectural reasons why you can | 't trivially modify the statistics collector -- I looked into | this option -- to work for larger tables without introducing | other serious issues. | | If you have a uniform distribution of values, the statistics | collector will work fine. In the above case, the values | followed a power law distribution which created extreme | biases in the statistical model due to necessary restrictions | on sampling. Other distributions can have similar effects | once you exceed the ability of the statistics collector to | acquire a representative sample. | brasetvik wrote: | > I know how to read the source. You aren't as familiar | with PostgreSQL as you think you are. | | Oh, maybe you have read some of Laurenz Albe's many | contributions to Postgres, then. https://git.postgresql.org | /gitweb/?p=postgresql.git&a=search... | srcreigh wrote: | Naive question. Wouldn't sampling a power law dataset be | straightforward? The idea is there's only a few outlier | values, and the rest are uncommon. This distribution seems | extremely common. Ie column with mostly NULL values and the | rest somewhat unique non null strings. | | I'm curious what data you saw and why the sampling didn't | work? | fabian2k wrote: | You can change the statistics target, see | https://www.postgresql.org/docs/current/runtime-config-query... | | You can also create more advanced statistics over multiple | columns: https://www.postgresql.org/docs/current/planner- | stats.html | | But if your statistics are bad, it will certainly mess up some | of your query plans. | jandrewrogers wrote: | Yes, everyone knows these things, it does not address the | issue. Due to its architecture, there are difficult to change | internal limits on how PostgreSQL samples the data. | | Under some conditions, it is not possible for the sampling | mechanics to build a representative model -- the actual | statistics used for query planning will be quasi-random. One | of the ways this manifests is that every time you rebuild the | statistics you get a completely different statistical model | even if the data has barely changed. | gher-shyu3i wrote: | > PostgreSQL's statistics collection, which is used by the | query planner, doesn't scale with storage size. | | Out of curiosity, do statistics collectors for other offerings | (e.g. MySQL, SQL Server, Oracle) scale with storage size? | boomer918 wrote: | Theoretically your argument would make sense, but practically | your data distribution should be uniform enough for the | statistics collector to work for terabytes and terabytes. | | I've seen query plans get messed up with partial indexes, but | if you have a regular index, you needn't worry. | natmaka wrote: | Even when using as much stat samples as possible (3000000) for | each of those tables: ALTER TABLE table_name ALTER column_name | SET STATISTICS 10000; See | https://www.postgresql.org/docs/current/sql-altertable.html | | AFAIK the PostgreSQL's approach is based upon | http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.53.... | fanf2 wrote: | Does statistics collection scale better with partitioned | tables? TFA mentions that partitioning helps with vacuum, and | AIUI vacuum workers also collect statistics. | jajool wrote: | I have seen many PostgreSQL benchmarks having solid performance | with TB data but my real world experience is the complete | opposite. | | Here are some of the main issues that I have encountered so | far: | | 1. Queries on large tables (around 10 GB) are slow even when | "index only scan" is used because of MVCC and the way | postgreSQL manages concurrency. | | 2. Hot-standby instances can't be used for anything serious | since all queries are dropped regularly (I believe it's not | safe to use "hot_standby_feedback" config to overcome this | issue). | | 3. It is not possible to have tables with heavy "update" | workflows. (because of simultaneous autovaccum execution) | | I would be very happy if anyone could show me that I am wrong. | YorickPeterse wrote: | > 2. Hot-standby instances can't be used for anything serious | since all queries are dropped regularly (I believe it's not | safe to use "hot_standby_feedback" config to overcome this | issue). | | Hot-standby instances are fine. We've been using these for | GitLab.com for a number of years now, and never have they | caused any issues. We do have some code in place that retries | queries when they are cancelled by the hot-standby, but IIRC | this doesn't happen often. | tommyzli wrote: | 1. You'll have to define "slow" - I have a 3TB table where an | index only scan takes under 1ms | | 2. hot_standby_feedback is absolutely safe. I've got 5 hot | standbys in prod with that flag enabled | | 3. Again, it depends on how "heavy" your update throughput | is. It is definitely tough to find the right balance to | configure autovacuum between "so slow that it can't keep up" | and "so fast that it eats up all your I/O" | singron wrote: | 1 mostly applies to update heavy tables since index only | scans use the visibility map, which would be frequently | invalidated. | | 3 is definitely true, especially the larger the table. I've | had success splitting frequently updated columns out into | their own much smaller table, or any other trick to | concentrate updates into a small table. Also MVCC bookkeeping | requires updating pages, so an UPDATE that doesn't change any | field and SELECT FOR UPDATE will cause the same problem. | andy_ppp wrote: | This sounds really in depth and useful, are there more in depth | articles about how to avoid these types of query. | claytonjy wrote: | This is intriguing; could you give an example of a distribution | Postgres struggles with at scale, and an application that | produces such a distribution? | mkilling wrote: | We've recently been surprised by wrong estimates made by the | planner resulting in inefficient query plans. | | It seems that power law distributions over multiple tables | aren't handled very well: | https://stackoverflow.com/questions/65861780/is-there-a- | way-... | silon42 wrote: | Is it easy to disable it? Personally I would not want any | unpredictable behavior from query planner anyway, especially at | scale. | Tostino wrote: | You would get predictably bad behavior and performance | without the statistics collected. | ben509 wrote: | Predictably bad won't result in a sudden spike in IO usage | that brings down a production system. This is why Oracle | has SQL profiles. | Tostino wrote: | Then use something like pg_hint_plan: | https://pghintplan.osdn.jp/pg_hint_plan.html (never used | it myself) | | I've been bitten by bad optimizer choices which took way | too long to figure out how to debug / fix, so I know the | pain. I do hope the PG optimizer continues improving. | It's one of the weakest areas in comparison to commercial | DB's IMO. | eatonphil wrote: | Neither of you have given a reproducible example one way or the | other! I could share that I know of TB of core analytics data | running postgresql and it does alright. But it would be nicest | if there were a more tangible study to talk about. | jjice wrote: | In the opinion of a last semester CS student who has never | written an application from scratch that needed more than a | SQLite DB (so take me with a half grain of salt), it seems like | premature optimization, while always talked about, is very | common. I see people talking about using Kubernetes for internal | applications and I just can't figure out why. If it's a hobby | project and you want to learn Kubernetes, that's a different | situation, but in the case of making a real application that | people will use, it seems like a lot of us can get away with a | single DB, a few workers of our app, and maybe a cache. | | I'm speaking out of little experience though. I just think that a | lot of us can get away with traditional vertical scaling and not | think too much about it. | pg_bot wrote: | What you're describing is called resume driven development. It | happens every few years when people want to cash in on | trends/buzzwords that people believe will be disruptive to all | industries but are just tools to have in the toolbox for most. | New tools pop up all the time that fit this mould. Over the | past ten years I can think of Hadoop (Big data), MongoDB | (NoSQL), Kubernetes, "Serverless" computing, and TensorFlow. | While all these tools have legitimate use cases, they are often | overused due to marketing or industry hype. | | Adding artificial intelligence to your recipe application is | unlikely to make any sense, but people do it because they want | to have AI software engineer on their resume. | nforest wrote: | For artificial intelligence, I think it's more often | marketing driven development. It's easier to seem disruptive | if you claim to have AI in your product. Easier to get | funding and have people talk about your company. I feel like | it comes more often from business executives than technical | people. | kodah wrote: | > but in the case of making a real application that people will | use, it seems like a lot of us can get away with a single DB, a | few workers of our app, and maybe a cache. | | A couple of points: | | 1. Kubernetes can run monoliths. It's certainly not exclusive | to microservices or SOA. It's just a compute scheduler, quite | similar to AWS's EC2 reservations and auto-scaling groups | (ASG's). | | 2. I can't speak for every corporation, but if you already have | patterns for one platform (note: "platform" in this context | means compute scheduling. eg: AWS, GCP, Kubernetes, Serverless) | then you will inevitably try to copy patterns you already | implement internally. A lot of times, for better or for worse, | it's not what fits best unless what fits best and what you have | available are highly conflicting. | | 3. A lot of times "scaling" is actually code for multi-tenancy. | As an industry, we should probably be explicit when we're | scaling for throughput, redundancy, and/or isolation. They are | not the same thing and at times at odds with each other. | | 4. I don't really like your use of "real application" here as | it implies some level of architectural hierarchy. My main | takeaway after 10+ years of professional development is that | architectures are often highly contextual to resource | availability, platform access, and personal preferences. | Sometimes there's a variable of languages too, because some | languages make microservice architecture quite easy while | others make it a royal PITA. | wejick wrote: | I know one of the biggest Ecommerce shop in Asia were using 1 | big DB with multiple read only slave in monolithic architecture | for more than 5 years. | | However not only driven by DB performance, but also on | organizing hundreds of engineers they adapted microservice | architecture. Then they slowly migrating to per domain specific | DB, it is just classic microservice migration story. | | While single DB may bring us pretty long way, designing the | system into more discipline logical domain level segregation | will help when there's need to move to microservice. | | *looks like HN reader quite sensitive with microservice | migration comment, usually this kind of comment got down voted | easily. | collyw wrote: | Stack Overflow runs what is essentially a monolithic | architecture. Though they do have a few services, it isn't | what I would describe as a micro-service architecture. | | https://stackexchange.com/performance | pas wrote: | Monzo (UK bank) has 1600+ microservices, but mandates a | common framework/library and uses Cassandra. (Which is | basically a shared nothing, bring your own schema | "database".) | | So it makes sense to combine advantages of different | approaches. | mumblemumble wrote: | It's resume-driven development, and it's also entertainment- | driven development. Bringing in new technologies gives you a | chance to play with a new toy. That's an effective way to make | your job more interesting when the thing you're supposed to be | working on is boring. Which, in business applications, is more | often than not the case. | spaetzleesser wrote: | In today's job market resume driven development is a very | rational choice. I work in medical devices so we are pretty | conservative and generally way behind the cutting edge. This | makes it really hard to find jobs at non medical companies. I | would recommend anybody who has the chance to use the latest | and shiniest stuff to do so because it's good for your career | . | mumblemumble wrote: | Very good point. Seems like yet another example of how | carefully optimizing all the individual parts of a system | can paradoxically de-optimize the overall system. | zdragnar wrote: | In big enough organizations, it is very easy to lose track of | who owns what, especially when it is those little ad-hoc | internal tools. Manually managing the infrastructure for them | is a recipe for them to become permanently enshrined in the | wasteland of "services we think we use, but do not maintain | because we don't remember who needed it or put it up or how to | configure it". | | K8s isn't the only answer, but if you are already using it for | your large applications, it isn't much work to reuse the | existing tooling and infrastructure, and now you st least have | the dockerfile as a reference if nothing else. | | OTOH, if you have an existing tooling setup / pipeline that is | _not_ K8s, there isn 't a good reason to use it for a small | application. | corty wrote: | Having a Dockerfile that copies a few binary blobs into an | age-old distro image isn't an improvement, it's a huge | liability. And most of that stuff that no one knows anything | about anymore is like that. Same as with an old VM or PM. | | I'd rather have that old crap as a physical machine. Why? | Because the hardware lifetime "naturally" limits the lifetime | of such applications. If the hardware dies, it forces a | decision to spend some money to keep it running or throw it | away, which, given that hardware is expensive, usually | results in throwing it away. | vlovich123 wrote: | Set up your docker file to be part of your CI so that your | binary blobs are built from source with regularity? That's | typically the solution I've seen work well. Manually | maintained stuff (especially for stuff that may not be the | thing everyone is primarily doing) generally doesn't scale | well without automation (speaking as someone who's seen | organizations grow). This is also true of "getting started" | guides. Can't tell you how much maintenance and run time | I've saved converting those wikis to Python scripts. | corty wrote: | Yes, of course. That would be ideal. That's what we do | for everything we can control. | | But as someone in the IT dept., far too often you get | some container that either was built by someone who long | left the company or an external consultant who got paid | to never return. Sourcecode is usually unavailable, and | if it is available, will only build on that one laptop | that the consultant used. The IT department gets left | with the instruction "run it, it was expensive" and "no, | you don't get any budget to fix it". That results in the | aforementioned containers of doom... | | Yes, I'm bitter and cynical. Yes, I'm leaving as soon as | I can :) | striking wrote: | I hate to raise a seemingly obvious point, but this | doesn't seem like a problem with Docker. | wdb wrote: | Can still remember the multinational were wanted to host | the webapplciation we developed from them at their data | centre. They wanted to charge us (not the business unit for | some odd reason) nearly EUR20.000 per year to host a RoR | web application. | | Ended up, hosting it ourselves, and in the last year they | were paying EUR100k per year for it. As we would just sell | the same setup for each deployment for their customers. | They probably been cheaper off to host it themselves. | giantrobot wrote: | In theory that works. In practice it rarely does. Docker et | al gained popularity because they made it way more | practical for projects to be managed as the world works | rather than as it should be, for good or ill. Before Docker | it was moving old applications to VMs and before that it | was running them in chroot horror shows. | yowlingcat wrote: | > the hardware lifetime "naturally" limits the lifetime of | such applications | | Oh, my sweet summer child. Of all the things that | "naturally" limits the lifetime of such applications, that | is not it. Consider the case of the mainframes running the | US banking system, for example. | waynesonfire wrote: | docker and k8s adoption can force a company that over years | has developed and perfected a standard practice of deploying | application, with a half-ass'd solution that ends up solving | the wrong problems and costing way more. The "shipping beats | perfecting" mantra is very much at play here. This is due to | the amount of time it would take to achieve parity. At the | end of the day, the new solution ends up looking like a step- | back. | | Such a practice combined with the mentality that software | engineers should do their own dev-ops can easily lead to an | environment of spaghetti applications where every developer | working on the new platform does things slightly different | because the replacement solution wasn't complete and had to | be addressed by countless band-aids by engineers across the | band of talent. | | Furthermore, for the features that were able to achieve | parity, you're now forcing your entire organization to re- | learn the new development process. The docs our abysmal and | the software engineers that developed the original solution | have moved on since the hard work of "productionalization" | remains and they're not interested in that. | akiselev wrote: | _> docker and k8s adoption can force a company that over | years has developed and perfected a standard practice of | deploying application, with a half-ass 'd solution that | ends up solving the wrong problems and costing way more._ | | docker and k8s adoption can also force a company that over | years has developed and perfected a half-ass'd solution to | deploying applications, with a single(-ish) source of truth | that ends up solving the right organizational problems | instead of the wrong technical ones (and ends up costing | way more, at least in the short term). | avereveard wrote: | Kubernetes isn't only about scaling. The repeatability of | deployment process is a great asset to have as well. | Silhouette wrote: | But there are much simpler ways than K8s to achieve | automated/repeatable deployments, if that is your goal. | objektif wrote: | Can you please name a few? | kitd wrote: | _I see people talking about using Kubernetes for internal | applications and I just can 't figure out why._ | | There is benefit in having established platforms for running | your code, and this is especially true for large orgs where the | people who run the systems are an entirely different group from | those that developed or assembled it. And people (+ their | skills) are what cost the most money in any business. | | It's true that many/most systems don't require a full | Kubernetes stack (for instance), but if a critical mass of the | business IT is going that way, doing the same with your own | makes sense from an economies of scale PoV. | mvanbaak wrote: | > There is benefit in having established platforms for | running your code | | You do know k8s is very new, there's a constant stream of | changes and updates to it, etc etc? it's not established. | It's known, but that's it. | thraxil wrote: | I think what they may mean is more that there is an | established platform within the organization. One that you | have expertise and experience with, | monitoring/backup/security tools that work with it, etc. | K8s might not have as long a pedigree as VMs, but if you | already have a setup to run K8s, people who know how to use | it, documentation and tooling that allow devs to run their | apps on it securely and efficiently, etc. it's pretty | reasonable to want to encourage devs to "just" use k8s it | if they want to stand up a new service rather than spinning | up whatever random collection of technologies that dev | happens to know better. | pas wrote: | 1.0 was released in 2015. There are stable LTS vendors for | it. | | It's pretty established. And much saner than cobbling | together Ansible/Puppet/Chef playbooks for everything. | majewsky wrote: | Saying that 2021's Kubernetes is established because 1.0 | was released in 2015 is like saying that 1991's Linux is | stable because Unix had existed for 20 years at that | point. Kubernetes 1.0 and 1.20 share the same name, | design principles and a certain amount of API | compatibility, but it's impossible to take a nontrivial | application running on 1.20 and just `kubectl apply` it | on 1.0. Too much has changed. | | Kubernetes is _just now_ entering the realm of "becoming | stable". Maybe in five years or so it'll finally be | boring (in the best sense of the word) like Postgres. | collyw wrote: | Speaking as someone with 20 years in the industry, what you say | is correct. Most applications would be find on a single server | and a classic LAMP stack. But that ain't cool these days. | ccmcarey wrote: | I think it's easier to run a small k8s cluster than it is to | attempt to recreate a lot of the functionality provided | manually, especially if you're running in a cloud where the | control plane is handled for you. | | It provides unified secrets management, automatic service | discovery and traffic routing, controllable deployments, | resource quotas, incredibly easy monitoring (with something | like a prometheus operator). | | Being able to have your entire prod environment defined in | declarative yml is just so much better than running something | like a mishmash of ansible playbooks. | | If your application runs on a single host and you don't really | care about SLAs or zero downtime deploys, sure, use some adhoc | deploy scripts or docker compose. Any more than that, and I | think k8s pays for itself. | jayd16 wrote: | Agreed. What is it that people are doing (or not doing) where | a simple managed k8s cluster is more work than the minimum | way to do this? | | Are teams not even setting up automated builds and just | sshing into a box? | hectormalot wrote: | For me that's Heroku. I just push my app (RoR) and done. | I've actually moved it once to k8s for a few months and | decided to move it back after I understood how easy heroku | made the opa side of the business. | | Note: it's a side project, 50 business users, $5k annual | revenue, 4h per month as target for the time spent on | customer support, admin and maintenance. So it's both easy | to pay heroku and important for me to not spend too much | time on Ops. | Silhouette wrote: | Very many successful applications can indeed run on a single DB | server (modulo redundancy in case of failures). Vertical | scaling isn't trendy, but it is effective, until it's not. | | I have yet to encounter a real situation where it _suddenly_ | became impossible to run a production DB on a single, high-spec | server, without knowing far enough in advance to plan a careful | migration to a horizontally scaled system if and when it was | necessary. | vlovich123 wrote: | I'm not saying in all companies, but as you grow you have lots | of different teams with different needs. So then you spin up a | tools team to manage the engineering infrastructure since you | can't do it as-hoc anymore (CI, source control, etc). So to | make that team more efficient, you let them force one size fits | all solutions. While this may feel constraining for a given | problem domain, it actually makes engineers more portable | between projects within the company which is valuable. Thus | having one DB or cloud thing that's supported for all teams for | all applications is valuable even if sometimes it isn't | necessarily the absolute best fit (and the complexity is | similarly reduced as good companies will ensure there's tooling | to make those complex things easy to configure in consistent | ways). Your tools team and the project team will work together | to smooth out any friction points. Why? Because for larger | numbers of engineers collaborating this is an efficient | organization that takes advantage of specialization. A | generalist may know a bit about everything (useful when | starting) but a domain expert will be far more equipped to | develop solutions (better results when you have the headcount). | gowld wrote: | Kubernetes is a container system, mostly orthogonal to 3- or | 4-tier application design. | h0l0cube wrote: | And typically a single DO droplet would suffice for a toy | project or POC, for which Ansible is probably the more | expedient option. But maybe they're not in a rush, and | learning K8s is just another feather in their cap . | barrkel wrote: | Kubernetes is for when you need to allocate CPU like you | allocate RAM, _and_ you don 't want to be tied to a higher | level API sold by a vendor. | smoyer wrote: | > I see people talking about using Kubernetes for internal | applications. | | I think the important issue when first starting a project is to | create a "12 Factor App" so that if and when you create a | Docker image and/or run the application in Kubernetes, you | don't have to rewrite the entire application. Most of the tools | I write run on the CLI but I am in fact a fan of Kubernetes for | services, message processing and certain batch jobs simply | because I don't have to manage their life-cycles. | Deadron wrote: | 12 factor apps sacrifice performance and simplicity of your | environment for scalability. Unless you are guaranteed to | start with a worldwide audience its complete overkill. A | better solution is to write your application with the rules | in mind with the goal of making it easy to transition to a 12 | factor style app when its needed. Scale up then scale out | will result in the best performance for your users. | vp8989 wrote: | The thinking on efficiency vs scalability is largely | influenced by the US tech company meta where founders give | up equity so they don't have to worry about profitability | for a very long time. | | In that case, it is preferred to burn piles of cash on AWS | instead of potentially needing to sacrifice revenue because | you can't scale quickly enough. | | An architecture that is not scalable is considered a | failure whereas one that is complex and inefficient is much | more tolerated (as long as it can scale out) ... at least | until the funding dries up or Wall Street activists get | involved. | sudhirj wrote: | The 12 factors are mostly common sense that apply in pretty | much any situation - they help with fancy deployments but | also with single servers on DO or even on-Prem servers. | Turbots wrote: | For one application, kubernetes is obviously giant overkill. | Companies with hundreds and thousands and applications need a | platform that can offer standardised deployment pattern, | runtime pattern and scaling pattern for those apps. Kubernetes | is a great start for managing all those containers in a | consistent, secure, multi tenant environment. I know customers | with literally 10000s of VMs that are moving to more cloud like | environments, either on premise in their own datacenter or in | the public cloud. They need these kind of platforms to automate | away the repetitive stuff and have a secure posture, throughout | the company. | rc_hackernews wrote: | I haven't worked at a FAANG or any other company even close to | that level of scale, so you can take me with half a grain of | salt too. | | But what you said is absolutely true. It's also something you | will very much experience once you start working | professionally. | | I'm in no position to give you advice, and I think I might be | giving advice to myself...just don't let it get to you. | btilly wrote: | I have worked at FAANGs before. | | I am in firm agreement. I think that far too many people are | trying to solve the problems that they wish that they had, | rather than making it easy to solve the ones that they do | have. Going to something like Kubernetes when there is no | particular reason for it is a good example of that trend. | | When you really need distributed, there is no substitute. But | far more think that they need it than do. | laurencerowe wrote: | Micro-service architectures are an absurd overcomplexity for | smaller internal apps. They only really make sense when a | monolithic system becomes too large for a single team to | manage, at which point the micro-service boundaries reflect | team boundaries. | SPBS wrote: | If you are going to allocate sharded databases per client with | identical schemas, might as well give each of them an sqlite | database? Since you're massively cutting down the writes if it's | one database per customer. | brandmeyer wrote: | And then _just one_ query comes along where you need to make an | update that should be globally visible, and not just visible to | the shard. | | I can see why you would stick with full Postgres for as long as | practicable. | bmcahren wrote: | Avoiding sharding and complex replication is very smart to | postpone as late as possible with any database (mysql, postgres, | mongodb). It can be very fragile or fail in unexpected or unusual | ways and most importantly it can take _much_ longer to fix. E.g. | 18 hours instead of 2 hours of downtime. | evanelias wrote: | Once your data grows very large, a successfully-implemented | sharding solution actually improves availability, rather than | reducing it. | | With a huge monolithic database, a failure causes downtime for | your entire product/company. Replica cloning and backups are | slow. Major version upgrades are stressful because it's all-or- | nothing. | | With a sharded environment, a single shard failure only impacts | a portion of your userbase, and smaller databases are faster to | perform operational actions on. | | There are definitely major downsides to sharding, but they tend | to be more on the application side in my experience. | codyb wrote: | Wouldn't sharding generally result in overlap between the key | ranges so that a database shard going down doesn't have to | result in any downtime? | | Then your issue is replication of writes I suppose. Probably | depends on use case what configuration you choose. | Tostino wrote: | Totally depends on how you shard. In my case (b2b), i'd be | sharding by tenant. Having a single tenant go down would | not have the same impact as every single tenant going down. | wejick wrote: | Honestly I never found the case when this happens, data | always falls into 1 shard according to the key. Then comes | the concept of shard replica where the shard can live in | several nodes and form a redundancy. | | However I'm noy sure how usually it's being setup on | Postgres | evanelias wrote: | With a typical sharded relational database setup, each | sharding key value maps to exactly one shard. | | There should be replicas of that shard, which can be | promoted in case of a master failure. But in rare cases all | replicas may also be degraded or inconsistent and therefore | non-promotable. When this happens to a giant monolithic | non-sharded database, the impact is far more catastrophic | than when it happens to a single smaller shard. | | In any case, replication is a separate concern from | sharding. Each sharded database replica set (traditionally | 1 master + N replicas) has a data set that is unique / | independent of other sharded database replica sets. | | That said, some of the "NewSQL" distributed databases may | arrange their data in more complex ways. [Edit to clarify: | I mean in respect to the _combination_ of both sharding and | replication. With a traditional sharded relational | database, you have replica sets where each replica contains | the exact same shard or shards; with a NewSQL distributed | DB, data may be replicated in a more complex arrangement] | zinclozenge wrote: | Most, if not all, commercial newsql distributed databases | do range based splitting of the data, with each range | managed by a raft group. Raft groups get migrated between | nodes to avoid hot spots, among other scheduling | criteria. TiDB does this for sure, I'd be surprised if | CockroachDB (and yugabyte and dgraph) doesn't do it. | bmcahren wrote: | Shards increase the number of failure modes and increase the | complexity of those failure modes. For _most_ businesses, the | recommendation holds true... keep it simple, don 't shard | until you _need_. | | I find it somewhat concerning that MongoDB has a better | architecture for upgrades than Postgres. You add a replica to | the cluster running the new major version and then switch | that replica over as your primary once you've replaced enough | instances in the cluster. Having worked in Oracle and MySQL | for years then having switched to a company with a MongoDB | framework I forgot how stressful upgrades would be with such | archaic limitations. | evanelias wrote: | > Shards increase the number of failure modes and increase | the complexity of those failure modes. | | I would only agree with this during the initial | implementation of sharding. Once deployed and stable, I | have not found this to be the case, at all. | | I say this as someone who has directly architected a | sharded database layer that scaled to over a trillion rows, | and later worked on core automation and operations for | sharded databases that scaled to an incalculable number of | rows (easily over 1 quadrillion). | | In both cases, each company's non-sharded databases were | FAR more operationally problematic than the sharded ones. | The sharded database tiers behave in common ways with | relatively uniform workloads, and the non-sharded databases | were each special snowflakes using different obscure | features of the database. | | > keep it simple, don't shard until you need | | I would have wholeheartedly agreed with this until a few | years ago. Cloud storage now permits many companies to run | monster 10+ TB monolithic relational databases. Technically | these companies no longer "need" to shard, possibly ever. | But at these data sizes, many operations become extremely | painful, problematic, and slow. | throwdbaaway wrote: | One of those companies is facebook right? I think this | blog post provides a much more balanced view: | http://yoshinorimatsunobu.blogspot.com/2017/11/towards- | bigge... | numlock86 wrote: | I like how the article starts with the metrics of what a single | node can already do. The trend appears to be to scale right at | the start, even before you have your first real customer ... the | implications and results are obvious. | AdrianB1 wrote: | I have experience with other relational DB products, but the | principle should be similar; I have a few databases over 1 TB, | but scaling to tens of TB would require more RAM than what a | typical 1S or 2S can support. CPU's are not that problematic with | the huge number of cores in AMD Epyc, but RAM is a serious | limitation, in my world I need between 15% and 60% RAM to | database size ration, depending if it is transactional, reporting | or somewhere in between. Taking a 50 TB database as an example, | it is too much for a 4TB RAM 2 socket Epyc system. | beck5 wrote: | When it comes to replication & automatic failover in PG what is | the 2021 gold standard setup? | __s wrote: | pg_auto_failover is good | https://www.citusdata.com/blog/2019/05/30/introducing-pg-aut... | | Disclosure: I work for Citus | ksec wrote: | In 2016 I was expecting some sane defaults like MySQL would | have arrived by 2018 or 2019. | | Looks like 2021 isn't that much different to 2016. There are | work being done, but doesn't seems to be anywhere close to the | level of MySQL. | dijit wrote: | wow, are you kidding? MySQL replication is possibly the worst | I've ever seen. | | There is almost no consideration for the target being up to | date, I have personally experienced missed inserts, | replication lags, and the replica being set to read/write. | | PGs integrated replication is far superior, it even has | support for bootstrapping a new replica node without rsync | and a command to `promote` the replica to write master. | claytonjy wrote: | What about Percona? I haven't used it, but a common | sentiment on HN has been that Postgres doesn't have | something quite on par with it. | | Citus and, more recently, Patroni, seem to be the dominant | Postgres analogues; have they caught up? Where do they | dominate? | evanelias wrote: | Replication issues in modern MySQL are caused by user | error, not inherent bugs. For example if you're configuring | replication manually, you must ensure that replicas are set | to read_only. Clustering options introduced over the past | few years do this automatically though. | | > PGs integrated replication is far superior, it even has | support for bootstrapping a new replica node without rsync | and a command to `promote` the replica to write master | | MySQL has this functionality too, e.g. | https://dev.mysql.com/doc/refman/8.0/en/clone-plugin.html | and https://dev.mysql.com/doc/mysql-shell/8.0/en/working- | with-re... | | To be clear, I am not claiming MySQL is superior to PG. | However, I have found that many PG users are unfamiliar | with the huge improvements introduced in MySQL over the | past ~8 years. | hintbits wrote: | Use built-in physical replication, it works and it's good. | | Patroni is widely used for automatic failover, at least where | you don't want a possibility of a split brain. | mmacedo_en wrote: | I'm building an app using Postgres for the first time. Naturally | I was a bit worried about performance and scaling if the not | launched yet app becomes a major success. | | I began simulating a heavy use scenario. 100k users creating 10 | records daily for three years straight. | | 100000 x 10 x 365 x 3 ~= 1 billion rows or about 200 GB with a | record's length of 200 bytes. This is peanuts for modern | databases and hardware. | | Seems like a single node can support it for a long way before I | have to worry about performance... | majewsky wrote: | You classifying 11 writes per second as "heavy use" reminds me | of how people on average completely underestimate how fast | computers actually are (when they're not bogged down by crappy | programs). | mixmastamyk wrote: | I don't believe the grandparent's simulation actually took | three years, it was likely operations with a particular data | size that was tested. | | Still, your main point stands. Around 2001 I wrote a | C-program to record every file and size on a large hard disk. | We were all amazed that it finished (seemingly) before the | enter key had come back up. Must be a bug somewhere, right? | Nope. | | Much earlier I wrote a Pascal program on a 486 in school that | did some calculations over and over again, writing the output | to the screen. It blew my mind then how fast the computer | could do it. | sharadov wrote: | Over- engineering has become a disease, and has reached pandemic | -level proportions, primarily driven by resume-padders and | clueless management folk. | marcus_holmes wrote: | TLDR: As always, don't fix a problem until just before you have | it. | fabian2k wrote: | One point I found very interesting was the following paragraph: | | > For example, on my (pretty average) workstation, I can do ca. | 25k simple read transactions per 1 CPU core on an "in memory" | pgbench dataset...with the default config for Postgres v13! | | In my own very unscientific experiments I never got values as | high as that, but in the area of around 4k transactions per | second total on multiple cores. Of course I'm comparing very | different things, I was looking at more than just the database | and there are lots of other aspects I probably did in a different | way. | | I find this interesting as it probably means that my bottleneck | wasn't entirely where I thought it was. I have to play around | with that some time again. | | If I wanted to find out the baseline Postgres performance on my | hardware for trivial read queries like looking up individual rows | by primary key with a database that fits entirely in RAM, how | would I do that? I know there is pgbench, but that performs more | complex queries as far as I understand. | ahachete wrote: | As this is a public reference: GitLab's Postgres cluster | handles peaks of 300K tps, where the master node alone supports | around 60K-80K. And this is not in-memory (datasize in the | order of 8TB, RAM 600GB). | | https://about.gitlab.com/blog/2020/09/11/gitlab-pg-upgrade/ | | And there's still room for vertical scaling. | | Disclaimer: we provide Postgres support for GitLab. | mmacedo_en wrote: | HN is missing an user follow and user tagging button :-) | ahachete wrote: | If that's for me, I really appreciate that :) | | There's obviously that option on Twitter (same username as | here), if you want to follow there ;) | __s wrote: | You can give pgbench files to have it run the query you want | (see -f flag) | | https://www.postgresql.org/docs/current/pgbench.html | wilsonrocks wrote: | This was an interesting read for a database novice. It seems like | a lot of the quoted stats are about in memory datasets - is that | realistic? | dap wrote: | It depends a lot on what you're doing. In my last role, keeping | the Postgres database in memory was absolutely not an option, | and we ran into major issues related to physical I/O. | bmcahren wrote: | Yes. Memory can reach 768GB on a single instance today and I | imagine that to expand. From there you can scale by sharding. | | In memory provides real-time transactions you can't guarantee | when using disk-based storage. | ahachete wrote: | EC2's u-24tb1.metal is 224/448 cores/hyperthreads and 24TB | RAM ;) | | https://aws.amazon.com/sap/instance-types/ | wiredfool wrote: | And nearly 2 million for a 3 year reserved instance. | atonse wrote: | This reminded me of a story from 15 years ago. | | I once worked for a company that was writing a proposal | for a US Homeland Security IT system. This was 2006. I | wasn't involved in it but my office-mate was. He randomly | turned his chair around and said "hey, can you go on | Dell.com and see if you can build the most expensive | server imaginable" - so I did and I ended up at around | $350k. I don't remember what it was, but at the time the | stats just felt absolutely obscene. I do remember that, | at that scale we were buying a closet, not just one box. | | And I told him the price, and he said, "They have set | aside $20 million for server hardware, and what you | configured is way way overkill for what they think they | need" - so we were both so damn perplexed because, at | least from what he told me, this didn't need that much. | And it wasn't one of those classified projects where they | couldn't tell you what it was for. It was a pretty boring | line of business type thing (like HR or building access | control or something). | | Maybe that's probably more a story of just how much cash | was being poured into that agency during the years after | 9/11. | jarym wrote: | They got to the end of their financial year and had $$$ | left to spend... happens all the time. | dekimir wrote: | > In memory provides real-time transactions you can't | guarantee when using disk-based storage. | | This is changing as we discover better ways of coding IO | operations. | | https://www.globenewswire.com/news- | release/2019/11/05/194114... | | https://itnext.io/modern-storage-is-plenty-fast-it-is-the- | ap... | bcrosby95 wrote: | How long does it take to get the data off disk and into | memory after coming online? A decade ago filling just 64GB of | memory with our hot dataset was a painful process. I can't | imagine it's any nicer with 768GB. | fabian2k wrote: | You can get more than that on common servers, see e.g. the | recent post by Let's Encrypt on their new database server: | | https://letsencrypt.org/2021/01/21/next-gen-database- | servers... | | They have 2TB RAM in there, and I suspect that is not the | largest possible amount if you're willing to spend more money | (though probably the largest possible amount for that | particular server). | | I played around a bit on the Dell website, and the largest | server I could find supported up to 6TB RAM, with a price at | ~300k EUR (I assume nobody pays list price for these). | tyingq wrote: | 4TB seems to be the limit for normal-ish servers. 32 slots | filled with 128GB DIMMS. You can find servers with more | slots (48 is common, 6TB), but you're going past most | people's definition of "commodity". | [deleted] | 0x10c0fe11ce wrote: | Yes. That's why the OP talked about separating hot/cold data | and scale up instead of scale out. | f430 wrote: | anybody know if this applies to Temporal Tables? I've thought of | using Datomic but it seems like Temporal Tables does the job, not | sure about immutability and how that might impact performance | because you are now dealing with keeping a timestamp record of | every transactions. | ComodoHacker wrote: | It's not surprising to hear such advice from PostgreSQL | consultancy shop: don't bother and relax until the day you | desperately need PostgreSQL consultancy! /s | | Seriously though, the main point stands. PostgreSQL isn't | designed with "distributed" in mind, so try vertical scaling | tricks first. | [deleted] | mumblemumble wrote: | The skepticism is founded, but there's some truth. | | PostgreSQL (and traditional RDBMSes) aren't built with | horizontal scaling in mind. That's both a blessing and a curse. | The curse is obvious. The blessing is that it means that they | will let you get _much_ further with vertical scaling. | Especially if you take the time to learn how to tune them. | | Support for horizontal scaling comes at a cost. When I've done | performance analyses, I regularly find that parallel and | distributed implementations spend most their wall clock time on | synchronization and communication. Occasionally more than 90%. | That creates a big up-front cost you need to overcome before | you see a net benefit. It also leads to a sort of horizontal | scaling version of the rocket equation where returns can | diminish rapidly. | ccleve wrote: | The difficulty with this advice is that it assumes that you have | many small transactions. | | Yes, of course, you should try to build your application so | queries and transactions are very short. That solves a great many | problems. | | But sometimes you can't. Sometimes you just have to do joins | across large tables. There just isn't any other way. Your query | is going to run for 5, 10, maybe 30 seconds. That's a huge burden | on your server, and it will slow down the other tenants. In that | case, the _only_ answer is to distribute the queries across other | boxes. | | I agree with the advice in general -- delay moving to a | distributed system as long as possible -- but sometimes you just | can't. | btilly wrote: | Point #1, there is a world of difference between a reporting | database and a transactional database. If you need a reporting | database, ship logs and set up a reporting database separate | from your transactional one. That solves most of the problem. | | Point #2, the fact that you've hit performance problems does | not mean that you need to distribute. Every real system that | I've seen has had order of magnitude performance improvements | left when it first looked like it was topping out. Add a | caching layer, have better indexes, more targeted queries, | appropriate use of window functions, etc. Give those a try | before biting the bullet on distributed data. | | Point #3, here is a good rule of thumb for distributed systems. | Whatever scale you hit on a single machine, you can probably | gain 1-2 orders of magnitude of performance by switching to a | faster language and carefully optimizing. If you switch to | distributed, you'll LOSE at least an order of magnitude | performance due to the overhead of RPCs, but are able to scale | indefinitely afterwards. | | If you're distributing for reliability, great. But if you're | distributing for performance and you have less than 20 machines | in your system, either your problem is embarrassingly parallel | or you likely aren't achieving a net win. | | I've seen a lot of people prematurely distribute, run into | performance challenges, solve them, then pat themselves on the | back for being smart enough to have distributed their code. | While failing to recognize that they were addressing a self- | inflicted injury. | ajsharp wrote: | My thoughts exactly on point #1. Nothing in a hot path should | take multiple seconds. | giantrobot wrote: | Postgres has materialized views for the case of expensive | queries. Instead of doing a view query in real-time it | physically caches the results and has the same accessibility as | a regular view. | | They don't solve all situations with expensive queries but they | help a lot. The fact they behave like dynamic views means you | can migrate a dynamic view to materialized if you run into | performance issues without making changes to the client | application. With views in general you can shuffle around | tables and relationships without the client application knowing | or caring. | mmacedo_en wrote: | I've seen queries running for 1 minute 2 minutes raising user | complaints. Then we looked at it, and with a few changes in | indexes and query hints brought it down to sub-second | execution. | | Before thinking about distributed systems, there is an entire | database optimization toolkit to make use of: primary key | review, secondary index creation, profiling, view or stored | procedure creation, temporary tables, memory tables and so on. | ozkatz wrote: | This. Using any database requires building up an expertise | and understanding how to use its capabilities properly. If | you hit a wall with a non-distributed database and your | solution is to replace it with a distributed one - you will | have a bad time. The surface area of what you need to know to | use it properly still includes your basic data modeling, | indices, troubleshooting that was required before, with a | whole lot of networking, consensus protocols and consistency | models to worry about (just to name a few). | CharlesW wrote: | > _Then we looked at it, and with a few changes in indexes | and query hints brought it down to sub-second execution._ | | This is exactly what the parent comment said: "you should try | to build your application so queries and transactions are | very short". | | If you're claiming that the parent is incorrect about | "sometimes, the only answer is to distribute the queries | across other boxes", my guess is that probably don't work at | a scale where you've learned that query optimization can't | solve every database performance problem. | btilly wrote: | Based on my past experiences, I'd be happy to take an even | money bet that more than 95% of organizations that go | distributed for performance reasons actually caused | themselves more problems than they solved. | | This does NOT mean that there are no use cases for | distributed - I've seen Google's internals and it would be | impossible to do that any other way. But it does mean that | when someone is telling you with a straight face that they | needed horizontal scalability for performance, you should | assume that they were probably wrong. (Though probably | saying that is not the wisest thing - particularly if the | person you're talking to is the architect whose beautiful | diagrams you'd be criticizing.) | | So yes, there are problems that require a distributed | architecture for performance problems. That doesn't | contradict the point that every other option should be | explored first. | CharlesW wrote: | > _Based on my past experiences, I 'd be happy to take an | even money bet that more than 95% of organizations that | go distributed for performance reasons actually caused | themselves more problems than they solved._ | | Oh, I'm absolutely positive you're right! And as you | know, even in distributed architectures there's a huge | range of solutions from "let's do these _n_ reasonably | simple things " to "let's just rebuild our system to | solve every conceivable future problem". I don't know of | a scenario where the latter has ever worked. | jeffbee wrote: | Indexes are not free, they take up space and they make | mutations more costly. Also, building the index may not even | be possible while your application is running, because | postgresql and other RDBMS have inadequate facilities for | throttling index construction such that it doesn't harm the | online workload. You might have to build indexes at midnight | on Sundays, or even take your whole system offline. It can be | a nightmare. | | This isn't just a problem for SQL databases. Terrible-but- | popular NoSQL systems like MongoDB also rely heavily on | indexes while providing zero or few safety features that will | prevent the index build from wrecking the online workload. | | I personally prefer databases that simply do not have | indexes, like bigtable, because they require more forethought | from data and application architects, leading to | fundamentally better systems. | btilly wrote: | I've seen https://www.postgresql.org/docs/12/sql- | createindex.html#SQL-... work well in practice on busy | transactional databases. I'd be interested in knowing about | cases where it doesn't work well. | | My experience on systems without indexes differs strongly | from yours. Yes, they can work well. But if you have | multiple use cases for how your data is being queried, they | push you into keeping multiple copies of your data. And | then it is very, very easy to lose consistency. And yes, I | know about "eventual consistency" and all that - I've found | that in practice it is a nightmare of special cases that | winds up nowhere good. | jeffbee wrote: | Just from personal experience, if they can build gmail on | top of a database (bigtable) having neither indexes nor | consistency, then probably it will also be suitable for | the purposes of my far smaller, much less demanding | products. | | On the other hand I've seen, and am currently suffering | through, products that have desperate performance | problems with trivial amounts (tens of GB) of data in | relational databases with indexes aplenty. | mcherm wrote: | > Sometimes you just have to do joins across large tables. | There just isn't any other way. Your query is going to run for | 5, 10, maybe 30 seconds. | | While that is true, I would speculate (based on my own | experience, not any actual research) that it is far more common | that users have an unoptimized query or database schema | (perhaps it just needs an index to be created) which is taking | tens of seconds to run but doesn't NEED to, than it is that | users have a need for a complex query that cannot run faster | than that. | | So for MOST users, the best advice is to learn how to use | analyze query and other database tools to optimize execution. | Only if you ALREADY know that should you be considering moving | to a read replica or some sort of distributed database. | cliftonk wrote: | You could always start with postgresql and use the transaction | log to materialize views for particular use cases (ie | differential data flow). This post is clickbaity in that it | doesn't give any administration advice but right in the broad | sense that postgres/mysql are still the best low-latency single | sources of truth available | petr25102018 wrote: | For anyone interested in general advice regarding scaling | traditional SQL databases, check out my article [0] where I | collected various techniques when I investigated this topic in my | last workplace. | | [0] https://stribny.name/blog/2020/07/scaling-relational-sql- | dat... ___________________________________________________________________ (page generated 2021-01-26 23:00 UTC)