[HN Gopher] Updating a 50 terabyte PostgreSQL database (2018)
       ___________________________________________________________________
        
       Updating a 50 terabyte PostgreSQL database (2018)
        
       Author : Maksadbek
       Score  : 99 points
       Date   : 2022-01-13 17:06 UTC (5 hours ago)
        
 (HTM) web link (www.adyen.com)
 (TXT) w3m dump (www.adyen.com)
        
       | jfbaro wrote:
       | Thanks for sharing this. Really interesting. But a basic
       | question, why not upgrading to PG 13 instead? I am curious about
       | the reasons for staying on an older version of PostgreSQL.
        
         | jfbaro wrote:
         | I see it now that it's a 2018 article. Makes sense!
        
         | hawk_ wrote:
         | Interesting. Does PG13 let then do this without downtime?
        
           | CubsFan1060 wrote:
           | Not from 9.4, but now there is a path using logical
           | replication between versions:
           | https://www.percona.com/blog/2019/04/04/replication-
           | between-...
        
       | simonw wrote:
       | I didn't understand how they avoid downtime during the upgrade -
       | they start with "Stop traffic to the database cluster" and then
       | perform a bunch of steps that each take 5-10 minutes or longer
       | before starting traffic again - so do they just accept the
       | downtime and schedule it in advance, or did I miss a detail?
        
         | rattray wrote:
         | Yeah, they architect their application to accept DB downtime -
         | but I'm sure their services are still degraded to some degree
         | or another during this, and they aren't clear how much total DB
         | downtime they need for this (and how that time scales across
         | various axes).
         | 
         | Overall my takeaway is basically "if you want to upgrade a
         | large Postgres db, you'll need like an hour of planned downtime
         | and a lot of careful work" which... doesn't make me excited to
         | upgrade big postgres db's in production.
        
           | znpy wrote:
           | > Overall my takeaway is basically "if you want to upgrade a
           | large Postgres db, you'll need like an hour of planned
           | downtime and a lot of careful work"
           | 
           | HA is one of those things where MySQL wins hands down, sadly.
           | Sadly in the sense that PostgreSQL HA still looks like a
           | couple of hacks held together with duct tape, at least when
           | compared to MySQL solutions.
           | 
           | The Percona MySQL distribution has multi-master HA (Percona
           | Xtradb Cluster) which is great for HA, and vanilla MySQL 8.x
           | has group replication which iirc also is multi-master.
           | MariaDB instead has multi-master based on Galera Cluster
           | (although i'm not very knowledgeable about MariaDB).
           | 
           | In PostgreSQL world there are many HA solutions, none of
           | which is both multi-master and open source (sadly).
        
           | rattray wrote:
           | I did a little googling, and still find the answers are
           | "logical replication, which is slow for large DB's" or
           | "pg_upgrade, which requires nontrivial downtime".
           | 
           | Gitlab wrote a much more in-depth rundown in 2020 of a
           | pg_upgrade, which in their case took 2h of downtime and 4h
           | total time: https://about.gitlab.com/blog/2020/09/11/gitlab-
           | pg-upgrade/ (submitted to HN here:
           | https://news.ycombinator.com/item?id=29926316). They included
           | a video of the whole thing which is pretty incredible.
        
         | gbrown_ wrote:
         | Whilst others have mentioned their application is architected
         | to cope with this I wanted to echo the sentiment. This post
         | feels very much like a "draw the rest of the owl" meme.
        
         | vonnieda wrote:
         | > One other detail to note is that we built our software
         | architecture in such a way that we can stop traffic to our
         | PostgreSQL databases, queue the transactions, and run a
         | PostgreSQL update without affecting payments acceptance.
        
           | foobiekr wrote:
           | That's downtime by any reasonable measure depending on the
           | exact specifics. Is writing to a journal while the DB is down
           | "downtime"?
           | 
           | Site wise no but DB-wise yes.
        
       | throwaway29303 wrote:
       | (2018)
        
         | wiz21c wrote:
         | was wondering why they were using old PG's :-)
        
       | abagheri43 wrote:
        
       | abagheri43 wrote:
        
       | LoungeFlyZ wrote:
       | Irony. I think the HN post took down the Slony website.
        
         | mulmen wrote:
         | http://howfuckedismydatabase.com/postgres/slony.php
        
       | leifg wrote:
       | One of the things I always wonder with giant relational database
       | is. How much of the "typical relational stuff" are they actually
       | using?
       | 
       | Do they have constraints on rows? Are they using views or do they
       | just denormalize and duplicate? Do they use joins at all? Are
       | they even doing more than 1 thing in a transaction?
        
         | mulmen wrote:
         | My only experience with databases of that size is for data
         | analysis so yeah, constraints are relaxed. But even at that
         | point ideas like normalization are critical to extracting
         | performance out of large datasets.
         | 
         | Normalization _is_ a performance optimization. Denormalization
         | is a development shortcut. Neither is right or wrong but I
         | would be surprised if a 50TB OLTP database wasn't already
         | highly normalized.
         | 
         | If it isn't then my next guess is that it could be made smaller
         | or more performant if it was.
         | 
         | We used to be proud of server uptime, back when we gave them
         | names. Today if you have a server up for 900 days you're going
         | to be accused of malpractice.
         | 
         | Similar for data. We used to be proud of being able to keep
         | "big" data online, but I'm no longer impressed. You're just
         | spending money. Did you actually solve the business case as
         | efficiently as possible given available resources? Do you
         | _need_ 50TB in the first place?
        
           | merb wrote:
           | > Normalization is a performance optimization
           | 
           | that is a stupid statement. because it's way too generic and
           | it depends on the use case. read heavy data that needs a lot
           | of joins are most often denormalized IF updating a lot is not
           | a problem. sometimes you need to create views that pull in
           | different stuff with different queries which would make them
           | not really performant especially not on postgres which is
           | just super slow when it comes to listing/filtering data.
        
           | Kranar wrote:
           | Normalization is pretty independent of optimization strategy
           | and generally that's not presented as one of its advantages.
           | Normalizing data can improve write performance but will make
           | reads slower, so if you have a write intensive database then
           | you will see some performance gains.
        
           | newlisp wrote:
           | Normalization improves data integrity.
        
       | paozac wrote:
       | 50TB is not so big these days. I read that in 2008 (!) Yahoo had
       | a 2+ PB PG database. What is the largest you know of, 14 years
       | later?
        
         | Gigachad wrote:
         | How are people dealing with databases this large? At work we
         | have a mysql db with a table that has 130M records in it and a
         | count(*) on the table takes 100 seconds. Anything but a simple
         | look up by id is almost unworkable. I assumed this was normal
         | because its too big. But am I missing something here? Are SQL
         | databases capable of actually working fast at 50TB?
        
           | itsthecourier wrote:
        
           | Something1234 wrote:
           | count(*) is always going to be slow. They don't store the
           | number of live tuples, just an estimate so it's a full table
           | scan. The secret is to use indexes to get down to a small bit
           | that you care about. If you're filtering on 3 columns, the
           | goal is to get the index to wipe out at least half the
           | results you don't care about and so on and so forth.
           | 
           | A 130M record table with no indexes is going to be crazy
           | slow. Although if all you need are primary key updates, then
           | that's the way to go.
        
         | ok_dad wrote:
         | Around ~2005 I took a tour of the [a well known government
         | organization] and they were bragging about several-PB-sized
         | databases at the time. Interestingly, there was a TON of server
         | racks there in a bomb-proof building with tons of security, and
         | they were all IBM servers (a supercomputer maybe?), if I
         | remember correctly. Also, there was one small server rack that
         | was painted differently from the rest (it looked like something
         | made in-house), and we asked what it was, and the tour guide (a
         | PhD computer scientist) said that technically it doesn't exist
         | and he couldn't talk about it even though it was super cool.
         | Now that I know what they were doing around that time (and
         | probably still today) I am kinda scared at the implications of
         | that tour guide's statement and what that one tiny rack was
         | for. I'm glad I never went to work in their organization, since
         | that tour was meant to recruit some of us a few years down the
         | road.
        
         | ddorian43 wrote:
         | Was it a single server?
        
         | GordonS wrote:
         | 50TB is big. Bigger is possible I'm sure, but I'd guess
         | 99.something% of all PG databases are less than 50TB.
         | 
         | If someone here commented they had a 2PB database, I
         | _guarantee_ someone else here would be like  "pfft, that's not
         | big"...
        
           | aantix wrote:
           | With 50TB, and if you were doing a full text search, wouldn't
           | the entirety of the index have to be held in memory?
        
           | Groxx wrote:
           | it's more than big enough to cause big problems / risk days
           | of downtime to change, yea. 50GB is not big. 50TB is at least
           | touching big - you can do it on one physical machine if
           | needed, but it's the sort of scale that benefits from bigger-
           | system architecture. 50PB would be world-class big, hitting
           | exciting new problems every time they do something.
        
           | mritun wrote:
           | The OP message could have better said that 50TB databases are
           | common these days when single metal or 24xl I3en or I4*
           | instance on AWS can hold 60T raw.
        
       | lelandbatey wrote:
       | This article mentions that a key piece of software they use in
       | all this is Slony[0], software which allows replicating one
       | primary/master to many secondaries/slaves, even when those
       | replicas have different versions. That's pretty cool, but I'd
       | like to instead draw your attention (dear reader) to the
       | _phenomenal_ writing happening in Slonys description of itself.
       | It describes, in plain but technical language what Slony is, then
       | immediately jumps into addressing  "why should you use this?" and
       | does so very directly. It reiterates the problem it solves,
       | mentions other potential solutions _by name_ to contextualize the
       | current solution space, then explains in a nice bulleted list
       | exactly the situations and niches that Slony is tailored for.
       | 
       | Oh my heavens, I wish every piece of technology described itself
       | this way!
       | 
       | [0] - Slony -- https://www.slony.info/
        
         | xblau wrote:
         | Archived link, since it appears to be down at the moment:
         | https://web.archive.org/web/20211031124351/https://www.slony...
        
       | [deleted]
        
       ___________________________________________________________________
       (page generated 2022-01-13 23:00 UTC)