[HN Gopher] Zero-downtime schema migrations in Postgres using views ___________________________________________________________________ Zero-downtime schema migrations in Postgres using views Author : fabianlindfors Score : 82 points Date : 2021-06-16 18:39 UTC (4 hours ago) (HTM) web link (fabianlindfors.se) (TXT) w3m dump (fabianlindfors.se) | nartz wrote: | Very interesting. I like the idea of "virtualizing" the hard | schema with views to abstract away some things. | | View performance can be a thing at larger scale for OLTP | workloads, also, the solution you propose also adds complexity | since you have two schemas now instead of one, and as you rightly | point out, complexity with views themselves. The question becomes | when is this added complexity worth it? | johbjo wrote: | Views are equivalent to "query templates" unless for example | they do operations on subquery aggregate columns (for example | group by sum().) | | So views don't add complexity relative to giving the query | explicitly. It might even help since the incoming queries can | be smaller, thus less parsing/processing. | fabianlindfors wrote: | I think you hit the nail on the head regarding complexity. I'm | experimenting with creating a migration tool which handles all | the views for you. I believe some of the process can be | automated and the complexity reduced but remains to be seen how | well it works! | ThrustVectoring wrote: | I looked into this recently, and views cover like 98% of the | functionality that the client app needs from Postgres. One issue | I ran into was that Postgres forgets that the primary key is a | primary key when pulling from a view, which breaks some queries | that rely on grouping by the primary key. | | https://dba.stackexchange.com/questions/195104/postgres-grou... | has some more info on this | e12e wrote: | Doesn't DISTINCT work in this case? SELECT | DISTINCT ON (vt.id) row_to_json(vt.*) FROM vt JOIN | vy ON vt.id = vy.tid WHERE vt.id = 1 AND vy.amount | > 1; | ThrustVectoring wrote: | The SQL in question which was problematic for me (tables | renamed): | | SELECT posts.*, MAX(COALESCE(comments.created_at, | posts.created_at)) AS latest_timestamp FROM posts LEFT OUTER | JOINS comments ON posts.id = comments.post_id GROUP BY | posts.id ORDER BY latest_timestamp desc | | In short, this is sorting posts by the most recent comment, | with a fallback to the post date if the post has no comments | on it. Hard to get rid of the grouping here and get the same | data back. | e12e wrote: | I see, that makes more sense. | fabianlindfors wrote: | Interesting link, thanks! It would be really nice if Postgres | were to close that gap and make them fully equivalent (if that | is even possible). | munk-a wrote: | I recently transitioned a number of tables over to views as | part of a data model rearrangement and I absolutely loathed my | previous self that leveraged that primary key trick. I don't | think there is anything unsafe about them choosing to | transition to allowing any column singularly defined as a | unique index for the table to serve this role and it'd help | make things a fair bit more logical. | | That all said, until that happens, I'd strongly suggest | avoiding that functionality since it can lay down some real | landmines. | [deleted] | julian37 wrote: | This might work to some extent for renaming things but doesn't | for any other kind of migration I can think of: | | - Dropping a column doesn't work (assuming the point of dropping | it is actual deletion, rather than just hiding it) | | - Adding a column doesn't work either | | - Changing a column's nullability, default value, or data type | doesn't work | | - Doesn't help with adding or changing constraints or indexes | fabianlindfors wrote: | With this model, migrations would be two-stepped. First the | migration would be applied, creating a new schema and adding | the new information to the underlying table. Once the version | is old and no longer used by any clients, the migration would | be "deleted", removing the schema and the underlying | information. | | So when dropping a column it would go like this: 1. Migration | is applied. Columns is removed from the new view but remains in | the backing table. 2. Migration is no longer needed. Column is | removed from backing table. | | This design should be extendable to cover changing data type | (introduce a new column and moves values over + sync values | using trigger), changing default or changing the contents of a | full column. | | Constraints or indices are a bit trickier, but it might work to | create a new column duplicated from the old one and add | constraints to that instead. | kroolik wrote: | Adding a column, changing column's nullability and | adding/changing constraints is already zero-downtime in PG. | | Not sure about default value change, but I would also say its | zero-downtime as adding a column with default is zero-downtime. | Haven't checked, though. | | Dropping a column and changing the data type are not zero- | downtime. | | Renaming a column is not zero-downtime as you might have | multiple readers/writes using both. | mikeklaas wrote: | Adding a NULL constraint isn't zero downtime since PG scans | the whole table ensuring that the constraint is not violated. | dariusj18 wrote: | I don't remember which but there was a DB server I worked with | where the tables weren't directly accessible and required alias's | to be created, thus enforcing portability. | fabianlindfors wrote: | Author here, thanks for reading. As has been mentioned in some | comments, the article only covers half the story of how this can | be used for migrations. The greater idea is to have every commit | be short-lived and two-phased: apply and cleanup. A migration is | first applied and once it's no longer used by any clients, it's | cleaned up. | | 1. Apply: the new schema is created and any new data is added to | the backing table. | | 2. Cleanup: the schema is deleted and any data that is no longer | needed is removed from the backing table. | | For example, to perform a column deletion: | | 1. Apply: create new schema without the column which is being | deleted. Don't change anything in the backing table. | | 2. Cleanup: delete the schema and the column from the backing | table. | brycelarkin wrote: | I think views are great way of abstracting out business logic. | The one thing keeping me from using it more frequently is that it | doesn't work well with Row Level Security. | xemoka wrote: | I was having issues with this as well; it seemed my problem was | that I was creating the view as a superuser. Since view | creation is essentially WITH SECURITY DEFINER (to take a term | from UDFs) any user accessing the view to see the underlying | rls enabled table would see all the results (since `postgres` | user bypasses all RLS). | | Creating a separate "data owner" and "api owner" to own the api | and data schema+tables respectively alleviates this problem. | Your API schema owner != your table owner unless you `ALTER | TABLE FORCE ROW LEVEL SECURITY` in addition to just enabling | it. | | What issues are you running into with RLS? | kroolik wrote: | Interesting approach, with a pretty dark side-effect: the tech | debt silently creeping in and saying forever. Apart from that, | you can use any abstraction over the base schema, depending on | the use-case. Renaming a column can be as simple as renaming a | field in the DTO structure. | | The interesting part is when you actually get to pay the debt | back. Yes, this is not free lunch. You move fast, but you have to | eventually pay the cost of the underlying storage containing all | the historical schemas combined at once. | | This is the difference between carefully planned schema | migrations and abstraction-based migrations. The former takes | more time planning and executing, but the end state is the | reference schema. The latter keeps on extending the underlying | schema and building projections over it. Over time, the system | accumulates tech debt in the form of ever-expanding base schema | that contains all the historical columns. | johbjo wrote: | The point here is to do migrations live while clients randomly | connect and stay connected for random time. When all old | clients have disconnected, the old schema can be removed. | | The point is not to let old clients work indefinitely. | | In other versions of this idea, all views/procedures are | prefixed with some revision-number but that looks messy. | kroolik wrote: | But what happens when schema Xn adds a column N, and schema | Xn+1 removes it? The base schema, the source for all the | views will contain all the columns Xn... from all the | migrations until purged. | | That purging is what has a lot of operational complexity. And | renaming a column. All the rest is zero-downtime in PG | already. | johbjo wrote: | It would have to be in steps: | | Add Xn+1: create new views without column. When no Xn | clients remain: drop column from tables and drop schema Xn. | | The point as I see it is to not break live client | connections which expect the column to exist. | kroolik wrote: | Yes, this is a very important insight! I think the author | proposes this approach as replacement for regular schema | migrations, whilst both approaches complement one | another. | | You can use views to make migrations that were previously | tricky zero-downtime. | | If that's not the case, then I mist've read the article | wrong! | | Edit: although when I think of it - if you want to | eventually materialize old migration schemas into the | base schema, you need to do the rename, too. Which is not | zero-downtime because of new migration schemas that do | the renaming automatically. Meaning changing views' | definition, meaning lots of locking. | | So, you still need maintenance windows to merge all the | changes. Just not on every change. Otherwise the base | schema will then eventually be completely out of sync and | contain tons of old, unused columns. | fabianlindfors wrote: | You're right that the article doesn't mention how this | can actually be practically used for migrations, but that | is exactly what I had in mind. Maybe I'll write another | one some day about the two-phase migrations combined with | views. | kroolik wrote: | Looking forward to your future blog posts! The rename | thing got me really interested. | fabianlindfors wrote: | This is precisely the way I was imagining! I'm | experimenting with creating a migration tool which will | do this fully automatically. | zozbot234 wrote: | Views can perform arbitrary queries. They can achieve _any_ | kind of inference starting from the base data that 's | stored in tables. (Keep in mind that modern SQL is fully | Turing-complete.) This makes any view-augmented database a | viable source of rich, complex knowledge, not just mere | information. | kroolik wrote: | But I doubt unleashing the full expressive power of SQL | is the point here. It would easily turn a moderately | complicated "remove column" migration a real maintenance | hell. | | In the simplest form, when you add a column to some | schema, it should be materialized in the base schema and | exposed via the migration view. | | The problems start when you add and remove 20 columns | because even though they are no longer visible in | migration schemas, they take up space in the base schema | lallysingh wrote: | That sounds like problems upstream of this technique. Are | you complaining that it isn't a silver bullet or that | it's so powerful that it'll be abused? | kroolik wrote: | @lallysingh, sorry, can't reply directly. | | All I'm saying is that I don't think we need a full | Turing-complete cannonball to hit the (relatively) small | fly of no-downtime migrations. | | Is it a silver-bullet? It's Turing-complete so high | chances yes. But for me it has a high risk of causing a | silver-poisoning. | | Personally, I would stick with simpler solutions. | squiggleblaz wrote: | Butting in to say: Maximum reply depth is relative to the | view. Just click on the comment's time and you will get a | form with a reply field. | fabianlindfors wrote: | Yes, this approach might actually decrease the power of | your database as some things become much trickier. It | might be worth trading of some expressive power of the | database for smoother, less hands-on migrations though, | which is what I'm going for. | fabianlindfors wrote: | This is definitely a valid concern. My plan for building | migrations on top of this is to have older migrations | automatically be deleted as they become unused. Preferably you | wouldn't actually keep using the old schemas but rather keep | them as you roll out changes to your application servers. | Migrations could then be deleted (in reverse order) based on | some condition, for example time since the schema was created | or time since last query. | kroolik wrote: | But in order to support the "add column" migration you have | to add it to the base schema, right? It has to be | materialized somewhere. | | Now, when you remove a column, you just create a new view | without it being available. But the column in the base schema | must stay because: you support the old schema, and because | you want to avoid the operational complexity of physically | removing it, right? | | Now, you can obviously split a table into a list of tables | each storing the pk and some column value. But you have just | invented columnar storage in a row-based rdbms :D | | A different approach is to have scheduled maintenance window, | for example quarterly or every 6 month when you materialize | all the changes at once. The amortized operational cost is | way lower :) | luhn wrote: | I thought this wouldn't work with INSERTs and UPDATEs, but it | turns out newer versions of Postgres automatically support | updating simple views. [1] | | [1] https://www.postgresql.org/docs/12/sql-createview.html (See | "Updatable Views") | munk-a wrote: | The information on this is, admittedly, hard to dig out of the | primary docs but there also exists an INSTEAD OF trigger mode - | so you can CREATE TRIGGER "actually_save_stuff" INSTEAD OF | INSERT which will intercept an attempted insert on the table | and write the rows elsewhere. | | Generally I've found these useful in the short term for letting | application logic continue to work blissfully ignorant to the | fact that the "table" it is writing to has been moved, merged | or refactored into a set of tables that the trigger will bridge | logic over to. While in the long term I've found that the | automatic logic covers the most common case, a view that's | filtering out some rows for some reason (permissions - soft | deletion) before presenting it to the application logic. ___________________________________________________________________ (page generated 2021-06-16 23:00 UTC)