[HN Gopher] How not to blow up the production database ___________________________________________________________________ How not to blow up the production database Author : pyrolistical Score : 43 points Date : 2021-10-15 18:23 UTC (4 hours ago) (HTM) web link (blog.battlefy.com) (TXT) w3m dump (blog.battlefy.com) | tnorgaard wrote: | Answer: Materalized Views. | | On a unrelated note: Still hoping for those automatically | refreshed materalized views in PostgreSQL, ala what VoltDB has. | marcosdumay wrote: | Given how well they work on any non-specialised DBMS, I prefer | Postgres to take their time and do it right (AKA, differently | from everybody else). | aledalgrande wrote: | Another thing I'm waiting for in Postgres is lifting and | decoupling from the connection limit... | sojournerc wrote: | TimescaleDB (psql extension) has these, specific to time-series | however. | | https://docs.timescale.com/timescaledb/latest/how-to-guides/... | BikiniPrince wrote: | Yes, views are nice, but there is also a fair concept of not | needlessly bogging down a table. Sure, they were making up | data, but a flat table with stats, profile data and other | easily external data is just bloat. Once you have an id then | static fields can be retrieved from other services/data stores. | tomrod wrote: | I'm not sure I am following. Aren't materialized views just | formal, cached results of a query? That wouldn't bog down a | table. | lazide wrote: | I think their point is more 'don't store all that junk in | your primary database and then do all your work on it there | too if you can just stuff it somewhere else'. Which has | pros and cons and depends a lot on various scaling factors. | manquer wrote: | Perhaps he means it will bog down on refresh. | tomrod wrote: | Maybe? Not sure. | ericHosick wrote: | > Still hoping for those automatically refreshed materialized | views in PostgreSQL, ala what VoltDB has. | | Not exactly what you're hoping for and you probably already | follow this pattern. pg_cron can help (and is now available in | AWS RDS). | | ```sql CREATE EXTENSION IF NOT EXISTS pg_cron; | | CREATE MATERIALIZED VIEW IF NOT EXISTS | activeschema.some_thing_cached AS ...; | | SELECT cron.schedule('some_thing_cached', ' _/ 5 _ * * *', | $CRON$ REFRESH MATERIALIZED VIEW some_thing_cached; $CRON$ ); | ``` | tibiapejagala wrote: | I think that the problem is when you have a materialized view | which takes hours to refresh. We are lucky that 99% of our | traffic is during 7-19 on weekdays, so we can just refresh at | night, but that won't work for others. | | I don't know much about how postgresql works internally, so I | just probably don't understand the constraints. Anyway as I | understand, there are two ways to refresh. You either refresh | a view concurrently or not. | | If not, then postgres rebuilds the view from its definition | on the side and at the end some internal structures are | switched from the old to the new query result. Seems | reasonable, but for some reason, which I don't understand due | to my limited knowledge, an exclusive access lock is held for | the entire duration of the refresh and all read queries are | blocked, what doesn't work for us. | | If you refresh concurrently, postgres rebuilds the view from | its definition and compares the old and the new query result | with a full outer join to compute a diff. The diff is then | applied to the old data (like regular table | INSERT/UPDATE/DELETE I assume), so I think you get away with | just an exclusive lock and read access still works. There are | two downsides to this, first that it requires a UNIQUE | constraint for the join, second that the full outer join is a | lot of additional work. | | I never had the time to test Materialize, but it seems to do | what I want with its continuous refresh. | | I also thought about splitting the materialized view into | two, one for rarely changing data and another one for smaller | part of the data which changes daily. Then I would only have | to refresh the smaller view and UNION ALL both materialized | views in a regular view. Not sure how well will that work | with postgres query planner. | nkozyra wrote: | If it's a one shot data compilation, you could use | something like postgres' NOTIFY to trigger a listening | external app. | kccqzy wrote: | I've been looking at Materielize for a while | (https://materialize.com/). It can handle automatically | refreshed materialized views. Last time I checked, it didn't | support some Postgres SQL constructs that I use often, but I'm | really looking forward to it. ___________________________________________________________________ (page generated 2021-10-15 23:00 UTC)