[HN Gopher] Ephemeral Postgres Databases ___________________________________________________________________ Ephemeral Postgres Databases Author : sac2171 Score : 29 points Date : 2021-11-17 01:05 UTC (21 hours ago) (HTM) web link (eradman.com) (TXT) w3m dump (eradman.com) | mborch wrote: | Another trick is to use (possibly nested) savepoints and rollback | after each test, never actually committing any data. | Foobar8568 wrote: | Sql server has snapshot database for something similar, and you | don't have to play with transactions (that may or not breaks | everything in case of errors) | ccakes wrote: | The catch with this is you can't test any behaviour with | deferred constraints. Any activity that doesn't take place | until the commit will never run. | mborch wrote: | Should be possible with "SET CONSTRAINTS". | pquerna wrote: | At a $previous_job I basically also did what the post is | describing. | | The "best" thing we did was actually using a "template | database": https://www.postgresql.org/docs/14/manage-ag- | templatedbs.htm... | | We would start a Postgres Process. We would create a new | database, run all of our migrations and basic data bring up. | Then we would create a new Database per Test Suite, using the | one we just ran migrations as the Template. | | This meant the initial bring up was a few seconds, but then | each test suite would get a new database in a dozen | milliseconds (IIRC). | mborch wrote: | That's probably more or less the same thing in terms of what | actually happens in PostgreSQL. | | All things considered, an actual database probably gives you | the least gray hair, but with some careful test setup I have | had good success using the savepoint/rollback trick (and it | trivially supports nested fixtures as well). | latch wrote: | > For some time proponents of unit testing have asserted that | unit tests should not touch a real database | | Is that still a belief in some circles? I feel like the shift | away from this started like 15 years ago (largely because of RoR | in my mind). | | Anyways, this essentially launches a pg instance with a | postgresql.conf that is aimed for speed (at the risk of possible | data loss/corruption). DO NOT DO THIS IN PRODUCTION, but I just | bake the following in local/test/CI: fsync = | off synchronous_commit = off | | Some other things I've seen / done in the name of testing with | PG: | | - Use unlogged tables. Maybe it's a bit faster, never really | seemed to make much a different. | | - Drop all foreign keys. This has significant non-performance | impact. On the downside, it materially changes the environment | that your tests are running vs your system: test could now pass | for otherwise invalid code. On the plus side, it makes setting up | fake data _a lot_ easier. | | - Run tests with a pool_size of 1. This catches cases where you | start a transaction, but forget to use the transaction object, | e.g.: db.begin_transaction(tx -> | tx.query("update x") db.query("delete y") // should | have been tx.query ) | | - A robust Factory library (NIH or OSS) is pretty probably the | most important thing you can do | | - If you can't pre-seed all the test data, try to write _all_ | your tests without ever having to truncate/delete tables between | tests. This (a) avoids the slow delete/truncate and (b) lets you | run tests in parallel. This means using random keys to avoid | duplicates (obviously uuid is the simple answer here) and not | selecting all data from a table when verifying/asserting (since | who knows what's going to be in there). | gregplaysguitar wrote: | > write _all_ your tests without ever having to truncate/delete | tables between tests | | This is exactly what we do, and it works really well. We | essentially spin up a new tenant for each test suite. Forcing | all tests to pass regardless of what other unrelated data is in | the db is a great way to ensure cross-tenant data isolation. | | The other underrated benefit is that you don't need a separate | test database for local development - local dev instance and | tests can use the same db. This means you can work on db | changes in one place and see their impact on both tests and the | running application. | rubenv wrote: | Or in Go, with full databases: https://github.com/rubenv/pgtest/ | stansler wrote: | Another approach is to use Database Lab | (https://gitlab.com/postgres-ai/database-lab). Our tool allows | deploying disposable Postgres databases in seconds using REST | API, CLI, or GUI. The difference is that we provide clones with | snapshots of full-sized data. For example, you may have dozens of | clones of your production databases (with masked sensitive data) | to use as staging servers, database migration verification, | tests, and all that provisioned in a couple of seconds. | | More about using full-sized Postgres clones for migration testing | and why is that important: https://postgres.ai/products/database- | migration-testing | kardianos wrote: | I also do something like this, but with SQL Server. Highly | recommended. Except rather then going from a seed or backup, the | schema and default data is defined in the application, then the | data is entered in the test setup, the tables are sorted | topographically, and inserted all at once, then the unit test | runs (can be combined with a table driven test). | | Robust, highly recommended. | tln wrote: | My test suite has the schema and seed data in a template database | then runs CREATE DATABASE testX TEMPLATE seed; | | ...which takes about 1 second per test run. The seed data and | schema is baked into a docker image, and recreated whenever there | are new migration files. Starting the docker image is slow but | that doesn't happen on every test run. | brasetvik wrote: | That's a pretty reasonable approach, and similar in spirit to | what I find to work well: | | - Create a template with necessary extensions, or just install | them to "template1" which is the template if you don't | explicitly specify a template. Installing PostGIS, for example, | takes a few seconds - which is annoying if you create the | schema from scratch in test runs. (`create extension if not | exists postgis` can still be around in your schema, it'll just | return right away) - Create a template for the test session, | based on the template you've pre-installed extensions in, and | apply the schema there. - Create a database based on the second | template for whatever scope makes sense for your test. | | If your Postgres cluster is only serving test workloads, | `fsync=off` can speed up things as well. (Which a stock | postgresql.conf will point out can cause irrecoverable data | loss, which I don't care about for test data) ___________________________________________________________________ (page generated 2021-11-17 23:00 UTC)