[HN Gopher] SQLedge: Replicate Postgres to SQLite on the Edge ___________________________________________________________________ SQLedge: Replicate Postgres to SQLite on the Edge Author : clessg Score : 257 points Date : 2023-08-09 14:29 UTC (8 hours ago) (HTM) web link (github.com) (TXT) w3m dump (github.com) | maxpert wrote: | Very interesting! I have question ( out of my experience in | https://github.com/maxpert/marmot ) how do get around the boot | time, specially when a change log of table is pretty large in | Postgres? I've implemented snapshotting mechanism in Marmot as | part of quickly getting up to speed. At some level I wonder if we | can just feed this PG replication log into NATS cluster and | Marmot can just replicate it across the board. | maxfurman wrote: | This is pretty neat! One question, if all your queries have to be | SQLite-compatible, doesn't that defeat the purpose of using PG in | the first place? Maybe SQLite supports more PG features than I | thought, but if for example your app uses pgvector or pgcrypto | you might have issues here. | zknill wrote: | Yes, absolutely, and this is going to be one of the hardest | tech challenges to solve. I've thought a little about it, and | it's probably unrealistic to think that we can translate every | single PG statement into a SQLite one, especially when PG has | extensions. So we're probably destined to use the local SQLite | database for queries we can parse and understand, and | forwarding all the others (both reads and writes) to the | upstream PG server. | | This slightly breaks model of having a local copy serve data | faster, but if only the minority of queries use a format that | we don't understand in SQLite then only that minority of | queries will suffer from the full latency to the main PG | server. | mvaliente2001 wrote: | For one moment I thought this was the equivalent of SQLite for | EdgeDB and the idea made me very happy. | hrdwdmrbl wrote: | This space is starting to get crowded. Can anyone compare this | with some of the other solutions coming out recently? | DANmode wrote: | List a couple? | arve0 wrote: | A few I know: rqlite, dqlite, SQLite wasm, litestream. | otoolep wrote: | rqlite[1] creator here, happy to answer any questions. | | [1] https://www.rqlite.io | westurner wrote: | #. SQLite WAL mode | | From https://www.sqlite.org/isolation.html | https://news.ycombinator.com/item?id=32247085 : | | > [sqlite] _WAL mode permits simultaneous readers and | writers. It can do this because changes do not overwrite the | original database file, but rather go into the separate | write-ahead log file. That means that readers can continue to | read the old, original, unaltered content from the original | database file at the same time that the writer is appending | to the write-ahead log_ | | #. superfly/litefs: a FUSE-based file system for replicating | SQLite https://github.com/superfly/litefs | | #. sqldiff: https://www.sqlite.org/sqldiff.html | https://news.ycombinator.com/item?id=31265005 | | #. dolthub/dolt: https://github.com/dolthub/dolt : | | > _Dolt is a SQL database that you can fork, clone, branch, | merge, push and pull just like a Git repository._ [...] | | > _Dolt can be set up as a replica of your existing MySQL or | MariaDB database using standard MySQL binlog replication. | Every write becomes a Dolt commit. This is a great way to get | the version control benefits of Dolt and keep an existing | MySQL or MariaDB database._ | | #. github/gh-ost: https://github.com/github/gh-ost : | | > _Instead, gh-ost uses the binary log stream to capture | table changes, and asynchronously applies them onto the ghost | table. gh-ost takes upon itself some tasks that other tools | leave for the database to perform. As result, gh-ost has | greater control over the migration process; can truly suspend | it; can truly decouple the migration 's write load from the | master's workload._ | | #. vlcn-io/cr-sqlite: https://github.com/vlcn-io/cr-sqlite : | | > _Convergent, Replicated SQLite. Multi-writer and CRDT | support for SQLite_ | | > _CR-SQLite is a run-time loadable extension for SQLite and | libSQL. It allows merging different SQLite databases together | that have taken independent writes._ | | > _In other words, you can write to your SQLite database | while offline. I can write to mine while offline. We can then | both come online and merge our databases together, without | conflict._ | | > _In technical terms: cr-sqlite adds multi-master | replication and partition tolerance to SQLite via conflict | free replicated data types (CRDTs) and /or causally ordered | event logs._ | | yjs also does CRDTs (Jupyter RTC,) | | #. pganalyze/libpg_query: | https://github.com/pganalyze/libpg_query : | | > _C library for accessing the PostgreSQL parser outside of | the server environment_ | | #. Ibis + Substrait [ + DuckDB ] https://ibis- | project.org/blog/ibis_substrait_to_duckdb/ : | | > _ibis strives to provide a consistent interface for | interacting with a multitude of different analytical | execution engines, most of which (but not all) speak some | dialect of SQL._ | | > _Today, Ibis accomplishes this with a lot of help from | `sqlalchemy` and `sqlglot` to handle differences in dialect, | or we interact directly with available Python bindings (for | instance with the pandas, datafusion, and polars backends)._ | | > [...] _`Substrait` is a new cross-language serialization | format for communicating (among other things) query plans. It | 's still in its early days, but there is already nascent | support for Substrait in Apache Arrow, DuckDB, and Velox._ | | #. ibis-project/ibis-substrait: https://github.com/ibis- | project/ibis-substrait | | #. tobymao/sqlglot: https://github.com/tobymao/sqlglot : | | > _SQLGlot is a no-dependency SQL parser, transpiler, | optimizer, and engine. It can be used to format SQL or | translate between 19 different dialects like DuckDB, Presto, | Spark, Snowflake, and BigQuery. It aims to read a wide | variety of SQL inputs and output syntactically and | semantically correct SQL in the targeted dialects._ | | > _It is a very comprehensive generic SQL parser with a | robust test suite. It is also quite performant, while being | written purely in Python._ | | > _You can easily customize the parser, analyze queries, | traverse expression trees, and programmatically build SQL._ | | > _Syntax errors are highlighted and dialect | incompatibilities can warn or raise depending on | configurations. However, it should be noted that SQL | validation is not SQLGlot's goal, so some syntax errors may | go unnoticed._ | | #. benbjohnson/postlite: | https://github.com/benbjohnson/postlite : | | > _postlite is a network proxy to allow access to remote | SQLite databases over the Postgres wire protocol. This allows | GUI tools to be used on remote SQLite databases which can | make administration easier._ | | > _The proxy works by translating Postgres frontend wire | messages into SQLite transactions and converting results back | into Postgres response wire messages. Many Postgres clients | also inspect the pg_catalog to determine system information | so Postlite mirrors this catalog by using an attached in- | memory database with virtual tables. The proxy also performs | minor rewriting on these system queries to convert them to | usable SQLite syntax._ | | > _Note: This software is in alpha. Please report bugs. | Postlite doesn 't alter your database unless you issue | INSERT, UPDATE, DELETE commands so it's probably safe. If | anything, the Postlite process may die but it shouldn't | affect your database._ | | #. > "Hosting SQLite Databases on GitHub Pages" (2021) re: | sql.js-httpvfs, DuckDB | https://news.ycombinator.com/item?id=28021766 | | #. >> - bittorrent/sqltorrent | https://github.com/bittorrent/sqltorrent | | >> _Sqltorrent is a custom VFS for sqlite which allows | applications to query an sqlite database contained within a | torrent. Queries can be processed immediately after the | database has been opened, even though the database file is | still being downloaded. Pieces of the file which are required | to complete a query are prioritized so that queries complete | reasonably quickly even if only a small fraction of the whole | database has been downloaded._ | | #. simonw/datasette-lite: | https://github.com/simonw/datasette-lite datasette, *-to- | sqlite, dogsheep | | "Loading SQLite databases" [w/ datasette] | https://github.com/simonw/datasette-lite#loading-sqlite- | data... | | #. awesome-db-tools: https://github.com/mgramin/awesome-db- | tools | | Lots of neat SQLite/vtable/pg/replication things | markhalonen wrote: | We're currently grappling with trying to build a system similar | to https://stripe.com/sigma with a single multi-tenant Postgres | db as the source and an SQLite read replica per tenant. Currently | we re-generate the world every night. We need fine-grained | control over the SQLite schema (it's a public api we let users | write sql against). Any related projects would be great to point | me towards! | kobieps wrote: | [dead] | hinkley wrote: | We need a standardized WAL format. Too many people trying to | write software to simulate it. | hamandcheese wrote: | I'm a bit confused by this. Since SQLEdge is a proxy, you lose | all the IO benefits of running an embedded in-process DB. | | At that point, why not replicate to a real Postgres on the edge? | | Maybe the expectation is that the application also opens the | SQLite file directly? (But in that case, what is the point of the | SQLEdge proxy?) | e12e wrote: | This looks neat. Any support for postgres schemas? Would be cool | if this supported each SQLite chard to write to separate schema - | giving each "share" a single tenant "view" while giving access to | all data in the postgres instance? | | Or is this only public or all schemas? | packetlost wrote: | Why SQLite instead of a standard hub and spoke replication? What | benefit does this provide? Being able to run your database on the | client? That seems like it would be risky | tptacek wrote: | Because it's running in the app's address space, SQLite is | obscenely fast for reads, so much so that you can often just | write N+1 queries with it. | hamandcheese wrote: | In this case though it appears that SQLEdge is running as its | own process, distinct from the app, and the app sends all | queries to the SQLEdge proxy. | tptacek wrote: | That makes less sense to me then, though the IPC cost for | those reads is still much lower than a typical n-tier | Postgres setup. | luckystarr wrote: | How would mutually incompatible upstream changes from multiple | SQLite edge instances be resolved? You'd need user input for | that, right? | zknill wrote: | The writes via SQLedge are sync, that is we wait for the write | to be processed on the upstream Postgres server. So it operates | as if SQLedge wasn't in the request path from application to | Postgres. The writes to Postgres are only reflected in SQLite | when the data is received back from the Postgres server on the | replication slot. | | This means writes are eventually consistent, currently, but I | intend to include a feature that allows waiting for that write | to be reflected back in SQLite which would satisfy the 'read | your own writes' property. | | SQLedge will never be in a situation where the SQLite database | thinks it has a write, but that write is yet to be applied to | the upstream Postgres server. | | Basically, the Postgres server 'owns' the writes, and can | handle them just like it would if SQLedge didn't exist. | luckystarr wrote: | So, the advertised performance is just for the read-part, not | the write-part? | | As far as I understand: writing would still exhibit the same | characteristics as before, while reads would never be | affected by other users. | zffr wrote: | > SQLedge serves reads from it's local sqlite database, and | forwards writes to the upstream postgres server that it's | replicating from. | | I don't think SQLite data is ever written back to the postgres | DB so this shouldn't be an issue | simonw wrote: | I don't think that's a problem here, because the write queries | aren't handled by the local SQLite databases - instead, the | proxy forwards them directly to the PostgreSQL instance, so | presumably the resulting changes show up in SQLite when they | are replicated back down again. | sgt wrote: | A commercial offering (although also open source): | https://www.powersync.co/ | kobieps wrote: | [dead] | kiitos wrote: | And how do you manage conflicts? | | _edit_ | | > The writes via SQLedge are sync, that is we wait for the write | to be processed on the upstream Postgres server | | OK, so, it's a SQLite read replica of a Postgres primary DB. | | Of course, this does mean that it's possible for clients to fail | the read-your-writes consistency check. | singingwolfboy wrote: | Does anyone know of a tool that will export a Postgres database | to a SQLite database file? Seems like a handy way of exporting | and passing around smallish DBs. I feel like this tool must | exist, but I haven't found it yet. (Supporting imports and data | transformations would be even better!) | simonw wrote: | I wrote a tool to do that: https://datasette.io/tools/db-to- | sqlite | | You can see an example of it in use here: | https://github.com/simonw/simonwillisonblog-backup/blob/main... | samanator wrote: | How does this work with custom postgres types? | karakanb wrote: | this seems neat, I'll definitely give it a look. this seems like | a very suitable trade-off for a lot of applications I have worked | on. | | does anyone know if there is a postgres-postgres version of this | that is easy to run in ephemeral environments? ideally I'd like | to be able to run Postgres sidecars along my application | containers and eliminate the network roundtrip using the sidecar | as a read replica, but haven't seen this being done anywhere. | maybe it wouldn't be fast enough to run in such scenarios? | djbusby wrote: | In PG only one can use logical or streaming replication. Then | all reads to the replica and writes to main. App needs two | connections - one for read, one for write. | durkie wrote: | I'm super excited for this -- it seems like it's perfect as an | app-local cache of things that can be a drop-in replacement for | some high-cost queries. | | Are there any plans to support which tables get copied over? The | main postgres database is too big to replicate everywhere, but | some key "summary" tables would be really nice to have locally. | simonw wrote: | Following the PostgreSQL logical replication stream to update a | local SQLite database copy is definitely a neat trick, and feels | very safe to me (especially since you track the Log Sequence | Number in a postgres_pos table). | | The bit that surprised me was that this thing supports writes as | well! | | It does it by acting as a PostgreSQL proxy. You connect to that | proxy with a regular PostgreSQL client, then any read queries you | issue run against the local SQLite copy and any writes are | forwarded on to "real" PostgreSQL. | | The downside is that now your SELECT statements all need to be in | the subset of SQL that is supported by both SQLite and | PostgreSQL. This can be pretty limiting, mainly because | PostgreSQL SQL is a much, much richer dialect than SQLite. | | Should work fine for basic SELECT queries though. | | I'd find this project useful even without the PostgreSQL | connection/write support though. | | I worked with a very high-scale feature flag system a while ago - | thousands of flag checks a second. This scaled using a local | memcached cache of checks on each machine, despite the check | logic itself consulting a MySQL database. | | I had an idea to improve that system by running a local SQLite | cache of the full flag logic on every frontend machine instead. | That way flag checks could use full SQL logic, but would still | run incredibly fast. | | The challenge would be keeping that local SQLite database copy | synced with the centralized source-of-truth database. A system | like SQLedge could make short work of that problem. | ericraio wrote: | One use case I can see this being valuable for is for a client | based application and Postgres being a centralized database. | The client would just query SQLite and not need to write | Postgres SQL. | Omnipresent wrote: | Honest question: why is SQLLite needed for local? Why would you | not have PG at edge that replicates data with central PG? That | way the SQL dialect problem you mentioned wouldn't exist. | DaiPlusPlus wrote: | > I worked with a very high-scale feature flag system a while | ago - thousands of flag checks a second. | | May I ask why the flags are checked that frequently? Couldn't | they be cached for at least a minute? | | > It does it by acting as a PostgreSQL proxy. [...] and any | writes are forwarded on to "real" PostgreSQL. | | What happens if there's a multi-statement transaction with a | bunch of writes sent-off to the mothership - which then get | returned to the client via logical replication, but _then_ | there 's a ROLLBACK - how would that situation be handled such | that both the SQLite edge DBs and the mothership DB are able to | rollback okay - would this impact other clients? | rockostrich wrote: | Feature flag systems are usually based on a set of rules that | could be serialized and evaluated locally (this is how pretty | much every open source feature flag system and feature flag | SaaS works). Usually it's based on some kind of UUID being | hashed with a per-flag seed and bucketed after some set of | targeting rules are applied to other properties passed in for | that user. There are added features where you can stores | large cohorts to do specific targeting and usually there's | some kind of local cache added to make that look-up faster | for recent users. | | I'm not sure what the original commenter was doing but it | sounds like they had some kind of targeting that was almost | entirely based on cohorts or maybe they needed to have | stability over time which would require a database. We did | something similar recently except we just store a "session | ID" with a blob for look-up and the evaluation only happens | on the first request for a given session ID. | paulddraper wrote: | > Couldn't they be cached for at least a minute? | | Only per feature+per user. (Though 1000s per second does seem | high unless your scale is gigantic.) | | > What happens if there's a multi-statement transaction with | a bunch of writes sent-off to the mothership - which then get | returned to the client via logical replication, but then | there's a ROLLBACK | | Nothing makes it into the replication stream until it is | committed. | simonw wrote: | They were being cached for at least a minute (maybe even more | than that, I can't remember the details) - that's what the | local memcached instance was for. | | This was problematic though because changing a feature flag | and then waiting for a minute plus to see if the change | actually worked can be frustrating, especially if it relates | to an outage of some sort. | vasco wrote: | > May I ask why the flags are checked that frequently? | Couldn't they be cached for at least a minute? | | Not in that project but feature flags don't have to be all or | nothing. You can apply flags to specific cohorts of your | users for example, so if you have a large user base, even if | you cache them per-user, it still translates into many checks | a second for large systems. | mikepurvis wrote: | I guess the cost of doing it precisely isn't terribly high, | but if the majority of the flags were "off" (eg, subsets of | users being opted into a beta or something), I wonder if | you cut a bunch of the queries by sending down a bloom | filter. | | So basically you check the filter first, and if that says | the feature is enabled, only then do you actually ask the | real DB. | runeks wrote: | My limited understanding of logical replication is that | writes only happen at COMMIT. Ie. nothing is replicated until | it's committed. | zknill wrote: | The logical replication protocol sends a series of messages | that essentially follow the flow that a database transaction | would. | | i.e. a stream of messages like: "BEGIN", "[the data]", | ["COMMIT" or "ROLLBACK"]. | | So any application that listens to the Postgres replication | protocol can handle the transaction in the same way that | Postgres does. Concretely you might choose to open a SQLite | transaction on BEGIN, apply the statements, and then COMMIT | or ROLLBACK based on the next messages received on the stream | replication protocol. | | The data sent on the replication protocol includes the state | of the row after the write query has completed. This means | you don't need to worry about getting out of sync on queries | like "UPDATE field = field + 1" because you have access to | the exact resulting value as stored by Postgres. | | TL;DR - you can follow the same begin/change/commit flow that | the original transaction did on the upstream Postgres server, | and you have access to the exact underlying data after the | write was committed. | | It's also true (as other commenters have pointed out) that | for not-huge transactions (i.e. not streaming transactions, | new feature in Postgres 15) the BEGIN message will only be | sent if the transaction was committed. It's pretty unlikely | that you will ever process a ROLLBACK message from the | protocol (although possible). | [deleted] | paulddraper wrote: | Logical replication never includes uncommitted data, unless | you have written an custom output plugin. | | EDIT: | https://stackoverflow.com/questions/52202534/postgresql- | does... | zknill wrote: | It's true for protocol version 1 that only the committed | data is sent on the replication connection. | | In protocol version 2 (introduced in postgres 14) large | in-progress transactions can appear in the new "Stream" | messages sent on the replication connection: StreamStart, | StreamStop, StreamAbort, StreamCommit, etc. In the case | of large in-progress transactions uncommitted data might | end up in the replication connection after a StreamStart | message. But you would also receive a StreamCommit or | StreamAbort message to tell you what happened to that | transaction. | | I've not worked out what qualifies as a "large" | transaction though. But it is _possible_ to get | uncommitted data in the replication connection, although | unlikely. | | https://www.postgresql.org/docs/15/protocol-logicalrep- | messa... | jmull wrote: | > May I ask why the flags are checked that frequently? | Couldn't they be cached for at least a minute? | | Not the previous poster, but it appears in the scenario, the | SQLite database _is_ the cache. | aeyes wrote: | How many flags are we talking here? I implemted a similar | system and we just replace the whole sqlite DB file by | downloading it from the centralized storage whenever it | changes. | | Even with 1M flags it's still only a few 100 kB compressed. | | I wouldn't replicate per user flags to the edge to keep size | under control. | hinkley wrote: | Does it though? If it's a proxy it can support the SQLite read | and the Postgres write syntax. If reads only ever go to SQLite | they don't need to work on Postgres. | rbranson wrote: | We replicated our MySQL database to a SQLite edge at Segment in | ctlstore: https://github.com/segmentio/ctlstore | | We considered tailing binlogs directly but there's so much cruft | and complexity involved trying to translate between types and | such at that end, once you even just get passed properly parsing | the binlogs and maintaining the replication connection. Then you | have to deal with schema management across both systems too. | Similar sets of problems using PostgreSQL as a source of truth. | | In the end we decided just to wrap the whole thing up and | abstract away the schema with a common set of types and a limited | set of read APIs. Biggest missing piece I regret not getting in | was support for secondary indexes. | eddd-ddde wrote: | Is there a reason you didn't add them when making the shared | API? | rbranson wrote: | If you're asking about secondary indexes, it was just seen as | a "later" feature we'd implement as a follow-up. It was | definitely asked for, just never prioritized before I moved | off the project. ___________________________________________________________________ (page generated 2023-08-09 23:00 UTC)