[HN Gopher] Show HN: Distributed SQLite on FoundationDB
       ___________________________________________________________________
        
       Show HN: Distributed SQLite on FoundationDB
        
       Hello HN! I'm building mvsqlite, a distributed variant of SQLite
       with MVCC transactions, that runs on FoundationDB. It is a drop-in
       replacement that just needs an `LD_PRELOAD` for existing
       applications using SQLite.  I made this because Blueboat
       (https://github.com/losfair/blueboat) needs a native SQL interface
       to persistent data. Apparently, just providing a transactional key-
       value store isn't enough - it is more easy and efficient to build
       complex business logic on an SQL database, and it seems necessary
       to bring a self-hostable distributed SQL DB onto the platform.
       Since FoundationDB is Blueboat's only stateful external dependency,
       I decided to build the SQL capabilities on top of it.  At its core,
       mvsqlite's storage engine, mvstore, is a multi-version page store
       built on FoundationDB. It addresses the duration and size limits (5
       secs, 10 MB) of FDB transactions, by handling multi-versioning
       itself. Pages are fully versioned, so they are always snapshot-
       readable in the future. An SQLite transaction fetches the read
       version during `BEGIN TRANSACTION`, and this version is used as the
       per-page range scan upper bound in future page read requests.  For
       writes, pages are first written to a content-addressed store keyed
       by the page's hash. At commit, hashes of each written page in the
       SQLite transaction is written to the page index in a single FDB
       transaction to preserve atomicity. With 8K pages and ~60B per key-
       value entry in the page index, each SQLite transaction can be as
       large as 1.3 GB (compared to FDB's native txn size limit of 10 MB).
       mvsqlite is not yet "production-ready", since it hasn't received
       enough testing, and I may still have a few changes to make to the
       on-disk format. But please ask here if you have any questions!
        
       Author : losfair
       Score  : 72 points
       Date   : 2022-07-28 19:49 UTC (2 days ago)
        
 (HTM) web link (github.com)
 (TXT) w3m dump (github.com)
        
       | [deleted]
        
       | dilyevsky wrote:
       | I think foundationdb uses sqlite as its tablet kv engine? It's
       | sqlite all the way down
        
       | learndeeply wrote:
       | > But a group of N sqlite databases is an N-writer database. And
       | mvsqlite provides the necessary mechanisms to do serializable
       | cross-database transactions without additional overhead.
       | 
       | I'm confused, are these databases planned to be replicated? Or is
       | it expected for the databases to have separate schemas?
        
         | losfair wrote:
         | Replication is handled by FDB so you don't need to care about
         | it on the application level. These databases can contain
         | partitioned data of your application, like one DB per user, so
         | that a transaction on only user A and another one on only user
         | B won't conflict.
        
           | metadat wrote:
           | Did you intend "shard", or s/shard/partition/?
        
             | losfair wrote:
             | I guess "partition" is the better wording. Updated my
             | comment.
        
       | mping wrote:
       | FDB is a nice piece of technology if you know how to go around
       | its constraints. Congrats on the project.
        
       | LAC-Tech wrote:
       | So the idea is that a small business could start on SQLite, and
       | then switch over to this when it's time to scale, without re-
       | writing it in the Postgres dialect?
       | 
       | Regardless, it's very very cool. Would love to see it get turned
       | into a product.
        
       | metadat wrote:
       | This sounds really cool, do you have any source code in a
       | workable state yet or is this project still in the formulative
       | ideation phase?
       | 
       | If there's anything concrete so far, I'd love to take a look
       | and/or try it out!
        
         | losfair wrote:
         | It already works! There are steps to try it in readme.
        
           | metadat wrote:
           | Nice, I dug a bit and found the git repository:
           | 
           | https://github.com/losfair/mvsqlite
           | 
           | Would be great to add this link to the body of your story to
           | make it easy for HNers to get to the thing :)
           | 
           | If no longer editable, consider emailing moderator Dang
           | (hn@ycombinator.com).
        
       | faizshah wrote:
       | I love the idea of distributed SQLite but I'm having a hard time
       | understanding which parts of FoundationDB and which parts of
       | SQLite are available in this implementation.
       | 
       | I'm guessing virtual table extensions work with this since you're
       | just replacing the storage engine? So we could in theory use FTS5
       | and even OSQuery and other extensions right?
       | 
       | However since this is using FoundationDB I'm also guessing we
       | can't use this as a serverless embedded DB since since you'll
       | probably need a foundation db cluster to use this. Is that right?
       | 
       | So if I understand correctly this is a SQLite query engine on top
       | of FoundationDB with distributed transactions and we can
       | theoretically use SQLite ecosystem stuff like FTS5 and datasette
       | on top of it.
        
         | losfair wrote:
         | Yes it's correct! mvsqlite integrates as a custom VFS
         | underlying SQLite's query engine, and SQLite ecosystem stuff
         | can be used on top of it.
        
           | robertsdionne wrote:
           | "The SSD storage engine stores the data in a B-tree based on
           | SQLite." XD https://apple.github.io/foundationdb/architecture
           | .html#stora...
        
       | fnord123 wrote:
       | Comdb2: distributed sqlite: http://comdb2.org/
        
         | wener wrote:
         | A better link https://github.com/bloomberg/comdb2
        
       | liuliu wrote:
       | This really, IMHO (as someone implements things on top of SQLite
       | too https://dflat.io) pushes SQLite too far as the implementation
       | of cross-db transactions have some big issues:
       | https://www.sqlite.org/limits.html (the number of attached
       | databases cannot exceed 10 or 125 (if you compile your own))
       | https://www.sqlite.org/wal.html (in WAL mode, there is no
       | transactional guarantee for cross database transactions (atomic
       | per database, but not cross database))
        
       | [deleted]
        
       | cmrdporcupine wrote:
       | If you've written your own multi-versioning, what does FDB bring
       | to the table that you couldn't have gotten out of other
       | distributed but non-transactional KV stores? E.g. Cassandra, etc.
       | Isn't there an overhead to the MVCC aspect of FDB? And it sounds
       | like you've had to jump hoops around things to get past its
       | duration and size limits, as well...
        
         | LAC-Tech wrote:
         | Cassandra is a column store, surely that would be a very un-
         | natural mapping for a SQL Engine backend to use? An orthogonal
         | mapping, to be precise!
        
       ___________________________________________________________________
       (page generated 2022-07-30 23:00 UTC)