[HN Gopher] Show HN: WunderBase - Serverless OSS database on top...
       ___________________________________________________________________
        
       Show HN: WunderBase - Serverless OSS database on top of SQLite,
       Firecracker
        
       Author : jensneuse
       Score  : 135 points
       Date   : 2022-09-15 14:22 UTC (8 hours ago)
        
 (HTM) web link (wundergraph.com)
 (TXT) w3m dump (wundergraph.com)
        
       | azebazenestor wrote:
       | Another way to do sqlite over S3 is:
       | 
       | - https://github.com/uktrade/sqlite-s3vfs (Read/Write) -
       | https://github.com/michalc/sqlite-s3-query ( Read Only)
        
       | avl999 wrote:
       | SQLite, KVM, Firecracker, GraphQL, Serverless... if this was
       | written in Rust it would hit the holy trinity of all the HN
       | buzzwords that pull a post to the frontpage ;)
        
         | Sujan wrote:
         | The Prisma Query Engine is indeed written in Rust :D
        
       | tptacek wrote:
       | I'd love to say that the most interesting thing here is Fly
       | Machines (i am bias) but really it's SQLite, which, no matter
       | what platform you're using, makes it architecturally simpler to
       | scale up and down, since you don't have to scale a database
       | server up and down with your workload; the database is embedded
       | in the app, which already had the scaling logic.
       | 
       | People have been sleeping on SQLite and are starting to wake up
       | and I'm kind of psyched to see what else they come up with
       | (another very cool example of a software tool that really plays
       | to SQLite's strengths is Datasette: https://datasette.io/).
        
         | nijave wrote:
         | As soon as you start scaling the app beyond 1 replica, you have
         | to handle data replication again
        
           | tptacek wrote:
           | Of course.
        
       | gregwebs wrote:
       | This just runs a request proxy that turns off after 10 seconds of
       | no activity and starts it up (with a half second delay) when
       | there is a new request. It runs SQLite with Prisma. Prisma is an
       | API server that puts a GraphQL API in front of a DB.
       | 
       | It's a nice blog post about gluing technology and I can see how
       | this could be a really nice way to run some lower-cost databases
       | in a non-demanding development environment. However, it is not a
       | reliable way of operating a database. For me it isn't really
       | serverless since it only scales between 0 and 1 instance whereas
       | a serverless DB ideally would scale-out but should at least have
       | some ability to scale to greater load in response to demand,
       | along with higher reliability and availability, and backing up
       | data to object storage.
        
         | ithrow wrote:
         | Couldn't find anything in the Prisma docs about it exposing a
         | GraphQL API.
        
           | jensneuse wrote:
           | https://github.com/prisma/prisma-engines#query-engine
        
         | lbhdc wrote:
         | I was pretty interested to see how this worked as well. I think
         | you are right, this is a toy. It will be interesting to see if
         | they can solve scaling.
         | 
         | It would be cool if you could ditch the graphql layer. It seems
         | like there are other alternatives that go the vfs route so you
         | still get to use a standard sqlite client.
        
         | tptacek wrote:
         | You can, of course, scale to >1 with the Fly Machines API. I
         | don't know enough about how they're managing the SQLite part of
         | this to say more about how this scales, except that I think
         | scaling out SQLite is about to get a lot more interesting.
         | 
         | But I mostly agree that we need a better term than "serverless"
         | for this kind of stuff. The big things people seem to want from
         | "serverless" solutions are "not managing long-running server
         | instances" and "true usage-metered billing".
         | 
         | Whether or not there are servers, like, at all has not all that
         | much to do with things.
        
           | shabbatt wrote:
           | For many fly.io is not a viable alternative especially those
           | that are already on the AWS train.
           | 
           | I think that AWS already offers Aurora Serverless v2 which
           | pretty much accomplishes what a lot of these me-too-
           | serverless services that won't integrate as well as something
           | that is offered out of AWS.
           | 
           | Even if you were insistent on cloud-agnostic mandate (which
           | is really not logical since there is at best 3 public clouds
           | to choose from that are also vulnerable to targeted
           | cyberattacks and faultlines), it would be hard to convince a
           | large organization to switch to using Sqlite on Fly.io
        
             | vosper wrote:
             | > I think that AWS already offers Aurora Serverless v2
             | which pretty much accomplishes what a lot of these me-too-
             | serverless services that won't integrate as well as
             | something that is offered out of AWS.
             | 
             | People should just be aware that Aurora Serverless v2 won't
             | scale to zero, and you'll pay for it even if you never use
             | it.
             | 
             | https://www.lastweekinaws.com/blog/no-aws-aurora-
             | serverless-...
        
               | shabbatt wrote:
               | ah thanks for pointing that out, since this is new its
               | bound to change. will be surprised if this didn't
               | eventually scale to zero but if I had to bet I would back
               | AWS here going forward. This is way too critical for it's
               | serverless stack to get the Cognito treatment.
        
             | tptacek wrote:
             | Forget the Fly.io part; SQLite is what's interesting here.
             | I agree in advance that it's unlikely anyone's going to
             | convert a large app from Postgres to SQLite; if full-stack
             | SQLite succeeds as a trend, it'll be with new apps that
             | grow up using it.
        
               | shabbatt wrote:
               | interested to know what you see in sqlite here? why is
               | there so much interest in this all of a sudden lately? am
               | I missing something?
        
               | tptacek wrote:
               | It's a database that in full-stack culture has been
               | relegated to "unit test database mock" for about 15 years
               | that is (1) surprisingly capable as a SQL engine, (2) the
               | simplest SQL database to get your head around and manage,
               | and (3) can embed directly in literally every application
               | stack, which is especially interesting in latency-
               | sensitive and globally-distributed applications.
               | 
               | Reason (3) is clearly our ulterior motive here, so we're
               | not disinterested: our model user deploys a full-stack
               | app (Rails, Elixir, Express, whatever) in a bunch of
               | regions around the world, hoping for sub-100ms responses
               | for users in most places around the world. Even within a
               | single data center, repeated queries to SQL servers can
               | blow that budget. Running an in-process SQL server neatly
               | addresses it. Conveniently, most applications are read-
               | heavy, and most performance-sensitive app requests are
               | reads.
        
               | shabbatt wrote:
               | hmm but how would the replication and sync be handled if
               | you have many sqlite instances on edge locations around
               | the world? If someone inserts a row with id 234 and
               | somebody from other side of the world does it, wouldn't
               | this type of logic involve reaching into a central source
               | of truth to compare the diff?
               | 
               | tryna wrap my head around this architecture, it is quite
               | interesting but concerning that it is now sharding into
               | close-to-local sqlite instances located near the user.
        
               | tptacek wrote:
               | Yes: the model topology you should have in your head is
               | "single writer, multiple readers" --- exactly the same
               | way it would work with a conventional Postgres setup.
               | What you're getting with SQLite here is that the reads
               | themselves are served out of the app process rather than
               | round-tripping over the network; otherwise, it's the same
               | architecture.
               | 
               | (You're not generally "reaching back to the central
               | source of truth to compare" things, so much as
               | "satisfying the write centrally and shipping out the new
               | database pages back to the read replicas at the edges").
               | 
               | More on this model: https://fly.io/blog/globally-
               | distributed-postgres/
        
               | shabbatt wrote:
               | Interesting, do you have plans to support GPU as well? I
               | can see this is a bottom up approach: put a low load
               | instance close to the user for reads and have a globally
               | synced write that should handle race conditions etc
               | 
               | Are there cold start delays? From the moment I type
               | domain.com is it going to spin up a fly instance closest
               | to me and serve the SQLite database reads?
               | 
               | I'm gonna give this a go this weekend to see what it can
               | do
        
               | tptacek wrote:
               | This is getting into Fly.io stuff and not WunderBase or
               | SQLite stuff. GPU is a ways off for us: the programming
               | interface for GPUs is tricky to implement with full
               | isolation between VMs. The post we're commenting on talks
               | a bit about cold start delays (a couple hundred
               | milliseconds).
        
           | gregwebs wrote:
           | This is SQLite, so how would you scale itr to > 1? Certainly
           | you can put an app tier in front of this DB tier and scale
           | the app tier to infinity.
        
             | tptacek wrote:
             | By replicating the SQLite transactions to other SQLite
             | databases.
        
               | soamv wrote:
               | Hey, I like SQLite and I also like fly.io, but
               | "distributed DB built with sqlite as storage" is really a
               | very different beast from just "sqlite".
        
               | tptacek wrote:
               | I'm mostly interested in the SQLite part of this, and we
               | don't have an SQLite offering, only Postgres, just to be
               | clear. So you can't hurt my feelings here.
               | 
               | When does SQLite become "a distributed DB built with
               | sqlite as storage"? Did Postgres stop being Postgres when
               | someone plugged log shipping into it? That's basically
               | what we're talking about here --- not stuff like rqlite,
               | which I'm also pretty interested in, but which really is
               | a new database built on SQLite.
        
               | gregwebs wrote:
               | Users generally don't plug ad-hoc log shipping solutions
               | into Postgres. They generally use the built-in battle-
               | tested Postgres replication features, and they can setup
               | synchronous replication to avoid data loss. Shipping a
               | log is trivial but synchronous replication and failover
               | are quite difficult to get right (see jepsen.io), and
               | setting up failover for Postgres is still quite
               | difficult. Newer DBs have been built from the groundup
               | (CRDB, TIDB, etc) in part because of the difficulty of
               | attempting to operate traditional DBs as reliable
               | distributed systems.
        
               | tptacek wrote:
               | They do now, but that wasn't always the case, and people
               | didn't say that you weren't running Postgres when you did
               | that.
               | 
               | Cockroach is not the same thing we're talking about here;
               | it's a much more ambitious design, just like rqlite is
               | much more ambitious than shipping SQLite transactions.
               | What we're talking here is the tooling needed to generate
               | a single-writer multi-reader cluster the way you would
               | for Postgres, but for SQLite instead. I don't know if
               | single-writer multi-reader clusters for Postgres qualify
               | as "easy", but they're not science projects.
               | 
               | If it's not obvious: we love Cockroach. Our commercial
               | bias is that we built a platform that is especially
               | useful for distributed services and clusters, and
               | Cockroach is very much that.
        
         | jensneuse wrote:
         | As discussed in the post, the next steps are to add read
         | replicas. Regarding backups, that's possible with Litestream:
         | https://github.com/benbjohnson/litestream
        
           | gregwebs wrote:
           | Are you going to use LiteFS then for replication [1]? LiteFS
           | replication is asynchronous, meaning failover can lose the
           | latest data. Will LiteFS scale down to 0? Does scaling down
           | to zero mean electing a leader when scaling back up to 1 and
           | will that have a delay? Will the read replicas scale down to
           | 0 along with LiteFS when the primary scales down to 0?
           | 
           | [1] https://github.com/superfly/litefs
        
             | benbjohnson wrote:
             | LiteFS/Litestream author here. LiteFS will scale to zero
             | with a persistent volume attached. For short-lived
             | instances (aka serverless), we still have a few more
             | features to complete on the road map (e.g. S3 replication,
             | synchronous replication) to make that work well. Pure
             | serverless (e.g. Lambda, Vercel) is also something we plan
             | to support but we want to get LiteFS working well on more
             | traditional deployments (e.g. longer running instances,
             | Kubernetes, etc) first.
        
       | snadal wrote:
       | Slightly off-topic: according to what I read, it is a lightweight
       | proxy written in golang that is capable of starting a vm when
       | receiving network traffic and 10 seconds after the last request
       | it turns off the fly machine.
       | 
       | I've been looking for something similar for some time to use in
       | my development docker instances (specifically with dokku). I have
       | many services that, although they consume little CPU time, they
       | do have a high overall consumption of RAM, but they are actually
       | used for a few minutes each day.
       | 
       | I don't want to use kubernetes for this as it adds too much
       | complexity for the benefit I would get.
       | 
       | Do you know any solution similar to this, to turn on / off docker
       | containers when network traffic comes in?
        
       | talhof8 wrote:
       | Serverless database. Loving it! Good-luck!
        
       | bragr wrote:
       | >I do not recommend to expose WunderBase to the public internet.
       | The intended use case is to run it on a private network and
       | expose it to your frontend via an API Gateway, like WunderGraph!
       | 
       | My SEC team felt a disturbance in the force from me even
       | considering this on our internal network. Security should not be
       | a secondary consideration for a DB!
        
         | shabbatt wrote:
         | Seeing how many MongoDB instances were running wide open to the
         | internet despite calls to not do so, your concern is certainly
         | valid.
        
       ___________________________________________________________________
       (page generated 2022-09-15 23:00 UTC)