[HN Gopher] Show HN: WunderBase - Serverless OSS database on top... ___________________________________________________________________ Show HN: WunderBase - Serverless OSS database on top of SQLite, Firecracker Author : jensneuse Score : 135 points Date : 2022-09-15 14:22 UTC (8 hours ago) (HTM) web link (wundergraph.com) (TXT) w3m dump (wundergraph.com) | azebazenestor wrote: | Another way to do sqlite over S3 is: | | - https://github.com/uktrade/sqlite-s3vfs (Read/Write) - | https://github.com/michalc/sqlite-s3-query ( Read Only) | avl999 wrote: | SQLite, KVM, Firecracker, GraphQL, Serverless... if this was | written in Rust it would hit the holy trinity of all the HN | buzzwords that pull a post to the frontpage ;) | Sujan wrote: | The Prisma Query Engine is indeed written in Rust :D | tptacek wrote: | I'd love to say that the most interesting thing here is Fly | Machines (i am bias) but really it's SQLite, which, no matter | what platform you're using, makes it architecturally simpler to | scale up and down, since you don't have to scale a database | server up and down with your workload; the database is embedded | in the app, which already had the scaling logic. | | People have been sleeping on SQLite and are starting to wake up | and I'm kind of psyched to see what else they come up with | (another very cool example of a software tool that really plays | to SQLite's strengths is Datasette: https://datasette.io/). | nijave wrote: | As soon as you start scaling the app beyond 1 replica, you have | to handle data replication again | tptacek wrote: | Of course. | gregwebs wrote: | This just runs a request proxy that turns off after 10 seconds of | no activity and starts it up (with a half second delay) when | there is a new request. It runs SQLite with Prisma. Prisma is an | API server that puts a GraphQL API in front of a DB. | | It's a nice blog post about gluing technology and I can see how | this could be a really nice way to run some lower-cost databases | in a non-demanding development environment. However, it is not a | reliable way of operating a database. For me it isn't really | serverless since it only scales between 0 and 1 instance whereas | a serverless DB ideally would scale-out but should at least have | some ability to scale to greater load in response to demand, | along with higher reliability and availability, and backing up | data to object storage. | ithrow wrote: | Couldn't find anything in the Prisma docs about it exposing a | GraphQL API. | jensneuse wrote: | https://github.com/prisma/prisma-engines#query-engine | lbhdc wrote: | I was pretty interested to see how this worked as well. I think | you are right, this is a toy. It will be interesting to see if | they can solve scaling. | | It would be cool if you could ditch the graphql layer. It seems | like there are other alternatives that go the vfs route so you | still get to use a standard sqlite client. | tptacek wrote: | You can, of course, scale to >1 with the Fly Machines API. I | don't know enough about how they're managing the SQLite part of | this to say more about how this scales, except that I think | scaling out SQLite is about to get a lot more interesting. | | But I mostly agree that we need a better term than "serverless" | for this kind of stuff. The big things people seem to want from | "serverless" solutions are "not managing long-running server | instances" and "true usage-metered billing". | | Whether or not there are servers, like, at all has not all that | much to do with things. | shabbatt wrote: | For many fly.io is not a viable alternative especially those | that are already on the AWS train. | | I think that AWS already offers Aurora Serverless v2 which | pretty much accomplishes what a lot of these me-too- | serverless services that won't integrate as well as something | that is offered out of AWS. | | Even if you were insistent on cloud-agnostic mandate (which | is really not logical since there is at best 3 public clouds | to choose from that are also vulnerable to targeted | cyberattacks and faultlines), it would be hard to convince a | large organization to switch to using Sqlite on Fly.io | vosper wrote: | > I think that AWS already offers Aurora Serverless v2 | which pretty much accomplishes what a lot of these me-too- | serverless services that won't integrate as well as | something that is offered out of AWS. | | People should just be aware that Aurora Serverless v2 won't | scale to zero, and you'll pay for it even if you never use | it. | | https://www.lastweekinaws.com/blog/no-aws-aurora- | serverless-... | shabbatt wrote: | ah thanks for pointing that out, since this is new its | bound to change. will be surprised if this didn't | eventually scale to zero but if I had to bet I would back | AWS here going forward. This is way too critical for it's | serverless stack to get the Cognito treatment. | tptacek wrote: | Forget the Fly.io part; SQLite is what's interesting here. | I agree in advance that it's unlikely anyone's going to | convert a large app from Postgres to SQLite; if full-stack | SQLite succeeds as a trend, it'll be with new apps that | grow up using it. | shabbatt wrote: | interested to know what you see in sqlite here? why is | there so much interest in this all of a sudden lately? am | I missing something? | tptacek wrote: | It's a database that in full-stack culture has been | relegated to "unit test database mock" for about 15 years | that is (1) surprisingly capable as a SQL engine, (2) the | simplest SQL database to get your head around and manage, | and (3) can embed directly in literally every application | stack, which is especially interesting in latency- | sensitive and globally-distributed applications. | | Reason (3) is clearly our ulterior motive here, so we're | not disinterested: our model user deploys a full-stack | app (Rails, Elixir, Express, whatever) in a bunch of | regions around the world, hoping for sub-100ms responses | for users in most places around the world. Even within a | single data center, repeated queries to SQL servers can | blow that budget. Running an in-process SQL server neatly | addresses it. Conveniently, most applications are read- | heavy, and most performance-sensitive app requests are | reads. | shabbatt wrote: | hmm but how would the replication and sync be handled if | you have many sqlite instances on edge locations around | the world? If someone inserts a row with id 234 and | somebody from other side of the world does it, wouldn't | this type of logic involve reaching into a central source | of truth to compare the diff? | | tryna wrap my head around this architecture, it is quite | interesting but concerning that it is now sharding into | close-to-local sqlite instances located near the user. | tptacek wrote: | Yes: the model topology you should have in your head is | "single writer, multiple readers" --- exactly the same | way it would work with a conventional Postgres setup. | What you're getting with SQLite here is that the reads | themselves are served out of the app process rather than | round-tripping over the network; otherwise, it's the same | architecture. | | (You're not generally "reaching back to the central | source of truth to compare" things, so much as | "satisfying the write centrally and shipping out the new | database pages back to the read replicas at the edges"). | | More on this model: https://fly.io/blog/globally- | distributed-postgres/ | shabbatt wrote: | Interesting, do you have plans to support GPU as well? I | can see this is a bottom up approach: put a low load | instance close to the user for reads and have a globally | synced write that should handle race conditions etc | | Are there cold start delays? From the moment I type | domain.com is it going to spin up a fly instance closest | to me and serve the SQLite database reads? | | I'm gonna give this a go this weekend to see what it can | do | tptacek wrote: | This is getting into Fly.io stuff and not WunderBase or | SQLite stuff. GPU is a ways off for us: the programming | interface for GPUs is tricky to implement with full | isolation between VMs. The post we're commenting on talks | a bit about cold start delays (a couple hundred | milliseconds). | gregwebs wrote: | This is SQLite, so how would you scale itr to > 1? Certainly | you can put an app tier in front of this DB tier and scale | the app tier to infinity. | tptacek wrote: | By replicating the SQLite transactions to other SQLite | databases. | soamv wrote: | Hey, I like SQLite and I also like fly.io, but | "distributed DB built with sqlite as storage" is really a | very different beast from just "sqlite". | tptacek wrote: | I'm mostly interested in the SQLite part of this, and we | don't have an SQLite offering, only Postgres, just to be | clear. So you can't hurt my feelings here. | | When does SQLite become "a distributed DB built with | sqlite as storage"? Did Postgres stop being Postgres when | someone plugged log shipping into it? That's basically | what we're talking about here --- not stuff like rqlite, | which I'm also pretty interested in, but which really is | a new database built on SQLite. | gregwebs wrote: | Users generally don't plug ad-hoc log shipping solutions | into Postgres. They generally use the built-in battle- | tested Postgres replication features, and they can setup | synchronous replication to avoid data loss. Shipping a | log is trivial but synchronous replication and failover | are quite difficult to get right (see jepsen.io), and | setting up failover for Postgres is still quite | difficult. Newer DBs have been built from the groundup | (CRDB, TIDB, etc) in part because of the difficulty of | attempting to operate traditional DBs as reliable | distributed systems. | tptacek wrote: | They do now, but that wasn't always the case, and people | didn't say that you weren't running Postgres when you did | that. | | Cockroach is not the same thing we're talking about here; | it's a much more ambitious design, just like rqlite is | much more ambitious than shipping SQLite transactions. | What we're talking here is the tooling needed to generate | a single-writer multi-reader cluster the way you would | for Postgres, but for SQLite instead. I don't know if | single-writer multi-reader clusters for Postgres qualify | as "easy", but they're not science projects. | | If it's not obvious: we love Cockroach. Our commercial | bias is that we built a platform that is especially | useful for distributed services and clusters, and | Cockroach is very much that. | jensneuse wrote: | As discussed in the post, the next steps are to add read | replicas. Regarding backups, that's possible with Litestream: | https://github.com/benbjohnson/litestream | gregwebs wrote: | Are you going to use LiteFS then for replication [1]? LiteFS | replication is asynchronous, meaning failover can lose the | latest data. Will LiteFS scale down to 0? Does scaling down | to zero mean electing a leader when scaling back up to 1 and | will that have a delay? Will the read replicas scale down to | 0 along with LiteFS when the primary scales down to 0? | | [1] https://github.com/superfly/litefs | benbjohnson wrote: | LiteFS/Litestream author here. LiteFS will scale to zero | with a persistent volume attached. For short-lived | instances (aka serverless), we still have a few more | features to complete on the road map (e.g. S3 replication, | synchronous replication) to make that work well. Pure | serverless (e.g. Lambda, Vercel) is also something we plan | to support but we want to get LiteFS working well on more | traditional deployments (e.g. longer running instances, | Kubernetes, etc) first. | snadal wrote: | Slightly off-topic: according to what I read, it is a lightweight | proxy written in golang that is capable of starting a vm when | receiving network traffic and 10 seconds after the last request | it turns off the fly machine. | | I've been looking for something similar for some time to use in | my development docker instances (specifically with dokku). I have | many services that, although they consume little CPU time, they | do have a high overall consumption of RAM, but they are actually | used for a few minutes each day. | | I don't want to use kubernetes for this as it adds too much | complexity for the benefit I would get. | | Do you know any solution similar to this, to turn on / off docker | containers when network traffic comes in? | talhof8 wrote: | Serverless database. Loving it! Good-luck! | bragr wrote: | >I do not recommend to expose WunderBase to the public internet. | The intended use case is to run it on a private network and | expose it to your frontend via an API Gateway, like WunderGraph! | | My SEC team felt a disturbance in the force from me even | considering this on our internal network. Security should not be | a secondary consideration for a DB! | shabbatt wrote: | Seeing how many MongoDB instances were running wide open to the | internet despite calls to not do so, your concern is certainly | valid. ___________________________________________________________________ (page generated 2022-09-15 23:00 UTC)