[HN Gopher] D1: Improvements to performance and scalability ___________________________________________________________________ D1: Improvements to performance and scalability Author : eallam Score : 94 points Date : 2023-05-19 13:08 UTC (9 hours ago) (HTM) web link (blog.cloudflare.com) (TXT) w3m dump (blog.cloudflare.com) | davnicwil wrote: | So this looks like basically a distributed SQLite with read | replication for free across cloudflare's edge. Is that right? | m3kw9 wrote: | So serverless meaning you don't manage the db servers, so they | may or may not put your stuff on the edge, cloudflare takes | care of maybe load balancing too? | jgrahamc wrote: | Yes, Cloudflare handles load balancing and neat tricks like | measuring the latency between your users, your code and | databases etc. and moving your code around the network to | make it run fast: https://blog.cloudflare.com/announcing- | workers-smart-placeme... | | The vision of the Supercloud is that you give us your code | and we'll figure out where and how to execute it: | https://blog.cloudflare.com/welcome-to-the-supercloud-and- | de... | kneebonian wrote: | So I hate cloudflare with a burning passion, at some point | cloudflare decided me home IP was bad and has started flagging it | all over the place, which leads to vast swathes of the internet | now being inaccessible to me. | | There is danger in centralized systems. | v0idzer0 wrote: | Change your IP? Most ISP change every modem reboot. Others you | can call and request a change | bombcar wrote: | You may be able to force your ISP to give you a new IP by | resetting your modem or leaving it off long enough, that may | help. | | But once cloud flare hates you, you have to bow to them. | | Try browsing the normal web over Tor sometime and see how bad | it can get. | depingus wrote: | When I started using Firefox with the Temporary Containers | extension I would get absolutely bombarded by Cloudflare's | captchas. To the point where if I clicked on a search result | that sent me to a captcha, I would just close the tab and click | the next result (which often resulted in another captcha, rinse | repeat). That's when I realized just how big Cloudflare had | gotten. | | I still use Temporary Containers and lately, I've noticed a | sharp decline in these Cloudflare captchas. I don't know if its | because people are moving away from them, or Cloudflare just | found a better way to finger print me. | kentonv wrote: | Have you tried using Privacy Pass? It is a browser extension | which helps prove that your traffic is legitimate which should | reduce the incidence of challenges from Cloudflare, even if | your IP has bad rep. It uses advanced cryptography to do this | without creating any new tracking vectors. | | (Disclosure/disclaimer: I work for Cloudflare but in a | different department; I'm not an expert on Privacy Pass.) | simiones wrote: | > It uses advanced cryptography to do this without creating | any new tracking vectors. | | Sounds like something which is designed to create _hard to | detect_ tracking vectors. | kentonv wrote: | Note that Privacy Pass is not a Cloudflare product, it's an | open protocol which Cloudflare supports. | | https://datatracker.ietf.org/doc/charter-ietf-privacypass/ | threatofrain wrote: | I enjoy Cloudflare and have been using workers + D1 for a few | months, but warn that Cloudflare's definition of beta is far more | beta than what other companies mean. The API surface may change | repeatedly without corresponding documentation and their Discord | can be a bit sparse on help. | marcopicentini wrote: | Why and when should I use a serverless Postgresql instead of a | postresql hosted on a server? | rozenmd wrote: | this isn't postgres | fire wrote: | hmm, is this roughly equivalent to Neon[1], but sqlite based | rather than postgres? | | 1: https://neon.tech | nikita wrote: | (neon CEO) | | This is true. Neon however offers bottomless storage and D1 is | 100Mb currently going to 1Gb. | colesantiago wrote: | Can I connect directly to a D1 database with a SQLite URL? That | would be awesome. | ericstegemann wrote: | Eek 0.75 per GB of storage! AWS is 0.115 One of our DBs has 2.3TB | in it. | Akkuma wrote: | The costs as a whole has me worried. I'm not sure it'll be | better than just using https://turso.tech/pricing as that is | already a free tier of 8GB and it might cost less overall the | paid tier. | jFriedensreich wrote: | at least at the moment D1 is for a completely different use | case, d1 has currently a limit of 100mb and will now be | increased to 1gb. | yashap wrote: | Yeah, I'd imagine this will be cheaper than AWS RDS for low | storage use cases (lack of fixed monthly compute costs wins | out), but more expensive for high storage. Like quite cheap for | a 10-100 GB DB, quite expensive for a 1-10 TB GB. | | Though they do say: | | > when we enable global read replication, you won't have to pay | extra for it, nor will replication multiply your storage | consumption | | With AWS you'll have at least one read replica for failover, so | $0.23/GB. And if you really want global read replicas, with AWS | you might end up with something like a primary in North America | and read replicas in South America, Europe and Asia. That would | work out to $0.46/GB, so closes the gap a bit. | nemothekid wrote: | This is Cloudflare's flavor of Fly.io? | pier25 wrote: | Fly doesn't really offer distributed data. | | Edit: | | It does! | avinassh wrote: | is this not same - https://fly.io/docs/litefs/ ? They mention | it as 'LiteFS - Distributed SQLite' | pier25 wrote: | Ah you're right! | | I thought this was a wrapper for Litestream but apparently | it's a parallel project by the same author who Fly hired. | | https://litestream.io/ | | https://github.com/benbjohnson/litestream/pull/411 | trollied wrote: | Interesting, Time Travel works the same as Oracle Flashback. Hope | there aren't any patents to trip over. | pottertheotter wrote: | So when would someone use something like this? I learn better by | example if anyone has any. | | And for reference, here's the original D1 announcement with some | additional info https://blog.cloudflare.com/introducing-d1/ | leetrout wrote: | I use D1 through microfeed https://www.microfeed.org/ | kentonv wrote: | You'd use it whenever you're building an application on | Cloudflare Workers and you need to store data. D1 provides you | with a SQL database. It's based on SQLite, so it's designed for | relatively small datasets but can serve them very quickly from | the edge. | | Note there are several alternatives here, too. Workers Durable | Objects[0] provide a lower-level primitive for building | advanced distributed systems. But D1 is easier to use for | typical use cases. For blob storage you might use R2[1]. And | for large databases Workers can easily integrate with several | serverless database providers.[2] | | [0] https://blog.cloudflare.com/introducing-workers-durable- | obje... [1] https://www.cloudflare.com/products/r2/ [2] | https://blog.cloudflare.com/announcing-database-integrations... | Akkuma wrote: | Is it a non-goal to be long term usable for larger databases? | That would force the usage of something like turso your | closest direct comparison as a possible migration strategy or | relying on "Smart Placement" (which from my point of view | reduces the benefit of global edge) for other serverless non- | global dbs. | jgrahamc wrote: | I mean, our object storage is called R2, our first database | offering is called D1, and if we were to offer a fully | Distributed Database then D2 seems like a good name. | Akkuma wrote: | And you can follow up D2 with D2: Lord of Distributed. | kentonv wrote: | Personally, I'm a firm believer that most "web app" use | cases are better served by many small databases (e.g. per- | user or per-document) rather than a single monolithic | databases. This is especially true when serving users all | around the world -- per-user databases can be located near | each user (both for speed and to comply with data locality | laws). | | What I'd like to enable here is a progression where you | start out prototyping your app with a single D1 database, | which is easy to use and reasonably fast. Then as you grow | we provide tools to let you transition to many D1 databases | sharded in a way that makes sense (e.g. per-user). Apps | that want even more control can move to using full-on | Durable Objects (which will soon support a SQLite database | per-object). | | That said, there are certainly many use cases out there | where simple monoliths make sense, especially non- | interactive data crunching. I'm not sure yet if D1 will | ever be the right choice for those, but the Workers | platform aims to provide many options. | Akkuma wrote: | Thanks for the insight, I greatly appreciate it! This | definitely is a reasonable idea for many things and I'm | looking forward to seeing something similar to the | sharding mechanism in the future. | | I've only started to think about this and I'm thinking | the hardest part will be dealing with cross-cutting | concerns (in a non-auto sharded world manually creating | multiple database) and trying to find a way to keep each | database isolated without extra burden compared to using | a hosted Postgres. | | As an aside, that lan optimized house was a gaming dream. | Hope your new house is as awesome. | Wallacy wrote: | " Apps that want even more control can move to using | full-on Durable Objects (which will soon support a SQLite | database per-object)." | | Can you elaborate this little bit more? Im using DO today | and i have a bad time sharding my data (works, but i hate | it); | | So i will have the option to use the standard store | or/and SQLite? | | If so, i dont can keep with my DO (because i have control | of everything) and use SQLite for things that is bigger | than what the value store supports. | kentonv wrote: | Sorry, I don't quite understand what you're asking. | | In the future each DO will have a private SQLite | database. The key/value store will actually be redirected | to store into a special table in this database, but | probably new apps will just use the database and not the | KV store. | | Separately from that, I would like to develop tools that | make sharding Durable Objects (and D1 databases) easier. | Today it's a pain to do manually. This is independent | from the underlying storage model, though. | fyzix wrote: | Cloudflare are 285 pops. Surely you dont need your db | replication them all of them. A few locations per continent | should suffice. | | For comparison, fly.io, turso's provider, has 34 locations | and well-documented reliability issues. | Akkuma wrote: | That is a fair point. A few centralized locations will | likely be more than sufficient for most use cases. | Mystery-Machine wrote: | "up to 37x faster" is NOT before: 37.81ms, now: 1.82ms | | You can't just round down 1.82ms to 1ms. | elithrar wrote: | Fixing this. We ran a few benchmarks (and some were much faster | than 37x), but this was a more typical case. Not our goal to | inflate numbers. | ccorda wrote: | Kenton Varda (tech lead) has some more notes in this twitter | thread: | https://twitter.com/KentonVarda/status/1659551757796515846 | rektide wrote: | > _Our new engine is based on intercepting SQLite 's disk | writes and doing clever stuff with them. It was so easy because | the file format is not just well designed but amazingly well- | documented._ | | Quite the hobby project for a lot of people too! Other folks | doing this: | | Rqlite https://hn.algolia.com/?query=Rqlite&sort=byDate , | Dqlite https://hn.algolia.com/?query=dqlite&sort=byDate , | Litestream https://hn.algolia.com/?query=Litestream&sort=byDate | / LiteFS https://hn.algolia.com/?query=LiteFS&sort=byDate, | marmot, mvsqlite | https://hn.algolia.com/?query=mvsqlite&sort=byDate | otoolep wrote: | I've been doing it for almost 10 years. :-) | | https://www.philipotoole.com/9-years-of-open-source- | database... | aranke wrote: | Curious if you prototyped DuckDB before deciding to invest | further into SQLite. | | DuckDB works great as an in-memory database (it's also the | default mode). | kentonv wrote: | We didn't, no. | | I'm sure there's a lot of really cool local-first databases | out there, but SQLite has the benefit of being incredibly | widely battle-tested, with literally billions of | installations worldwide. It has received thorough security | research and fuzzing (it's part of Chrome's attack surface | after all). And there's tons of resources online to help | people understand how to use it. Although I'm sure there are | alternatives that serve certain use cases better it's hard to | imagine anything coming close for ours. | | That said, the storage engine we've built is not that heavily | dependent on SQLite specifically. Any database that uses a | write-ahead log like SQLite does should be possible to adapt | to it in the future. So maybe we'll eventually open it up to | a variety of choices, or even let you bring your own as a | Wasm module. | kentonv wrote: | Oh, I've been informed that DuckDB uses SQLite under the | hood, so maybe compiling DuckDB to Wasm and running it on | top of this will be possible, we'll see. | aranke wrote: | From https://news.ycombinator.com/item?id=23290512: | | > DuckDB is indeed a free columnar database system, but | it is not entirely built on top of SQLite. It exposes the | same front-end and uses components of SQLite (the shell | and testing infrastructure), but the execution | engine/storage code is new. | rektide wrote: | I expect most workloads are more OLTP/transactional than | OLAP/analytical. | elithrar wrote: | Correct. DuckDB is really interesting technology, but it's | not a direct successor to SQLite for transactional | workloads. It's also very new: there's a LOT of new code in | DuckDB on top of the (heavily fuzzed) SQLite parts. | | (I use it personally, but it's not the same thing as what | we're building with D1) | pier25 wrote: | Has the DX of Workers improved? | | I think Deno is lightyears ahead. | kentonv wrote: | Hard to say without knowing what specific problem you have had | but we are improving all the time. Several improvements | announced just this week, take a look at the blog. | | Among other things, we made Wrangler (Workers CLI tool) use the | open source workerd by default for local development, so local | dev should produce a much more precise simulation now (since | it's literally running the same code). | pier25 wrote: | And what about the DX of using Workers with Pages? | | I tried to use that recently and it was a disaster. I wrote | about my experience here: | | https://twitter.com/pierbover/status/1641474067013271552 | | I then opened these two issues: | | https://github.com/cloudflare/workers-sdk/issues/2962 | | https://github.com/cloudflare/workers-sdk/issues/2964 | | I ended up moving the project over to Netlify + Edge | functions. I had it all working in like 5-10 mins as it | should. Took me two hours to figure out why Workers weren't | working in my Pages project, and could never get Workers | working properly with my Astro project. | | I think you're working exclusively on the engine of Workers | which is really top notch, but Cloudflare really needs to | improve the outer layer which affects DX considerably. | kentonv wrote: | Sorry you experienced that. FWIW this announcement from | Wednesday should help address the problems you ran into: | | https://blog.cloudflare.com/pages-and-workers-are- | converging... | KRAKRISMOTT wrote: | Still no compatibility with standard ORMs like Prisma either. | | And the team has been aware of the issue for years now | | https://github.com/cloudflare/workers-sdk/issues/2701 | | https://news.ycombinator.com/item?id=31341513 | kentonv wrote: | Sorry that D1 has been slow out the gate. Now that we've | solved the basic technical issues we can really focus on | improving DX. | Akkuma wrote: | You can use drizzle-orm with Cloudflare and that is fully | compatible. | rozenmd wrote: | Drizzle is amazing. | Akkuma wrote: | This is my first time using it and I've been very pleased | with it so far. It keeps it simple, has solid typing & | schema building, and reminds me of LINQ. I'm also a thin | models kind of person and the fact this is just an object | without needing to build ORM classes is even better. | mscccc wrote: | I've been using workers on & off since it launched. Just tried | it again recently and the local dev experience with wrangler is | excellent now. ___________________________________________________________________ (page generated 2023-05-19 23:00 UTC)