[HN Gopher] How the SQLite Virtual Machine Works
       ___________________________________________________________________
        
       How the SQLite Virtual Machine Works
        
       Author : danielskogly
       Score  : 281 points
       Date   : 2022-09-07 13:55 UTC (9 hours ago)
        
 (HTM) web link (fly.io)
 (TXT) w3m dump (fly.io)
        
       | bob1029 wrote:
       | The EXPLAIN keyword is really awesome. It also allows you to
       | quickly validate SQLite command text without actually executing
       | it. We have found this to be useful for real-time validation of
       | user inputs when working on SQL-based config tools.
       | 
       | Note that EXPLAIN also seamlessly incorporates any application-
       | defined functions, so you can validate _everything_ is as
       | expected. Useful when being clever with DSLs and such.
        
       | einpoklum wrote:
       | The post does not describe a virtual machine, nor does it really
       | describe how SQLite, specifically, works, as opposed to other
       | DBMSes.
        
         | ok_dad wrote:
         | The "virtual machine" is not what you would probably consider a
         | "VM" from the usual use of the term, like a "virtual x86/ARM
         | computer" that you can do anything with.
         | 
         | In this case, it's simply a set of variables used to simulate
         | CPU registers, an instruction set that's specific to the
         | underlying database's data format and desired operations, and
         | then some code that "virtually" emulates what a CPU might do if
         | it were designed with this memory architecture (the database
         | file) and instruction set. The VM supports primitive operations
         | via the instruction set; liken this to the machine code for
         | your laptop's CPU, but very specific to the use-case of a
         | SQLite database. The SQL you send into the SQLite db
         | transaction is compiled into this VM's "machine code" and then
         | run on the "virtual CPU" that implements that instruction set,
         | and then it spits out your results. The article was very clear
         | on how it worked.
        
         | d23 wrote:
         | It describes a virtual machine instruction set along with
         | specific opcodes for the example query as well as a link to the
         | full instruction set. It specifically talks about SQLite. Did
         | you click the wrong link?
        
       | avl999 wrote:
       | I wonder how an alternate timeline might have played out if
       | Richard Hipp had not named it "SQLite" and instead called it
       | "SQLightning" or "SQLExpress" or something like that. For much of
       | its lifetime, SQLite wasn't taken seriously despite being an
       | extraordinary technology and part of me is convinced it was in
       | large part due to "Lite" being in the name.
        
         | dimitrios1 wrote:
         | SQLite has been taken seriously by serious people for decades.
         | We just don't hear about it in the cool, hipster startup
         | circles because it isn't sexy. But it's steadily grown to be
         | the most deployed embedded database in the world, in millions
         | of devices, and have specialized strands that went through
         | military and medical grade formal validations to end up in
         | airplanes, fighter pilots, naval ships, medical devices.
         | 
         | You have to look where SQLite _isn 't_ rather than where it is
         | these days.
        
           | sgbeal wrote:
           | > ... in millions of devices...
           | 
           |  _cough_ _Billions_ _cough_ with a capital  "B". Very nearly
           | every non-trivial electronic device built over the past 10-15
           | years. (Non-trivial being very roughly: "anything with a UI
           | or having the potential to run one.")
        
           | tptacek wrote:
           | The shift isn't in SQLite being taken seriously behind the
           | scenes, or as a better file format. It's in SQLite being
           | taken seriously as an alternative to n-tier database in full
           | stack applications.
        
         | thro388 wrote:
         | There were other embedded databases; Firebird, Foxpro etc..
        
         | d23 wrote:
         | Interesting. I didn't realize it wasn't for "lite".
         | 
         | A friend of mine made a similar point about GIMP. I'd never
         | thought about it that way. What a shame to be hindered by such
         | a terrible name choice (in GIMP's case).
        
           | TOGoS wrote:
           | I'm always surprised at how quickly people will dismiss
           | something just because of a name. I once wrote a bug tracking
           | system/wiki for my company to replace the terrible one that
           | we had been using. It did everything we needed and was much
           | nicer to work with, but the reason people gave me for why we
           | never used it was that the name sounded funny. It was a made-
           | up word, totally an afterthought. _sigh_
        
         | tptacek wrote:
         | I don't think it has much to do with the name. I think the
         | biggest perceptual impact on SQLite came from Rails, and its
         | default for using SQLite as a test database while strongly
         | discouraging people from using it in production.
        
           | post-it wrote:
           | Django too.
           | 
           | > By default, the configuration uses SQLite. If you're new to
           | databases, or you're just interested in trying Django, this
           | is the easiest choice. SQLite is included in Python, so you
           | won't need to install anything else to support your database.
           | When starting your first real project, however, you may want
           | to use a more scalable database like PostgreSQL, to avoid
           | database-switching headaches down the road. [0]
           | 
           | [0] https://docs.djangoproject.com/en/4.1/intro/tutorial02/
        
             | tptacek wrote:
             | Sure, Django too.
        
           | randomdata wrote:
           | I think it's older than that, even predating SQLite. Back
           | when Perl/CGI was all the rage it was common to store data in
           | files. But the web was growing fast and scaling files was
           | hard. Pushing that workload onto the heavyweight RDMBS was
           | seen as the solution, reasonably so in some cases, but that
           | also brought the cargo culting along with it.
           | 
           | When SQLite arrived it got lumped in with file storage that
           | can't scale. Rails only perpetuated what everyone was already
           | thinking.
        
           | nigma wrote:
           | There are many reasons for using other databases like
           | PostgreSQL or MySQL in production - feature set, (managed)
           | hosting, backup and replication, etc.
           | 
           | SQLite is primarily embedded/local database and cannot be
           | easily separated and shared over network [1] between multiple
           | disposable backend/worker instances.
           | 
           | [1] https://12factor.net/backing-services
        
             | [deleted]
        
             | deepstack wrote:
             | yeah when it comes to embbed, if one is willing to go the
             | java road, then H2 is an excellent DB to use. It is some
             | what Postgres compliant.
        
             | tptacek wrote:
             | Yes, that's what Rails (and Django) say about it.
        
           | 0x457 wrote:
           | To be fair, serving any kind of traffic with Rails required
           | multiple replicas of Rails application running. So SQLite
           | wasn't an option for production for any kind of production
           | rails workload.
           | 
           | Then most hosting for rails were stateless, so you had no way
           | of storing SQLite on disk.
           | 
           | And finally, for serious production you need high
           | availability and SQLite couldn't offer that.
        
         | tveita wrote:
         | I've used SQLite for toy services "in production", and it was
         | really just as bad as people think it is. Sure, you could
         | handle a large amount of read-only queries, but it only took a
         | tiny bit of write traffic in the mix to make the random latency
         | spikes jarring.
         | 
         | This was pre-WAL, presumably enabling WAL would help a lot (but
         | is still not the default, so beware). But the caveats were
         | real, it's not like people just took one look at the name and
         | though "'SQ _Lite_? ' I better put a big warning in our
         | documentation to not use this in production."
        
           | LAC-Tech wrote:
           | _This was pre-WAL, presumably enabling WAL would help a lot_
           | 
           | Indeed it would!
           | 
           | "WAL provides more concurrency as readers do not block
           | writers and a writer does not block readers. Reading and
           | writing can proceed concurrently."
           | 
           | https://www.sqlite.org/wal.html
           | 
           | I think the people advocating for SQLite to be used in more
           | places are all assuming write ahead logging is enabled.
        
             | chasil wrote:
             | There are many problems with WAL, as acknowledged by Dr.
             | Hipp in the recent olap/duckdb paper.
             | 
             | The chief problem that I see with WAL is that it breaks
             | ACID with databases that are ATTACHed, as the documentation
             | shows:
             | 
             | https://sqlite.org/lang_attach.html
        
               | gigatexal wrote:
               | How often is attach really used though?
        
               | chasil wrote:
               | How often are distributed transactions used elsewhere
               | with two-phase commit?
               | 
               | All the time. I suspect this is similar.
        
               | gigatexal wrote:
               | Right but that's bending SQLite a lot no? I mean it's
               | meant to be used as a file that has a sql interface. But
               | I digress.
        
               | simonw wrote:
               | What are the other problems?
        
               | chasil wrote:
               | OK, I'll download the PDF onto my phone and get the
               | quote...
               | 
               | ...I forgot how significant these problems are. These are
               | quite serious.
               | 
               | "However, WAL mode has notable disadvantages. To
               | accelerate searching the WAL, SQLite creates a WAL index
               | in shared memory. This improves the performance of read
               | transactions, but the use of shared memory requires that
               | all readers must be on the same machine. Thus, WAL mode
               | does not work on a network filesystem. It is not possible
               | to change the page size after entering WAL mode. In
               | addition, WAL mode comes with the added complexity of
               | checkpoint operations and additional files to store the
               | WAL and the WAL index."
               | 
               | https://vldb.org/pvldb/volumes/15/paper/SQLite%3A%20Past%
               | 2C%...
               | 
               | https://news.ycombinator.com/item?id=32684424
        
               | yread wrote:
               | It also complicates backups. I have another process that
               | .backups the database but even when opening it in read-
               | only it creates a .wal file. My first version of the
               | backup script didn't delete it afterwards and the normal
               | process didn't have the right to overwrite the backup's
               | wal file so opening the db failed
        
           | simonw wrote:
           | WAL was added 2010-07-21, so pre-WAL is over 12 years ago
           | now!
        
         | _the_inflator wrote:
         | I look at it differently. In my opinion Android started
         | popularizing SQLite. For me kudos go out to Android, for
         | integrating it on a large scale.
        
         | resoluteteeth wrote:
         | I'm not sure the problem is the name.
         | 
         | I think there has been a lot of recognition in the last 10
         | years that sqlite is actually quite robust, but it still hasn't
         | been considered suitable for serious use is based on how
         | software and database servers have traditionally operated.
         | 
         | It seems like what's changing that now is the recognition that
         | other approaches may make more sense given modern software
         | architecture.
        
           | [deleted]
        
           | kevincox wrote:
           | I don't think stability is the main concern. However for the
           | longest time SQLite didn't have great support for multiple
           | writers and it is still pretty basic. So it works really well
           | for smallish numbers of clients or write-light workloads but
           | if you want to pound it with inserts and updates it still
           | isn't as good of a choice as other RDBMSes.
        
         | simlevesque wrote:
         | You may already know but it's SQL-ite (sequel-ite), like
         | graphite, dynamite, sulfite.
         | 
         | https://en.wiktionary.org/wiki/-ite
        
           | EGreg wrote:
           | I had no idea! I was sure it was "SQL Lite" ever since I
           | heard about it a decade ago. Its support for mostly text
           | strings while MySQL and Postgres supported a myriad types
           | made me think it's just a small embedded "lite" library for
           | when you wanted to bundle a mostly SQL database with your app
           | lol
        
           | avgcorrection wrote:
           | Such ambigious names remind of "Rust". It's such an um-
           | actually name (um actually it's not first and foremost about
           | oxidization it's about fungi...)
        
           | geewee wrote:
           | Huh, I had no idea!
        
           | srcreigh wrote:
           | Not even Sequel-ite. But S.Q.L.-ite. That's how Dr Hipp says
           | it, anyways.
        
             | sgbeal wrote:
             | FWIW... i've been in Richard's presence several times when
             | he was asked how to pronounce it by various people and he's
             | invariably answered, "pronounce it _however you like_! " (i
             | belong to the ess-queue-lite school of thought.)
        
         | gnrlst wrote:
         | I am a big believer in "nomen omen" -> latin for: destiny is in
         | the name. It's obviously an exaggeration, but to your point,
         | this is one of those cases where the name influenced _how_ the
         | tech was being used (and in what environments), regardless of
         | its underlying potential.
        
           | mananaysiempre wrote:
           | "Nominative determinism"[1]
           | 
           | [1] https://unsongbook.com/
        
         | ternaryoperator wrote:
         | > For much of its lifetime, SQLite wasn't taken seriously
         | 
         | As far as I can recall, it was always well regarded as an
         | embeddable database. What makes you think that it wasn't taken
         | seriously "for much of its lifetime"?
        
       | manimino wrote:
       | These blog posts have been great. I'd love to see a deep dive on
       | the query planner at some point.
       | 
       | I've done lots of benchmarking SQLite while writing ducks [1],
       | and found some interesting stuff. It seems like SQLite will only
       | use one index, even in cases where two indexes would better suit
       | the query. Or, in cases where the query is fetching most of the
       | table, it will use an index even though a full scan would
       | actually be much faster.
       | 
       | Don't get me wrong, SQLite is awesome, and it's quite possible to
       | work around these behaviors. Just interested in why they're not
       | built in.
       | 
       | [1] https://github.com/manimino/ducks
        
         | chasil wrote:
         | The author of SQLite recently published a paper on olap
         | performance and duckdb that went into greater detail on the
         | virtual machine.
         | 
         | https://news.ycombinator.com/item?id=32684424
         | 
         | Also, there is a "hint" mechanism in SQLite to force the use of
         | a specific index. A variant of it ("not indexed") can be used
         | to force a full table scan, which is appropriate when more than
         | 5% of a table's rows are involved. The docs strongly admonish
         | that these are not "hints," and the optimizer must obey these
         | clauses when they are used.
         | 
         | https://www.sqlite.org/lang_indexedby.html
        
           | manimino wrote:
           | Exactly. So the workaround looks like:
           | 
           | - Make one query with LIMIT of 5%
           | 
           | - If it returns LIMIT rows, do a query with NOT INDEXED
           | 
           | Which is surprising; why not build that into the query
           | planning / execution in the first place?
        
             | dllthomas wrote:
             | The workaround is to understand how the questions you're
             | asking the database relate to the structure of your data.
             | What you describe is a brute force way to acquire that
             | understanding on the fly on arbitrary queries. It might
             | sometimes be the best you can do, and then it might be
             | worth it (benchmark!), but in quite a few cases it'll be
             | quite a bit slower.
        
               | manimino wrote:
               | That's a great point. The LIMIT workaround takes no
               | knowledge of the data distribution, but an optimizer
               | could do much better with some statistics such as
               | estimated percentiles.
        
           | lilyball wrote:
           | The docs also say these aren't meant to be used for
           | performance tuning but instead of protecting against
           | regressions if the schema changes. I guess the idea is once
           | you're done and happy with performance you check the query
           | plan for time-sensitive queries and update the relevant
           | statements with INDEXED BY to reference the indexes they're
           | already using.
        
             | chasil wrote:
             | In Oracle, stats can change a CBO query plan in ways that
             | adversely affect performance.
             | 
             | To address this, Oracle 7 added hints, and then 8i
             | introduced the ability to "pin" an existing optimizer plan
             | to a query.
             | 
             | Stats are run here and there in SQLite, so I can see how
             | bad stats could throw the cost-based optimizer (CBO) into
             | bad performance.
             | 
             | Stats can be manually run with the ANALYZE command. This
             | should likely be done (among other things) prior to
             | distributing SQLite databases to an audience, or perhaps in
             | weekly maintenance of active databases undergoing extensive
             | DML.
             | 
             | https://www.sqlite.org/lang_analyze.html
        
         | whalesalad wrote:
         | cool library, potentially have an immediate need for it will
         | take it for a spin. currently maintaining multiple dict's in
         | order to achieve concurrent (air quotes with my fingers here)
         | writes but as more dimensions are added, more locks are needed
         | etc. memory is cheap so I'd rather just store a big list of
         | objects
        
         | samatman wrote:
         | Have you brought any of that up in the SQLite forum? Not that
         | you must, but the team loves that kind of feedback. They won't
         | take all opportunities to make queries faster, if it's too
         | complex for example, but they care a lot about speed of
         | execution.
        
       | spullara wrote:
       | Most people probably don't know this but SQLite is used under the
       | covers for most Mac OS applications like Photos and Mail. Been
       | that way for a long time and lets you do some interesting data
       | mining.
       | 
       | e.g. https://javarants.com/build-your-own-mail-analyzer-for-
       | mac-m...
        
         | pak9rabid wrote:
         | I believe it's because these applications likely use CoreData
         | as their data persistence framework, which uses SQLite as the
         | backend store. This has been nice because when I'm doing iOS
         | app development that uses CD, I can simply pop the SQLite
         | database open and query it directly to see what's going on as
         | I'm testing.
        
           | adamnemecek wrote:
           | It's one of the possible backend stores.
        
       | oxff wrote:
       | For some reason, IME, SQLite feels the hardest to use of the
       | popular databases. I did start my own database journey with
       | Postgre and MSSQL so I feel kind of lost with SQLite whenever I
       | use it.
        
         | sgbeal wrote:
         | > For some reason, IME, SQLite feels the hardest to use of the
         | popular databases.
         | 
         | sqlite is, above all else, a C API for interacting with a
         | database. Try using the C APIs for Postgres or MySQL or
         | Oracle/OCI, and i _guaranty_ you you'll find sqlite's easier to
         | work with _by leaps and bounds_.
        
           | nijave wrote:
           | I'm guessing this varies a lot by language and library. In
           | Python, it uses DB-API 2.0 which is a Python interface so
           | it's basically the same as, say, Postgres (minus the sql
           | dialect quirks).
           | 
           | I'd think something like JDBC would do the same for Java, etc
        
       | endisneigh wrote:
       | I see so many SQLite posts these days, but which companies with a
       | lot (>1M) concurrent users are using SQLite in a non embedded
       | fashion?
       | 
       | It just seems so academic. I'd like to use a web service or app
       | backed primarily by SQLite and see how it goes.
        
         | jcwayne wrote:
         | I'm curious how many companies building for >1M concurrent
         | users have even a tenth of that in reality.
        
         | tptacek wrote:
         | It works fine for read-heavy apps (ie: most apps) and, with
         | things like Litestream, is especially attractive in distributed
         | and edge settings, where you can stream out read replicas to a
         | bunch of regions and direct writes to a single central region.
         | 
         | That might sound confining, but it's what people already do
         | with Postgres: read replicas, single write master.
         | 
         | And the benefits aren't academic: having your database in-
         | process not only simplifies your database code (because N+1
         | queries are no longer perf emergencies), but can also
         | drastically speed up response times, because you're not dealing
         | with compounding intra-DC latency responding to an HTTP request
         | that incurs multiple database queries.
         | 
         | And, finally: SQLite wants a single writer _per database_. But
         | there 's no rule that says your entire app has to land in a
         | single .db file. That's what you'd naturally do in a Postgres
         | app, but it's not necessarily the natural answer for a SQLite-
         | backed app, where it's pretty trivial to pull from multiple
         | databases with a single SQL statement.
         | 
         | Don't get me wrong, I think it's still pretty niche-y as an
         | alternative to standard n-tier app databases. You'd use it
         | deliberately; it's unlikely to be a full-stack default. But
         | people have definitely been sleeping on SQLite, and I think
         | what you're seeing is some of those people beginning to wake
         | up.
        
           | endisneigh wrote:
           | > It works fine for read-heavy apps (ie: most apps) and, with
           | things like Litestream, is especially attractive in
           | distributed and edge settings, where you can stream out read
           | replicas to a bunch of regions and direct writes to a single
           | central region.
           | 
           | This is what I'm saying - I keep hearing this, which popular
           | website is using SQLite as their main database?
           | 
           | I don't have any doubt you can serve 1000 users, or even
           | 10,000 current users using SQLite. There's a power law
           | distribution with the popularity of sites. The most popular
           | sites have multiple orders of magnitude.
           | 
           | Can SQLite serve 500,000 current users with 75% reads and
           | writes with the consistency you'd expect?
           | 
           | > And, finally: SQLite wants a single writer per database.
           | But there's no rule that says your entire app has to land in
           | a single .db file. That's what you'd naturally do in a
           | Postgres app, but it's not necessarily the natural answer for
           | a SQLite-backed app, where it's pretty trivial to pull from
           | multiple databases with a single SQL statement.
           | 
           | Sounds reasonable in theory. How do you maintain consistency
           | between all of these databases? Schema migrations? Backups
           | and restores and maintaining referential consistency?
           | 
           | By the time you implement all of this, and you certainly can,
           | why not just use Postgres?
        
             | ok_dad wrote:
             | > which popular website is using SQLite as their main
             | database
             | 
             | First, I don't think this is a good metric for software
             | quality or use-case analysis. Just because people don't use
             | X doesn't mean X wouldn't be a superior tool for their use-
             | case.
             | 
             | > How do you maintain consistency between all of these
             | databases?
             | 
             | You don't, in the GP's example, you would shard the
             | database schema. Sharding is easy to figure about, because
             | many database services do this already to spread data and
             | analysis across several servers. You might use Litestream
             | to then send those thousands/millions of database files to
             | some S3 bucket somewhere. Then, you could analyze the
             | shards together in that bucket using one of the many
             | available solutions for such things, like Apache Hive or
             | whatever the BigCo clouds provide.
             | 
             | > By the time you implement all of this, and you certainly
             | can, why not just use Postgres?
             | 
             | Because you analyzed the pros and cons to SQLite vs.
             | Postgres for your specific use-case and determined SQLite
             | will be better in the long run. Perhaps your particular
             | use-case will turn up "Postgres" and you won't use SQLite,
             | or maybe you'll use both of them, but for different data
             | stores.
             | 
             | No one technology is "the chosen one", sometimes you need
             | several different options.
        
               | endisneigh wrote:
               | > You don't, in the GP's example, you would shard the
               | database schema. Sharding is easy to figure about,
               | because many database services do this already to spread
               | data and analysis across several servers. You might use
               | Litestream to then send those thousands/millions of
               | database files to some S3 bucket somewhere. Then, you
               | could analyze the shards together in that bucket using
               | one of the many available solutions for such things, like
               | Apache Hive or whatever the BigCo clouds provide.
               | 
               | I'm not sure if you're being serious. Sharding isn't easy
               | at all. Especially if you're doing it across different
               | files that need to be synced in tandem with schema
               | migrations that also need to be applied in tandem.
               | 
               | > Because you analyzed the pros and cons to SQLite vs.
               | Postgres for your specific use-case and determined SQLite
               | will be better in the long run. Perhaps your particular
               | use-case will turn up "Postgres" and you won't use
               | SQLite, or maybe you'll use both of them, but for
               | different data stores.
               | 
               | The use case is already defined in my original post. A
               | popular app with a lot of concurrent users. People claim
               | SQLite is superior to Postgres for such a scenario, I'm
               | curious to read about and use a service using SQLite for
               | an app with a lot of users (>1M 75% reads)
               | 
               | From looking at the issues for Litestream on GitHub I'm
               | not really convinced it's production ready. For some
               | small app I'm sure it's fine, I'm talking about an app
               | with a lot of users generating a lot of money and you
               | need high availability.
               | 
               | All of this makes it sound like I dislike SQLite, I
               | really don't - I've used it successfully for local apps
               | on the web, Android and iOS. However I feel this
               | popularity for SQLite on the web may be misplaced.
        
               | ok_dad wrote:
               | > People claim SQLite is superior to Postgres for such a
               | scenario
               | 
               | Only idiots would claim some technology is superior
               | without doing some use-case analysis and experimentation.
               | 
               | > I'm not sure if you're being serious...
               | 
               | I am, but still it _depends on use case_. If you have an
               | application that can support it for the use-case, then
               | you can do simple things like making one DB per user and
               | then doing aggregate functions on Apache Hive or
               | something. If you have a schema that will absolutely
               | _NOT_ support sharding, then I guess it might not work in
               | that case. I would argue I could design a SQLite solution
               | for 95% of the applications that need a SQL database, but
               | there are some which, for whatever reason, may require
               | other solutions. I am also _NOT_ saying that all of those
               | designs would be the best for that use-case, but it would
               | work.
               | 
               | Unequivocally: _Your use cases define the tools you need
               | to use._
               | 
               | > The use case is already defined in my original post. A
               | popular app with a lot of concurrent users. People claim
               | SQLite is superior to Postgres for such a scenario, I'm
               | curious to read about and use a service using SQLite for
               | an app with a lot of users (>1M 75% reads).
               | 
               | Still, you didn't _define_ it. What 's the data schema?
               | What is the application? I don't know what to say, if
               | your app is a TODO list app that's wildly popular, then I
               | guarantee I can do what you want with SQLite easily. If
               | it's Spotify-level stuff with special aggregation and
               | data analysis, maybe not?
        
               | endisneigh wrote:
               | > Still, you didn't define it. What's the data schema?
               | What is the application? I don't know what to say, if
               | your app is a TODO list app that's wildly popular, then I
               | guarantee I can do what you want with SQLite easily. If
               | it's Spotify-level stuff with special aggregation and
               | data analysis, maybe not?
               | 
               | sigh. _any_ app based on SQLite with more than 1 million
               | concurrent users, 75% reads.
        
               | tptacek wrote:
               | A million _concurrent_ , with a database-backed website?
               | Is that a useful threshold? How many monthly actives is a
               | million _concurrent_? And, at that scale, is your
               | database your gating factor, or is app architecture?
        
               | ok_dad wrote:
               | Here's one specific one, I would imagine they have a
               | pretty big user base:
               | https://tailscale.com/blog/database-for-2022/
               | 
               | Here are a few other sources I googled:
               | 
               | https://www.sqlite.org/famous.html
               | 
               | https://stackshare.io/sqlite
               | 
               | This is the best I can do, because I have no idea what
               | else you want. If I had the time, I could throw up a
               | simulated million users doing a TODO app with SQLite as
               | the backing DB engine, but I don't have time, I have a 2
               | year old. Do your own research from here, or don't and
               | just use Postgres until you retire, I don't really care
               | that much.
        
               | Scarbutt wrote:
               | _sigh. any app based on SQLite with more than 1 million
               | concurrent users, 75% reads._
               | 
               | Definitely not the use case for SQLite unless you can
               | easily do sharding but as you said, why bother?
        
           | bob1029 wrote:
           | > And, finally: SQLite wants a single writer per database.
           | But there's no rule that says your entire app has to land in
           | a single .db file.
           | 
           | I like finding hints of our architecture scattered across
           | these SQLite threads :D
           | 
           | It's _very_ sketchy trying to slam all users into a single
           | physical database. If you can slice up the users along unit
           | of work, session, customer, account, etc., you can go
           | horizontal very quickly.
        
         | adamrezich wrote:
         | you've almost certainly used software that uses sqlite today,
         | possibly even right now
         | https://www.sqlite.org/mostdeployed.html
        
           | zerr wrote:
           | That's because of its prevalence usage as a better flat file
           | or config file.
        
           | tptacek wrote:
           | They stipulated non-embedded use cases for it --- really,
           | their subtext is "show me places that use SQLite where
           | ordinarily people would use an n-tier architecture", which is
           | to say, Rails, Node, or Django. I think it's obvious to
           | everyone that SQLite gets a lot of important use behind the
           | scenes.
        
             | adamrezich wrote:
             | "embedded" can mean a few different things so that's a bit
             | confusing for me, but if the intent was "show me places
             | sqlite is used as a database backend for user-facing web
             | software", the Nim forum (https://github.com/nim-
             | lang/nimforum) uses sqlite as its database backend.
        
               | sgbeal wrote:
               | > but if the intent was "show me places sqlite is used as
               | a database backend for user-facing web software", the Nim
               | forum
               | 
               | Perhaps more appropriately, sqlite's own forum is sqlite-
               | backed: <https://sqlite.org/forum> (running the Fossil
               | SCM's forum interface, which itself exclusively uses
               | sqlite for storage: <https://fossil-scm.org> (that whole
               | site is an instance of fossil, serving all content from
               | an sqlite db))
        
               | tptacek wrote:
               | That's a good example.
        
         | pjc50 wrote:
         | That's pretty much the anti-use-case for SQLite?
        
         | TillE wrote:
         | Not everyone is doing only web development for millions of
         | concurrent users.
         | 
         | SQLite has been deployed on billions of devices, so "academic"
         | is an odd term here.
        
           | endisneigh wrote:
           | Non embedded. fly.imo for example exposes SQLite in a non
           | embedded fashion.
        
             | ok_dad wrote:
             | I don't understand your claim, with fly.io it seems like
             | you attach some storage to your application and store the
             | SQLite file there, semi-locally, much the same as any other
             | use of SQLite. The SQLite database code is still inside
             | your chosen application, it's just loading the file from
             | some "persistent volume" abstraction.
        
         | bob1029 wrote:
         | > I'd like to use a web service or app backed primarily by
         | SQLite and see how it goes.
         | 
         | We've been doing this in production for over half a decade now.
         | It goes so well it becomes boring very quickly.
        
           | endisneigh wrote:
           | Stats on your service/app and workload?
        
         | Thaxll wrote:
         | You won't find anyone because it would mean that you have a
         | single service talking to SQLite which most webservices won't
         | do for HA reasons, you have a few of them behind a
         | loadbalancer.
         | 
         | Using SQLite for any serious webservices is bad thing because
         | you can only have a single app. Then you have other tools on
         | top of SQLite for those shortcoming.
        
           | tptacek wrote:
           | Having a single database service as a write master is already
           | commonplace in HA Postgres.
        
             | Thaxll wrote:
             | Right but they do that over the network not using the
             | filesystem.
        
               | tptacek wrote:
               | I'm not sure what's complicated about this. Instead of
               | exposing a database instance as the write master, you
               | expose a service as the write master. You can even do it
               | transparently, if your load balancing fabric cooperates:
               | just arrange to have read replicas redirect write
               | requests to the write master.
        
               | avl999 wrote:
               | With that setup you have essentially lost any benefits
               | that SQLite provides, specifically any performance
               | benefits by making a DB read/write a network call instead
               | of a local call in the same process that manipulates some
               | file on the same machine. You are closer to a
               | postgres/mysql patterne except now your writes don't
               | scale as well. What are we actually gaining with this
               | setup other than perhaps easier administration of the "db
               | service" compared to a mysql/postgres deployment?
        
               | tptacek wrote:
               | Obviously, you have not lost those benefits: reads are
               | satisfied from an in-process database, and, for most
               | applications, most requests are reads.
        
         | sgbeal wrote:
         | > I'd like to use a web service or app backed primarily by
         | SQLite and see how it goes.
         | 
         | Look no further than sqlite's own source control system:
         | <https://fossil-scm.org>
         | 
         | i've used that site almost daily since the end of 2007 and have
         | encountered maybe (maybe) three "db is locked" errors.
        
       | mbaris wrote:
       | I really like this series of blog posts on SQLite, thanks if you
       | are reading this. It was really informative for me since I did
       | not know much about its internals before
        
         | benbjohnson wrote:
         | Author here. Thank you for the kind words! I'm glad to hear
         | it's been helpful.
        
           | samatman wrote:
           | I'm having a great time with it. SQLite's documentation is so
           | good that it's difficult to usefully add to it, but you've
           | managed that.
           | 
           | Some else said this, but I'd also love to see a deeper dive
           | into the query plan, if you get to it.
        
           | adamrezich wrote:
           | hey, while you're here, the links to the other articles from
           | this one are broken
        
             | benbjohnson wrote:
             | Oh shoot, thanks for letting me know. I'm getting them
             | fixed up right now!
        
           | tpetry wrote:
           | Are you still planning to write the article about cases when
           | SQLite is not a good fit? Would be really interesting because
           | we only hear stories when someone believes it is the best
           | fit.
        
             | orangepurple wrote:
             | I am under the impression that sqlite may struggle with
             | many concurrent writers
        
             | benbjohnson wrote:
             | Yes, I am still planning on writing about when SQLite is
             | not a good fit. Folks on Twitter were helpful in listing
             | use cases where it doesn't well. :)
        
       | purim wrote:
       | Any dates on supporting range of ports to my running sqlite/db
       | instance on? ex) 5999-8999
       | 
       | edit: not sure why this is being downvoted? not being able to
       | define a range of port seems like a huge oversight. the forums
       | there are not very active so I am asking here.
        
         | sgbeal wrote:
         | > edit: not sure why this is being downvoted? not being able to
         | define a range of port seems like a huge oversight.
         | 
         | It's likely been downvoted because your feature request
         | demonstrates that you're not asking about sqlite. sqlite has
         | _no server element_ , and therefore no ports which need using.
         | 
         | > the forums there are not very active so I am asking here.
         | 
         | You're apparently not looking at the correct forums. As one of
         | the sqlite forum moderators, i can say with some degree of
         | authority that its forum is well-visited.
        
           | purim wrote:
           | Do you work for fly.io? Horrible way to represent the company
           | like this!
           | 
           | Why would I be on sqlite forums?
        
             | simonw wrote:
             | Your original comment didn't clarify that you were talking
             | about ports on a Fly.io instance, which I imagine is why
             | people didn't understand the relevance of your question.
        
               | duskwuff wrote:
               | And even if the parent _is_ asking about Fly.io, I don 't
               | see how that's relevant to a blog post about SQLite.
        
             | sgbeal wrote:
             | > Do you work for fly.io? Horrible way to represent the
             | company like this!
             | 
             | Wha!?!?? The top post is very specifically about sqlite3,
             | not fly.io (whatever that is). Your comment about it was
             | taken entirely within the context of sqlite, not fly.io
             | (whatever that is).
             | 
             | > Why would I be on sqlite forums?
             | 
             | Because this whole thread is about sqlite and your post
             | said "the forums," implying sqlite's forums, not fly.io's
             | (whatever that is).
             | 
             | Edit: is see now that the article is hosted on fly.io
             | (whatever that is), but the article is still entirely about
             | sqlite, not fly.io (whatever that is).
        
               | tptacek wrote:
               | Whatever that is?! Hey now!
        
               | purim wrote:
        
         | mrkurt wrote:
         | https://news.ycombinator.com/item?id=32251586
         | 
         | This is answered in the forums, too:
         | https://community.fly.io/t/new-feature-every-public-port-now...
        
       | skadamat wrote:
       | > SQL was originally designed for non-technical users to interact
       | with the database, however, it's used almost exclusively by
       | software developers peppering it throughout their applications.
       | 
       | Interesting claim! In my experience, the # of data analysts / BI
       | analysts / etc that write and have mastered SQL vastly exceeds
       | the # of web developers (especially if we focus on developers
       | actually writing SQL instead of using ORMs)
        
         | xani_ wrote:
         | It would be probably other way around if browsers, that just
         | use sqlite underneath, exposed it as offical API for storing
         | data and not that abortion of a database called IndexedDB
        
           | sgbeal wrote:
           | Funnily enough: the WebSQL standardization effort was
           | abandoned _because_ all implementors were basing theirs on
           | sqlite (citation: https://www.w3.org/TR/webdatabase/). By
           | convention, standards efforts (except for the one Microsoft
           | bought from I$O for their office format) require two
           | competing implementations, in order to demonstrate that the
           | standard isn't entirely dependent on one provider.
        
         | benbjohnson wrote:
         | That's a good point. My claim is probably too specific to
         | software developers. Maybe I should have said "technical users"
         | instead.
         | 
         | I was paraphrasing from the Donald Chamberlin quote in the last
         | section of the linked PDF[1]:
         | 
         | "When Ray and I were designing Sequel in 1974, we thought that
         | the predominant use of the language would be for ad-hoc queries
         | by planners and other professionals whose domain of expertise
         | was not primarily database management... Over the years, I have
         | been surprised to see that SQL is more frequently used by
         | trained database specialists to implement repetitive
         | transactions such as bank deposits, credit card purchases, and
         | online auctions. I am pleased to see the language used in a
         | variety of environments, even though it has not proved to be as
         | accessible to untrained users as Ray and I originally hoped."
         | 
         | [1]
         | https://ieeexplore.ieee.org/stamp/stamp.jsp?arnumber=6359709
        
           | jgilias wrote:
           | I wonder what was the intended way of interacting with the
           | database for technical users?
        
             | ok_dad wrote:
             | The VM described in this post? I would imagine other
             | databases have a similar VM?
        
             | benbjohnson wrote:
             | I don't have any evidence to back this up but my guess is
             | that the underlying key/value store & cursor APIs were
             | probably the intended way for programmers to interact with
             | the database. SQL came out in the 1970s so every ounce of
             | performance was important.
        
         | paulhodge wrote:
         | Yeah I think it's mostly a piece of fun trivia at this point.
         | If we're talking about designing an interface for non-technical
         | users to interact with the data (an interesting topic), then a
         | modern one would probably look more like the omni-search bars
         | that are part of Splunk/Datadog/etc. Those make it drastically
         | easy to find data (with tableless text-based search by
         | default), and easier to aggregate it too (with the left-to-
         | right pipe | syntax).
        
       ___________________________________________________________________
       (page generated 2022-09-07 23:00 UTC)