[HN Gopher] SQLite is not a toy database ___________________________________________________________________ SQLite is not a toy database Author : nalgeon Score : 730 points Date : 2021-03-25 14:20 UTC (8 hours ago) (HTM) web link (antonz.org) (TXT) w3m dump (antonz.org) | fouc wrote: | PostgreSQL is not a toy database. MySQL is not a toy database. I | feel like the "toy database" part is clickbait | dpix wrote: | I don't think the author is trying to call out SQLite as | somehow better than others database tech - just that most | people perceive SQLite to be a tool for development or | exploration when in fact it much more powerful than most | believe | guenthert wrote: | Not sure about that. I can't recall having seen PostgreSQL | being referred to as toy database. I have seen however texts | cautioning SQLite in production use, e.g. in [1], [2]. | | [1] | https://docs.bareos.org/bareos-18.2/IntroductionAndTutorial/... | | [2] | https://docs.bareos.org/bareos-18.2/DeveloperGuide/catalog.h... | unixhero wrote: | SQLite: The Database at the Edge of the Network with Dr. Richard | Hipp https://www.youtube.com/watch?v=Jib2AmRb_rk | amaccuish wrote: | It's a shame Nextcloud works terribly with SQLite | annoyingnoob wrote: | Thanks for the info. Probably would not have considered SQLite in | place of something like pandas. | bob1029 wrote: | Don't forget about User-Defined Functions. | | https://www.sqlite.org/appfunc.html | | We just started enhancing our SQL dialect with new functions | which are implemented in C# code. One of them is an aggregate and | it is really incredible to see how it simplifies projections | involving multiple rows. | | One huge benefit of SQLite's idea of UDFs is that you can | actually set breakpoints and debug them as SQL is executing. | hnrodey wrote: | Sounds very interesting. Can you elaborate on how you're | leveraging C# for this? | bob1029 wrote: | Directly through Microsoft's SQLite provider - | Microsoft.Data.Sqlite. | | See: https://docs.microsoft.com/en- | us/dotnet/standard/data/sqlite... | foepys wrote: | Application-defined functions are very useful. | | One of my company's applications is already designed to work | with different SQL systems and a new customer desperately | wanted SQLite for a very special use case. As SQLite is quite | simple and doesn't support many functions that are standard in | SQL Server, MySQL, Oracle, etc., we used application-defined | functions to implement all functions the application needs in | C#. It's not very fast but also not slow and the customer is | happy, which is what really counts. | TeMPOraL wrote: | They're awesome! | | One common use case is REGEXP. SQLite has a keyword for regular | expression matching, but it has no implementation for it. What | your application needs to do is to take whatever regex library | it's using, make (through whatever FFI method it's using) a C | function that interfaces with it, and register it with SQLite. | | A more advanced feature of this binding mechanism is that, if | you provide a bunch of specific callbacks to SQLite, you can | expose anything you like as a virtual table, that can be | queried and operated on as if it was just another SQLite table. | | See: https://www.sqlite.org/vtab.html. The plugin for full-text | search is implemented in terms of this mechanism. | eb0la wrote: | Sqlite is also an EXTREMELY well tested database. Last time I | checked test code was about 2x productionncode. This is also good | indicator to spot software that is not a toy. | eb0la wrote: | Turns out it is much, much, more tested than I remembered: | | https://www.sqlite.org/testing.html | mey wrote: | Just went looking, there doesn't seem to be a great actively | developed solution for a Java implementation or JNI bridge. The | JDBC driver that seemed to have popularity was brought into the | SQLite repo, but the documentation is lacking a bit. [1] Even | .Net Core has a integrated ADO.NET bridge to SQLite [2] | | There are quite a few mature embedded DB options for Java | actively developed, for example Apache Derby/HSQLDB/H2. The | SQLite fileformat is very portable, which is very handy and | useful, the SQLite C Library is very powerful. It'd be nice for | the Java community to tap into that power. | | 1) | https://www.sqlite.org/java/file?name=doc/overview.html&ci=t... | | 2) https://docs.microsoft.com/en- | us/dotnet/standard/data/sqlite... | doomleika wrote: | Two major gripe I had with SQlite | | 1. SQLite doesn't really enforce column types[0], the choice is | really puzzling to me. Since schema enforced type check is one of | the strong suit of SQL/RDMBS based data solution. | | 2. Whole database lock on write, this make it unsuitable to high | write usages like logging and metric recording. WAL mode will | help but it will only alleviate the issue, you will need row | based lock solution eventually. | | Just like the offical FAQ said, SQLite competes with fopen[1] | instead of RDBMS systems. | | -- | | [0]: https://sqlite.org/datatype3.html | | [1]: https://www.sqlite.org/whentouse.html | lsb wrote: | If you are in WAL mode, you can have unlimited readers as one | writer is writing. | mastre_ wrote: | Also, another issue when not in WAL mode is that a long read | will actually have a write lock and block _writes_. | haberman wrote: | > SQLite doesn't really enforce column types[0], the choice is | really puzzling to me. | | This is acknowledged as a likely mistake, but one that will | never be fixed due to backward compatibility: | | > Flexible typing is considered a feature of SQLite, not a bug. | Nevertheless, we recognize that this feature does sometimes | cause confusion and pain for developers who are acustomed to | working with other databases that are more judgmental with | regard to data types. In retrospect, perhaps it would have been | better if SQLite had merely implemented an ANY datatype so that | developers could explicitly state when they wanted to use | flexible typing, rather than making flexible typing the | default. But that is not something that can be changed now | without breaking the millions of applications and trillions of | database files that already use SQLite's flexible typing | feature. | | https://sqlite.org/quirks.html | doomleika wrote: | Yeah, still I would really wish SQLite is a feather weight | RDBM system than almost-but-not-quite-your-typical-RDBMS the | better `fopen`. | | Having a tool like this would made my life whole lot easier, | well, one can dream. | setr wrote: | SQLite already has different operating modes, right? e.g. WAL | is turned on and stays on, I think; It seems like you could | at least make type-checking an opt-in mode | mastre_ wrote: | > but one that will never be fixed due to backward | compatibility | | I wonder if a fork/"new version" could address this. Like, | sqlite2 (v1.0, etc.). | isoprophlex wrote: | Considering that we're on sqlite3 already, it'd probably be | something for the v4 ;) | [deleted] | mbreese wrote: | This would be something on the order of the python2 -> | python3 transition. Meaning, it would likely take a decade. | After going though that, I'm not sure it would be worth it | to just change from (default) flexible data types. | [deleted] | petters wrote: | The expensify blog linked from a comment here claims: | | > But lesser known is that there is a branch of SQLite that has | page locking, which enables for fantastic concurrent write | performance. Reach out to the SQLite folks and I'm sure they'll | tell you more | dnautics wrote: | > SQLite doesn't really enforce column types[0], the choice is | really puzzling to me. | | It's not the worst thing in the world; you're validating on | data ingest anyways to prevent sqli, for example, right? | Sohcahtoa82 wrote: | Data validation is a dangerous method to try to prevent SQL | injection. The only surefire method is to used parameterized | queries, which you should be doing anyways. | dnautics wrote: | huh? By data validation I mean a validation library in your | surrounding PL, it's flowing through the types of that | language, and at no point is unprepared SQL entering your | system. | HelloNurse wrote: | > unsuitable to high write usages like logging and metric | recording. | | If the speed of your write-only workload is limited by whole | file locks rather than by raw I/O speed, you can probably | consolidate your writes into fewer transactions (i.e. fewer | disk accesses, amortizing lock cost over more data) and write | to several databases in parallel according to any suitable | sharding criteria. Which is what any RDBMS would have to to | anyway. | zie wrote: | 1: Actually this is a feature, It's awesome and easy to map | typing to your language types. In python see: | https://docs.python.org/3/library/sqlite3.html#using-adapter... | specifically the DECLTYPES option. | | Other language bindings do things like this also, and makes it | pretty idiot proof. you `create table test (mydict dict);` so | your tables know their types, and then at bind time you say a | sqlite column type of dict == a python dictionary. | | Obviously python is sort of a terrible example, because python | typing is somewhat non-existent in many ways, but you see the | point here. | | 2: There are definitely cases where it won't work out well, | high-concurrent write load is definitely it's big weak spot, | but those are usually fairly rare use cases. | petercooper wrote: | With no sense of overstatement here, SQLite is one of my favorite | creations in the entire world, so I have a bunch of links some of | you might find interesting if you want to dig further: | | https://github.com/sql-js/sql.js - SQL.js lets you run SQLite | within a Web page as it's just SQLite compiled to JS with | Emscripten. | | https://litestream.io/blog/why-i-built-litestream/ - Litestream | is a SQLite-powered streaming replication system. | | https://sqlite.org/lang_with.html#rcex3 - you can do graph-style | queries against SQLite too (briefly mentioned in the article). | | https://github.com/aergoio/aergolite - AergoLite is replicated | SQLite but secured by a blockchain. | | https://github.com/simonw/datasette - Datasette (mentioned at the | very end of the OP article) is a tool for offering up an SQLite | database as a Web accessible service - you can do queries, data | analysis, etc. on top of it. I believe Simon, the creator, | frequents HN too and is a true SQLite power user :-) | | https://dogsheep.github.io/ - Dogsheep is a whole roster of tools | for doing personal analytics (e.g. analyzing your GitHub or | Twitter use, say) using SQLite and Datasette. | jakoblorz wrote: | sqlite can actually be scaled quite high: 4mio qps (queries per | second) https://blog.expensify.com/2018/01/08/scaling-sqlite- | to-4m-q... | api wrote: | SQLite is one of those things that also shows by contrast how | bloated and inefficient much of modern software is. It | accomplishes so much so efficiently with such a small | footprint. | rhombocombus wrote: | I love it. SQLite is light, easy, and useful in a variety of | situations. | petercooper wrote: | Agreed. There's a real beauty to it, to me, like looking at a | Swiss watch or something. Redis gives me similar vibes though | is a little more complex given its networked nature (although | the C is nice to read). | dheera wrote: | Which if any of these is efficient enough to run on mobile and | can cerealize an entire database to localStorage? | WJW wrote: | SQLite underpins large parts of both iOS and Android, so... | all of them? Whether you can serialize "an entire database" | just depends on the size of the database and whether it will | fit in the local storage. SQLite scales up to a few TB at | least, so that is unlikely to be the bottleneck. | yrrr wrote: | localStorage is a web API, the storage limit is ~5MB. | | However it can only store strings so it's pretty taxing. | | Chrome and Safari actually embed SQLite directly as WebSQL, | however it's not becoming standardized because Mozilla | doesnt consider "add SQLite" as a sensible web standard. | dheera wrote: | I don't fully get it -- this is an SQlite implementation in | JavaScript, which is different from what you'd use in a | native app. Unless you store it to localStorage or cookies, | all variables disappear when you navigate away from the | page. | | Can you write an offline HTML5 webapp with any of these | libraries such that it can cerealize the entire database | into a string and then store that to localStorage and | reload the next time? | | It's an honest question and not sure why people are | downvoting without giving reasons. | tehbeard wrote: | Really you'd want to do this in to an indexeddb store | rather than local storage. Less issues with size limits | and string serialisation as you can chuck Blob instances | straight into it. | | You'd have to handle loading/saving though | haolez wrote: | Wow! The sql.js bundle is only 8kb. This is a hidden gem for | sure. Thanks for pointing it out! | | EDIT: It's actually 1.2MB. Thanks for pointing it out :) | tln wrote: | There's a 1.2 Mb WASM file as well | fabiospampinato wrote: | Nope, try using the demo website, it loads a 1.2mb wasm file | (https://sql.js.org/dist/sql-wasm.wasm). SQLite might be | impressively small but it's not _that_ small. | bogwog wrote: | I was actually looking into options for storing some data | for a game I'm working on a while back, and decided against | using SQLite because it was too big of a dependency | (project has a specific size limit/target). | [deleted] | faizshah wrote: | There's also a pretty nice built-in sqlite extension for | transitive closures that helps in searching hierarchical/tree | structures in sqlite: | https://web.archive.org/web/20141127001741/https://charlesle... | napsterbr wrote: | Clicking the above link takes me to | https://imgur.com/32R3qLv. | | I fail to understand why have a blog at all if its author | don't like people linking to it. | setr wrote: | well, he doesn't like one particular group of people | linking to it :-) | faizshah wrote: | How strange, sorry, I didn't know that it did that. Just | updated it to an archive link. This is the only source I | know of that gives an overview of closures.c so it's worth | the read. | napsterbr wrote: | Definitely not your fault :) | fiddlerwoaroof wrote: | I believe jwz did this first | kroggen wrote: | we can also have a SQLite database with branches, like in Git: | | https://github.com/aergoio/litetree | neerajdotname2 wrote: | Using sql.js we have built online SQL course where the code is | executed in the browser itself. | https://academy.bigbinary.com/learn-sql | gary_bernhardt wrote: | Seconded; it's a fantastic tool and ecosystem. We use sql.js in | production (https://www.executeprogram.com). The SQL course's | live code examples all run directly in the browser via sql.js. | | I initially had low expectations because it's such a weird use | case, but it's been totally reliable. We did have to ignore a | few types of errors from old browsers that don't support wasm | properly, but we've never had a bug in current browsers caused | by sql.js. | no-s wrote: | >so I have a bunch of links some of you might find interesting | if you want to dig further | | oooh, thank you! I'm starting to see SQLite the same way! | D_Guidi wrote: | spatialite rulez! | drenvuk wrote: | sqlite is my caching layer. It's crazy fast on ssds and keeps my | memory empty. There's no reason to read blobs from disk or keep | them stored in memory if i'm using it. | sigjuice wrote: | What is the benefit of keeping memory empty? | drenvuk wrote: | Running lots of other things on the same machine. Most of the | stuff programs store is just sitting idle doing nothing but | taking up space. Move it to sqlite or disk if it's an ssd. | The latency hit negligible unless you're running a gaming | server. | maxbaines wrote: | Thanks, always agreed. And a good read. | incanus77 wrote: | I had seen Dr. Hipp speak at a conference[1] around 2009 and it | really put SQLite on my radar as a (then) Mac developer. It ended | up surfacing from the back of my mind around then at Mapbox when | I put together the prototype of the MBTiles file format[2] which | is still in use today. The idea of an on-disk relational database | was new to me and just a perfect fit for where mobile devices | were at that point in time, and for our use case of offline maps. | | [1] https://en.wikipedia.org/wiki/C4_(conference)#C4[2] | | [2] https://medium.com/devseed/portable-map-tiles-format- | release... | durkie wrote: | thanks for mbtiles! I spend lots of time using and abusing that | format every day :) | cute_boi wrote: | Nobody seems to mention that SQlite is one of the secure | database. Just look at its rigorous testing, fuzzing and | everything. | juancampa wrote: | nit: security is like the speed of light for a massive object, | you can never reach it, you can only get closer and closer. I'd | say instead "SQLite is one of the _most_ secure databases" | hackeraccount wrote: | Is it the unix version of MS Access? It's sort of interesting to | compare the two. | SigmundA wrote: | No it would be closer to the Jet db engine that Access uses, | they are both file based in process relational db engines. Back | in the day for a Windows app if you wanted a in process file | based db you would use Jet since the engine was included with | windows even though Access was a separate product. | happyweasel wrote: | Yeah, and I think sqlite is really comparable to this setup. | Use Access to design the database (optional), and access (no | pun intended) the jet database engine via ODBC,OLE-DB or DAO | ..Jay! Code like it's 1999 :-) ... I remember the jet db | engine offered at least transaction level READ_COMMITTED (in | comparison to mysql's MyISAM) and "it was quite fast" until | reaching about 1mio rows or so (of course, db design, index | etc determines "fast", that was just a basic rule of thumb I | remember)... | II2II wrote: | SQLite is a database that is intended to be embedded in | applications rather than run as a separate server. You could | make an Access-like application from it, but it cannot replace | Access on its own. | | It's commonly used in applications that use it as a convenient | container to store data (the "file format" scenario) or as a | domain specific database. One of the better examples of the | latter is Calibre, an e-book library manager, since it exposes | some of the database functionality to the end user. For | example: the end user can add columns to store custom data for | each book. | stjohnswarts wrote: | Well the database part maybe, but it's not going to help you | generate forms and reports or build small guis around it. | You'll have to find those tools for yourself or build a web app | around your sqlite db. I haven't done a lot with Access though, | mostly help build small database interfaces/reports for their | inventory systems (whether a lab or comic book collection). | buescher wrote: | You can do all that with tcl/tk. Sqlite was in fact | originally a tcl extension so the tcl interface for sqlite is | first-rate. With tcl/tk it's sort of a Unix analog to VB6. | | I keep thinking it would be sort of fun to do a tk app again; | is there a good model CRUD app for tcl/tk/sqlite out there? | Something like the Northwind thing for Access? | eleog wrote: | Sqlite is not a toy, but it's so sloppy that it's not really a | tool either. Like a saw with a loose blade, it can help you cut | but it can also hurt you. My pet peave is the way sqlite accepts | non-aggregated columns in a GROUP BY query. You'll get a result | in the column, but it's not clear what it means. | | More of sqlite's sloppyness is detailed here | https://sqlite.org/src/wiki?name=StrictMode | outworlder wrote: | MySQL is also very "sloppy". People have learned to work around | its quirks. | | Honestly, SQLite is a wonderful tool if properly used. Calling | it "sloppy" is a disservice. | ransom1538 wrote: | SQLite powers iMessage. Legit. You can use sqlite commands to | look through your messages on your macbook. | myth2018 wrote: | I'd also like to add the possibility of using SQLite databases as | an application file format: | https://news.ycombinator.com/item?id=23508923 | | I had to work on a data import/export tool some time ago and | SQLite has simplified the design a lot. | indigo945 wrote: | While this makes things easy, you should not do this for any | application file format where you except your users to share | files. This is because opening a SQLite database file makes the | SQLite library execute any arbitrary code that may be stored in | that file. [0] Therefore, SQLite is really only suitable for | local-only file formats, such as configuration files, and not | for files that users will e-mail to each other. | | [0] | https://media.ccc.de/v/36c3-10701-select_code_execution_from... | myth2018 wrote: | Thanks for the heads-up! | | In my use case, files are shared between different instances | of the application, usually without user intervention, but | there's an attack vector to be addressed here. | adamrezich wrote: | Weren't all of these issues quickly fixed by the developers | after being reported? | | E: oh whoops missed the dev post above mine | SQLite wrote: | This is misinformation. SQLite does _not_ execute arbitrary | code found in the data file. There was a bug, long since | fixed, that could be used by an attacker to cause arbitrary | code execution upon opening the database file. The referenced | video talks about it. It was a very clever attack. But the | bug that enabled the attack was fixed even before the talk | shown in the video was given. | | Let me say that again: SQLite does _NOT_ execute arbitrary | code that it finds in the database file. To suggestion that | it does is nonsense. | | See https://www.sqlite.org/security.html for additional | discussion of security precautions you can take when using | SQLite with potentially hostile files. The latest SQLite's | should be safe right out of the box, without having to do | anything mentioned on that page. But defense in depth never | hurts. | tompazourek wrote: | Anyone knows what is the current situation regarding parallelism | in SQLite (utilizing multiple cores when executing a complex | query)? | | Last time I heard about this, it wasn't looking very good. | people_not_bots wrote: | If coding can't be a toy then I dont know what is | francoisp wrote: | SQLite is a really neat thing. I was looking at extensions and | how to augment it; you could even add a pg_notify -like feature: | https://sqlite.org/c3ref/update_hook.html and have worker | processes doing what would amount to out of process stored | procedures in postgres (or UDF in SQLite) -- in any language | you'd like. | | You can only register one callback per table tho, although you | could from this callback fire other functions... All in all it's | an awesome tool for a project like tailscape, but I think the | hackers there went for a flat file. | | Personally I'd would love to see in process postgres; a build of | postgres that is geared for integrating a set of your threads, | and builds the whole of postgres with your app on all major OSs, | only listening to the inside by default. For the same reason I'm | using nodejs; to be able to run the same code anywhere. I think | bundle size would be a minor issue, really, I downloaded Sage9.2 | yesterday, it's 2GB! VSCode is 100MB download, and they refer to | it as a small download... | | cheers! happy coding, | fiddlerwoaroof wrote: | The Firebird database can run both as a separate server and in | embedded mode. I wish more databases works design for this sort | of flexibility. | tgb wrote: | Does anyone use SQLite as their daily-driver in lieu of R or | pandas for data analysis? I don't think I can use the sqlite | command-line since I'd want a fully-developed plotting utility, | and it seems less convenient to do the actual analysis part | through a sqlite connection in python, say. | abhgh wrote: | Not "in lieu", but I have found SQLite to be a good _companion_ | to pandas. pandas is good when you want to lend some easily | navigable structure to your data without leaving python, but | becomes cumbersome (it _feels_ cumbersome esp if you know SQL) | when you want to do some serious processing. At such times, I | just dump my data into a SQLite DB, and from then on my code | has SQL queries to it, which I find are far more readable than | corresponding pandas statements (also easier for the reviewer). | | Another advantage you get is if you wanted to look at these | intermediate data, you don't need to run your code in debug | mode and view the dataframe at a breakpoint - you can use | something like datasette[1] or a standard SQLite DB viewer. | | So a function that does complex data processing has | approximately this structure my code: (1) <non- | pandas python code to fetch data, other "normal" stuff> | (2) <pandas code to do basic navigation, filtering, grouping | etc> (3) <dump to SQLite, followed by SQL queries for the | really heavy stuff> (4) <back to python, possibly pandas, | to return results, write to a file, plot etc> | | Step (3) used to be pandas for me before, but depending on how | complex your operations are this can become hard to read and/or | review. | | [1] https://github.com/simonw/datasette - datasette doesnt | replace a standard DB IDE, but is a very good lightweight | alternative to one if don't intend to perform updates/inserts | directly on a table. | steve_gh wrote: | I have exactly the opposite workflow. Import data into | salute, perform as much heavy lifting as possible in salute, | and export into a python data science workflow via led to if | necessary. | hokkos wrote: | I use extensively DbBrowser for sqlite to plot simple graph | from sql queries of imported csv files, it is easier than to | start a jupyter environment with python and pandas or r studio. | nalgeon wrote: | I do. Apache Superset, Metabase, Redash - all support SQLite. | Other BI tools also do. | simonw wrote: | I do. I load data into SQLite using various tools I've written | (geojson-to-sqlite, db-to-sqlite etc) and run the analysis in | https://datasette.io | iagovar wrote: | You can perform sql queries with many of the DB clients | available, such as Heidi, or Deaver, and also through python. | | Sqlite is OLTP though, for analytical purposes I'd use an OLAP | DB like DuckDB. | | IMO dumping workload in DB is nice, specially when your dataset | doesn't fit in RAM. | 6gvONxR4sf7o wrote: | You can always execute SQL in pandas by passing it a | connection, getting the best of both worlds. | qbasic_forever wrote: | Yes, setup jupyter magics to execute SQL and it becomes MUCH | more useful for data analysis in a notebook. There's a great | extension that lets you just write and run SQL queries right in | cells: https://blog.jupyter.org/a-jupyter-kernel-for- | sqlite-9549c5d... Hook it up to an in memory or on disk SQLite | instance and you can kick pandas to the curb. | kwhitefoot wrote: | Take a look at DB Browser for SQLite: | https://sqlitebrowser.org/ | | It has some plotting built in. Not sure if it is sophisticated | enough for what you need but it might be worth a try. | tfehring wrote: | It's complementary, since unlike R/Pandas it can fit too-big- | for-memory data - I've used it ad-hoc to store big-ish datasets | to pull into R. Though if you're using Python I think that use | case is mostly superseded by Dask. | jarym wrote: | Oh it's totally a toy. A great toy though! Love it to bits. | SQLite + Postgres here and maybe will add DuckDB to the mix soon | [deleted] | polyrand wrote: | You can also have stored procedures in SQLite | | https://cgsql.dev/ | fmajid wrote: | Surprised no one has mentioned how Expensify use SQLite as their | main data store: | | https://blog.expensify.com/2018/01/08/scaling-sqlite-to-4m-q... | ajsnigrutin wrote: | For me, the main killer feature of sqlite is portability. (Like | moving it around, not porting it to new systems). | | Just write the code to create an empty database if the db file | does not exist, move your code elsewhere, and the database will | be created on the first run. No usernames, passwords, firewall | rules, IP addresses, no nothing... just a single file, with all | the data inside. | | Miration? Copy the whole folder, code and the database. Clean | install.. copy the folder, delete the database. Testing in | production? Just backup the file, do whatever, then overwrite the | file. | beached_whale wrote: | If I have a choice of making a file format and doing the I/O or | using SQLite, I'll choose SQLite. It's hardened to a degree that | I won't be for filesystem shinanigans. If I need to export a | file, that's still easy enough. | diehunde wrote: | From the SQLite website: | | Situations Where A Client/Server RDBMS May Work Better | Client/Server Applications | | If there are many client programs sending SQL to the same | database over a network, then use a client/server database engine | instead of SQLite. SQLite will work over a network filesystem, | but because of the latency associated with most network | filesystems, performance will not be great. Also, file locking | logic is buggy in many network filesystem implementations (on | both Unix and Windows). If file locking does not work correctly, | two or more clients might try to modify the same part of the same | database at the same time, resulting in corruption. Because this | problem results from bugs in the underlying filesystem | implementation, there is nothing SQLite can do to prevent it. | | A good rule of thumb is to avoid using SQLite in situations where | the same database will be accessed directly (without an | intervening application server) and simultaneously from many | computers over a network. | | High-volume Websites | | SQLite will normally work fine as the database backend to a | website. But if the website is write-intensive or is so busy that | it requires multiple servers, then consider using an enterprise- | class client/server database engine instead of SQLite. | | Very large datasets | | An SQLite database is limited in size to 281 terabytes (248 | bytes, 256 tibibytes). And even if it could handle larger | databases, SQLite stores the entire database in a single disk | file and many filesystems limit the maximum size of files to | something less than this. So if you are contemplating databases | of this magnitude, you would do well to consider using a | client/server database engine that spreads its content across | multiple disk files, and perhaps across multiple volumes. | | High Concurrency | | SQLite supports an unlimited number of simultaneous readers, but | it will only allow one writer at any instant in time. For many | situations, this is not a problem. Writers queue up. Each | application does its database work quickly and moves on, and no | lock lasts for more than a few dozen milliseconds. But there are | some applications that require more concurrency, and those | applications may need to seek a different solution. | candiddevmike wrote: | I wish SQLite had a PostgreSQL compatibility layer. Right now, to | add SQLite support to a Go project (or anything without an ORM), | you have to rework all your queries and migrations. It's probably | an impossible ask, but having a compatibility flag within SQLite | so it would accept PostgreSQL formatted queries would be | extremely helpful. | smilliken wrote: | Abstracting over two complex and evolving systems is a | bottomless well of bugs. Contributers would shy away from this | sort of compatibility because it would encourage users to | ignore the differences at first, and then complain loudly when | they are discovered. | michaelmior wrote: | > There is nothing more convenient than SQLite for analyzing and | transforming JSON. You can select data directly from a file as if | it were a regular table. | | Personally I love jq[0] for this purpose. I haven't really used | SQLite for working with JSON, but the examples given are very | verbose. | | [0] https://stedolan.github.io/jq/ | simonw wrote: | I wrote a plugin for Datasette that adds a jq() custom SQLite | function, it's pretty fun: | https://datasette.io/plugins/datasette-jq | hobo_mark wrote: | I never managed to wrap my head around jq syntax however | busterarm wrote: | It takes a while. jq really needs a repl. | bionhoward wrote: | Check out ijq...used it this past week to deal with some | ugly nested FHIR data. Not perfect, a bit laggy, sometimes | crashes your terminal, but really cool in a lot of cases | https://sr.ht/~gpanders/ijq/ | qbasic_forever wrote: | I'm with you too. I've never found a query language except | SQL that has really stuck in my head. Pandas, jq, XSL, etc... | it all has me running screaming back to SQL. | michaelmior wrote: | I agree it can be a little obtuse at times and I still do | occasionally find myself having to consult the documentation | despite using it pretty regularly for some time. I do also | sometimes make use of gron[0] since it integrates well with | standard Unix tools. The pipelines I write with gron are | probably pretty brittle in the sense that I expect they might | be incorrect for some edge cases. But for some quick | exploratory analysis, it's a tool I like to have handy. | | [0] https://github.com/tomnomnom/gron/ | andrewmcwatters wrote: | SQLite is so robust, that I bet most websites could use it | without really needing to move onto a client/server RDBMS.[1] I | use MySQL, and I know PostgreSQL has a large marketshare now, but | I wonder how much of either is really necessary when you think | about traffic usage alone. I know at least in my use cases, | neither seem necessary. | | [1]: https://sqlite.org/whentouse.html | smoe wrote: | I'm sure most website could traffic wise. But for me it is not | a question whether I could, but if I should. | | If my goal is building a website, I don't necessarily want to | experiment with different technologies if I already know | Postgres will work perfectly fine without adding much | operational overhead and covering use cases I don't have yet, | vs the unknown unknowns of using SQLite and maintaining it over | time. Again, this is not about some problem with SQLite, but | just me not having experience using it this way. Same reason | why I wouldn't just add any database system I haven't used | before, even if on paper it would be the "better tool for the | job" for a particular use case, and sure would be an | interesting learning experience. | | In my opinion practicality and prior experience often beats | what is strictly necessary or "best". | Thaxll wrote: | SQLite is very limited because of its threading model, imo it's | not usable outside of the single app model where you have a | single user. | | https://sqlite.org/threadsafe.html | | https://sqlite.org/lockingv3.html | andrewmcwatters wrote: | I've used SQLite a bit, but not enough to say I know where | you run into performance issues. I would anticipate though, | if it's similar to fread/fwrite with some marginal overhead | for dealing with SQL, after considering most queries aren't | very complex, I think most people are going to have a hard | time hitting those limits. | | And that's assuming you're making queries every time an event | occurs versus persisting data at particular points in time. | hsnewman wrote: | I disagree. I wrote a multiplayer web based game using sqlite | in go: http://decwars.com/ | gwd wrote: | The article addresses this. Basically, you can have any | number of concurrent _readers_ , but only a single _writer_. | Writing and reading can happen concurrently just fine. So the | question is -- how many users does a website need before | having only a single concurrent writer becomes a bottleneck? | | That number will obviously depend on the read/write ratio of | any given website; but it's hard to imagine _any_ website | where [EDIT the number maximum number of concurrent users] is | actually "1". And for many, that will be in the thousands or | hundreds of thousands. | | FWIW the webapp I use to help organize my community's | conference has almost 0 cpu utilization with 50 users. Using | sqlite rather than a separate database _greatly_ simplifies | administration and deployment. | lacker wrote: | When I was working on database services there were actually | many applications with more writes than reads. A common | example is applications where you want to keep the user's | state saved frequently, but you only need to reload it when | the application is restarted. A lot of games work this way. | So there can be a write every few seconds or every minute | for every active user, but you only need to do a read a few | times at the beginning of a new session. | | Of course applications like a blog will have far more reads | than writes. It just really varies depending on the type of | application. | eingaeKaiy8ujie wrote: | >That number will obviously depend on the read/write ratio | of any given website; but it's hard to imagine any website | where that number is actually "1". | | I can imagine a static website where the content is read- | only for users and is only editable by | admins/developers/content managers through some CMS. | duskwuff wrote: | That'd be a near-infinite ratio. The parent is discussing | a site where the ratio is near 1 (i.e. roughly as many | reads as writes). | gwd wrote: | It's not even the _percentage_ of reads vs writes; it 's | about the _total time spent writing_. Suppose that while | a single user was actively using the website, the time | spent writing was 70%. That 's a totally mad write load; | but even then, if you had two concurrent users, things | would probably still be quite useable -- you'd have to go | up to 4 or 5 users before things really started to slow | down noticeably. | | Suppose, on the other hand, that a single user generated | around a 1% write utilization when they were actively | using the website (which still seems pretty high to me). | You could probably go up to 120 concurrent users quite | easily. And given that not all of your users are going to | be online at exactly the same time, you could probably | handle 500 or 1000 _total_ users. | tyingq wrote: | A nice fix for this might be to extract just the wire | protocol from dqlite to make a network connected, but not | Raft replicated, sqlite. | | https://github.com/canonical/dqlite | bob1029 wrote: | With a single thread & SQLite connection instance, I have | been able to insert hundreds of thousands of rows per second | when using NVMe drives. | | Note that WAL and synchronous flags must be set | appropriately. Out of the box and using the standard "one | connection per query" meme will handicap you to <10k inserts | per second even on the fastest hardware. | | The trick for extracting performance from SQLite is to use a | single connection object for all operations, and to serialize | transactions using your application's logic rather than | depending on the database to do this for you. | | The whole point of an embedded database is that the | application should have exclusive control over it, so you | don't have to worry about the kinds of things that SQL Server | needs to worry about. | | SQLite is not a _direct_ replacement for SQL Server, but with | enough effort it can theoretically handle even more traffic | in your traditional one-database-per-app setup, because it 's | not worrying about multiple users, replication, et. al. | Thaxll wrote: | "The trick for extracting performance from SQLite is to use | a single connection object for all operations, and to | serialize transactions using your application's logic | rather than depending on the database to do this for you." | | It's not the right approach because it's hard to get right, | you want to offload that to the DB. | bob1029 wrote: | It's actually not that hard to get right. You can put a | simple lock statement around a SQLiteConnection instance | to achieve the same effect with 100% reliability, but | with dramatically lower latency than what hosted SQL | offerings can provide. | | Also, the only reason we ever want to lock a | SQLiteConnection is to obtain a consistent | LastInsertRowId. With the latest changes to SQLite, we | don't even have to do this anymore as we can return the | value as part of a single invocation. | busterarm wrote: | Most websites/frameworks access their database through a | singleton pattern/single-connection anyway. | | Edit: Sometimes you have to lie and lead people down the | wrong path to enlightenment... ;) | Thaxll wrote: | That's not the case no, you usually access your database | with thread pool. Otherwise everyone would wait until the | single connection is free. Once you have a bit more users | that tries to write everything will fall appart. | OldHand2018 wrote: | Well, let me throw out a crazy idea: | | Nobody said you have to use a single database/file. | Obviously, you are going to want to spend a couple | minutes thinking about referential integrity. But how | often do you _delete_ records in your web app? | HelloNurse wrote: | A big database id not an issue. Contention requires a | large number of writers, not a large amount of cold data. | kuschku wrote: | As an IRC client with an SQLite backend: several hundred | times per second. | | Per user. | | While reads are more like once every day per user. | quesera wrote: | UPDATE is also a write, and references crossing files | sounds like a nightmare. | | If your deployment environment has serious constraints, | I'm sure you could make it work. The product you deliver | would be SQLite + custom DBI layer to hide SQLite's | limitations. | | It would be a lot more work, and not be as robust or | scalable, compared to a more traditional selection. But I | can imagine cases where it would be appropriate. | busterarm wrote: | Yes. | | ;) | mywittyname wrote: | One could implement a write queue and caching for the | front end. That might sound like a lot of work to go | through to avoid moving to a proper multi-user database. | But given that writes are probably done through a handful | of API endpoints, it's probably not really that much work | to implement. | quesera wrote: | Any site with real volume will have a connection pool. | | All of those connections might need to write, and this is | where SQLite gets tricky to implement at scale. | | I love SQLite! It's perfect for many use cases, but not | all. Fortunately, Postgres is also excellent. | trisiak wrote: | Can you provide some examples of frameworks with such | pattern? I actually have never seen it in any of the Python | or Go web applications that I had a chance to work with. | dwohnitmok wrote: | As far as I'm aware that threading information is on a per- | connection basis. | | SQLite is perfectly capable of supporting multiple, parallel | reads. | | SQLite must serialize writes, which makes a highly parallel | write-heavy workload not good for it. However, with WAL | enabled writes do not block reads. | | Basically highly-parallel read loads with low write counts | (low-enough that serializing them doesn't lead to | unacceptable slow down of writes) or with loads where latency | is acceptable in writes (but not reads) is a perfect use case | for SQLite. And it turns out that a lot of web services are | heavily asymmetrically biased towards reads. | petercooper wrote: | I ran a niche community social bookmarking site (around | 100-200k pageviews per month) on SQLite for several years and | it was no problem at all. If a write was occurring, having a | simultaneous request wait 100 milliseconds was no big deal. It | only became a problem when I got tired of ops and wanted to put | it on Heroku at which time I had to migrate to Postgres. | | I've always been surprised WordPress didn't go with SQLite | though - it'd have made deployment so much easier for 99% of | users running a small, simple blog. | unnouinceput wrote: | At the time when WordPress rolled out SQLite was still in its | infancy while MySQL was already a mature DB, hence the | choice. These days however they could, at least, do a nice | wizard to ask "do you want to run a blog only?" and deploy | SQLite instead. | fpoling wrote: | I looked at using SQLite with Wordpress for sites I | maintain. In theory Wordpress core is quite DB-agnostic and | according to random blogs MySQL could work. But plugins use | so much MySQL-specific features that it was hopeless. | jacurtis wrote: | I don't have any data on this, but I do know that a | significant (if not majority) of Wordpress sites now aren't | even blogs. They are small websites for things like | restaurants, hair salons, construction companies, | photographers, and personal portfolios. Especially since | most of these sites are also running on $3/mo shared hosts, | switching to SQLite would be a substantial improvement. | | Most these sites are running a homepage, an about page, a | contact page, and maybe one or two misc pages. They might | have a blog that has two blog posts on it from nine years | ago. But that is really it. MySQL is really overkill | considering the scenario. SQLite's big "limitation" is non- | concurrent writes. But this is rarely a problem with most | Wordpress sites because they are single-author and they | aren't updated very often. SQLite can handle plenty of | reads to support even heavily trafficked websites. | | Not to mention, SQLite's greatest advantage is portability. | A single file contains your entire database. A non- | technical user could transfer hosts or backup their data by | copying their database file like it was a photo or an excel | document. That's pretty incredible when you think about it. | iagovar wrote: | You don't even need SQLite for that. Something like | PicoCMS with markdown files is more than enough. If you | want a fronted for the owner, that's another story, but | if you're the one mantaining the sites, WordPress is an | overkill 99% of the time. | godot wrote: | Pleasantly surprised to hear PicoCMS mentioned here, | since it's not typically the most common PHP markdown CMS | of choice folks like to mention. | | I was a casual developer (i.e. not for work, just for | personal) deep into the PicoCMS ecosystem for a couple | years, a few years ago. I both started a site and helped | a family convert an old static site to PicoCMS and really | had no complaints. Re: frontend for the owner, I started | with Pico Admin and made a bunch of modifications to it | (including an image uploader) and the non-tech owner has | no complaints and it's been working well since. | | Nowadays for my own blog I'm into the whole SSG/JAM | trend, but I'd still run a PicoCMS site any time, if the | use case is right. | tyingq wrote: | Someone did write a plugin to have wordpress use SQLite as | the backend: https://wordpress.org/plugins/sqlite- | integration/ | | Perhaps not great for production since Wordpress | automatically updates itself, and you would have to keep up | with any changes. And not just for wordpress, but for any | other plugins that use the database. | | Edit: A single file fork (albeit 5k lines of PHP) of the | plugin that looks interesting: | https://github.com/aaemnnosttv/wp-sqlite-db | blacksmith_tb wrote: | I maintain a corp Ghost blog that's backed by SQLite, it's | been solid for years. | nalgeon wrote: | Same thing here. | [deleted] | jstrong wrote: | > There is nothing more convenient than SQLite for analyzing and | transforming JSON. | | the example query given: select | json_extract(value, '$.iso.code') as code, | json_extract(value, '$.iso.number') as num, | json_extract(value, '$.name') as name, | json_extract(value, '$.units.major.name') as unit from | json_each(readfile('currency.sample.json')) ; | | that sure looks fun to type into a repl! | | nothing against sqlite, which I like and use, just found the idea | of that query being convenient for one-off analysis to be off. | lamelydiminish wrote: | I love SQLite. Am building a webapp + mobile app that needs to | sync user created content. Is there an easy way to sync SQLite | DBs? | simonw wrote: | SQLite has a "session" extension which is designed to help | handle the sync problem: https://sqlite.org/sessionintro.html | lamelydiminish wrote: | TIL. Can't find anything on sessions for sql.js though... | Interested in using sql.js since I'm building a web PWA | first, and eventually native desktop&mobile app. Perhaps this | is overkill for SQLite at this point? | Toutouxc wrote: | > not bothering with optimization ([?]200 requests per page) | | What amount of SQL queries per page render is considered | sensible? | | When I run more than 20 queries per request in my Rails apps | (smallish internal tools for different companies) I get uneasy. I | usually deploy the app on the same machine where the DB (not | SQLite) runs, but I imagine if that weren't the case the app-DB | roundtrips could soon dominate the whole thing. | renewiltord wrote: | The sqlite position on this is articulated here | https://sqlite.org/np1queryprob.html | | > _N+1 Queries Are Not A Problem With SQLite | | > | | > The SQLite database runs in the same process address space as | the application. Queries do not involve message round-trips, | only a function call. The latency of a single SQL query is far | less in SQLite. Hence, using a large number of queries with | SQLite is not the problem._ | deckard1 wrote: | > app-DB roundtrips could soon dominate the whole thing | | and they do. We have a GraphQL API backed by multiple | microservices at a place I work. You become increasingly | paranoid at every call you need to make. | | It's also where ORMs completely fall over. ORMs are designed | for, generally, one record to one query mapping. Which is why | they are _always_ leaky abstractions which need heroic acts of | clever hacking to overcome. You know, rather than doing the | sensible thing and just writing a single query that fetches | data from multiple tables. | | SQLite sort of encourages many queries, though, due to their | design. I recall their docs even mention this point. But I've | seen a few people in various threads that get bitten by this | when they switch to a client/server DB. | | I think it's perfectly fine to design your app around the | characteristics of your database. Otherwise, you're missing out | on optimizations and features you could be using and what would | even be the point in favoring one DB over another if it's all | the same to your app? You should pick the DB that matches the | characteristics of your app anyway. | vbsteven wrote: | When using an external db every query needs to cross the | connection boundary and thus perform network IO and deal with | the threading implications for that IO. With SQLite running in- | process a round trip just means a function call, and when data | is available in the cache it won't even hit the disk. | megous wrote: | I'd still rather use PostgreSQL for my data needs. psql is quite | nice too as an interface, and "setup and forget" replication for | everything in the cluster and remote access makes up for slightly | harder deployment and upgrades. | | Knowing that I can do anything and the change hits 3 hard drives | in a span of few seconds on various machnies is nice. Tolerance | for arbitrary down time of master/slave clusters is also nice. | | But yeah, for quick one-off data munging, sqlite looks quite | nice. | benbjohnson wrote: | I agree that PostgreSQL has some great tooling around it. For | SQLite replication, you might want to checkout Litestream: | https://litestream.io/ | megous wrote: | I know about litestream, it's not the kind of "setup and | forget" I have in mind. But it's nice that it exists. | benbjohnson wrote: | That's good to know. I'm always trying to improve the | developer ergonomics. What would you want to see added to | make it "setup and forget"? | megous wrote: | I don't think there's much to be done. I just like the | fact that I don't have to think about replicating | individual databases, regardless if I rename, delete, re- | create them. | | It's being done at cluster level, it's one time setup and | I can forget about it. | kragen wrote: | This is a great introductory guide! | | I like SQLite a lot, but I found SQLite's recursive CTE | implementation to be somewhat limited: | https://dercuano.github.io/notes/why-html-is-not-a-programmi... | | Has that improved? | | (sorry about the embarrassing arrogant pedant attitude in that | note, but it's too late to fix it now) | ben509 wrote: | > There is a popular opinion among developers that SQLite is not | suitable for the web, because it doesn't support concurrent | access. | | No, the issue is it doesn't have high availability features: | failover, snapshots, concurrent backups, etc. (Edit: oops, | comment pointed out it does have concurrent backups.) | | SQLite isn't a toy DBMS, it's an extremely capable embedded DBMS. | An embedded DBMS is geared towards serving a single purpose-built | client, which is great for a desktop application that wants a | reliable way to store user data. | | Once you have multiple clients being developed and running | concurrently, and you have production data (customer accounts | that are effectively legal documents that _must_ be preserved at | all times) you want that DBMS to be an independent component. It | 's not principally about the concurrent performance, rather it's | the administrative tasks. | | That requires a level of configuration and control that is | contrary to the mission of SQLite to be embedded. They don't, and | shouldn't, add that kind of functionality. | benbjohnson wrote: | I agree that high availability features are outside of the | goals of an embedded database. There's an ecosystem of tools | SQLite to provide these benefits though. There's dqlite & | rqlite for providing HA over SQLite. I'm the author of | Litestream[1] which provides streaming replication to S3 for | SQLite databases so folks can safely run a single node | instance. | | [1]: https://litestream.io/ | hinkley wrote: | One could, for instance, run "static" sites with the data | stored in a SQLite database. Update on write apps often can | do okay with 90% uptime. It's the "generate everything in | every request" crowd that needs an HA solution and _why do we | keep doing this to ourselves_? | benbjohnson wrote: | Yes, static generation is a great way to go if that works | for your use case. Throw it up on a CDN and you'll likely | have five 9s of uptime. I don't think that's a viable | solution for a lot of applications though. | hinkley wrote: | Half of arguments on the Internet are someone trying to | win an argument they lost elsewhere. If I could get some | people I work with to actually run the numbers on how | much it costs us to make everything dynamic, versus how | much we earn versus static, I think we'd quickly arrive | at an agreement that most of it should be static with a | little AJAX thrown in for the 10% that needs to be | special. | kroggen wrote: | For replication of SQLite there are some options: | | http://litereplica.io - single-master replication | | http://litesync.io - multi-master replication | | https://aergolite.aergo.io - highest security replication, | using a small footprint blockchain | pbowyer wrote: | Have you used litereplica or litesync? It doesn't inspire | confidence that their websites haven't been updated since | 2016 and 2017 respectively. | kroggen wrote: | Most things happen in the forums: | | http://litesync.io/forum/ | | http://litereplica.io/forum/ | jes5199 wrote: | no, those things have never stopped me. What has stopped me is | that the default configuration for a Rails app throws | intermittent exceptions when I'm trying to save, because | concurrent writes fail and Rails (at least as of a couple years | ago) does the wrong thing by default | jyrkesh wrote: | I learned this the hard way when I stuck some app containers | down on a few RPis and mapped the folder of stateful stuff | (including a SQLite database) to an NFS share on my NAS. | | It's....not great. | pdimitar wrote: | Can't remember where did I read this now but about a year ago | I have found material saying that sqlite3 over NFS is a very | bad idea and even though it's supported it's also strongly | advised against as a use-case. | wenc wrote: | It also doesn't have strong/static typing (it's dynamically | typed) so you have to typecheck your inputs or do type coercion | upon read. | | And it doesn't have a native date type. Date handling has to be | handled at the application layer. It can be tricky to do | massive time-series calculations or date-based aggregations. | | You can use integers or text types to represent dates, but this | open-endedness means you can't share your db because everyone | implements their own datetime representation. | samatman wrote: | Datetimes really are the achilles heel of SQLite. | | The JSON extension library is amazing and works well. If | SQLite were to grow a first-rate RFC 3339 library, one which | could read from tz when available and do the things which | strftime cant, acting on your choice of Unix timestamp and | valid RFC 3339 date string, this would be a real boon to the | ecosystem. | | I haven't found typechecking inputs to be a real barrier. | Sure, `val INTEGER CHECK (val = 0 or val = 1)` is a long way | to spell `val BOOLEAN` but `CHECK json(metadata)` is a | reasonable way to spell `metadata JSON`, and a similar | function would surely exist for a SQLite datetime extension. | You can do it now by coercing the string through an expected | format, but that doesn't generalize well. | names_are_hard wrote: | This. I'm working on a hobby project that is essentially a | web app for personal use. I started with sqlite because it | was the simplest to start with, but I'm about to reluctantly | migrate to postgres. Why? Constraints, data types, and | programmability. | | In terms of scale, sqlite is just fine. But I am tired of | fiddling with the dates, it's too easy for bugs to sneak into | my code, and I want to use table valued functions to | essentially parameterize views instead of having to build | complex queries in the app layer. | | If your web app is mostly reading and writing single rows, | yeah, sqlite is just fine. But if there's substantial and | complex logic involved, it has its limits. | smulc wrote: | Check out rqlite (https://github.com/rqlite/rqlite) for that | functionality - "rqlite is a lightweight, distributed | relational database, which uses SQLite as its storage engine. | Forming a cluster is very straightforward, it gracefully | handles leader elections, and tolerates failures of machines, | including the leader. rqlite is available for Linux, macOS, and | Microsoft Windows." | emodendroket wrote: | To me this is obviating a lot of the advantage of SQLite over | any other RDBMS. | dmurray wrote: | I thought the same, but maybe it's useful if you start with | SQLite and decide to scale up to a distributed RDBMS | without having to rewrite too much? | msla wrote: | Two thoughts: | | SQLAlchemy, at least, lets you switch from SQLite to | Postgres in a configuration file. | | Isn't there a useful SQL subset which allows you to | switch from one database to another without rewriting? | There seems to be such a subset for C, for example, which | multiple compilers all interpret the same way, and SQL is | a standardized language, too. | the_duke wrote: | rqlite is a separate daemon written in Go, negating most of | the reasons to choose Sqlite in the first place. | | It absolutely has good use cases, but those are rather niche. | You'll mostly be better off with the traditional postgres | etc. | hexmiles wrote: | > "No, the issue is it doesn't have high availability features: | failover, snapshots, concurrent backups, etc." | | I think it has concurrent backups via the backup api: | https://www.sqlite.org/backup.html | formerly_proven wrote: | I'd expect this to be unnecessary in WAL mode (which you | should use, when possible), since WAL allows concurrent | readers while permitting up to one writer. In the old undo- | log mode (which remains the default for compatibility) | writing excluded readers. | nayuki wrote: | I don't think you can do naive file copy backups even in | WAL mode. WAL will checkpoint the log into the main | database file from time to time. You need to use the SQLite | backup API to let you make a backup while blocking any | potential checkpointing. | formerly_proven wrote: | Sorry, I was unclear. I meant the "copy a few pages then | sleep for some time to release the database lock" part. | With WAL mode you should just be able to copy as fast as | you can without disturbing other readers or the writer. | SQLite also has a similar SQL built in, VACUUM INTO. | That's nice because the backup API just copies all pages | (iirc). | jksmith wrote: | From the dbadmin POV and understandable. From the dev side, if | you want to take a stab at your own failover, shards, | concurrents and snaps, there may be no better platform to learn | on. Writing a custom backend to do tricks with sqlite dlls is | extremely satisfying. | nijave wrote: | I think the biggest issue can be distilled to "lack of | concurrent network access". You start getting into a lot hairy | management problems with distributed file systems/copying files | around. | | Imo part of that is just web applications tend to be not very | efficient when compared to something like unix cli utilities so | to get performance you end up with potentially massive amounts | of horizontal scaling | | On the other hand, the lack of performance /usually/ buys you | higher productivity so you can make product changes faster (you | let a GC manage the memory to save time coding but introduce GC | overhead) | therealdrag0 wrote: | Right, and Expensify plausibly solved this by wrapping SQLite | to produce a different DB, BedrockDb. But that shouldn't | imply at all that SQLite alone could satisfy main stream web | applications. | habibur wrote: | Not to misunderstand. It fully supports concurrent reads. And | also concurrent reading while some other process is writing the | same database. | | What it doesn't support is concurrent writes only. | bravura wrote: | A kludge---I forget from where---for low traffic sites is one | SQLite DB per user. | nalgeon wrote: | I think that the absence of 'high availability' is not an issue | for small websites or web apps. Transactions are ACID, | concurrent readers are fully supported. Backups and | administrative tasks are super-easy. | zokier wrote: | I see availability as orthogonal issue to scale/size. You | might be processing one transaction per day but it can still | be super important that the service is available for that one | transaction. | mikesabbagh wrote: | I ran it once on an NFS disk. There must be some file locking | happening I think. But I could read from multiple nodes, but | for some reason the delete was failing (could not delete a | database). Did not test the write as my app at that time was | writing from one location only. Anyone tested such HA | scenario? | gwd wrote: | I just looked, and the entire database for my webapp (which | typically serves around ~50 people) is 139k. As you say, you | can do a data dump quickly and safely; at that size, you | could afford to just dump the entire DB every hour and keep | the last year's worth of snapshots if you wanted. | jakoblorz wrote: | https://litestream.io/ That's exactly what litestream | accomplishes. | benbjohnson wrote: | That's close. Litestream takes a snapshot of the database | and then continuously replicates out WAL frames. On | restore, it grabs the snapshot and replays all those WAL | frames (or all the frames up to a certain time if you | want point-in-time recovery). | tmountain wrote: | How do backups work? Is there a locking mechanism to prevent | file corruption if the file is copied during writing? | nijave wrote: | You can also combine file system snapshots to get | versioning which would allow you to pull the db file out of | the latest snapshot (right after taking it) and send it | somewhere | sangnoir wrote: | It supports multiple readers, and writes are atomic - no | reader should ever get corrupt data just because there is a | write happening in parallel. You probably should not use | the OS to copy the live db file, rather, have an in-process | thread that reads the db and writes to a backup location | periodically. | mrweasel wrote: | Exactly, many forget the little detail that they should | use the backup feature, rather than just copy the | database file. | crazygringo wrote: | So if you're saying everything _does_ support a web | database... then what 's the reason people aren't using it | for websites? | | Why do you say it works for "small" websites but presumably | not large ones? If it's not transactions, concurrent reading, | backups, or administrative tasks... then what's the issue you | run into? | | Genuinely curious... I'm wondering if everything I've heard | about "don't use SQLite for websites" is wrong, or when it's | right? | dbattaglia wrote: | I think one valid fear is that a web application might, for | multiple reasons, scale beyond a single process on a single | server. This is common enough that using an embedded | database can be problematic, compared to a separate RDBMS | process that can be shared between processes. Some web | applications are just never going to scale out for any | reason, and for those SQLite might be appropriate. | danenania wrote: | Agreed, but it's not only a question of scaling. Any app | with high uptime requirements will need more than a | single process from day 1. | benbjohnson wrote: | It depends on what you consider high uptime. You can | achieve 99.95% uptime with 4h of downtime a year. A lot | of downtime occurs because of overly complicated systems | so running a single process on a single server can give | you relatively high uptime. | 0xCMP wrote: | The primary reason for me is that Postgres has stronger | default constraints. If you care about keeping your data | logically consistent then Postgres has more of that out of | the box. | | SQLite just makes the tradeoff to be simpler since often it | doesn't matter. But don't make the mistake that it doesn't | matter. Since PG helps avoid data problems and you might | need to scale out web servers that is why Django for | instance recommends switching to Postgres (or whatever | you're actually going to use) ASAP cause there are | differences. You may end up relying on PG to reject things | SQLite doesn't care about by default. SQLite might let you | get away with inserting data which PG refuses to handle. | | Not to mention the DB specific features can differ. Like | PG's JSON field types or etc. | pdimitar wrote: | Not sure why you were downvoted because that is a | legitimate concern and it's my only beef with sqlite3. I | will kill for an embedded PostgreSQL. If sqlite3 becomes | that I'll absolutely pay for a license if they require | it. | yellowapple wrote: | > then what's the reason people aren't using it for | websites? | | I'd guess the reason to be that people keep hearing things | like "don't use SQLite for websites" and thus don't even | try. | | > Why do you say it works for "small" websites but | presumably not large ones? | | Not the GP, but the main reason I _wouldn 't_ use SQLite | for a large website is that SQLite itself doesn't offer | much re: failover/replication (i.e. multiple servers, one | database), and I haven't used RQLite enough (or at all; I | should fix that) to be comfortable with it in production. | Because of that, I'm more likely to reach for / recommend | PostgreSQL instead. | | That being said, if your website has crazy "web scale" | FAANGesque needs and you're at the point where you need to | write your own replicated datastore, using SQLite as a base | and building your own replication layer on top of it (or | using RQLite and maybe adjusting it for your needs) seems | like a reasonable way to go. | benbjohnson wrote: | I'm in the process of adding read replication to | Litestream[1] so folks can scale out the read-side of | their SQLite applications to multiple nodes (or replicate | to edge nodes for low-latency). | | [1]: https://litestream.io/ | pdimitar wrote: | Amazing. This was missing for my use-case and I gave up | on LiteStream because of it but now it seems I have to | revisit! | | Wonderful job. | gm3dmo wrote: | Crazy web scale folks flip from Mongo (NoSQL) to | Postgres: | https://www.theguardian.com/info/2018/nov/30/bye-bye- | mongo-h... | pbowyer wrote: | > That being said, if your website has crazy "web scale" | FAANGesque needs and you're at the point where you need | to write your own replicated datastore, using SQLite as a | base and building your own replication layer on top of it | (or using RQLite and maybe adjusting it for your needs) | seems like a reasonable way to go. | | Exactly what Bloomberg did with Comdb2: | https://github.com/bloomberg/comdb2 | danielheath wrote: | I would use SQLite on a single-machine web server. However, | I prefer to build my web stuff as stateless machine images | that connect to a separate database instance, because it | lets me scale them independently. | | I should revisit this policy now that you can run a truly | huge site off a 1U slot (I work for an alexa top10k, and | our compute would fit comfortably in 1U); computers are so | fast that vertical scaling is probably a viable option. | pdimitar wrote: | That's exactly what I am leaning to in my current job. | Looking at the load of our servers I can comfortably put | all our software on my gaming machine (which is mid- | range!) -- with the DB included -- and I bet no request | ever will be above 100ms. | | IMO a lot of organizations should start re-investing in | on-premise as well. Having a mid-range AMD EPYC server | can serve most businesses out there without ever having | more than 40% CPU usage. | | That, plus scaling down Kubernetes clusters. Most | companies absolutely didn't need them in the first place. | emodendroket wrote: | If writes are infrequent I think it might be fine. | Otherwise I'd be wary. | jakoblorz wrote: | Depending on the use case (SaaS offerings with per customer | shards) you can actually scale sqlite quite high. Expensify | scaled to 4 mio qps (queries per second) [1], so if one of | your customers is exceeding that, you better be looking at | another DBMS but below that - well sqlite is an option. | | [1] https://blog.expensify.com/2018/01/08/scaling-sqlite- | to-4m-q... | voxic11 wrote: | Also if you really need HA there are nice projects like | https://dqlite.io/ | SahAssar wrote: | Looking at https://dqlite.io/docs/protocol it seems like it | does not allow for embedded usage though. Doesn't this turn | it into more of a traditional client-server RDBMS? | yawaramin wrote: | From the home page: | | > Dqlite is a fast, embedded, persistent SQL database | with Raft consensus that is perfect for fault-tolerant | IoT and Edge devices. | rstat1 wrote: | Looking at the archtecture page [0] on that same website | the second heading is "Embedding". | | [0] https://dqlite.io/docs/architecture | SahAssar wrote: | Ah, right, I should have looked deeper! | TheRealPomax wrote: | Not sure I understand: sqlite is file based, so snapshots and | concurrent backups are literally just file copies/backups. | | I'd much rather SQLite not waste its time on implementing | features they're not good at, leaving that to the tools we | already have available for rolling file backups, instead | spending their time and effort on offering the best file-based | database system they can. | | (Heck, even failover is just a file copy initiated "when your | health check sees there's a problem") | simonw wrote: | Unfortunately backing up with a simply file copy operation | isn't guaranteed to work if you have write traffic at the | same time. | | Instead you need to use the .backup mechanism or the VACCUM | INTO command, both of which safely create a backup copy of | your database in another file - which you can then move | anywhere you like. | majkinetor wrote: | Depends on the file system. On Windows you can use shadow | copy for example. | mceachen wrote: | Oh man, I didn't know about VACCUM INTO, thanks! | emodendroket wrote: | Every database is ultimately on the file system but there's a | reason that method of backup is rarely used. | dnautics wrote: | Postgres lets you write directly to block. In practice, | nobody does this though. | gm3dmo wrote: | >Once you have multiple clients being developed and running | concurrently, and you have production data (customer accounts | that are effectively legal documents that must be preserved at | all times) you want that DBMS to be an independent component. | It's not principally about the concurrent performance, rather | it's the administrative tasks. | | For my reporting/read-only use case https://datasette.io/ | solves the above beautifully. | D13Fd wrote: | My only "problem" with SQLite was that it was so fast, running | locally, that it hid just how much my app was needlessly hitting | the database. It was a surprise when I transitioned to a | networked Postgres server and performance completely tanked due | to my crappy code... | re wrote: | I know you put "problem" in quotes, but in case you haven't | seen it, this document was posted here a little while ago: | https://sqlite.org/np1queryprob.html | | It describes how access patterns that would be bad practices | with networked databases are actually appropriate for SQLite as | an in-memory DB. | D13Fd wrote: | That's interesting! Now I have to resist the urge to move | back... | [deleted] | agumonkey wrote: | Someone told me they build a world class DNS mostly around | sqlite. I guess you left the toy area by that point. | michaelmcmillan wrote: | I've built a complex CRM that handles 2.1 million USD in | transactions every year. It is running sqlite with a simple in- | memory lru cache (just a dict) that gets purged when a mutating | query (INSERT, UPDATE or DELETE) is executed. It is very simple | and more than fast enough. | | Friendly reminder that you shouldn't spend time fine tuning your | horizontal autoscaler in k8s before making money. | outworlder wrote: | > Friendly reminder that you shouldn't spend time fine tuning | your horizontal autoscaler in k8s | | Oh, but most companies need this! | | The biggest feat of microservices (which require ways to manage | them, like k8s) was to provide the ability for companies to | ship their organizational chart to production. | | If you don't need to ship your org chart and you can focus on | designing a product, then you can go a long way without overly | complicating your architecture. | midrus wrote: | Do you mean I don't need Go microservices talking gRPC deployed | in multiple kubernetes clusters with bash script based | migrations via GitOps with my hand made multi cloud automation | (in case we move clouds) following all the SCRUM practices to | ship working software? | | Mindblowing. | kwertyoowiyop wrote: | You will for your blog series that you mention prominently on | your resume that gets you your next Senior Architect gig. | midrus wrote: | Agh!... that's the catch... what I'm gonna give talks about | and what do I write on medium then.... now I get it. | Thanks!. | anticristi wrote: | > my hand made multi cloud automation (in case we move | clouds) | | Aren't these symptoms of a deeper problem? Many Product | Manager I talk to wanted me to build something that is as | flexible as possible and solves all problems for everyone, | everywhere. Go microservices with gRPC and Kubernetes feels | like the only high-level technical decisions I can take in | light of such information. :) | juancampa wrote: | How do you ensure data is not lost to oblivion if a | catastrophic system failure occurs? | pupdogg wrote: | You put in-place a loss mitigation strategy. This strategy | will vary by application. In my case, I have a similar setup | where we write 25-30k records to SQLite daily. We start each | day fresh with a new SQLite db file (named yyyy-mm-dd.db) and | back it up to AWS S3 daily under the scheme | /app_name/data/year/month/file. You could say that's 9 | million records a year or 365 mini-sqlite dbs containing | 25-30k records. Portability is another awesome trait of | SQLite. Then, at the end of the week (after 7 days that is), | we use AWS Glue (PySpark specifically) to process these | weekly database files and create a Parquet (snappy | compression) file which is then imported into Clickhouse for | analytics and reporting. | | At any given point in time, we retain 7 years worth of files | in S3. That's approx. 2275 files for under $10/month. | Anything older, is archived into AWS Glacier...all while the | data is still accessible within Clickhouse. As of right now, | we have 12 years worth of data. Hope it helps! | hodgesrm wrote: | This sounds interesting. Have you thought of doing a talk | or blog article about it? | | p.s., I run the SF Bay Area ClickHouse meetup. Sounds like | an interesting topic for a future meeting. | https://www.meetup.com/San-Francisco-Bay-Area-ClickHouse- | Mee... | michaelmcmillan wrote: | Backups. | antoinealb wrote: | Does that mean it's okay for your application to loose | transactions (which occured between the backup point and | the failure point) or do you have other mitigations ? | Spivak wrote: | Isn't that how all backups work? If you _need_ to prevent | data loss then backups probably aren 't your tool of | choice. And if you're paranoid about data loss then any | replication lag is also unacceptable. | | * I'm worried about my server blowing up: Transactions | have to be committed to more than one DB on separate | physical hosts before returning. | | * I'm worried about my datacenter blowing up: | Transactions have to be committed to more than one DB in | more than one DC before returning. | benbjohnson wrote: | I'm the author of Litestream, which is an open-source | tool for streaming replication for SQLite. That could be | a good option if you need to limit your window for data | loss. We have a pretty active Slack if you need help | getting up and running. https://litestream.io/ | [deleted] | habibur wrote: | Guess this daily 2 seconds of downtime is worth it, when | that reduces cost say from $2000/month to $20/month. | mrweasel wrote: | Many banks still "shutdown" for hours every night to do | backups. | finiteseries wrote: | I'm not anywhere _near_ the banking industry but from HN | alone I've been led to believe dailyish huge file | transfers are also the norm in a variety of situations | (aka SQLite's backup strategy). | ericlewis wrote: | ftp or sftp if you're lucky - upload a giant CSV or | receive one. It is _crazy to me it still works this way_. | Derpdiherp wrote: | ---Was talking about a previous company who made steps to shift | from it for pricing reasons. Seems they where misinformed. | eurleif wrote: | Huh? SQLite is public domain.[0] There are paid extensions for | encrypted and compressed databases[1], but those cost a flat | fee for unlimited devices:[2] "Your license is perpetual. You | have paid a one-time fee that allows you to use and modify the | software forever. You can ship as many copied of the software | to your customers as you want so long as you ensure that only | compiled binaries are shipped (you cannot distribute source | code) and that your customers cannot make additional copies of | the software to use for other purposes." | | [0] https://www.sqlite.org/copyright.html | | [1] https://www.sqlite.org/prosupport.html | | [2] https://sqlite.org/see/doc/release/www/readme.wiki | Derpdiherp wrote: | Interesting - it's been a long time since I used / looked at | it. There was a big uproar in the company though and a drive | to move away from it due to licensing and cost from there. | Maybe they where misinformed. | masterofmisc wrote: | I would say terribly so. | tomcam wrote: | How is it not free for commercial use? It's public domain code. | I just rechecked the license after reading your comment. | qmmmur wrote: | Did anyone say it was? | qwerty456127 wrote: | SQLite is beautiful. Almost a perfect database for all single- | user and simple multi-user projects. I just wish it had some | built-in (not as an extension) encryption, native (non-DIY) date- | time and UUID types - it would actually be perfect then. | unnouinceput wrote: | SQLite is definitely not a toy DB when it comes to its features, | none will deny that. However, when you start to deal with | hundreds of millions of records in a table, you kinda consider it | a toy. | _wldu wrote: | It is the most widely deployed DB in the world. That fact, | alone, shows that it is not a toy DB. | nijave wrote: | Sqlite can handle that | https://blog.expensify.com/2018/01/08/scaling-sqlite-to-4m-q... | | I've been toying around with it locally for playing with big | data sets (50-100s GB) uncompressed and it works pretty well. | It's much easier than postgres and mysql which have a lot more | knobs and tuning required) | RedComet wrote: | It seems as though there is a new article about this daily. | JimmaDaRustla wrote: | Weird. They never mentioned the amazing power of inserting text | and numbers into a boolean column. | justin_oaks wrote: | I assume you're being sarcastic because I haven't yet thought | of a reason why that'd be a good thing. Although I've never | been bitten by SQLite's dynamic types, I wish they weren't a | thing. | | I'm genuinely curious if there are any real use-cases for this | behavior. | flyflyFenix wrote: | I was a fan of SQLite until I needed a full outer join of two | tables. It was such a pain to create a workaround when I found | out it isn't supported, and the result was still sub-optimal and | needlessly complex. Is running a full outer joins such a | unreasonable assumption within a DBMS? | justin_oaks wrote: | I hear you. MySQL still doesn't have full outer joins either. | | I would think this would be one of the easier features to add. | | Of course, I'm sure the people working on MySQL or SQLite would | say "Patches welcome!" | snicker7 wrote: | > SQLite is serverless. | | Maybe if you use SQLite as a file format. But if you use it like | an actual database (e.g. in a web application), I find that one | is best off setting up a daemon thread to queue/batch | transactions. | II2II wrote: | One of the disappointing developments of the past quarter | century is the near demise of general purpose databases as an | end user application. Yes, client/server models are useful when | dealing with a large number of transactions. On the other hand, | it is usually too complex to justify for personal or small | office use. | | I miss the days when databases were included in office suites | or could be purchased as relatively inexpensive standalone | applications since it was easy to create a database and | associated forms, rather than depending upon domain specific | applications that use a database as a file format but may be | poorly suited for a particular scenario. | | (Yes, the phrasing "file format" verses "actual database" | rubbed me the wrong way.) | | Edit: I realized that my post comes off as dismissive of | SQLite, which is not entirely true. I use it as a database and | appreciate it's role, but embedding it in an application is | frequently more than a given application requires. | fpoling wrote: | Access DB is still included in Microsoft Office | OldHand2018 wrote: | Only some versions of Office, mainly the non-cheapest | business versions. | zokier wrote: | And LibreOffice has Base, a sadly often neglected | component. | II2II wrote: | Another issue with Base is the required JRE. Installing | the JRE may be trivial, but it means that an error | message is the first experience most people will have | with Base. | | As for Access, it's access is limited due to the much | higher price point of Business/Professional versions of | Office. In terms of office suites, that version is about | two to three times more expensive than an equivalent | office suite from the mid-1990's (adjusted to 2021 | dollars). In other words, people will only have Access if | they feel that the additional cost is justified. | | That additional cost may be fine for business use, yet it | also means that people have less exposure to databases to | start with. With respect to exposure, there also appears | to be an absence of general purpose databases for home | users these days. (By that I mean in terms of cost and | ease of use.) | zokier wrote: | > Another issue with Base is the required JRE | | I thought that they dropped JRE requirement when the | engine was switched to Firebird? Googling around | apparently the transition wasn't quite successful :( | | https://ask.libreoffice.org/en/question/279711/firebird- | dead... | II2II wrote: | The only "embedded database" option (HSQLDB) requires the | JRE. I don't know if this holds true for the "connect to | an existing database" options, but I would rather throw | together a Python script using SQLite when it reaches | that level of complication. (It uses my existing | knowledge.) | zokier wrote: | > The only "embedded database" option (HSQLDB) requires | the JRE | | Firebase is still available as an _experimental_ option | as another embedded database, just confirmed on my | installation. Also while I was at it, tested it with Java | disabled and HSQLDB predictably did not work, but | Firebird did continue to work. Apparently it also opens | an pre-existing Firebase odb file even without | experimental flag being on. | | https://ibb.co/jM6k07w | code-scope wrote: | I use a different pattern. A lot sqlite files for diff purposes | (UserSession, User Files, etc each store in separate files) | This way diff threads of webserver can open/query/read/write a | lot of files concurrently without any issue. | snicker7 wrote: | > without any issue | | It's easier than you think to corrupt SQLite if you access | from multiple threads and especially from multiple processes | (yup, I've done this before). | | Also, there are no concurrent transactions in SQLite. The | entire db file gets locked (using POSIX locking, which is | known to be broken [0]). Better to queue/batch transactions | on a single connection. If your web server consists of | multiple processes, then this requires a separate daemon. | | [0]: http://0pointer.de/blog/projects/locking.html | saalweachter wrote: | I, too, enjoy creating deadlocks. | | [I assume if you are using this pattern successfully in | production you are already aware of and taking proper steps | to avoid them, but the pedant in me takes issue with "without | any issue", since once you have multiple resources being | locked in multiple threads, you need to be careful to acquire | and release locks in such a way that deadlocks do not occur, | either never acquiring more than one lock at a time, | acquiring locks in specific orders, or acquiring batches of | locks at a time.] | codykochmann wrote: | The thought that devs believe javascript would ever be fast | enough to make sqlite actually need to run in parallel makes me | giggle. ___________________________________________________________________ (page generated 2021-03-25 23:00 UTC)