hngopher.com

       [HN Gopher] SQLite is not a toy database
       ___________________________________________________________________
        
       SQLite is not a toy database
        
       Author : nalgeon
       Score  : 730 points
       Date   : 2021-03-25 14:20 UTC (8 hours ago)
        
 (HTM) web link (antonz.org)
 (TXT) w3m dump (antonz.org)
        
       | fouc wrote:
       | PostgreSQL is not a toy database. MySQL is not a toy database. I
       | feel like the "toy database" part is clickbait
        
         | dpix wrote:
         | I don't think the author is trying to call out SQLite as
         | somehow better than others database tech - just that most
         | people perceive SQLite to be a tool for development or
         | exploration when in fact it much more powerful than most
         | believe
        
         | guenthert wrote:
         | Not sure about that. I can't recall having seen PostgreSQL
         | being referred to as toy database. I have seen however texts
         | cautioning SQLite in production use, e.g. in [1], [2].
         | 
         | [1]
         | https://docs.bareos.org/bareos-18.2/IntroductionAndTutorial/...
         | 
         | [2]
         | https://docs.bareos.org/bareos-18.2/DeveloperGuide/catalog.h...
        
       | unixhero wrote:
       | SQLite: The Database at the Edge of the Network with Dr. Richard
       | Hipp https://www.youtube.com/watch?v=Jib2AmRb_rk
        
       | amaccuish wrote:
       | It's a shame Nextcloud works terribly with SQLite
        
       | annoyingnoob wrote:
       | Thanks for the info. Probably would not have considered SQLite in
       | place of something like pandas.
        
       | bob1029 wrote:
       | Don't forget about User-Defined Functions.
       | 
       | https://www.sqlite.org/appfunc.html
       | 
       | We just started enhancing our SQL dialect with new functions
       | which are implemented in C# code. One of them is an aggregate and
       | it is really incredible to see how it simplifies projections
       | involving multiple rows.
       | 
       | One huge benefit of SQLite's idea of UDFs is that you can
       | actually set breakpoints and debug them as SQL is executing.
        
         | hnrodey wrote:
         | Sounds very interesting. Can you elaborate on how you're
         | leveraging C# for this?
        
           | bob1029 wrote:
           | Directly through Microsoft's SQLite provider -
           | Microsoft.Data.Sqlite.
           | 
           | See: https://docs.microsoft.com/en-
           | us/dotnet/standard/data/sqlite...
        
         | foepys wrote:
         | Application-defined functions are very useful.
         | 
         | One of my company's applications is already designed to work
         | with different SQL systems and a new customer desperately
         | wanted SQLite for a very special use case. As SQLite is quite
         | simple and doesn't support many functions that are standard in
         | SQL Server, MySQL, Oracle, etc., we used application-defined
         | functions to implement all functions the application needs in
         | C#. It's not very fast but also not slow and the customer is
         | happy, which is what really counts.
        
         | TeMPOraL wrote:
         | They're awesome!
         | 
         | One common use case is REGEXP. SQLite has a keyword for regular
         | expression matching, but it has no implementation for it. What
         | your application needs to do is to take whatever regex library
         | it's using, make (through whatever FFI method it's using) a C
         | function that interfaces with it, and register it with SQLite.
         | 
         | A more advanced feature of this binding mechanism is that, if
         | you provide a bunch of specific callbacks to SQLite, you can
         | expose anything you like as a virtual table, that can be
         | queried and operated on as if it was just another SQLite table.
         | 
         | See: https://www.sqlite.org/vtab.html. The plugin for full-text
         | search is implemented in terms of this mechanism.
        
       | eb0la wrote:
       | Sqlite is also an EXTREMELY well tested database. Last time I
       | checked test code was about 2x productionncode. This is also good
       | indicator to spot software that is not a toy.
        
         | eb0la wrote:
         | Turns out it is much, much, more tested than I remembered:
         | 
         | https://www.sqlite.org/testing.html
        
       | mey wrote:
       | Just went looking, there doesn't seem to be a great actively
       | developed solution for a Java implementation or JNI bridge. The
       | JDBC driver that seemed to have popularity was brought into the
       | SQLite repo, but the documentation is lacking a bit. [1] Even
       | .Net Core has a integrated ADO.NET bridge to SQLite [2]
       | 
       | There are quite a few mature embedded DB options for Java
       | actively developed, for example Apache Derby/HSQLDB/H2. The
       | SQLite fileformat is very portable, which is very handy and
       | useful, the SQLite C Library is very powerful. It'd be nice for
       | the Java community to tap into that power.
       | 
       | 1)
       | https://www.sqlite.org/java/file?name=doc/overview.html&ci=t...
       | 
       | 2) https://docs.microsoft.com/en-
       | us/dotnet/standard/data/sqlite...
        
       | doomleika wrote:
       | Two major gripe I had with SQlite
       | 
       | 1. SQLite doesn't really enforce column types[0], the choice is
       | really puzzling to me. Since schema enforced type check is one of
       | the strong suit of SQL/RDMBS based data solution.
       | 
       | 2. Whole database lock on write, this make it unsuitable to high
       | write usages like logging and metric recording. WAL mode will
       | help but it will only alleviate the issue, you will need row
       | based lock solution eventually.
       | 
       | Just like the offical FAQ said, SQLite competes with fopen[1]
       | instead of RDBMS systems.
       | 
       | --
       | 
       | [0]: https://sqlite.org/datatype3.html
       | 
       | [1]: https://www.sqlite.org/whentouse.html
        
         | lsb wrote:
         | If you are in WAL mode, you can have unlimited readers as one
         | writer is writing.
        
           | mastre_ wrote:
           | Also, another issue when not in WAL mode is that a long read
           | will actually have a write lock and block _writes_.
        
         | haberman wrote:
         | > SQLite doesn't really enforce column types[0], the choice is
         | really puzzling to me.
         | 
         | This is acknowledged as a likely mistake, but one that will
         | never be fixed due to backward compatibility:
         | 
         | > Flexible typing is considered a feature of SQLite, not a bug.
         | Nevertheless, we recognize that this feature does sometimes
         | cause confusion and pain for developers who are acustomed to
         | working with other databases that are more judgmental with
         | regard to data types. In retrospect, perhaps it would have been
         | better if SQLite had merely implemented an ANY datatype so that
         | developers could explicitly state when they wanted to use
         | flexible typing, rather than making flexible typing the
         | default. But that is not something that can be changed now
         | without breaking the millions of applications and trillions of
         | database files that already use SQLite's flexible typing
         | feature.
         | 
         | https://sqlite.org/quirks.html
        
           | doomleika wrote:
           | Yeah, still I would really wish SQLite is a feather weight
           | RDBM system than almost-but-not-quite-your-typical-RDBMS the
           | better `fopen`.
           | 
           | Having a tool like this would made my life whole lot easier,
           | well, one can dream.
        
           | setr wrote:
           | SQLite already has different operating modes, right? e.g. WAL
           | is turned on and stays on, I think; It seems like you could
           | at least make type-checking an opt-in mode
        
           | mastre_ wrote:
           | > but one that will never be fixed due to backward
           | compatibility
           | 
           | I wonder if a fork/"new version" could address this. Like,
           | sqlite2 (v1.0, etc.).
        
             | isoprophlex wrote:
             | Considering that we're on sqlite3 already, it'd probably be
             | something for the v4 ;)
        
               | [deleted]
        
             | mbreese wrote:
             | This would be something on the order of the python2 ->
             | python3 transition. Meaning, it would likely take a decade.
             | After going though that, I'm not sure it would be worth it
             | to just change from (default) flexible data types.
        
               | [deleted]
        
         | petters wrote:
         | The expensify blog linked from a comment here claims:
         | 
         | > But lesser known is that there is a branch of SQLite that has
         | page locking, which enables for fantastic concurrent write
         | performance. Reach out to the SQLite folks and I'm sure they'll
         | tell you more
        
         | dnautics wrote:
         | > SQLite doesn't really enforce column types[0], the choice is
         | really puzzling to me.
         | 
         | It's not the worst thing in the world; you're validating on
         | data ingest anyways to prevent sqli, for example, right?
        
           | Sohcahtoa82 wrote:
           | Data validation is a dangerous method to try to prevent SQL
           | injection. The only surefire method is to used parameterized
           | queries, which you should be doing anyways.
        
             | dnautics wrote:
             | huh? By data validation I mean a validation library in your
             | surrounding PL, it's flowing through the types of that
             | language, and at no point is unprepared SQL entering your
             | system.
        
         | HelloNurse wrote:
         | > unsuitable to high write usages like logging and metric
         | recording.
         | 
         | If the speed of your write-only workload is limited by whole
         | file locks rather than by raw I/O speed, you can probably
         | consolidate your writes into fewer transactions (i.e. fewer
         | disk accesses, amortizing lock cost over more data) and write
         | to several databases in parallel according to any suitable
         | sharding criteria. Which is what any RDBMS would have to to
         | anyway.
        
         | zie wrote:
         | 1: Actually this is a feature, It's awesome and easy to map
         | typing to your language types. In python see:
         | https://docs.python.org/3/library/sqlite3.html#using-adapter...
         | specifically the DECLTYPES option.
         | 
         | Other language bindings do things like this also, and makes it
         | pretty idiot proof. you `create table test (mydict dict);` so
         | your tables know their types, and then at bind time you say a
         | sqlite column type of dict == a python dictionary.
         | 
         | Obviously python is sort of a terrible example, because python
         | typing is somewhat non-existent in many ways, but you see the
         | point here.
         | 
         | 2: There are definitely cases where it won't work out well,
         | high-concurrent write load is definitely it's big weak spot,
         | but those are usually fairly rare use cases.
        
       | petercooper wrote:
       | With no sense of overstatement here, SQLite is one of my favorite
       | creations in the entire world, so I have a bunch of links some of
       | you might find interesting if you want to dig further:
       | 
       | https://github.com/sql-js/sql.js - SQL.js lets you run SQLite
       | within a Web page as it's just SQLite compiled to JS with
       | Emscripten.
       | 
       | https://litestream.io/blog/why-i-built-litestream/ - Litestream
       | is a SQLite-powered streaming replication system.
       | 
       | https://sqlite.org/lang_with.html#rcex3 - you can do graph-style
       | queries against SQLite too (briefly mentioned in the article).
       | 
       | https://github.com/aergoio/aergolite - AergoLite is replicated
       | SQLite but secured by a blockchain.
       | 
       | https://github.com/simonw/datasette - Datasette (mentioned at the
       | very end of the OP article) is a tool for offering up an SQLite
       | database as a Web accessible service - you can do queries, data
       | analysis, etc. on top of it. I believe Simon, the creator,
       | frequents HN too and is a true SQLite power user :-)
       | 
       | https://dogsheep.github.io/ - Dogsheep is a whole roster of tools
       | for doing personal analytics (e.g. analyzing your GitHub or
       | Twitter use, say) using SQLite and Datasette.
        
         | jakoblorz wrote:
         | sqlite can actually be scaled quite high: 4mio qps (queries per
         | second) https://blog.expensify.com/2018/01/08/scaling-sqlite-
         | to-4m-q...
        
         | api wrote:
         | SQLite is one of those things that also shows by contrast how
         | bloated and inefficient much of modern software is. It
         | accomplishes so much so efficiently with such a small
         | footprint.
        
           | rhombocombus wrote:
           | I love it. SQLite is light, easy, and useful in a variety of
           | situations.
        
           | petercooper wrote:
           | Agreed. There's a real beauty to it, to me, like looking at a
           | Swiss watch or something. Redis gives me similar vibes though
           | is a little more complex given its networked nature (although
           | the C is nice to read).
        
         | dheera wrote:
         | Which if any of these is efficient enough to run on mobile and
         | can cerealize an entire database to localStorage?
        
           | WJW wrote:
           | SQLite underpins large parts of both iOS and Android, so...
           | all of them? Whether you can serialize "an entire database"
           | just depends on the size of the database and whether it will
           | fit in the local storage. SQLite scales up to a few TB at
           | least, so that is unlikely to be the bottleneck.
        
             | yrrr wrote:
             | localStorage is a web API, the storage limit is ~5MB.
             | 
             | However it can only store strings so it's pretty taxing.
             | 
             | Chrome and Safari actually embed SQLite directly as WebSQL,
             | however it's not becoming standardized because Mozilla
             | doesnt consider "add SQLite" as a sensible web standard.
        
             | dheera wrote:
             | I don't fully get it -- this is an SQlite implementation in
             | JavaScript, which is different from what you'd use in a
             | native app. Unless you store it to localStorage or cookies,
             | all variables disappear when you navigate away from the
             | page.
             | 
             | Can you write an offline HTML5 webapp with any of these
             | libraries such that it can cerealize the entire database
             | into a string and then store that to localStorage and
             | reload the next time?
             | 
             | It's an honest question and not sure why people are
             | downvoting without giving reasons.
        
               | tehbeard wrote:
               | Really you'd want to do this in to an indexeddb store
               | rather than local storage. Less issues with size limits
               | and string serialisation as you can chuck Blob instances
               | straight into it.
               | 
               | You'd have to handle loading/saving though
        
         | haolez wrote:
         | Wow! The sql.js bundle is only 8kb. This is a hidden gem for
         | sure. Thanks for pointing it out!
         | 
         | EDIT: It's actually 1.2MB. Thanks for pointing it out :)
        
           | tln wrote:
           | There's a 1.2 Mb WASM file as well
        
           | fabiospampinato wrote:
           | Nope, try using the demo website, it loads a 1.2mb wasm file
           | (https://sql.js.org/dist/sql-wasm.wasm). SQLite might be
           | impressively small but it's not _that_ small.
        
             | bogwog wrote:
             | I was actually looking into options for storing some data
             | for a game I'm working on a while back, and decided against
             | using SQLite because it was too big of a dependency
             | (project has a specific size limit/target).
        
           | [deleted]
        
         | faizshah wrote:
         | There's also a pretty nice built-in sqlite extension for
         | transitive closures that helps in searching hierarchical/tree
         | structures in sqlite:
         | https://web.archive.org/web/20141127001741/https://charlesle...
        
           | napsterbr wrote:
           | Clicking the above link takes me to
           | https://imgur.com/32R3qLv.
           | 
           | I fail to understand why have a blog at all if its author
           | don't like people linking to it.
        
             | setr wrote:
             | well, he doesn't like one particular group of people
             | linking to it :-)
        
             | faizshah wrote:
             | How strange, sorry, I didn't know that it did that. Just
             | updated it to an archive link. This is the only source I
             | know of that gives an overview of closures.c so it's worth
             | the read.
        
               | napsterbr wrote:
               | Definitely not your fault :)
        
             | fiddlerwoaroof wrote:
             | I believe jwz did this first
        
         | kroggen wrote:
         | we can also have a SQLite database with branches, like in Git:
         | 
         | https://github.com/aergoio/litetree
        
         | neerajdotname2 wrote:
         | Using sql.js we have built online SQL course where the code is
         | executed in the browser itself.
         | https://academy.bigbinary.com/learn-sql
        
         | gary_bernhardt wrote:
         | Seconded; it's a fantastic tool and ecosystem. We use sql.js in
         | production (https://www.executeprogram.com). The SQL course's
         | live code examples all run directly in the browser via sql.js.
         | 
         | I initially had low expectations because it's such a weird use
         | case, but it's been totally reliable. We did have to ignore a
         | few types of errors from old browsers that don't support wasm
         | properly, but we've never had a bug in current browsers caused
         | by sql.js.
        
         | no-s wrote:
         | >so I have a bunch of links some of you might find interesting
         | if you want to dig further
         | 
         | oooh, thank you! I'm starting to see SQLite the same way!
        
       | D_Guidi wrote:
       | spatialite rulez!
        
       | drenvuk wrote:
       | sqlite is my caching layer. It's crazy fast on ssds and keeps my
       | memory empty. There's no reason to read blobs from disk or keep
       | them stored in memory if i'm using it.
        
         | sigjuice wrote:
         | What is the benefit of keeping memory empty?
        
           | drenvuk wrote:
           | Running lots of other things on the same machine. Most of the
           | stuff programs store is just sitting idle doing nothing but
           | taking up space. Move it to sqlite or disk if it's an ssd.
           | The latency hit negligible unless you're running a gaming
           | server.
        
       | maxbaines wrote:
       | Thanks, always agreed. And a good read.
        
       | incanus77 wrote:
       | I had seen Dr. Hipp speak at a conference[1] around 2009 and it
       | really put SQLite on my radar as a (then) Mac developer. It ended
       | up surfacing from the back of my mind around then at Mapbox when
       | I put together the prototype of the MBTiles file format[2] which
       | is still in use today. The idea of an on-disk relational database
       | was new to me and just a perfect fit for where mobile devices
       | were at that point in time, and for our use case of offline maps.
       | 
       | [1] https://en.wikipedia.org/wiki/C4_(conference)#C4[2]
       | 
       | [2] https://medium.com/devseed/portable-map-tiles-format-
       | release...
        
         | durkie wrote:
         | thanks for mbtiles! I spend lots of time using and abusing that
         | format every day :)
        
       | cute_boi wrote:
       | Nobody seems to mention that SQlite is one of the secure
       | database. Just look at its rigorous testing, fuzzing and
       | everything.
        
         | juancampa wrote:
         | nit: security is like the speed of light for a massive object,
         | you can never reach it, you can only get closer and closer. I'd
         | say instead "SQLite is one of the _most_ secure databases"
        
       | hackeraccount wrote:
       | Is it the unix version of MS Access? It's sort of interesting to
       | compare the two.
        
         | SigmundA wrote:
         | No it would be closer to the Jet db engine that Access uses,
         | they are both file based in process relational db engines. Back
         | in the day for a Windows app if you wanted a in process file
         | based db you would use Jet since the engine was included with
         | windows even though Access was a separate product.
        
           | happyweasel wrote:
           | Yeah, and I think sqlite is really comparable to this setup.
           | Use Access to design the database (optional), and access (no
           | pun intended) the jet database engine via ODBC,OLE-DB or DAO
           | ..Jay! Code like it's 1999 :-) ... I remember the jet db
           | engine offered at least transaction level READ_COMMITTED (in
           | comparison to mysql's MyISAM) and "it was quite fast" until
           | reaching about 1mio rows or so (of course, db design, index
           | etc determines "fast", that was just a basic rule of thumb I
           | remember)...
        
         | II2II wrote:
         | SQLite is a database that is intended to be embedded in
         | applications rather than run as a separate server. You could
         | make an Access-like application from it, but it cannot replace
         | Access on its own.
         | 
         | It's commonly used in applications that use it as a convenient
         | container to store data (the "file format" scenario) or as a
         | domain specific database. One of the better examples of the
         | latter is Calibre, an e-book library manager, since it exposes
         | some of the database functionality to the end user. For
         | example: the end user can add columns to store custom data for
         | each book.
        
         | stjohnswarts wrote:
         | Well the database part maybe, but it's not going to help you
         | generate forms and reports or build small guis around it.
         | You'll have to find those tools for yourself or build a web app
         | around your sqlite db. I haven't done a lot with Access though,
         | mostly help build small database interfaces/reports for their
         | inventory systems (whether a lab or comic book collection).
        
           | buescher wrote:
           | You can do all that with tcl/tk. Sqlite was in fact
           | originally a tcl extension so the tcl interface for sqlite is
           | first-rate. With tcl/tk it's sort of a Unix analog to VB6.
           | 
           | I keep thinking it would be sort of fun to do a tk app again;
           | is there a good model CRUD app for tcl/tk/sqlite out there?
           | Something like the Northwind thing for Access?
        
       | eleog wrote:
       | Sqlite is not a toy, but it's so sloppy that it's not really a
       | tool either. Like a saw with a loose blade, it can help you cut
       | but it can also hurt you. My pet peave is the way sqlite accepts
       | non-aggregated columns in a GROUP BY query. You'll get a result
       | in the column, but it's not clear what it means.
       | 
       | More of sqlite's sloppyness is detailed here
       | https://sqlite.org/src/wiki?name=StrictMode
        
         | outworlder wrote:
         | MySQL is also very "sloppy". People have learned to work around
         | its quirks.
         | 
         | Honestly, SQLite is a wonderful tool if properly used. Calling
         | it "sloppy" is a disservice.
        
       | ransom1538 wrote:
       | SQLite powers iMessage. Legit. You can use sqlite commands to
       | look through your messages on your macbook.
        
       | myth2018 wrote:
       | I'd also like to add the possibility of using SQLite databases as
       | an application file format:
       | https://news.ycombinator.com/item?id=23508923
       | 
       | I had to work on a data import/export tool some time ago and
       | SQLite has simplified the design a lot.
        
         | indigo945 wrote:
         | While this makes things easy, you should not do this for any
         | application file format where you except your users to share
         | files. This is because opening a SQLite database file makes the
         | SQLite library execute any arbitrary code that may be stored in
         | that file. [0] Therefore, SQLite is really only suitable for
         | local-only file formats, such as configuration files, and not
         | for files that users will e-mail to each other.
         | 
         | [0]
         | https://media.ccc.de/v/36c3-10701-select_code_execution_from...
        
           | myth2018 wrote:
           | Thanks for the heads-up!
           | 
           | In my use case, files are shared between different instances
           | of the application, usually without user intervention, but
           | there's an attack vector to be addressed here.
        
           | adamrezich wrote:
           | Weren't all of these issues quickly fixed by the developers
           | after being reported?
           | 
           | E: oh whoops missed the dev post above mine
        
           | SQLite wrote:
           | This is misinformation. SQLite does _not_ execute arbitrary
           | code found in the data file. There was a bug, long since
           | fixed, that could be used by an attacker to cause arbitrary
           | code execution upon opening the database file. The referenced
           | video talks about it. It was a very clever attack. But the
           | bug that enabled the attack was fixed even before the talk
           | shown in the video was given.
           | 
           | Let me say that again: SQLite does _NOT_ execute arbitrary
           | code that it finds in the database file. To suggestion that
           | it does is nonsense.
           | 
           | See https://www.sqlite.org/security.html for additional
           | discussion of security precautions you can take when using
           | SQLite with potentially hostile files. The latest SQLite's
           | should be safe right out of the box, without having to do
           | anything mentioned on that page. But defense in depth never
           | hurts.
        
       | tompazourek wrote:
       | Anyone knows what is the current situation regarding parallelism
       | in SQLite (utilizing multiple cores when executing a complex
       | query)?
       | 
       | Last time I heard about this, it wasn't looking very good.
        
       | people_not_bots wrote:
       | If coding can't be a toy then I dont know what is
        
       | francoisp wrote:
       | SQLite is a really neat thing. I was looking at extensions and
       | how to augment it; you could even add a pg_notify -like feature:
       | https://sqlite.org/c3ref/update_hook.html and have worker
       | processes doing what would amount to out of process stored
       | procedures in postgres (or UDF in SQLite) -- in any language
       | you'd like.
       | 
       | You can only register one callback per table tho, although you
       | could from this callback fire other functions... All in all it's
       | an awesome tool for a project like tailscape, but I think the
       | hackers there went for a flat file.
       | 
       | Personally I'd would love to see in process postgres; a build of
       | postgres that is geared for integrating a set of your threads,
       | and builds the whole of postgres with your app on all major OSs,
       | only listening to the inside by default. For the same reason I'm
       | using nodejs; to be able to run the same code anywhere. I think
       | bundle size would be a minor issue, really, I downloaded Sage9.2
       | yesterday, it's 2GB! VSCode is 100MB download, and they refer to
       | it as a small download...
       | 
       | cheers! happy coding,
        
         | fiddlerwoaroof wrote:
         | The Firebird database can run both as a separate server and in
         | embedded mode. I wish more databases works design for this sort
         | of flexibility.
        
       | tgb wrote:
       | Does anyone use SQLite as their daily-driver in lieu of R or
       | pandas for data analysis? I don't think I can use the sqlite
       | command-line since I'd want a fully-developed plotting utility,
       | and it seems less convenient to do the actual analysis part
       | through a sqlite connection in python, say.
        
         | abhgh wrote:
         | Not "in lieu", but I have found SQLite to be a good _companion_
         | to pandas. pandas is good when you want to lend some easily
         | navigable structure to your data without leaving python, but
         | becomes cumbersome (it _feels_ cumbersome esp if you know SQL)
         | when you want to do some serious processing. At such times, I
         | just dump my data into a SQLite DB, and from then on my code
         | has SQL queries to it, which I find are far more readable than
         | corresponding pandas statements (also easier for the reviewer).
         | 
         | Another advantage you get is if you wanted to look at these
         | intermediate data, you don't need to run your code in debug
         | mode and view the dataframe at a breakpoint - you can use
         | something like datasette[1] or a standard SQLite DB viewer.
         | 
         | So a function that does complex data processing has
         | approximately this structure my code:                 (1) <non-
         | pandas python code to fetch data, other "normal" stuff>
         | (2) <pandas code to do basic navigation, filtering, grouping
         | etc>       (3) <dump to SQLite, followed by SQL queries for the
         | really heavy stuff>       (4) <back to python, possibly pandas,
         | to return results, write to a file, plot etc>
         | 
         | Step (3) used to be pandas for me before, but depending on how
         | complex your operations are this can become hard to read and/or
         | review.
         | 
         | [1] https://github.com/simonw/datasette - datasette doesnt
         | replace a standard DB IDE, but is a very good lightweight
         | alternative to one if don't intend to perform updates/inserts
         | directly on a table.
        
           | steve_gh wrote:
           | I have exactly the opposite workflow. Import data into
           | salute, perform as much heavy lifting as possible in salute,
           | and export into a python data science workflow via led to if
           | necessary.
        
         | hokkos wrote:
         | I use extensively DbBrowser for sqlite to plot simple graph
         | from sql queries of imported csv files, it is easier than to
         | start a jupyter environment with python and pandas or r studio.
        
         | nalgeon wrote:
         | I do. Apache Superset, Metabase, Redash - all support SQLite.
         | Other BI tools also do.
        
         | simonw wrote:
         | I do. I load data into SQLite using various tools I've written
         | (geojson-to-sqlite, db-to-sqlite etc) and run the analysis in
         | https://datasette.io
        
         | iagovar wrote:
         | You can perform sql queries with many of the DB clients
         | available, such as Heidi, or Deaver, and also through python.
         | 
         | Sqlite is OLTP though, for analytical purposes I'd use an OLAP
         | DB like DuckDB.
         | 
         | IMO dumping workload in DB is nice, specially when your dataset
         | doesn't fit in RAM.
        
         | 6gvONxR4sf7o wrote:
         | You can always execute SQL in pandas by passing it a
         | connection, getting the best of both worlds.
        
         | qbasic_forever wrote:
         | Yes, setup jupyter magics to execute SQL and it becomes MUCH
         | more useful for data analysis in a notebook. There's a great
         | extension that lets you just write and run SQL queries right in
         | cells: https://blog.jupyter.org/a-jupyter-kernel-for-
         | sqlite-9549c5d... Hook it up to an in memory or on disk SQLite
         | instance and you can kick pandas to the curb.
        
         | kwhitefoot wrote:
         | Take a look at DB Browser for SQLite:
         | https://sqlitebrowser.org/
         | 
         | It has some plotting built in. Not sure if it is sophisticated
         | enough for what you need but it might be worth a try.
        
         | tfehring wrote:
         | It's complementary, since unlike R/Pandas it can fit too-big-
         | for-memory data - I've used it ad-hoc to store big-ish datasets
         | to pull into R. Though if you're using Python I think that use
         | case is mostly superseded by Dask.
        
       | jarym wrote:
       | Oh it's totally a toy. A great toy though! Love it to bits.
       | SQLite + Postgres here and maybe will add DuckDB to the mix soon
        
       | [deleted]
        
       | polyrand wrote:
       | You can also have stored procedures in SQLite
       | 
       | https://cgsql.dev/
        
       | fmajid wrote:
       | Surprised no one has mentioned how Expensify use SQLite as their
       | main data store:
       | 
       | https://blog.expensify.com/2018/01/08/scaling-sqlite-to-4m-q...
        
       | ajsnigrutin wrote:
       | For me, the main killer feature of sqlite is portability. (Like
       | moving it around, not porting it to new systems).
       | 
       | Just write the code to create an empty database if the db file
       | does not exist, move your code elsewhere, and the database will
       | be created on the first run. No usernames, passwords, firewall
       | rules, IP addresses, no nothing... just a single file, with all
       | the data inside.
       | 
       | Miration? Copy the whole folder, code and the database. Clean
       | install.. copy the folder, delete the database. Testing in
       | production? Just backup the file, do whatever, then overwrite the
       | file.
        
       | beached_whale wrote:
       | If I have a choice of making a file format and doing the I/O or
       | using SQLite, I'll choose SQLite. It's hardened to a degree that
       | I won't be for filesystem shinanigans. If I need to export a
       | file, that's still easy enough.
        
       | diehunde wrote:
       | From the SQLite website:
       | 
       | Situations Where A Client/Server RDBMS May Work Better
       | Client/Server Applications
       | 
       | If there are many client programs sending SQL to the same
       | database over a network, then use a client/server database engine
       | instead of SQLite. SQLite will work over a network filesystem,
       | but because of the latency associated with most network
       | filesystems, performance will not be great. Also, file locking
       | logic is buggy in many network filesystem implementations (on
       | both Unix and Windows). If file locking does not work correctly,
       | two or more clients might try to modify the same part of the same
       | database at the same time, resulting in corruption. Because this
       | problem results from bugs in the underlying filesystem
       | implementation, there is nothing SQLite can do to prevent it.
       | 
       | A good rule of thumb is to avoid using SQLite in situations where
       | the same database will be accessed directly (without an
       | intervening application server) and simultaneously from many
       | computers over a network.
       | 
       | High-volume Websites
       | 
       | SQLite will normally work fine as the database backend to a
       | website. But if the website is write-intensive or is so busy that
       | it requires multiple servers, then consider using an enterprise-
       | class client/server database engine instead of SQLite.
       | 
       | Very large datasets
       | 
       | An SQLite database is limited in size to 281 terabytes (248
       | bytes, 256 tibibytes). And even if it could handle larger
       | databases, SQLite stores the entire database in a single disk
       | file and many filesystems limit the maximum size of files to
       | something less than this. So if you are contemplating databases
       | of this magnitude, you would do well to consider using a
       | client/server database engine that spreads its content across
       | multiple disk files, and perhaps across multiple volumes.
       | 
       | High Concurrency
       | 
       | SQLite supports an unlimited number of simultaneous readers, but
       | it will only allow one writer at any instant in time. For many
       | situations, this is not a problem. Writers queue up. Each
       | application does its database work quickly and moves on, and no
       | lock lasts for more than a few dozen milliseconds. But there are
       | some applications that require more concurrency, and those
       | applications may need to seek a different solution.
        
       | candiddevmike wrote:
       | I wish SQLite had a PostgreSQL compatibility layer. Right now, to
       | add SQLite support to a Go project (or anything without an ORM),
       | you have to rework all your queries and migrations. It's probably
       | an impossible ask, but having a compatibility flag within SQLite
       | so it would accept PostgreSQL formatted queries would be
       | extremely helpful.
        
         | smilliken wrote:
         | Abstracting over two complex and evolving systems is a
         | bottomless well of bugs. Contributers would shy away from this
         | sort of compatibility because it would encourage users to
         | ignore the differences at first, and then complain loudly when
         | they are discovered.
        
       | michaelmior wrote:
       | > There is nothing more convenient than SQLite for analyzing and
       | transforming JSON. You can select data directly from a file as if
       | it were a regular table.
       | 
       | Personally I love jq[0] for this purpose. I haven't really used
       | SQLite for working with JSON, but the examples given are very
       | verbose.
       | 
       | [0] https://stedolan.github.io/jq/
        
         | simonw wrote:
         | I wrote a plugin for Datasette that adds a jq() custom SQLite
         | function, it's pretty fun:
         | https://datasette.io/plugins/datasette-jq
        
         | hobo_mark wrote:
         | I never managed to wrap my head around jq syntax however
        
           | busterarm wrote:
           | It takes a while. jq really needs a repl.
        
             | bionhoward wrote:
             | Check out ijq...used it this past week to deal with some
             | ugly nested FHIR data. Not perfect, a bit laggy, sometimes
             | crashes your terminal, but really cool in a lot of cases
             | https://sr.ht/~gpanders/ijq/
        
           | qbasic_forever wrote:
           | I'm with you too. I've never found a query language except
           | SQL that has really stuck in my head. Pandas, jq, XSL, etc...
           | it all has me running screaming back to SQL.
        
           | michaelmior wrote:
           | I agree it can be a little obtuse at times and I still do
           | occasionally find myself having to consult the documentation
           | despite using it pretty regularly for some time. I do also
           | sometimes make use of gron[0] since it integrates well with
           | standard Unix tools. The pipelines I write with gron are
           | probably pretty brittle in the sense that I expect they might
           | be incorrect for some edge cases. But for some quick
           | exploratory analysis, it's a tool I like to have handy.
           | 
           | [0] https://github.com/tomnomnom/gron/
        
       | andrewmcwatters wrote:
       | SQLite is so robust, that I bet most websites could use it
       | without really needing to move onto a client/server RDBMS.[1] I
       | use MySQL, and I know PostgreSQL has a large marketshare now, but
       | I wonder how much of either is really necessary when you think
       | about traffic usage alone. I know at least in my use cases,
       | neither seem necessary.
       | 
       | [1]: https://sqlite.org/whentouse.html
        
         | smoe wrote:
         | I'm sure most website could traffic wise. But for me it is not
         | a question whether I could, but if I should.
         | 
         | If my goal is building a website, I don't necessarily want to
         | experiment with different technologies if I already know
         | Postgres will work perfectly fine without adding much
         | operational overhead and covering use cases I don't have yet,
         | vs the unknown unknowns of using SQLite and maintaining it over
         | time. Again, this is not about some problem with SQLite, but
         | just me not having experience using it this way. Same reason
         | why I wouldn't just add any database system I haven't used
         | before, even if on paper it would be the "better tool for the
         | job" for a particular use case, and sure would be an
         | interesting learning experience.
         | 
         | In my opinion practicality and prior experience often beats
         | what is strictly necessary or "best".
        
         | Thaxll wrote:
         | SQLite is very limited because of its threading model, imo it's
         | not usable outside of the single app model where you have a
         | single user.
         | 
         | https://sqlite.org/threadsafe.html
         | 
         | https://sqlite.org/lockingv3.html
        
           | andrewmcwatters wrote:
           | I've used SQLite a bit, but not enough to say I know where
           | you run into performance issues. I would anticipate though,
           | if it's similar to fread/fwrite with some marginal overhead
           | for dealing with SQL, after considering most queries aren't
           | very complex, I think most people are going to have a hard
           | time hitting those limits.
           | 
           | And that's assuming you're making queries every time an event
           | occurs versus persisting data at particular points in time.
        
           | hsnewman wrote:
           | I disagree. I wrote a multiplayer web based game using sqlite
           | in go: http://decwars.com/
        
           | gwd wrote:
           | The article addresses this. Basically, you can have any
           | number of concurrent _readers_ , but only a single _writer_.
           | Writing and reading can happen concurrently just fine. So the
           | question is -- how many users does a website need before
           | having only a single concurrent writer becomes a bottleneck?
           | 
           | That number will obviously depend on the read/write ratio of
           | any given website; but it's hard to imagine _any_ website
           | where [EDIT the number maximum number of concurrent users] is
           | actually  "1". And for many, that will be in the thousands or
           | hundreds of thousands.
           | 
           | FWIW the webapp I use to help organize my community's
           | conference has almost 0 cpu utilization with 50 users. Using
           | sqlite rather than a separate database _greatly_ simplifies
           | administration and deployment.
        
             | lacker wrote:
             | When I was working on database services there were actually
             | many applications with more writes than reads. A common
             | example is applications where you want to keep the user's
             | state saved frequently, but you only need to reload it when
             | the application is restarted. A lot of games work this way.
             | So there can be a write every few seconds or every minute
             | for every active user, but you only need to do a read a few
             | times at the beginning of a new session.
             | 
             | Of course applications like a blog will have far more reads
             | than writes. It just really varies depending on the type of
             | application.
        
             | eingaeKaiy8ujie wrote:
             | >That number will obviously depend on the read/write ratio
             | of any given website; but it's hard to imagine any website
             | where that number is actually "1".
             | 
             | I can imagine a static website where the content is read-
             | only for users and is only editable by
             | admins/developers/content managers through some CMS.
        
               | duskwuff wrote:
               | That'd be a near-infinite ratio. The parent is discussing
               | a site where the ratio is near 1 (i.e. roughly as many
               | reads as writes).
        
               | gwd wrote:
               | It's not even the _percentage_ of reads vs writes; it 's
               | about the _total time spent writing_. Suppose that while
               | a single user was actively using the website, the time
               | spent writing was 70%. That 's a totally mad write load;
               | but even then, if you had two concurrent users, things
               | would probably still be quite useable -- you'd have to go
               | up to 4 or 5 users before things really started to slow
               | down noticeably.
               | 
               | Suppose, on the other hand, that a single user generated
               | around a 1% write utilization when they were actively
               | using the website (which still seems pretty high to me).
               | You could probably go up to 120 concurrent users quite
               | easily. And given that not all of your users are going to
               | be online at exactly the same time, you could probably
               | handle 500 or 1000 _total_ users.
        
           | tyingq wrote:
           | A nice fix for this might be to extract just the wire
           | protocol from dqlite to make a network connected, but not
           | Raft replicated, sqlite.
           | 
           | https://github.com/canonical/dqlite
        
           | bob1029 wrote:
           | With a single thread & SQLite connection instance, I have
           | been able to insert hundreds of thousands of rows per second
           | when using NVMe drives.
           | 
           | Note that WAL and synchronous flags must be set
           | appropriately. Out of the box and using the standard "one
           | connection per query" meme will handicap you to <10k inserts
           | per second even on the fastest hardware.
           | 
           | The trick for extracting performance from SQLite is to use a
           | single connection object for all operations, and to serialize
           | transactions using your application's logic rather than
           | depending on the database to do this for you.
           | 
           | The whole point of an embedded database is that the
           | application should have exclusive control over it, so you
           | don't have to worry about the kinds of things that SQL Server
           | needs to worry about.
           | 
           | SQLite is not a _direct_ replacement for SQL Server, but with
           | enough effort it can theoretically handle even more traffic
           | in your traditional one-database-per-app setup, because it 's
           | not worrying about multiple users, replication, et. al.
        
             | Thaxll wrote:
             | "The trick for extracting performance from SQLite is to use
             | a single connection object for all operations, and to
             | serialize transactions using your application's logic
             | rather than depending on the database to do this for you."
             | 
             | It's not the right approach because it's hard to get right,
             | you want to offload that to the DB.
        
               | bob1029 wrote:
               | It's actually not that hard to get right. You can put a
               | simple lock statement around a SQLiteConnection instance
               | to achieve the same effect with 100% reliability, but
               | with dramatically lower latency than what hosted SQL
               | offerings can provide.
               | 
               | Also, the only reason we ever want to lock a
               | SQLiteConnection is to obtain a consistent
               | LastInsertRowId. With the latest changes to SQLite, we
               | don't even have to do this anymore as we can return the
               | value as part of a single invocation.
        
           | busterarm wrote:
           | Most websites/frameworks access their database through a
           | singleton pattern/single-connection anyway.
           | 
           | Edit: Sometimes you have to lie and lead people down the
           | wrong path to enlightenment... ;)
        
             | Thaxll wrote:
             | That's not the case no, you usually access your database
             | with thread pool. Otherwise everyone would wait until the
             | single connection is free. Once you have a bit more users
             | that tries to write everything will fall appart.
        
               | OldHand2018 wrote:
               | Well, let me throw out a crazy idea:
               | 
               | Nobody said you have to use a single database/file.
               | Obviously, you are going to want to spend a couple
               | minutes thinking about referential integrity. But how
               | often do you _delete_ records in your web app?
        
               | HelloNurse wrote:
               | A big database id not an issue. Contention requires a
               | large number of writers, not a large amount of cold data.
        
               | kuschku wrote:
               | As an IRC client with an SQLite backend: several hundred
               | times per second.
               | 
               | Per user.
               | 
               | While reads are more like once every day per user.
        
               | quesera wrote:
               | UPDATE is also a write, and references crossing files
               | sounds like a nightmare.
               | 
               | If your deployment environment has serious constraints,
               | I'm sure you could make it work. The product you deliver
               | would be SQLite + custom DBI layer to hide SQLite's
               | limitations.
               | 
               | It would be a lot more work, and not be as robust or
               | scalable, compared to a more traditional selection. But I
               | can imagine cases where it would be appropriate.
        
               | busterarm wrote:
               | Yes.
               | 
               | ;)
        
               | mywittyname wrote:
               | One could implement a write queue and caching for the
               | front end. That might sound like a lot of work to go
               | through to avoid moving to a proper multi-user database.
               | But given that writes are probably done through a handful
               | of API endpoints, it's probably not really that much work
               | to implement.
        
             | quesera wrote:
             | Any site with real volume will have a connection pool.
             | 
             | All of those connections might need to write, and this is
             | where SQLite gets tricky to implement at scale.
             | 
             | I love SQLite! It's perfect for many use cases, but not
             | all. Fortunately, Postgres is also excellent.
        
             | trisiak wrote:
             | Can you provide some examples of frameworks with such
             | pattern? I actually have never seen it in any of the Python
             | or Go web applications that I had a chance to work with.
        
           | dwohnitmok wrote:
           | As far as I'm aware that threading information is on a per-
           | connection basis.
           | 
           | SQLite is perfectly capable of supporting multiple, parallel
           | reads.
           | 
           | SQLite must serialize writes, which makes a highly parallel
           | write-heavy workload not good for it. However, with WAL
           | enabled writes do not block reads.
           | 
           | Basically highly-parallel read loads with low write counts
           | (low-enough that serializing them doesn't lead to
           | unacceptable slow down of writes) or with loads where latency
           | is acceptable in writes (but not reads) is a perfect use case
           | for SQLite. And it turns out that a lot of web services are
           | heavily asymmetrically biased towards reads.
        
         | petercooper wrote:
         | I ran a niche community social bookmarking site (around
         | 100-200k pageviews per month) on SQLite for several years and
         | it was no problem at all. If a write was occurring, having a
         | simultaneous request wait 100 milliseconds was no big deal. It
         | only became a problem when I got tired of ops and wanted to put
         | it on Heroku at which time I had to migrate to Postgres.
         | 
         | I've always been surprised WordPress didn't go with SQLite
         | though - it'd have made deployment so much easier for 99% of
         | users running a small, simple blog.
        
           | unnouinceput wrote:
           | At the time when WordPress rolled out SQLite was still in its
           | infancy while MySQL was already a mature DB, hence the
           | choice. These days however they could, at least, do a nice
           | wizard to ask "do you want to run a blog only?" and deploy
           | SQLite instead.
        
             | fpoling wrote:
             | I looked at using SQLite with Wordpress for sites I
             | maintain. In theory Wordpress core is quite DB-agnostic and
             | according to random blogs MySQL could work. But plugins use
             | so much MySQL-specific features that it was hopeless.
        
             | jacurtis wrote:
             | I don't have any data on this, but I do know that a
             | significant (if not majority) of Wordpress sites now aren't
             | even blogs. They are small websites for things like
             | restaurants, hair salons, construction companies,
             | photographers, and personal portfolios. Especially since
             | most of these sites are also running on $3/mo shared hosts,
             | switching to SQLite would be a substantial improvement.
             | 
             | Most these sites are running a homepage, an about page, a
             | contact page, and maybe one or two misc pages. They might
             | have a blog that has two blog posts on it from nine years
             | ago. But that is really it. MySQL is really overkill
             | considering the scenario. SQLite's big "limitation" is non-
             | concurrent writes. But this is rarely a problem with most
             | Wordpress sites because they are single-author and they
             | aren't updated very often. SQLite can handle plenty of
             | reads to support even heavily trafficked websites.
             | 
             | Not to mention, SQLite's greatest advantage is portability.
             | A single file contains your entire database. A non-
             | technical user could transfer hosts or backup their data by
             | copying their database file like it was a photo or an excel
             | document. That's pretty incredible when you think about it.
        
               | iagovar wrote:
               | You don't even need SQLite for that. Something like
               | PicoCMS with markdown files is more than enough. If you
               | want a fronted for the owner, that's another story, but
               | if you're the one mantaining the sites, WordPress is an
               | overkill 99% of the time.
        
               | godot wrote:
               | Pleasantly surprised to hear PicoCMS mentioned here,
               | since it's not typically the most common PHP markdown CMS
               | of choice folks like to mention.
               | 
               | I was a casual developer (i.e. not for work, just for
               | personal) deep into the PicoCMS ecosystem for a couple
               | years, a few years ago. I both started a site and helped
               | a family convert an old static site to PicoCMS and really
               | had no complaints. Re: frontend for the owner, I started
               | with Pico Admin and made a bunch of modifications to it
               | (including an image uploader) and the non-tech owner has
               | no complaints and it's been working well since.
               | 
               | Nowadays for my own blog I'm into the whole SSG/JAM
               | trend, but I'd still run a PicoCMS site any time, if the
               | use case is right.
        
           | tyingq wrote:
           | Someone did write a plugin to have wordpress use SQLite as
           | the backend: https://wordpress.org/plugins/sqlite-
           | integration/
           | 
           | Perhaps not great for production since Wordpress
           | automatically updates itself, and you would have to keep up
           | with any changes. And not just for wordpress, but for any
           | other plugins that use the database.
           | 
           | Edit: A single file fork (albeit 5k lines of PHP) of the
           | plugin that looks interesting:
           | https://github.com/aaemnnosttv/wp-sqlite-db
        
             | blacksmith_tb wrote:
             | I maintain a corp Ghost blog that's backed by SQLite, it's
             | been solid for years.
        
               | nalgeon wrote:
               | Same thing here.
        
         | [deleted]
        
       | jstrong wrote:
       | > There is nothing more convenient than SQLite for analyzing and
       | transforming JSON.
       | 
       | the example query given:                   select
       | json_extract(value, '$.iso.code') as code,
       | json_extract(value, '$.iso.number') as num,
       | json_extract(value, '$.name') as name,
       | json_extract(value, '$.units.major.name') as unit         from
       | json_each(readfile('currency.sample.json'))         ;
       | 
       | that sure looks fun to type into a repl!
       | 
       | nothing against sqlite, which I like and use, just found the idea
       | of that query being convenient for one-off analysis to be off.
        
       | lamelydiminish wrote:
       | I love SQLite. Am building a webapp + mobile app that needs to
       | sync user created content. Is there an easy way to sync SQLite
       | DBs?
        
         | simonw wrote:
         | SQLite has a "session" extension which is designed to help
         | handle the sync problem: https://sqlite.org/sessionintro.html
        
           | lamelydiminish wrote:
           | TIL. Can't find anything on sessions for sql.js though...
           | Interested in using sql.js since I'm building a web PWA
           | first, and eventually native desktop&mobile app. Perhaps this
           | is overkill for SQLite at this point?
        
       | Toutouxc wrote:
       | > not bothering with optimization ([?]200 requests per page)
       | 
       | What amount of SQL queries per page render is considered
       | sensible?
       | 
       | When I run more than 20 queries per request in my Rails apps
       | (smallish internal tools for different companies) I get uneasy. I
       | usually deploy the app on the same machine where the DB (not
       | SQLite) runs, but I imagine if that weren't the case the app-DB
       | roundtrips could soon dominate the whole thing.
        
         | renewiltord wrote:
         | The sqlite position on this is articulated here
         | https://sqlite.org/np1queryprob.html
         | 
         | > _N+1 Queries Are Not A Problem With SQLite
         | 
         | >
         | 
         | > The SQLite database runs in the same process address space as
         | the application. Queries do not involve message round-trips,
         | only a function call. The latency of a single SQL query is far
         | less in SQLite. Hence, using a large number of queries with
         | SQLite is not the problem._
        
         | deckard1 wrote:
         | > app-DB roundtrips could soon dominate the whole thing
         | 
         | and they do. We have a GraphQL API backed by multiple
         | microservices at a place I work. You become increasingly
         | paranoid at every call you need to make.
         | 
         | It's also where ORMs completely fall over. ORMs are designed
         | for, generally, one record to one query mapping. Which is why
         | they are _always_ leaky abstractions which need heroic acts of
         | clever hacking to overcome. You know, rather than doing the
         | sensible thing and just writing a single query that fetches
         | data from multiple tables.
         | 
         | SQLite sort of encourages many queries, though, due to their
         | design. I recall their docs even mention this point. But I've
         | seen a few people in various threads that get bitten by this
         | when they switch to a client/server DB.
         | 
         | I think it's perfectly fine to design your app around the
         | characteristics of your database. Otherwise, you're missing out
         | on optimizations and features you could be using and what would
         | even be the point in favoring one DB over another if it's all
         | the same to your app? You should pick the DB that matches the
         | characteristics of your app anyway.
        
         | vbsteven wrote:
         | When using an external db every query needs to cross the
         | connection boundary and thus perform network IO and deal with
         | the threading implications for that IO. With SQLite running in-
         | process a round trip just means a function call, and when data
         | is available in the cache it won't even hit the disk.
        
       | megous wrote:
       | I'd still rather use PostgreSQL for my data needs. psql is quite
       | nice too as an interface, and "setup and forget" replication for
       | everything in the cluster and remote access makes up for slightly
       | harder deployment and upgrades.
       | 
       | Knowing that I can do anything and the change hits 3 hard drives
       | in a span of few seconds on various machnies is nice. Tolerance
       | for arbitrary down time of master/slave clusters is also nice.
       | 
       | But yeah, for quick one-off data munging, sqlite looks quite
       | nice.
        
         | benbjohnson wrote:
         | I agree that PostgreSQL has some great tooling around it. For
         | SQLite replication, you might want to checkout Litestream:
         | https://litestream.io/
        
           | megous wrote:
           | I know about litestream, it's not the kind of "setup and
           | forget" I have in mind. But it's nice that it exists.
        
             | benbjohnson wrote:
             | That's good to know. I'm always trying to improve the
             | developer ergonomics. What would you want to see added to
             | make it "setup and forget"?
        
               | megous wrote:
               | I don't think there's much to be done. I just like the
               | fact that I don't have to think about replicating
               | individual databases, regardless if I rename, delete, re-
               | create them.
               | 
               | It's being done at cluster level, it's one time setup and
               | I can forget about it.
        
       | kragen wrote:
       | This is a great introductory guide!
       | 
       | I like SQLite a lot, but I found SQLite's recursive CTE
       | implementation to be somewhat limited:
       | https://dercuano.github.io/notes/why-html-is-not-a-programmi...
       | 
       | Has that improved?
       | 
       | (sorry about the embarrassing arrogant pedant attitude in that
       | note, but it's too late to fix it now)
        
       | ben509 wrote:
       | > There is a popular opinion among developers that SQLite is not
       | suitable for the web, because it doesn't support concurrent
       | access.
       | 
       | No, the issue is it doesn't have high availability features:
       | failover, snapshots, concurrent backups, etc. (Edit: oops,
       | comment pointed out it does have concurrent backups.)
       | 
       | SQLite isn't a toy DBMS, it's an extremely capable embedded DBMS.
       | An embedded DBMS is geared towards serving a single purpose-built
       | client, which is great for a desktop application that wants a
       | reliable way to store user data.
       | 
       | Once you have multiple clients being developed and running
       | concurrently, and you have production data (customer accounts
       | that are effectively legal documents that _must_ be preserved at
       | all times) you want that DBMS to be an independent component. It
       | 's not principally about the concurrent performance, rather it's
       | the administrative tasks.
       | 
       | That requires a level of configuration and control that is
       | contrary to the mission of SQLite to be embedded. They don't, and
       | shouldn't, add that kind of functionality.
        
         | benbjohnson wrote:
         | I agree that high availability features are outside of the
         | goals of an embedded database. There's an ecosystem of tools
         | SQLite to provide these benefits though. There's dqlite &
         | rqlite for providing HA over SQLite. I'm the author of
         | Litestream[1] which provides streaming replication to S3 for
         | SQLite databases so folks can safely run a single node
         | instance.
         | 
         | [1]: https://litestream.io/
        
           | hinkley wrote:
           | One could, for instance, run "static" sites with the data
           | stored in a SQLite database. Update on write apps often can
           | do okay with 90% uptime. It's the "generate everything in
           | every request" crowd that needs an HA solution and _why do we
           | keep doing this to ourselves_?
        
             | benbjohnson wrote:
             | Yes, static generation is a great way to go if that works
             | for your use case. Throw it up on a CDN and you'll likely
             | have five 9s of uptime. I don't think that's a viable
             | solution for a lot of applications though.
        
               | hinkley wrote:
               | Half of arguments on the Internet are someone trying to
               | win an argument they lost elsewhere. If I could get some
               | people I work with to actually run the numbers on how
               | much it costs us to make everything dynamic, versus how
               | much we earn versus static, I think we'd quickly arrive
               | at an agreement that most of it should be static with a
               | little AJAX thrown in for the 10% that needs to be
               | special.
        
         | kroggen wrote:
         | For replication of SQLite there are some options:
         | 
         | http://litereplica.io - single-master replication
         | 
         | http://litesync.io - multi-master replication
         | 
         | https://aergolite.aergo.io - highest security replication,
         | using a small footprint blockchain
        
           | pbowyer wrote:
           | Have you used litereplica or litesync? It doesn't inspire
           | confidence that their websites haven't been updated since
           | 2016 and 2017 respectively.
        
             | kroggen wrote:
             | Most things happen in the forums:
             | 
             | http://litesync.io/forum/
             | 
             | http://litereplica.io/forum/
        
         | jes5199 wrote:
         | no, those things have never stopped me. What has stopped me is
         | that the default configuration for a Rails app throws
         | intermittent exceptions when I'm trying to save, because
         | concurrent writes fail and Rails (at least as of a couple years
         | ago) does the wrong thing by default
        
         | jyrkesh wrote:
         | I learned this the hard way when I stuck some app containers
         | down on a few RPis and mapped the folder of stateful stuff
         | (including a SQLite database) to an NFS share on my NAS.
         | 
         | It's....not great.
        
           | pdimitar wrote:
           | Can't remember where did I read this now but about a year ago
           | I have found material saying that sqlite3 over NFS is a very
           | bad idea and even though it's supported it's also strongly
           | advised against as a use-case.
        
         | wenc wrote:
         | It also doesn't have strong/static typing (it's dynamically
         | typed) so you have to typecheck your inputs or do type coercion
         | upon read.
         | 
         | And it doesn't have a native date type. Date handling has to be
         | handled at the application layer. It can be tricky to do
         | massive time-series calculations or date-based aggregations.
         | 
         | You can use integers or text types to represent dates, but this
         | open-endedness means you can't share your db because everyone
         | implements their own datetime representation.
        
           | samatman wrote:
           | Datetimes really are the achilles heel of SQLite.
           | 
           | The JSON extension library is amazing and works well. If
           | SQLite were to grow a first-rate RFC 3339 library, one which
           | could read from tz when available and do the things which
           | strftime cant, acting on your choice of Unix timestamp and
           | valid RFC 3339 date string, this would be a real boon to the
           | ecosystem.
           | 
           | I haven't found typechecking inputs to be a real barrier.
           | Sure, `val INTEGER CHECK (val = 0 or val = 1)` is a long way
           | to spell `val BOOLEAN` but `CHECK json(metadata)` is a
           | reasonable way to spell `metadata JSON`, and a similar
           | function would surely exist for a SQLite datetime extension.
           | You can do it now by coercing the string through an expected
           | format, but that doesn't generalize well.
        
           | names_are_hard wrote:
           | This. I'm working on a hobby project that is essentially a
           | web app for personal use. I started with sqlite because it
           | was the simplest to start with, but I'm about to reluctantly
           | migrate to postgres. Why? Constraints, data types, and
           | programmability.
           | 
           | In terms of scale, sqlite is just fine. But I am tired of
           | fiddling with the dates, it's too easy for bugs to sneak into
           | my code, and I want to use table valued functions to
           | essentially parameterize views instead of having to build
           | complex queries in the app layer.
           | 
           | If your web app is mostly reading and writing single rows,
           | yeah, sqlite is just fine. But if there's substantial and
           | complex logic involved, it has its limits.
        
         | smulc wrote:
         | Check out rqlite (https://github.com/rqlite/rqlite) for that
         | functionality - "rqlite is a lightweight, distributed
         | relational database, which uses SQLite as its storage engine.
         | Forming a cluster is very straightforward, it gracefully
         | handles leader elections, and tolerates failures of machines,
         | including the leader. rqlite is available for Linux, macOS, and
         | Microsoft Windows."
        
           | emodendroket wrote:
           | To me this is obviating a lot of the advantage of SQLite over
           | any other RDBMS.
        
             | dmurray wrote:
             | I thought the same, but maybe it's useful if you start with
             | SQLite and decide to scale up to a distributed RDBMS
             | without having to rewrite too much?
        
               | msla wrote:
               | Two thoughts:
               | 
               | SQLAlchemy, at least, lets you switch from SQLite to
               | Postgres in a configuration file.
               | 
               | Isn't there a useful SQL subset which allows you to
               | switch from one database to another without rewriting?
               | There seems to be such a subset for C, for example, which
               | multiple compilers all interpret the same way, and SQL is
               | a standardized language, too.
        
           | the_duke wrote:
           | rqlite is a separate daemon written in Go, negating most of
           | the reasons to choose Sqlite in the first place.
           | 
           | It absolutely has good use cases, but those are rather niche.
           | You'll mostly be better off with the traditional postgres
           | etc.
        
         | hexmiles wrote:
         | > "No, the issue is it doesn't have high availability features:
         | failover, snapshots, concurrent backups, etc."
         | 
         | I think it has concurrent backups via the backup api:
         | https://www.sqlite.org/backup.html
        
           | formerly_proven wrote:
           | I'd expect this to be unnecessary in WAL mode (which you
           | should use, when possible), since WAL allows concurrent
           | readers while permitting up to one writer. In the old undo-
           | log mode (which remains the default for compatibility)
           | writing excluded readers.
        
             | nayuki wrote:
             | I don't think you can do naive file copy backups even in
             | WAL mode. WAL will checkpoint the log into the main
             | database file from time to time. You need to use the SQLite
             | backup API to let you make a backup while blocking any
             | potential checkpointing.
        
               | formerly_proven wrote:
               | Sorry, I was unclear. I meant the "copy a few pages then
               | sleep for some time to release the database lock" part.
               | With WAL mode you should just be able to copy as fast as
               | you can without disturbing other readers or the writer.
               | SQLite also has a similar SQL built in, VACUUM INTO.
               | That's nice because the backup API just copies all pages
               | (iirc).
        
         | jksmith wrote:
         | From the dbadmin POV and understandable. From the dev side, if
         | you want to take a stab at your own failover, shards,
         | concurrents and snaps, there may be no better platform to learn
         | on. Writing a custom backend to do tricks with sqlite dlls is
         | extremely satisfying.
        
         | nijave wrote:
         | I think the biggest issue can be distilled to "lack of
         | concurrent network access". You start getting into a lot hairy
         | management problems with distributed file systems/copying files
         | around.
         | 
         | Imo part of that is just web applications tend to be not very
         | efficient when compared to something like unix cli utilities so
         | to get performance you end up with potentially massive amounts
         | of horizontal scaling
         | 
         | On the other hand, the lack of performance /usually/ buys you
         | higher productivity so you can make product changes faster (you
         | let a GC manage the memory to save time coding but introduce GC
         | overhead)
        
           | therealdrag0 wrote:
           | Right, and Expensify plausibly solved this by wrapping SQLite
           | to produce a different DB, BedrockDb. But that shouldn't
           | imply at all that SQLite alone could satisfy main stream web
           | applications.
        
         | habibur wrote:
         | Not to misunderstand. It fully supports concurrent reads. And
         | also concurrent reading while some other process is writing the
         | same database.
         | 
         | What it doesn't support is concurrent writes only.
        
         | bravura wrote:
         | A kludge---I forget from where---for low traffic sites is one
         | SQLite DB per user.
        
         | nalgeon wrote:
         | I think that the absence of 'high availability' is not an issue
         | for small websites or web apps. Transactions are ACID,
         | concurrent readers are fully supported. Backups and
         | administrative tasks are super-easy.
        
           | zokier wrote:
           | I see availability as orthogonal issue to scale/size. You
           | might be processing one transaction per day but it can still
           | be super important that the service is available for that one
           | transaction.
        
           | mikesabbagh wrote:
           | I ran it once on an NFS disk. There must be some file locking
           | happening I think. But I could read from multiple nodes, but
           | for some reason the delete was failing (could not delete a
           | database). Did not test the write as my app at that time was
           | writing from one location only. Anyone tested such HA
           | scenario?
        
           | gwd wrote:
           | I just looked, and the entire database for my webapp (which
           | typically serves around ~50 people) is 139k. As you say, you
           | can do a data dump quickly and safely; at that size, you
           | could afford to just dump the entire DB every hour and keep
           | the last year's worth of snapshots if you wanted.
        
             | jakoblorz wrote:
             | https://litestream.io/ That's exactly what litestream
             | accomplishes.
        
               | benbjohnson wrote:
               | That's close. Litestream takes a snapshot of the database
               | and then continuously replicates out WAL frames. On
               | restore, it grabs the snapshot and replays all those WAL
               | frames (or all the frames up to a certain time if you
               | want point-in-time recovery).
        
           | tmountain wrote:
           | How do backups work? Is there a locking mechanism to prevent
           | file corruption if the file is copied during writing?
        
             | nijave wrote:
             | You can also combine file system snapshots to get
             | versioning which would allow you to pull the db file out of
             | the latest snapshot (right after taking it) and send it
             | somewhere
        
             | sangnoir wrote:
             | It supports multiple readers, and writes are atomic - no
             | reader should ever get corrupt data just because there is a
             | write happening in parallel. You probably should not use
             | the OS to copy the live db file, rather, have an in-process
             | thread that reads the db and writes to a backup location
             | periodically.
        
               | mrweasel wrote:
               | Exactly, many forget the little detail that they should
               | use the backup feature, rather than just copy the
               | database file.
        
           | crazygringo wrote:
           | So if you're saying everything _does_ support a web
           | database... then what 's the reason people aren't using it
           | for websites?
           | 
           | Why do you say it works for "small" websites but presumably
           | not large ones? If it's not transactions, concurrent reading,
           | backups, or administrative tasks... then what's the issue you
           | run into?
           | 
           | Genuinely curious... I'm wondering if everything I've heard
           | about "don't use SQLite for websites" is wrong, or when it's
           | right?
        
             | dbattaglia wrote:
             | I think one valid fear is that a web application might, for
             | multiple reasons, scale beyond a single process on a single
             | server. This is common enough that using an embedded
             | database can be problematic, compared to a separate RDBMS
             | process that can be shared between processes. Some web
             | applications are just never going to scale out for any
             | reason, and for those SQLite might be appropriate.
        
               | danenania wrote:
               | Agreed, but it's not only a question of scaling. Any app
               | with high uptime requirements will need more than a
               | single process from day 1.
        
               | benbjohnson wrote:
               | It depends on what you consider high uptime. You can
               | achieve 99.95% uptime with 4h of downtime a year. A lot
               | of downtime occurs because of overly complicated systems
               | so running a single process on a single server can give
               | you relatively high uptime.
        
             | 0xCMP wrote:
             | The primary reason for me is that Postgres has stronger
             | default constraints. If you care about keeping your data
             | logically consistent then Postgres has more of that out of
             | the box.
             | 
             | SQLite just makes the tradeoff to be simpler since often it
             | doesn't matter. But don't make the mistake that it doesn't
             | matter. Since PG helps avoid data problems and you might
             | need to scale out web servers that is why Django for
             | instance recommends switching to Postgres (or whatever
             | you're actually going to use) ASAP cause there are
             | differences. You may end up relying on PG to reject things
             | SQLite doesn't care about by default. SQLite might let you
             | get away with inserting data which PG refuses to handle.
             | 
             | Not to mention the DB specific features can differ. Like
             | PG's JSON field types or etc.
        
               | pdimitar wrote:
               | Not sure why you were downvoted because that is a
               | legitimate concern and it's my only beef with sqlite3. I
               | will kill for an embedded PostgreSQL. If sqlite3 becomes
               | that I'll absolutely pay for a license if they require
               | it.
        
             | yellowapple wrote:
             | > then what's the reason people aren't using it for
             | websites?
             | 
             | I'd guess the reason to be that people keep hearing things
             | like "don't use SQLite for websites" and thus don't even
             | try.
             | 
             | > Why do you say it works for "small" websites but
             | presumably not large ones?
             | 
             | Not the GP, but the main reason I _wouldn 't_ use SQLite
             | for a large website is that SQLite itself doesn't offer
             | much re: failover/replication (i.e. multiple servers, one
             | database), and I haven't used RQLite enough (or at all; I
             | should fix that) to be comfortable with it in production.
             | Because of that, I'm more likely to reach for / recommend
             | PostgreSQL instead.
             | 
             | That being said, if your website has crazy "web scale"
             | FAANGesque needs and you're at the point where you need to
             | write your own replicated datastore, using SQLite as a base
             | and building your own replication layer on top of it (or
             | using RQLite and maybe adjusting it for your needs) seems
             | like a reasonable way to go.
        
               | benbjohnson wrote:
               | I'm in the process of adding read replication to
               | Litestream[1] so folks can scale out the read-side of
               | their SQLite applications to multiple nodes (or replicate
               | to edge nodes for low-latency).
               | 
               | [1]: https://litestream.io/
        
               | pdimitar wrote:
               | Amazing. This was missing for my use-case and I gave up
               | on LiteStream because of it but now it seems I have to
               | revisit!
               | 
               | Wonderful job.
        
               | gm3dmo wrote:
               | Crazy web scale folks flip from Mongo (NoSQL) to
               | Postgres:
               | https://www.theguardian.com/info/2018/nov/30/bye-bye-
               | mongo-h...
        
               | pbowyer wrote:
               | > That being said, if your website has crazy "web scale"
               | FAANGesque needs and you're at the point where you need
               | to write your own replicated datastore, using SQLite as a
               | base and building your own replication layer on top of it
               | (or using RQLite and maybe adjusting it for your needs)
               | seems like a reasonable way to go.
               | 
               | Exactly what Bloomberg did with Comdb2:
               | https://github.com/bloomberg/comdb2
        
             | danielheath wrote:
             | I would use SQLite on a single-machine web server. However,
             | I prefer to build my web stuff as stateless machine images
             | that connect to a separate database instance, because it
             | lets me scale them independently.
             | 
             | I should revisit this policy now that you can run a truly
             | huge site off a 1U slot (I work for an alexa top10k, and
             | our compute would fit comfortably in 1U); computers are so
             | fast that vertical scaling is probably a viable option.
        
               | pdimitar wrote:
               | That's exactly what I am leaning to in my current job.
               | Looking at the load of our servers I can comfortably put
               | all our software on my gaming machine (which is mid-
               | range!) -- with the DB included -- and I bet no request
               | ever will be above 100ms.
               | 
               | IMO a lot of organizations should start re-investing in
               | on-premise as well. Having a mid-range AMD EPYC server
               | can serve most businesses out there without ever having
               | more than 40% CPU usage.
               | 
               | That, plus scaling down Kubernetes clusters. Most
               | companies absolutely didn't need them in the first place.
        
             | emodendroket wrote:
             | If writes are infrequent I think it might be fine.
             | Otherwise I'd be wary.
        
             | jakoblorz wrote:
             | Depending on the use case (SaaS offerings with per customer
             | shards) you can actually scale sqlite quite high. Expensify
             | scaled to 4 mio qps (queries per second) [1], so if one of
             | your customers is exceeding that, you better be looking at
             | another DBMS but below that - well sqlite is an option.
             | 
             | [1] https://blog.expensify.com/2018/01/08/scaling-sqlite-
             | to-4m-q...
        
           | voxic11 wrote:
           | Also if you really need HA there are nice projects like
           | https://dqlite.io/
        
             | SahAssar wrote:
             | Looking at https://dqlite.io/docs/protocol it seems like it
             | does not allow for embedded usage though. Doesn't this turn
             | it into more of a traditional client-server RDBMS?
        
               | yawaramin wrote:
               | From the home page:
               | 
               | > Dqlite is a fast, embedded, persistent SQL database
               | with Raft consensus that is perfect for fault-tolerant
               | IoT and Edge devices.
        
               | rstat1 wrote:
               | Looking at the archtecture page [0] on that same website
               | the second heading is "Embedding".
               | 
               | [0] https://dqlite.io/docs/architecture
        
               | SahAssar wrote:
               | Ah, right, I should have looked deeper!
        
         | TheRealPomax wrote:
         | Not sure I understand: sqlite is file based, so snapshots and
         | concurrent backups are literally just file copies/backups.
         | 
         | I'd much rather SQLite not waste its time on implementing
         | features they're not good at, leaving that to the tools we
         | already have available for rolling file backups, instead
         | spending their time and effort on offering the best file-based
         | database system they can.
         | 
         | (Heck, even failover is just a file copy initiated "when your
         | health check sees there's a problem")
        
           | simonw wrote:
           | Unfortunately backing up with a simply file copy operation
           | isn't guaranteed to work if you have write traffic at the
           | same time.
           | 
           | Instead you need to use the .backup mechanism or the VACCUM
           | INTO command, both of which safely create a backup copy of
           | your database in another file - which you can then move
           | anywhere you like.
        
             | majkinetor wrote:
             | Depends on the file system. On Windows you can use shadow
             | copy for example.
        
             | mceachen wrote:
             | Oh man, I didn't know about VACCUM INTO, thanks!
        
           | emodendroket wrote:
           | Every database is ultimately on the file system but there's a
           | reason that method of backup is rarely used.
        
             | dnautics wrote:
             | Postgres lets you write directly to block. In practice,
             | nobody does this though.
        
         | gm3dmo wrote:
         | >Once you have multiple clients being developed and running
         | concurrently, and you have production data (customer accounts
         | that are effectively legal documents that must be preserved at
         | all times) you want that DBMS to be an independent component.
         | It's not principally about the concurrent performance, rather
         | it's the administrative tasks.
         | 
         | For my reporting/read-only use case https://datasette.io/
         | solves the above beautifully.
        
       | D13Fd wrote:
       | My only "problem" with SQLite was that it was so fast, running
       | locally, that it hid just how much my app was needlessly hitting
       | the database. It was a surprise when I transitioned to a
       | networked Postgres server and performance completely tanked due
       | to my crappy code...
        
         | re wrote:
         | I know you put "problem" in quotes, but in case you haven't
         | seen it, this document was posted here a little while ago:
         | https://sqlite.org/np1queryprob.html
         | 
         | It describes how access patterns that would be bad practices
         | with networked databases are actually appropriate for SQLite as
         | an in-memory DB.
        
           | D13Fd wrote:
           | That's interesting! Now I have to resist the urge to move
           | back...
        
       | [deleted]
        
       | agumonkey wrote:
       | Someone told me they build a world class DNS mostly around
       | sqlite. I guess you left the toy area by that point.
        
       | michaelmcmillan wrote:
       | I've built a complex CRM that handles 2.1 million USD in
       | transactions every year. It is running sqlite with a simple in-
       | memory lru cache (just a dict) that gets purged when a mutating
       | query (INSERT, UPDATE or DELETE) is executed. It is very simple
       | and more than fast enough.
       | 
       | Friendly reminder that you shouldn't spend time fine tuning your
       | horizontal autoscaler in k8s before making money.
        
         | outworlder wrote:
         | > Friendly reminder that you shouldn't spend time fine tuning
         | your horizontal autoscaler in k8s
         | 
         | Oh, but most companies need this!
         | 
         | The biggest feat of microservices (which require ways to manage
         | them, like k8s) was to provide the ability for companies to
         | ship their organizational chart to production.
         | 
         | If you don't need to ship your org chart and you can focus on
         | designing a product, then you can go a long way without overly
         | complicating your architecture.
        
         | midrus wrote:
         | Do you mean I don't need Go microservices talking gRPC deployed
         | in multiple kubernetes clusters with bash script based
         | migrations via GitOps with my hand made multi cloud automation
         | (in case we move clouds) following all the SCRUM practices to
         | ship working software?
         | 
         | Mindblowing.
        
           | kwertyoowiyop wrote:
           | You will for your blog series that you mention prominently on
           | your resume that gets you your next Senior Architect gig.
        
             | midrus wrote:
             | Agh!... that's the catch... what I'm gonna give talks about
             | and what do I write on medium then.... now I get it.
             | Thanks!.
        
           | anticristi wrote:
           | > my hand made multi cloud automation (in case we move
           | clouds)
           | 
           | Aren't these symptoms of a deeper problem? Many Product
           | Manager I talk to wanted me to build something that is as
           | flexible as possible and solves all problems for everyone,
           | everywhere. Go microservices with gRPC and Kubernetes feels
           | like the only high-level technical decisions I can take in
           | light of such information. :)
        
         | juancampa wrote:
         | How do you ensure data is not lost to oblivion if a
         | catastrophic system failure occurs?
        
           | pupdogg wrote:
           | You put in-place a loss mitigation strategy. This strategy
           | will vary by application. In my case, I have a similar setup
           | where we write 25-30k records to SQLite daily. We start each
           | day fresh with a new SQLite db file (named yyyy-mm-dd.db) and
           | back it up to AWS S3 daily under the scheme
           | /app_name/data/year/month/file. You could say that's 9
           | million records a year or 365 mini-sqlite dbs containing
           | 25-30k records. Portability is another awesome trait of
           | SQLite. Then, at the end of the week (after 7 days that is),
           | we use AWS Glue (PySpark specifically) to process these
           | weekly database files and create a Parquet (snappy
           | compression) file which is then imported into Clickhouse for
           | analytics and reporting.
           | 
           | At any given point in time, we retain 7 years worth of files
           | in S3. That's approx. 2275 files for under $10/month.
           | Anything older, is archived into AWS Glacier...all while the
           | data is still accessible within Clickhouse. As of right now,
           | we have 12 years worth of data. Hope it helps!
        
             | hodgesrm wrote:
             | This sounds interesting. Have you thought of doing a talk
             | or blog article about it?
             | 
             | p.s., I run the SF Bay Area ClickHouse meetup. Sounds like
             | an interesting topic for a future meeting.
             | https://www.meetup.com/San-Francisco-Bay-Area-ClickHouse-
             | Mee...
        
           | michaelmcmillan wrote:
           | Backups.
        
             | antoinealb wrote:
             | Does that mean it's okay for your application to loose
             | transactions (which occured between the backup point and
             | the failure point) or do you have other mitigations ?
        
               | Spivak wrote:
               | Isn't that how all backups work? If you _need_ to prevent
               | data loss then backups probably aren 't your tool of
               | choice. And if you're paranoid about data loss then any
               | replication lag is also unacceptable.
               | 
               | * I'm worried about my server blowing up: Transactions
               | have to be committed to more than one DB on separate
               | physical hosts before returning.
               | 
               | * I'm worried about my datacenter blowing up:
               | Transactions have to be committed to more than one DB in
               | more than one DC before returning.
        
               | benbjohnson wrote:
               | I'm the author of Litestream, which is an open-source
               | tool for streaming replication for SQLite. That could be
               | a good option if you need to limit your window for data
               | loss. We have a pretty active Slack if you need help
               | getting up and running. https://litestream.io/
        
               | [deleted]
        
               | habibur wrote:
               | Guess this daily 2 seconds of downtime is worth it, when
               | that reduces cost say from $2000/month to $20/month.
        
               | mrweasel wrote:
               | Many banks still "shutdown" for hours every night to do
               | backups.
        
               | finiteseries wrote:
               | I'm not anywhere _near_ the banking industry but from HN
               | alone I've been led to believe dailyish huge file
               | transfers are also the norm in a variety of situations
               | (aka SQLite's backup strategy).
        
               | ericlewis wrote:
               | ftp or sftp if you're lucky - upload a giant CSV or
               | receive one. It is _crazy to me it still works this way_.
        
       | Derpdiherp wrote:
       | ---Was talking about a previous company who made steps to shift
       | from it for pricing reasons. Seems they where misinformed.
        
         | eurleif wrote:
         | Huh? SQLite is public domain.[0] There are paid extensions for
         | encrypted and compressed databases[1], but those cost a flat
         | fee for unlimited devices:[2] "Your license is perpetual. You
         | have paid a one-time fee that allows you to use and modify the
         | software forever. You can ship as many copied of the software
         | to your customers as you want so long as you ensure that only
         | compiled binaries are shipped (you cannot distribute source
         | code) and that your customers cannot make additional copies of
         | the software to use for other purposes."
         | 
         | [0] https://www.sqlite.org/copyright.html
         | 
         | [1] https://www.sqlite.org/prosupport.html
         | 
         | [2] https://sqlite.org/see/doc/release/www/readme.wiki
        
           | Derpdiherp wrote:
           | Interesting - it's been a long time since I used / looked at
           | it. There was a big uproar in the company though and a drive
           | to move away from it due to licensing and cost from there.
           | Maybe they where misinformed.
        
             | masterofmisc wrote:
             | I would say terribly so.
        
         | tomcam wrote:
         | How is it not free for commercial use? It's public domain code.
         | I just rechecked the license after reading your comment.
        
       | qmmmur wrote:
       | Did anyone say it was?
        
       | qwerty456127 wrote:
       | SQLite is beautiful. Almost a perfect database for all single-
       | user and simple multi-user projects. I just wish it had some
       | built-in (not as an extension) encryption, native (non-DIY) date-
       | time and UUID types - it would actually be perfect then.
        
       | unnouinceput wrote:
       | SQLite is definitely not a toy DB when it comes to its features,
       | none will deny that. However, when you start to deal with
       | hundreds of millions of records in a table, you kinda consider it
       | a toy.
        
         | _wldu wrote:
         | It is the most widely deployed DB in the world. That fact,
         | alone, shows that it is not a toy DB.
        
         | nijave wrote:
         | Sqlite can handle that
         | https://blog.expensify.com/2018/01/08/scaling-sqlite-to-4m-q...
         | 
         | I've been toying around with it locally for playing with big
         | data sets (50-100s GB) uncompressed and it works pretty well.
         | It's much easier than postgres and mysql which have a lot more
         | knobs and tuning required)
        
       | RedComet wrote:
       | It seems as though there is a new article about this daily.
        
       | JimmaDaRustla wrote:
       | Weird. They never mentioned the amazing power of inserting text
       | and numbers into a boolean column.
        
         | justin_oaks wrote:
         | I assume you're being sarcastic because I haven't yet thought
         | of a reason why that'd be a good thing. Although I've never
         | been bitten by SQLite's dynamic types, I wish they weren't a
         | thing.
         | 
         | I'm genuinely curious if there are any real use-cases for this
         | behavior.
        
       | flyflyFenix wrote:
       | I was a fan of SQLite until I needed a full outer join of two
       | tables. It was such a pain to create a workaround when I found
       | out it isn't supported, and the result was still sub-optimal and
       | needlessly complex. Is running a full outer joins such a
       | unreasonable assumption within a DBMS?
        
         | justin_oaks wrote:
         | I hear you. MySQL still doesn't have full outer joins either.
         | 
         | I would think this would be one of the easier features to add.
         | 
         | Of course, I'm sure the people working on MySQL or SQLite would
         | say "Patches welcome!"
        
       | snicker7 wrote:
       | > SQLite is serverless.
       | 
       | Maybe if you use SQLite as a file format. But if you use it like
       | an actual database (e.g. in a web application), I find that one
       | is best off setting up a daemon thread to queue/batch
       | transactions.
        
         | II2II wrote:
         | One of the disappointing developments of the past quarter
         | century is the near demise of general purpose databases as an
         | end user application. Yes, client/server models are useful when
         | dealing with a large number of transactions. On the other hand,
         | it is usually too complex to justify for personal or small
         | office use.
         | 
         | I miss the days when databases were included in office suites
         | or could be purchased as relatively inexpensive standalone
         | applications since it was easy to create a database and
         | associated forms, rather than depending upon domain specific
         | applications that use a database as a file format but may be
         | poorly suited for a particular scenario.
         | 
         | (Yes, the phrasing "file format" verses "actual database"
         | rubbed me the wrong way.)
         | 
         | Edit: I realized that my post comes off as dismissive of
         | SQLite, which is not entirely true. I use it as a database and
         | appreciate it's role, but embedding it in an application is
         | frequently more than a given application requires.
        
           | fpoling wrote:
           | Access DB is still included in Microsoft Office
        
             | OldHand2018 wrote:
             | Only some versions of Office, mainly the non-cheapest
             | business versions.
        
             | zokier wrote:
             | And LibreOffice has Base, a sadly often neglected
             | component.
        
               | II2II wrote:
               | Another issue with Base is the required JRE. Installing
               | the JRE may be trivial, but it means that an error
               | message is the first experience most people will have
               | with Base.
               | 
               | As for Access, it's access is limited due to the much
               | higher price point of Business/Professional versions of
               | Office. In terms of office suites, that version is about
               | two to three times more expensive than an equivalent
               | office suite from the mid-1990's (adjusted to 2021
               | dollars). In other words, people will only have Access if
               | they feel that the additional cost is justified.
               | 
               | That additional cost may be fine for business use, yet it
               | also means that people have less exposure to databases to
               | start with. With respect to exposure, there also appears
               | to be an absence of general purpose databases for home
               | users these days. (By that I mean in terms of cost and
               | ease of use.)
        
               | zokier wrote:
               | > Another issue with Base is the required JRE
               | 
               | I thought that they dropped JRE requirement when the
               | engine was switched to Firebird? Googling around
               | apparently the transition wasn't quite successful :(
               | 
               | https://ask.libreoffice.org/en/question/279711/firebird-
               | dead...
        
               | II2II wrote:
               | The only "embedded database" option (HSQLDB) requires the
               | JRE. I don't know if this holds true for the "connect to
               | an existing database" options, but I would rather throw
               | together a Python script using SQLite when it reaches
               | that level of complication. (It uses my existing
               | knowledge.)
        
               | zokier wrote:
               | > The only "embedded database" option (HSQLDB) requires
               | the JRE
               | 
               | Firebase is still available as an _experimental_ option
               | as another embedded database, just confirmed on my
               | installation. Also while I was at it, tested it with Java
               | disabled and HSQLDB predictably did not work, but
               | Firebird did continue to work. Apparently it also opens
               | an pre-existing Firebase odb file even without
               | experimental flag being on.
               | 
               | https://ibb.co/jM6k07w
        
         | code-scope wrote:
         | I use a different pattern. A lot sqlite files for diff purposes
         | (UserSession, User Files, etc each store in separate files)
         | This way diff threads of webserver can open/query/read/write a
         | lot of files concurrently without any issue.
        
           | snicker7 wrote:
           | > without any issue
           | 
           | It's easier than you think to corrupt SQLite if you access
           | from multiple threads and especially from multiple processes
           | (yup, I've done this before).
           | 
           | Also, there are no concurrent transactions in SQLite. The
           | entire db file gets locked (using POSIX locking, which is
           | known to be broken [0]). Better to queue/batch transactions
           | on a single connection. If your web server consists of
           | multiple processes, then this requires a separate daemon.
           | 
           | [0]: http://0pointer.de/blog/projects/locking.html
        
           | saalweachter wrote:
           | I, too, enjoy creating deadlocks.
           | 
           | [I assume if you are using this pattern successfully in
           | production you are already aware of and taking proper steps
           | to avoid them, but the pedant in me takes issue with "without
           | any issue", since once you have multiple resources being
           | locked in multiple threads, you need to be careful to acquire
           | and release locks in such a way that deadlocks do not occur,
           | either never acquiring more than one lock at a time,
           | acquiring locks in specific orders, or acquiring batches of
           | locks at a time.]
        
       | codykochmann wrote:
       | The thought that devs believe javascript would ever be fast
       | enough to make sqlite actually need to run in parallel makes me
       | giggle.
        
       ___________________________________________________________________
       (page generated 2021-03-25 23:00 UTC)