[HN Gopher] How Does Sqlite Work? (2014) ___________________________________________________________________ How Does Sqlite Work? (2014) Author : searchableguy Score : 267 points Date : 2020-06-27 17:17 UTC (5 hours ago) (HTM) web link (jvns.ca) (TXT) w3m dump (jvns.ca) | dang wrote: | Discussed at the time: | https://news.ycombinator.com/item?id=8378894 | gtrubetskoy wrote: | The details of how the data is stored are described in this 1979 | paper by Douglas Comer: "The Ubiquitous B-Tree" [1]. Since it was | invented, much hasn't really changed: practically all relational | databases use some variation of this structure. I wrote about it | in [2], which might help explain some aspects of how it works. | | [1] https://dl.acm.org/doi/10.1145/356770.356776 | | [2] https://grisha.org/blog/2013/05/11/relational-database-on- | to... | | Edit: also [3] is about how to store SQLite in a Redis hash | (rather than on-disk file). | | [3] https://grisha.org/blog/2013/05/29/sqlite-db-stored-in-a- | red... | oscargrouch wrote: | > I'm still not super clear on how exactly the interior pages are | structured and how the pointers to their child pages work. It's | time to sleep now, but perhaps that will happen another day! | | Luckily. The SQLite source code is very well documented, and the | 'btreeInt.h' has something about how the btree pointers are laid | out | | I happen to use the SQLite btree directly without the SQL layer | (the VDBE) as a key-value store on top of a userspace filesystem | (chrome cachefs) and even with this configuration it works | wonderfully. ** The basic idea is that each page | of the file contains N database ** entries and N+1 | pointers to subpages. ** ** | --------------------------------------------------- | ------------- ** | Ptr(0) | Key(0) | Ptr(1) | Key(1) | | ... | Key(N-1) | Ptr(N) | ** | ---------------------------------------------------------------- | ** ** All of the keys on the page that Ptr(0) points to | have values less ** than Key(0). All of the keys on page | Ptr(1) and its subpages have ** values greater than Key(0) | and less than Key(1). All of the keys ** on Ptr(N) and | its subpages have values greater than Key(N-1). And ** so | forth. ** ** Finding a particular key requires | reading O(log(M)) pages from the ** disk where M is the | number of entries in the tree. ** ** In this | implementation, a single file can hold one or more separate | ** BTrees. Each BTree is identified by the index of its root | page. The ** key and data for any entry are combined to | form the "payload". A ** fixed amount of payload can be | carried directly on the database ** page. If the payload | is larger than the preset amount then surplus ** bytes are | stored on overflow pages. The payload for an entry ** and | the preceding pointer are combined to form a "Cell". Each | ** page has a small header which contains the Ptr(N) pointer and | other ** information such as the size of key and data. | | I guess if she go straight to use the btree layer, bypassing the | SQL layer, it will be easier to figure it out how the pages and | pointers works and relate to each other. | | After this, try it out with the SQL and see how it works. | | --- | | Edit: just to add to this, this line here | | > Each BTree is identified by the index of its root page | | When you call 'sqliteBtreeCreateTable()' it will return to you | the root page of the table it creates. In my experience it starts | from 2, and my guess is that 1 is probably some master root page | (i bet Julia will dig into this later). | | This is how i've manage to have "keyspaces" here. i use the first | created table as a meta-table, keeping the index of keyspace | names with their root page number counterparts, and on opening | the db, is just a matter of recovering this info and put it in a | hash table (keyspace => page number) for later use. | dunham wrote: | Yeah, if I remember right the first block is the root of the | master table, listing all the other tables. (Name, schema, root | block id.) | zoomablemind wrote: | The SQLite code is nicely split into modules, and builds together | reatively quickly. So the easier way to explore the codebase is | to load into editor the individual modules at src/ rather than | the aggregated sqlite3.c file which is too big and unmanageable | in the editor. | | IIRC, the heart of the db engine appears to be some sort of vm | that processes the parsed out SQL. Stepping through in debugger | will inevitably capture you in the loop which executes the | statements. | | Of course, there're plentiful design docs, all on | https://sqlite.org. | bntyhntr wrote: | My c is pretty limited and I was able to hack in support for | table prefixed columns in result sets over a night or two from | never seeing the code base before. It was fun to work with! | nurettin wrote: | they also provide an amalgamation header which is generated as | part of their build process, so that you can simply add a .h | and .c file to your project and have the whole sqlite goodness | linked directly into your project. | ed25519FUUU wrote: | > _May you do good and not evil._ | | > _May you find forgiveness for yourself and forgive others._ | | > _May you share freely, never taking more than you give._ | | If only more software engineers started out their projects with | this mindset, the world would be a very different place. | helloiloveyou wrote: | This part impacted me a lot really, almost teared up. This is a | truly important part of how SQLite works | ed25519FUUU wrote: | Me too! I think there's a place for spiritual/meditate | mindset in our work. | stormdennis wrote: | Sounds like an extract from the Our Father. | mosselman wrote: | Why is this down voted? | | > If only more software engineers started out their projects | with this mindset, the world would be a very different place. | | Apparently this is a horrible thing to say? | edoceo wrote: | Not sure about the comment you replied to (is that called | GP?) but I think you've missed this guidelines: | | "Please don't comment about the voting on comments. It never | does any good, and it makes boring reading." | [deleted] | tekstar wrote: | Years ago I spent an evening at Recurse Center (at the time | Hacker School) in NYC. My employer sent me with some recruiters | to chat with participants about hiring possibilities. It was just | so cool. | | That night I met Julia (she and a friend were, if I remember | correctly, trying to boot a handwritten kernel in a VM and | figured it out around 11pm that only the first 300 bytes were | getting loaded into memory). | | I met Andrew Kelley, who had recently written an NES emulator | that translated the code to LLVM (if I remember correctly). | | I briefly met Filippo Valsorda. I think he was the person who, | after I said I wanted to make a bitcoin arbitrage bot, | immediately came up with putting the prices into a matrix and | then computing optimal paths. | | The place just seemed.. magical. I have a family and kids in | Canada but i thought a long time about trying to reorient my life | to NYC for a bit. It was just so cool. | | It reminded me of the joy of learning for the sake of learning, | hacking stuff because it feels like a magic superpower to bring | ideas to life, and the joy of building things you have no | intention on trying to make money off of. | | I'm certain none of those people remember me but I remember the | evening fondly whenever I see any of them show up on HN. Cheers | :) | forgotmypw17 wrote: | I have a lot of respect for Recurse Center, and am a fan of the | work that comes out of it. However, I feel that it is a very | exclusive place, and if you are not good enough, you're just | not welcome. | | I applied for an internship there a couple years ago. After | writing an essay on why I want to do RC, and an interview | where, IIRC, I shared my camera but my interviewers did not, I | was rejected with a boilerplate "not for us" explanation, and | that was that. | | I would've felt much better about it if they had at least | invited me to visit, but this experience gave it a very | corporate and impersonal feel. | epoch_100 wrote: | > I was rejected with a boilerplate "not for us" explanation, | and that was that. | | For what it's worth, RC has struggled with the question of | whether to give personalized feedback to applicants who don't | get in. They did it for awhile, but stopped for a number of | reasons. They explain it here: | https://www.recurse.com/feedback | gentryb wrote: | While I may question their reasoning, I respect them for | putting it out there. Ultimately totally turned off to this | based on the elitism crap. I did enjoy the article quite a | bit though. | 4gotunameagain wrote: | To be honest, elitism in such places keeps them awesome. Not | everything is for everyone, despite how much modern movements | want to believe so. | [deleted] | forgotmypw17 wrote: | I think the strategy you describe works to an extent, but | eventually leads to echo chamber effect. | | I think, perhaps arrogantly, that RC would have benefited | from my pollination. | | We'll both be fine on our own, too. | akavel wrote: | _[citation needed]_ | daseiner1 wrote: | tragedy of the commons | ponker wrote: | There are places where passion isn't enough, only performance | is what counts. The NBA, Recurse Center, Navy Seals, etc. | mi_lk wrote: | I get why you're downvoted, but it's not inaccurate | peruvian wrote: | I can't speak for the NBA or Navy, but it's not true for | RC. I was a pretty meh developer when I got in 2016. | Technical questions in the interviews are a magnitude | simpler than any job interview I've ever had. | | Most people don't have the skills Julia and the other | people the original post mentioned. They're honestly the | 1% of RC's over 1k (I think!) students over the years. | Most of us have little projects that we work on for 12 | weeks in a great environment or learn new things. Most of | us are normal people who are lucky enough to be able to | study programming for 12 weeks in NYC. | mattgreenrocks wrote: | I think the danger here is thinking that RC is the only place | that cool people reside. | | RC has excellent marketing. There needs to be more places | like it rather than everyone thinking that there can only be | one place that everyone tries to get into. | quadrifoliate wrote: | This. Why are there not more places (or for that matter, | workplaces) like RC? | peruvian wrote: | > However, I feel that it is a very exclusive place, and if | you are not good enough, you're just not welcome. | | What do you mean by this? | | RC is a company (a YC company actually) that leases expensive | NYC real estate and bears all the other costs of running a | company while providing its service for free. Unfortunately, | it's not a hangout where anyone can join and thus they have | to limit space. | | As for "if you are not good enough"... all the tech interview | questions (at least in 2016) are easier or simpler than real | job interview questions. The important parts are the soft | skills questions. I "studied" with PhDs, bootcamp grads, and | college students. And personally, I'm a pretty meh developer. | But just like any company, they don't always give the best | feedback on rejections. | seemslegit wrote: | Quite well actually | Animats wrote: | Lookup is easy. Update is hard. Sqlite does quite well at update, | given the limitations it works under. Multiple instances can | update the same database while maintaining strong consistency, | yet there's no common server process that can see all the | transactions. This costs you some speed in updating (don't do | your web service this way) but it's pretty good for most low- | write-traffic use cases. | justinclift wrote: | Part 2: https://jvns.ca/blog/2014/10/02/how-does-sqlite-work- | part-2-... | ramoz wrote: | I encourage anyone interested in this to also explore embedded | dbs (leveldb, badger, lmdb, rocks, etc) | refset wrote: | Embedded KV stores are as comparable to SQLite as Cassandra is | to Postgres (i.e. lacking joins or other data modelling | essentials), but I agree that embedding a database directly | inside your app is interesting and opens up a world of | possibilities for efficiently blending N+1 code and queries. It | gets even more interesting to take it a step further and embed | a networked-replica of a database a la Datomic Peers. | dfox wrote: | It was probably meant as a learning exercise. And for me it | is that implementation details of the KV/heap layer with some | meaningful transactional properties seem more interesting | than how to build SQL/whatever database on top of that, which | in comparison looks like mostly obvious but tedious SMOP. | searchableguy wrote: | Check out sled too which is written in rust. - | https://github.com/spacejam/sled | twic wrote: | Anyone looked at DuckDB [1]? Like SQLite, it's an embedded | relational database with a C API. In fact, it has a wrapper which | presents the same API as SQLite [2]. But it's optimised for | analytical rather than transactional work - a few huge queries | rather than lots of little reads and writes. | | [1] https://duckdb.org/ | | [2] | https://github.com/cwida/duckdb/tree/master/tools/sqlite3_ap... | iagovar wrote: | So I guess I can use SQLite connectors or GUI to play with it? | I was just looking for this. | | Not a dev so I'm a bit dumb with this stuff, but I was | definitely looking for pivoting stuff and such, which requires | a lot of extra steps with sqlite. | mytherin wrote: | DuckDB offers a SQLite compatible interface; we use it | extensively ourselves as it allows us to use the SQLite shell | directly. | | The internal design of DuckDB, however, is different and not | directly related to SQLite. As you mentioned, it is entirely | optimized for analytical workloads whereas SQLite is optimized | for point-queries and point-updates. The main things we share | is that we also have a single-file storage format and offer an | amalgamation (i.e. single-source file compilation, duckdb.cpp | and duckdb.hpp). ___________________________________________________________________ (page generated 2020-06-27 23:00 UTC)