[HN Gopher] SQLite Internals: Pages and B-trees ___________________________________________________________________ SQLite Internals: Pages and B-trees Author : eatonphil Score : 433 points Date : 2022-07-27 13:55 UTC (9 hours ago) (HTM) web link (fly.io) (TXT) w3m dump (fly.io) | nathell wrote: | I dream of a SQLite-like embeddable database engine based on | Datomic's data model and queryable with Datalog. Written in | something like C, Rust, or Zig. I'm toying around with the idea | of hacking something up, but it'll likely stay in the dream realm | until I have Heaps of Free Time on My Hands (tm). | | Looking into SQLite's innards is a great source of inspiration. | Thanks for this post. | user5678 wrote: | klysm wrote: | I'm surprised something like this doesn't exist yet - I wonder | if it's possible to build it on top of SQLite somehow? | packetlost wrote: | I tried. It's not easy because of how limiting SQLites | indexes are. You have to build your own indexes using | `TRIGGER`s or in a software wrapper and tables. | | You can see me prototype here: | https://git.sr.ht/~chiefnoah/quark | infogulch wrote: | Commendable attempt! I've considered writing a datalog | storage backend on sqlite just like your prototype. Thank | you for sharing, now I can lazily study your prototype | instead of doing the hard work myself. :) I'm curious, what | kinds of limitations of SQLite indexes are you referring | to? | packetlost wrote: | Sparse indexes are pretty limited and it only supports | B-tree, which make implementing AVET and VAET difficult. | Further efficiently finding the _current_ value for a E + | A is difficult to do in SQL in a way that doesn 't | require maintaining a whole copy of the data. I actually | bumped up against what I believe are weird edge-case bugs | in the SQLite query planner when dealing with sparse | indexes as well. | | I think I gave up when trying to implement one-many | relationships because the SQL was getting too gnarly. | packetlost wrote: | I've been doing something very similar, but based on extendible | hashing algorithms and LSM trees. I've come to the conclusion | that a DAG of constraints, triggers and validations on a flat | KV + entity model is probably the ideal data structure for 99% | of projects I've worked on. You can get the benefits of the | Datomic-like history by... skipping compaction and including a | TX instant next to all record/assertions. I've found SQLite, | Postgres, the LSM papers, Bitcask, and many other papers to be | very helpful in terms of inspiration. | | Edit: I'm prototyping in Python and implementing in Rust with | intent to create a C API and embedding a Scheme runtime for | "server"-side query and constraint parsing. | ftdyuio9p0iugy wrote: | I've had some append-only tables in Postgres and only | recently realised that Postgres' system columns | (https://www.postgresql.org/docs/14/ddl-system-columns.html) | already effectively enabled some Datomic-like structure for | such append-only tables! Specifically the xmin column allows | me to identify the rows to treat atomically as a unit, and to | ignore if I'm querying for a historical view. | packetlost wrote: | You could probably do it with BRIN indexes similar to how | TimescaleDB handles their time-series hypertables | LoriP wrote: | TimescaleDB is packaged as a postgres extension, there's | a GitHub project here if anyone is interested to check in | on that https://github.com/timescale/timescaledb | sauruk wrote: | Do you have a public repo for that yet? (Assuming you're | planning to open-source) | packetlost wrote: | It's got a LICENSE but not publicly listed yet. It's very | rough, and has a ton of work left. Once I get it to a | workable state I'll open it up under AGPL, though probably | with a CLA because I'd like to turn it into a marketable | product in the long-run. If I make significant progress on | it, I'll reply to this thread with updates :) | sherbondy wrote: | Obligatory link to Project Mentat: | https://github.com/mozilla/mentat | | No longer actively maintained, but maybe a nice starting point | for hacking on your dream! | infogulch wrote: | mentat was archived by mozilla back in 2017, but there are a | bunch of forks. Because github is dumb and has a terrible | interface for exploring forks [0], I used the Active GitHub | Forks tool [1] that helped to find: | | qpdb/mentat [2] seems to be the largest (+131 commits) and | most recently modified (May this year) fork of | mozilla/mentat. | | [0]: https://github.com/mozilla/mentat/network/members - | Seriously, how am I supposed to use this? Hundreds of | entries, but no counts for stars, contributors, or commits, | no details about recent commits. Just click every one? | | [1]: https://techgaun.github.io/active-forks/index.html | | [2]: https://github.com/qpdb/mentat | huahaiy wrote: | Sounds like Datalevin https://github.com/juji-io/datalevin | | Embeddable, check. Datomic data model and Datalog query, check. | Storage written in C, check. | packetlost wrote: | Ooh, this looks good. LMDB backend though, meh. | | Edit: it's written in Clojure, so JVM. Extra bleh | artemisart wrote: | Embeddable... in the java ecosystem. I often see comments | about datalog/datomic on HN and it seems interesting but I | never see it anywhere else, is it because it's mostly known | and used in the java and clojure ecosystem? Do you know of | any free database with similar models usable from e.g. | python? | krn wrote: | > I dream of a SQLite-like embeddable database engine based on | Datomic's data model and queryable with Datalog. | | You can have this today by running XTDB[1] on top of SQLite via | JDBC. | | > Written in something like C, Rust, or Zig. | | And then compiling your application into native executables | with GraalVM Native Image[2]. | | [1] https://xtdb.com/ | | [2] https://www.graalvm.org/native-image/ | Kinrany wrote: | I hope using an embeddable database will also free us from | delegating the choice of query language to the database | library. It should be possible to have a general purpose low | level persistence API and many different query engines built on | top of it. | packetlost wrote: | I think the problem with querying is efficient query-planning | requires understanding the indexes on the dataset, so you at | least need to be able to expose indexes and the their | properties in an API. | Kinrany wrote: | The API should definitely either allow directly managing | indexes or provide even lower level primitives that let the | query engine create its own indexes. | packetlost wrote: | I mean, if you have a KV-like store that supports | enumeration, you can pretty much always index the data | yourself. | kjeetgill wrote: | Imho the 90% of query planning is not that hard at all in | practice. If it's your data, and your query you'll probably | have a pretty good idea which table you'll want to filter | first, and what to join the same you would with any data | structures. | | The hard part is getting all of that consistent with | concurrent writes. Can rows change while you scan? can | indexes? How do you check that your write is valid | immediately before committing, etc. things like that. | | I think SQL makes that pretty hard already, but in a | "database-as-a-bag-of-data-structures" mode I think that's | going to get even harder. | thesz wrote: | IMNSHO query planning is pretty hard. I recently found | exponential behavior in SELECT query processing that | depends on the depth of subselects. This happened with | pretty seasoned database system, let me say. | | To have good query optimization, you need to implement, | at the very least, some form of dynamic programming, | otherwise you will not be able to optimize queries that | have more than half a dozen tables in selects. Then you | have to implement selection of the best plan or | approximation to it, which would make you implement beam | search through space of all solutions you generated, and | that's simplest case. For guaranteed optimization, you | need to implement or utilize pseudoboolean optimization | engine. | | I am a database engine developer right now. ;) | j-pb wrote: | If you just pick a worst case optimal join algorithm with | a limited table cardinality, i.e. triples like in the | case of OP, so that you can materialise every possible | index, you can perform a dynamic _instance optimal_ | search (i.e. you can ignore skew), if you choose the best | possible variable every time you have to pick a variable. | This can be done by estimating the cardinality and | jaccard index of the variables, which is pretty | straightforward with the right datastructure. If you | don't want to limit yourself to just three columns, you | can also go for cutting edge succinct data-structure | magic. https://aidanhogan.com/docs/wco-ring.pdf | | Either way, using a WCO join combined with a data- | structure that allows for efficient range estimate and | dynamic variable ordering, completely obliviates the need | for query planning. | thesz wrote: | "Estimate the cardinality", "jaccard index", "succinct | data structure" and, finally, some unknown abbreviation | "WCO" (I guess, it stands for "worst case optimal"). | | Yes, of course, having implementation of all that | completely obliviates the need for query planning. ;) | | I can't help being sarcastic for a moment, sorry. | | In my opinion, in your comment above you clearly | demonstrated that even avoidance of query planning is | hard, using as example (multi)set of triples for which it | is possible to realize all indices. | | If what you described is simpler than query planning, | then query planning is hard. | j-pb wrote: | While the technical terms may be unfamiliar, it's all | pretty straightforward, and requires between 1-4kloc | depending on how fast and fancy you want things to be. | (This includes everything from the basic copy on write | data-structures to the query language.) | | Building an immutable path compressed radix tree is | pretty straightforward and requires around 1-2kloc, and | it's easy to keep track of the n-smallest hashes of leaf | values, as well as the total count of leafs in the nodes. | The sampling done by the min-hashes give you a good | indication of two nodes overlap which the jaccard index | is just a different name for. | | The query engine itself is like 0.5kloc, and is just | walking the different radix-trie indices simultaneously. | The basic insight of worst case optimal (WCO) joins, is | that it's a lot cheaper to join everything at once and | treat it as a constraint propagation problem. | | A LINQ style query parser takes up another 1-2kloc and is | just a bunch of ol' boilerplate. | | In total that's about as much code as your average large | C++ codebase CMake file. | | You could sketch the entire thing on a napkin and build | it in a week if you've build something like it before. | ComodoHacker wrote: | >you'll probably have a pretty good idea which table | you'll want to filter first, and what to join | | Until the day you load a bunch of new data and it gets | skewed, or you delete a bunch of data without shrinking | and oops, your join order and method is not efficient | anymore. | awild wrote: | Have you had a look at arrow? It has those capabilities | EastLondonCoder wrote: | What about https://xtdb.com/ | michael_j_ward wrote: | if litestream ever enables logical replication, I think you | could do `SQLite logical replication --> embedded materialize- | db --> back to SQLite` | kebman wrote: | Your dream sounds very nice. | chakkepolja wrote: | There's a bunch of resources on r/databasedesign. | HyperMassive wrote: | I can't seem to find this subreddit. Do you have a link? | benbjohnson wrote: | I think it's this one: | https://www.reddit.com/r/databasedevelopment/ | Rochus wrote: | Interesting article. When I implemented a no-sql database using | the sqlite backend for https://github.com/rochus-keller/DoorScope | and https://github.com/rochus-keller/CrossLine 15 years ago there | was little literature about it (actually also the referenced | article is rather light on the pages and b-tree implementation, | but e.g. section 9 of https://www.amazon.com/Definitive-Guide- | SQLite-Experts-Sourc... was helpful). There was no lmdb or | leveldb yet; though there was bekreley db and I did prototypes | with it, but the sqlite backend turned out to be much leaner and | faster for the task at hand. | alberth wrote: | Off topic: does it bother anyone else when author By Line and | Date Published is displayed at the very end of the blog/article | as opposed to the start/top? | | I'm not even a journalist / communication person but I do | immediately look at those fields to get a quick sense of | relevance (is this an old article possibly out-of-date) and | authority of author on said topic. | zasdffaa wrote: | A date published anywhere at all is fortunate. | Linda703 wrote: | guessmyname wrote: | I love this. | | I have been a fan of SQLite for years, and it makes me genuinely | happy to see other people have the same enthusiasm for the | software. | | A couple of years ago, during a job search, I came across a | couple of companies that I thought were a good match. I went | through the interview process with all of them and, due to | similarities in the interview process, we ended up talking about | the same thing: databases. Oh the horror in the interviewers' | eyes whenever I mentioned SQLite. Experienced software engineers | from The New York Times, for example, went as far as to mock my | engineering choices in 3 out of 5 interviews, despite the success | of the products my team built on top of SQLite. | | The experience made me feel awful and I stopped talking about | SQLite for a couple of years. Instead, whenever someone asked a | question about databases, I would answer with the PostgreSQL | equivalent, knowing that the basic features are all the same. | | That is why, whenever I see projects like Litestream [1][2] used | by excellent products like Tailscale [3], it brings me joy. | | A friend of SQLite is a friend of mine :-) | | [1] https://litestream.io | | [2] https://fly.io/blog/all-in-on-sqlite-litestream/ | | [3] https://tailscale.com/blog/database-for-2022/ | jjoonathan wrote: | Mockery is never OK. | | That said, their experience probably reflects my own: using | SQLite on a personal project for config or something, loving | the simplicity, loving the "ditch all that complexity" hype, | then trying it on a more demanding project that according to | their mental model it should have excelled at (a central job | tracking process), watching it completely faceplant under load | (20TPS? Surely it just needs an index or something? Maybe a | ramdisk? Err.....), panicking when they realize that perf and | monitoring and diagnostics and optimized backends were some of | those "bells and whistles that nobody needs," deciding "fine | I'll do it myself," throwing far too much time and effort at | co-debugging python and C, recompiling SQLite with debug | symbols, sprinkling printfs, running profilers, reading the | source, and tracking down the issue... only to find that it was | a bad default which people had been unsuccessfully begging the | maintainers to change for years, which was guaranteed to | torpedo perf under a wide variety of common workloads, and | which the maintainers had been resisting based entirely on | inertia and the misguided idea that this setting was easily | discoverable. Yikes. | | _That 's_ how someone picks up rules of thumb like "SQLite is | for preferences, not perf, and definitely not prod." | | Now, this experience is a decade stale. SQLite bit me in 2012 | and the particular bug was fixed in 2015 IIRC. The people | begging to fix the bad default finally won. Time passed, SQLite | got better, and now, based on these new stories of SQLite doing | well under load, I think it's time to lift my "once bit, twice | shy" sanctions regarding perf. I had them in place for a | reason, though, and if you had YOLO'd when I had YOLO'd, you | would have faceplanted when I faceplanted. I fully defend the | use of sanctions like this in general. They are the only tool | we have against overenthusiastic marketing. | | To be fair, the SQLite site is pretty good, it's the fans that | tend to take things too far. | jjoonathan wrote: | Speaking of potentially obsolete sanctions, conferences have | been on hold so I might be hanging on to a few more of them | past their due date. Does anyone have fresh experiences with | these? | | ------------------ | | AMD GPUs (2014): Waiting for GPGPU support from Adobe and | Blender that was supposed to be operational (Adobe) or | getting there (Blender) but never landed. | | AMD GPUs (2016): I burned an entire semester trying to port | CUDA code to OpenCL only to run into bug after bug after bug. | Lockups, black screens (no framebuffers were supposed to be | involved but bugs find a way), leaks, stalls which were | difficult to attribute, unscruitable errors leading to | abandoned threads with pleading users from years past with no | reply. I finally cracked when after a long day of "I swear | this should work I have no idea how the changes I make | correspond to the bugs I see" I ran my ailing OpenCL code on | a weaker NVidia card, and it Just Worked. Not only did it | just work, it was 10x faster. Even though the card was like 2 | generations older, and I was using OpenCL, which NVidia had | every reason to sandbag, but they were still winning at. Oh, | and the debugging tools actually worked, too. I was persuaded | to sell my red cards, eat the ebay tax, eat the nvidia tax, | and buy green from then on. Time has passed since 2016, | though. ROCm looks good and AMD has money now, so maybe they | can afford to pay people to fix bugs. I want to see someone | running Blender and a modern ML framework or two before I | retire the sanctions, though. Anyone have fresh intuition | here? | | Numba CUDA (2018): I spent two weeks working through a | succession of problems that I eventually resolved in two days | by just using CUDA C++. Numba CUDA only worked with year- | stale CUDA, it copied float8s over a float4 array despite | having access to triply redundant type specs, the mechanism | to pass buffers between kernels always forced copy off GPU, | and one more that I forget. I made patches for the first | three and gave up on the last one before fixing. Has anyone | used it recently? Is it in better shape these days? | hardwaregeek wrote: | Never underestimate programmer machismo. So many people seem to | rule out SQLite because of programmer lore that it's not fast | enough or doesn't scale. But ofc Postgres is a _real_ database | and therefore is the automatic correct choice. | outworlder wrote: | > went as far as to mock my engineering choices | | That's not acceptable. | | > Experienced software engineers | | Measured in years, probably, not by ability. | | I've heard supposedly 'experienced' engineers saying things | like "No-one uses Lua in production!". | ok_dad wrote: | It's amazing how otherwise competent engineers will look down | on technology that so many use all around the world when | generally their contribution to society is a few buggy Python | classes in some internal codebase. | atonse wrote: | Crazy to hear that. | | Those conversations are my favorite when interviewing | candidates because it gives you a view into how their brains | work. | | I would've nerded out about how so many macOS apps use SQLite | under the hood because it's so solid. | | SQLite is even used in the Airbus A350 and they signed (approx | something like) a 35 year support contact with Airbus for it, I | remember hearing in a podcast a long time ago. (Please correct | if I'm way wrong here) | LAC-Tech wrote: | I sometimes wonder how much of this objection to SQLite is | purely technical, and how much is due to the fact that Dr. | Richard Hipp would not exactly be a great "cultural fit" at | places like the New York Times or Mozilla. | mypalmike wrote: | SQLite is great. But the simple fact that it runs locally | makes it a poor technical fit in many scenarios. | bob1029 wrote: | > Simplicity leads to reliability and I don't know of a more | reliable database than SQLite. | | JSON file on disk might be a reasonable competitor here. This | scales poorly, but sometimes you don't need to scale up to | something as crazy as full-blown SQLite. | benbjohnson wrote: | Yeah, I like JSON on disk too. There's some "gotchas" to avoid | to make it safe though like atomic renaming or fsyncing the | file and its parent directory. But as long as you avoid those | and don't have a lot of data then JSON-on-disk works great. | yndoendo wrote: | One of the projects I worked on was an embedded system running | on 200MHz with 8MB RAM and micro SD for storage, single data | channel. SQLite was chosen because of the power to preform | report calculations in SQL and is perfect for configuration | storage in simple Key / Value relation, like Firefox | about:config. | | SQLite is my go-to database for application configuration and | even non-enterprise web applications. | vlovich123 wrote: | Implied is "as simple as possible but no simpler". We used JSON | on disk at one job BUT. | | 1. We knew what we were doing with respect to fsync+rename to | guarantee transactions. I bet you the vast majority of people | who go to do that won't (SQLite abstract you from needing to | understand that) | | 2. We ended up switching to SQLite anyway and that was a | painful migration. | | Just pick SQLite to manage data and bypass headaches. There's a | huge ecosystem of SQL-based tools that can help you manage your | growth too (eg if you need to migrate from SQLite to Postgres | or something) | anonymouse008 wrote: | At what point does one "need" to move to Postgres / MySQL? It | feels like much of DBA is all style points - not necessarily | objective. | stonemetal12 wrote: | SQLite is not good at OLTP workloads. If that is what you | are doing then SQLite runs out of steam pretty quickly. | infogulch wrote: | > _multiple-user_ OLTP | | It works fine for single user applications. | dymk wrote: | If at least one of the following are true (non-exuastive | list): | | - More than one machine needs to read from the database | | - More than one machine needs to write to the database | | - The database is too large to hold on one machine | [deleted] | jjoonathan wrote: | Does SQLite have perf tools these days? I got bit badly by | SQLite perf a decade ago (a bad default tanked performance | and took a week of autotools, profiling, and mucking around | in source to fix). Now I just use Postgres whenever I need | anything resembling performance or scale. A decade is a | long time, though -- do the tools exist now? | | If not, some jobs will benefit from a better perf insurance | policy than "I could probably get this into a profiler if I | needed to." | yread wrote: | if you see "database locked" more often then you'd like | zdkl wrote: | It becomes necessary at some (low) value of concurrent | write usage basically. Or if you want to lean on logic | built into one of the "proper" RDBMS (GIS, granular access | managment, ...). | masklinn wrote: | > It becomes necessary at some (low) value of concurrent | write usage basically. | | Much less so since the introduction of WAL mode 12 years | ago, tho. | justinclift wrote: | If your database is write heavy, and receives a bunch of | traffic, then PostgreSQL is probably a better fit. | | Also, if the SQL that's needed in an application is more | "enterprise" style, then SQLite isn't there (yet). ;) | masklinn wrote: | Also if you want things like advanced datatypes or | indexes, or features sqlite doesn't have (of which there | are many, though in the last few years the support for | "advanced" SQL has really taken a leap forward making it | much more enjoyable). | schwartzworld wrote: | JSON is a pretty poor storage format though, isn't it? I use it | for the wordle clone I made, but the code would have been much | simpler with SQL statements. And that was without any kind of | complicated relational data | https://github.com/schwartzworld/schwardle | conradev wrote: | The funny thing is that SQLite is faster for storing JSON blobs | than the file system is: | https://www.sqlite.org/fasterthanfs.html | | I would also say that the second you need any sort of writing | or mutating in any sort of production environment SQLite is | also simpler than the file system! | | Race conditions in file system operations are easy to hit, and | files have a lot more moving pieces (directory entries, | filenames, file descriptors, advisory locks, permissions, the | read/write API, etc) than a simple SQL schema | | Whenever I see "full-blown" next to "SQLite" there is usually | some sort of misconception. It's often simpler and more | reliable than the next best option! | jck wrote: | It looks like sqlite is not much faster than fs on Linux. | | I've also found that ripgrep performs surprisingly well with | folders containing large numbers of text files - this makes | raw json on disk more convenient for many simple usecases | which don't require complex queries. | samatman wrote: | It's on write that using SQLite really shines. | | Imagining a moderately large JSON object, a lot of them | actually, not hard to read or write exactly but they do | carry some common references. | | SQLite can: only rewrite the changed fields with a query, | and transact the update across the objects. The filesystem | can do atomic swaps of each file, usually, if nothing goes | wrong. SQLite would also be faster in this case, but I'm | more motivated by the ACID guarantees. | outworlder wrote: | > something as crazy as full-blown SQLite. | | The sqlite library is tiny. Often smaller than some JSON | parsers! | CJefferson wrote: | We did have a big system that used JSON file on disk. The | problem is over time we ended up with files with half a JSON, | and needed a "cleanup" pass. Except of course the cleanup might | find a file in the process of being written, and life slowly | got more and more complicated. | zimpenfish wrote: | Worked at a place that used on-disk XML as their database | with multiple (10s of) writers. Started off fragile, ended up | hideously complex and fragile. Unfortunately it was | originally written by one of the powers that be and his word | held sway over every technical decision, good or disastrous. | eatonphil wrote: | See also: https://tailscale.com/blog/an-unlikely-database- | migration/, and https://tailscale.com/blog/database-for-2022/. | didgetmaster wrote: | I wrote a new, general-purpose data management system that was | originally designed to compete with file systems for storing | large amounts of unstructured data (my background is in disk | utilities like PartitionMagic and Drive Image). It can be used | to convert 100M+ files (photos, documents, videos, etc.) into | data objects and attach searchable, contextual meta-data tags | to them. | | The tags I invented were essentially highly-optimized key-value | stores that worked really well. I then discovered that these | same KV stores could also be used to form relational tables (a | columnar store). It could be used to store both highly | structured data or the more semi-structured data found in | things like Json documents where every value could be an array. | Queries were lightning fast and it could perform analytic | operations while still performing transactions at a high speed. | | My hobby turned into a massive project as I figured out how to | import/export CSV, Json, Json lines, XML, etc. files to/from | tables easily and quickly. It still has a lot of work to go | before it is a 'full-blown' database, but it is now a great | tool for cleaning and analyzing some pretty big tables (tested | to 2500 columns and 200M rows). | | It is in open beta at https://didgets.com/download and there | are a bunch of short videos that show some of the things it can | do on my youtube channel: | https://www.youtube.com/channel/UC-L1oTcH0ocMXShifCt4JQQ | dymk wrote: | JSON file (or any file you directly write to) is simple, but | has few of the guarantees SQLite gives you with regards to | persistence. SQLite abstracts away all of that for you by | handling the fancy platform-specific I/O stuff under the hood. | | SQLite does increase complexity in the sense that it's a new | dependency. But as far as dependencies go, SQLite has bindings | in almost all languages. IMO even for simple usecases, it's | worth using it from the start. Refactors are easier down the | line than with a bespoke JSON/yaml/flat encoding. | | Read only configuration, sure, JSON (etc) file makes sense. The | second you start writing to it, probably not. | fikama wrote: | Could you elaborate more on those things abstracted by | SQLite? or point to further reading - You got me interested | Lex-2008 wrote: | Not OP, but I remember reading an interesting article about | issues that simple write() might face: | https://danluu.com/deconstruct-files/ | | And here's what SQLite does to ensure that in case of | application or computer crash database contains either new | or old data, but not their mix or some other garbage: | https://sqlite.org/atomiccommit.html | under-Peter wrote: | For people who enjoy this article I recommend reading the two- | parter "How does SQLite work?" [1] [2] by Julia Evans. It too is | an exploration into the inner workings but diving deeper into | code. And for those interested in more details on b-trees, | there's "The Ubiquitous B-Tree" by Douglas Cramer which I enjoyed | a lot. | | [1] https://jvns.ca/blog/2014/09/27/how-does-sqlite-work- | part-1-... [2] https://jvns.ca/blog/2014/10/02/how-does-sqlite- | work-part-2-... [3] | http://carlosproal.com/ir/papers/p121-comer.pdf | upupandup wrote: | A bit off topic but fly related: | | - How can i forward range of ports instead of generating 1000s of | lines? This is a major blocker for me. | | edit: not sure why this is being downvoted, I've received no | response from fly.io about this and this is the only way to get | their attention on this shortcoming. you can't really expect | people to write port 100,00 to 50,000 individually thats like | 40,000 lines. Fly.io really needs to support port ranges like | 10,000-50,000 | mrkurt wrote: | community.fly.io is the best place to ask about this. | | We don't support tcp port ranges yet. We will someday: | https://community.fly.io/t/new-feature-every-public-port-now... | freecodyx wrote: | It's always good for this kind of articles, to refer to source | code. Provide links to the source code responsible of some of the | implementation details | eatonphil wrote: | Yeah that would be awesome! I'd rephrase your comment as a | suggestion rather than a demand though personally. :) | benbjohnson wrote: | That's a good idea. There's a careful balance though since | once you start talking about specific functions in SQLite | then the post becomes very low-level and starts to become | less accessible. I'll try to integrate that more in the | future though. | mandeepj wrote: | > | https://github.com/sqlite/sqlite/blob/master/src/btreeInt.h | | Woah, Ben! I'm so glad the above link was shared above; | thanks to the poster as well. There's a very good amount of | in-depth knowledge shared via comments and a reference to | Donald knuth's book | benbjohnson wrote: | There's a lot of goodies in the comments. I like this one | about how POSIX locks are broken: https://github.com/sqli | te/sqlite/blob/master/src/os_unix.c#L... | mandeepj wrote: | Thanks! Enough catalyst for me to read through all of | SQLite source files | eatonphil wrote: | Yeah maybe it would force you to get too close to the code, | I don't know SQLite well enough. | | When I do surveys of software I try to provide links to | source (mostly just the relevant file or directory) so | folks can verify for themselves. For example when looking | into the parsers behind Postgres-compatible DBs [0] and | parser techniques in a few major language implementations | [1]. But both those posts are at a higher-level than what | you're doing here. | | I'm sure you'll have good reason either way for this | series. | | [0] | https://datastation.multiprocess.io/blog/2022-02-08-the- | worl... | | [1] https://notes.eatonphil.com/parser-generators-vs- | handwritten... | samatman wrote: | GP's account page says From Morocco With Love, and I'm sure | their Arabic and French are far beyond mine. | | This reads to me like an accidental slip into the imperative, | breaking a sentence in two parts, and not adding a phrase | such, "For example,". | | In fact, changing the period to a colon would defuse the | imperative. It's subtle! | photochemsyn wrote: | Here's a header file that basically mirrors some of what the | article is talking about, the layout of pages and the btree and | so on (~lines 100-200) | | https://github.com/sqlite/sqlite/blob/master/src/btreeInt.h | | The code for the btree functions is here and is a bit over my | head TBH with all the locks and permissions and so on but it's | a nice example of how to comment code I think: | | https://github.com/sqlite/sqlite/blob/master/src/btree.c | jvdvegt wrote: | The SQLite documentation is pretty good as well: | https://sqlite.org/fileformat.html | dunham wrote: | I'll second that. I was able to use that documentation to | write javascript code that reads and queries a sqlite | database. (Just as a learning exercise, I don't have a | practical use for this.) | CoastalCoder wrote: | A bit of a tangent, but the Corecursive podcast has a really | interesting interview with the creator of SQLite: [0] | | [0] https://corecursive.com/066-sqlite-with-richard-hipp/ | anewpersonality wrote: | How much of SQLite is D. Richard Hipp versus companies that are | contributing to the code? | azornathogron wrote: | I can't directly answer your question because I don't know, but | I will note that the SQLite Copyright page states | | > Contributed Code | | > In order to keep SQLite completely free and unencumbered by | copyright, the project does not accept patches. | | https://www.sqlite.org/copyright.html | | I think the number of people who have worked directly on the | code for the canonical SQLite distribution is quite small. | benbjohnson wrote: | I talked with the SQLite team about a year or two ago and | IIRC there were four people working on it (including Dr. | Hipp). I believe they were all full time but I didn't ask. | LAC-Tech wrote: | SQLite is 21 years old, and for me it just keeps getting better | and better. More performant, more relevant features, all while | being just as easy to embed and administer. ___________________________________________________________________ (page generated 2022-07-27 23:00 UTC)