[HN Gopher] I found a bug in SQLite
       ___________________________________________________________________
        
       I found a bug in SQLite
        
       Author : otoolep
       Score  : 274 points
       Date   : 2022-12-11 17:18 UTC (5 hours ago)
        
 (HTM) web link (www.philipotoole.com)
 (TXT) w3m dump (www.philipotoole.com)
        
       | nodesocket wrote:
       | My first time coming across rqlite. Looks awesome. I just
       | finished writing a poor man's centralized store of SQLite writing
       | the DB file to Minio (s3 compliant) storage but of course has
       | race conditions all over the place. Gonna take a look at
       | replacing with rqlite.
        
       | MrWiffles wrote:
       | Props to the author, this was well written. Clear and concise, it
       | was easy to follow. Not like my ratings and ravings! ;-)
        
       | password4321 wrote:
       | Long long ago I lost my SMS database on my first Android phone to
       | corruption.
       | 
       | The worst part was if the app encountered an error opening the
       | database, it just deleted it and started over -- no chance of
       | repair to rescue any of the data. I don't think this is done this
       | way anymore.
       | 
       | After that I have installed SMS Backup+ first thing on every new
       | phone.
        
         | password4321 wrote:
         | https://stackoverflow.com/questions/7764943/what-can-be-done...
        
       | mort96 wrote:
       | I'd be interested to read some discussion on why this wasn't
       | caught earlier. When you have some thing with supports read +
       | write and supports access from multiple threads/processes, I'd
       | expect "read as fast as possible from one thread, write as fast
       | as possible from another" to be one of the most obvious tests to
       | write.
        
         | loeg wrote:
         | Seems like it was particularly the lock upgrade path in the
         | memdb vfs -- so a little weird case that must have not been
         | covered adequately.
         | 
         | https://www.sqlite.org/src/info/15f0be8a640e7bfa
        
           | [deleted]
        
           | morelisp wrote:
           | I vaguely recall some variant of MC/DC coverage that also
           | treats each bit in any bit flag checks as independent boolean
           | conditions. It seems like there could be a similar variant
           | that requires checking each value of such a "leveled" enum
           | independently; if your enum says you can take on values 0-5
           | and the logic checks <= 2 and <= 5, nonetheless there should
           | be a test case for 0, 1, 3, and 4.
        
       | jeroen79 wrote:
       | still just submit a bugreport instead of making a fuss of it.
        
         | sigjuice wrote:
         | This is all rather tasteful. Making a fuss would be registering
         | a domain name and making a whole freaking website just for the
         | bug.
        
         | bawolff wrote:
         | It is impressive for very popular software to find bugs.
         | 
         | But even if it wasnt, its still a blog post. The entire point
         | is to talk about what you have been doing. Personally, My blog
         | is super inane.
        
         | 0cf8612b2e1e wrote:
         | People are allowed to be proud of their accomplishments. SQLite
         | is extremely high quality software. Identifying a mistake is
         | practically equivalent to receiving a Knuth check.
         | 
         | Maybe this post will inspire others on how to locate other
         | bugs, improving the world for the rest of us.
        
         | wistlo wrote:
         | I found a bug in mySQL and after the mySQL team fixed it, I
         | included a summary of it in my annual employee review. Bug
         | fixing widely-used software is not part of my role, but the
         | company recognized it as an accomplishment.
        
       | einpoklum wrote:
       | > I found a bug in SQLite
       | 
       | Well, good thing it wasn't a bug in the C compiler you were
       | building sqlite with... even those can come up occasionally.
        
       | jacquesm wrote:
       | No such thing as software without bugs, but given the incredibly
       | widespread use that SQLite sees the quality as evidenced by the
       | fact that finding a bug is news by itself is extremely high.
       | Something to strive for.
        
       | de6u99er wrote:
       | Did you have to get baptized first to be allowed reporting a bug?
        
       | fastaguy88 wrote:
       | I have seen similar error messages of the "database is corrupted
       | ..." type with MariaDB when I simply typed in my SQL incorrectly.
        
       | lifeisstillgood wrote:
       | I understand the comment about "intellectual honesty" about those
       | pesky transient bugs.
       | 
       | It's really hard to know where a hard to reproduce bug is on the
       | cost benefit spectrum - and that is the crux - not knowing enough
       | about the bug to determine it's negative weight means you are
       | essentially guessing both sides of the equation.
       | 
       | It's probably not the best idea, it waiting till users find it at
       | east gives a good idea of the prior
        
         | quickthrower2 wrote:
         | Yeah the old "boss, I have spent 2 days investigating the bug,
         | and I think it is best overall if we don't fix it. maybe show a
         | better message".
        
       | habibur wrote:
       | Even though SQLite bugs are rare, those can be found
       | occasionally.
       | 
       | I remember a bug finder took the sqlite documentation off their
       | website. Collected all their keywords, made up millions of
       | jumbled up queries of random combination between keywords and
       | then ran those overnight to find 10 bugs where the engine
       | crashed. And yes those were also reported and fixed quickly.
        
         | markerz wrote:
         | Sounds a lot like fuzzing, which is a really good way to find
         | really esoteric bugs against a spec! I did something like this
         | in college when I was implementing a compiler and it caught a
         | lot of really weird bugs in a lot of peoples codes. The
         | beautiful part of some fuzzers though is they can automatically
         | simplify their complicated test cases to something you can
         | actually look at and reason about.
        
           | carlmr wrote:
           | >Sounds a lot like fuzzing, which is a really good way to
           | find really esoteric bugs against a spec!
           | 
           | QuickCheck style testing is maybe also worth a mention here.
           | Instead of using any possible inputs, like in fuzzing, you
           | restrict yourself to legal inputs, like the keywords here, to
           | get maybe less random crashes, but more likely to find useful
           | corner cases because of the restriction on the search space.
        
         | return_to_monke wrote:
         | so basically, fuzzing.
         | 
         | https://en.m.wikipedia.org/wiki/Fuzzing
        
           | rsaxvc wrote:
           | Sounds more like smart Monkey testing, much older, less
           | feedback.
           | 
           | https://en.m.wikipedia.org/wiki/Monkey_testing
           | 
           | Edit: looks like some consider these the same nowadays.
        
             | robocat wrote:
             | I wonder if ChatGPT would be good at generating various
             | fuzz testing queries. Maybe too slow? Maybe too repetitive?
             | Maybe too unvarying? Sorry for mentioning ChatGPT (to those
             | who are sick of the infection).
        
               | mattygee wrote:
               | Actually, with one of the problems being faced in that
               | space at present, especially for applications involving
               | factual answers, being "hallucinations" ( ie. essentially
               | as I understand it the level of "creativity" in
               | responses) such "creativity" may well be quite suited to
               | finding those various unusual edge cases.
        
       | orf wrote:
       | Lots of people saying that SQLite is super high quality and
       | finding a bug is so rare. It's not. I found one on a simple query
       | utilising WHERE EXISTS [1]. Reporting it to a weird forum was
       | also a horrible experience.
       | 
       | It's high quality software, don't get me wrong, but the infamous
       | 100% test coverage doesn't make it somehow immune to issues, or
       | imply that the issues you do find are of a certain level of
       | complexity. Nothing is back and white like that.
       | 
       | 1. https://sqlite.org/forum/forumpost/452888d3b1?t=c&unf
        
         | qbasic_forever wrote:
         | What exactly was a 'horrible experience' with the report you
         | filed? Richard himself promptly replied in less than 24 hours
         | that the issue was resolved and previously seen by Firefox too.
         | I am at a genuine loss, what was horrible here??
        
           | orf wrote:
           | It was a while ago, but if I recall I ended up having to sign
           | up several times, the post failed to be created for some
           | reason and something else. There was a captcha involved I
           | think.
           | 
           | I don't remember the specifics, but I do remember coming away
           | from it with a feeling of "wow, that was an atrocious
           | experience. I wonder what the drop off rate is"
        
             | devwastaken wrote:
             | Sounds about right, most forum software is buggy and fails,
             | but those failures don't get sent to the maintainers. It
             | just waits until someone whom knows the maintainers says
             | "uhhh it's broke.".
        
             | asah wrote:
             | in situations like this, I typically report bugs directly
             | to members of the core team individually, with gory
             | details, and explina that I tried the forum approach and it
             | failed for me.
        
             | zoomablemind wrote:
             | Perhaps a while ago... For quite some time SQLite official
             | forum has been running its own version of Forum software
             | (part of Fossil). It allows Anonymous login, with captcha
             | (again, Fossil handles), yeah, but this is nothing
             | unexpected. So in general, the path to Forum presence is
             | fairly unimpeded, well, it is moderated. Reasonably stated
             | issues seem to get attention.
             | 
             | Though, some issues indeed need a push to be recognized as
             | such, as it's a public forum, so other users may express
             | their "other" opinions...
             | 
             | All in all it's Freedom of Reasonable speech in action.
             | 
             | I believe there's a different channel for reporting
             | security-related issues. Again, it's through the Forum, but
             | there's a private message feature for signed-in users.
        
         | bobleeswagger wrote:
         | > Lots of people saying that SQLite is super high quality
         | 
         | I think the point most of those folks are making, is that
         | SQLite is good enough where most developers think "Psh, I will
         | use [HEAVIER DB SYSTEM THAT SLOWS OVERALL DEVELOPMENT TIME]"
         | even if it is a better long term solution.
         | 
         | It's about bikeshedding, SQLite really is good enough for most
         | projects and its a shame it still has such negative
         | connotations.
        
           | maxbond wrote:
           | For what it's worth I think it's largely overcome the
           | reputation of being a "toy" database.
        
             | robertlagrant wrote:
             | It never was a toy, so that's good.
        
               | maxbond wrote:
               | I'll plead guilty to having this impression until fairly
               | recently, but it's an incredible database. While we were
               | all not taking it seriously, SQLite was quietly getting
               | better and better, year after year. You do that for a few
               | decades and you've got an incredible piece of software.
        
           | eduction wrote:
           | "Heavier" and "lightweight" are such abstract terms in
           | software.
           | 
           | If "heavier" just means more LoC -- sure, there's more
           | complexity in more LoC but also more problems solved. There's
           | a reason people tend to use the latest Linux/macos/Windows as
           | opposed to the very lightweight Apple II OS from 1978.
           | 
           | Defaulting to, say, Postgres doesn't seem so bad to me. It
           | solves more problems than SQLite and "lightweight" is not
           | really a concrete benefit for SQLite. It's at least one level
           | removed from speaking to a real problem.
        
             | robertlagrant wrote:
             | It also causes problems. Now I have to worry about multiple
             | processes instead of just one, and I need to upgrade it
             | separately and test them both together.
        
               | [deleted]
        
             | Quarrelsome wrote:
             | Lightweight means I don't need to install some shit on a
             | server somewhere and hope that its reachable at the time I
             | need to access it. If I'm a process running then I loaded
             | from a disk somewhere which means I can in-process sqlite
             | to write to that disk.
             | 
             | Less stuff == less to go wrong == lightweight.
        
             | throwawaymaths wrote:
             | I remember switching from postgres to MySQL and there was
             | definitely a learning curve, so even without deployment
             | costs (admittedly the same for mysql -- which were a real
             | thing back then before RDS or Aurora or whatever hosted
             | postgres)... It probably would have been better to use
             | sqlite
        
             | maxbond wrote:
             | I read them as "more/less complexity" not "more/less code".
             | Postgres is heavier weight because it has more complexity;
             | it has more features, it's networked, it has a more
             | involved configuration process. They both have different
             | advantages and can with in problem spaces the other can't.
             | 
             | This is quite unlike the Apple II, which is outmoded and
             | requires a dedicated hobbyist to get working.
             | 
             | Postgres is an excellent default, but preferring lighter
             | solutions does solve problems. It eliminates failure modes
             | and cognitive load. As engineers we seek to eliminate the
             | irrelevant to focus on the interesting. If you can use
             | SQLite and avoid shipping a series of containers, and
             | instead ship a single binary, you've eliminated things to
             | think about.
             | 
             | Neither of them is a silver bullet and you'll be a better
             | engineer if you can do both.
        
               | eduction wrote:
               | This is an enlightening answer. Part of my issue with
               | "lightweight" is that is is vague. I see it used all the
               | time. In this case, if people mean "way less
               | configuration than Postgres," and "does not need its own
               | service and process(es) to manage," that makes total
               | sense. And it also gives a hint as to the tradeoffs.
        
               | maxbond wrote:
               | This made me smile, thanks for keeping an open mind,
               | stranger.
        
             | gjulianm wrote:
             | Not really. Heavier and lightweight usually refer to the
             | amount of features and requirements of a piece of software.
             | 
             | In that regard, it's easier to see which of PostgreSQL and
             | SQLite is lighter. PostgreSQL requires a separate process
             | running with its own config, plus the library to
             | communicate with it, plus all the things Postgres does...
             | On the other hand, SQLite is just a library that reads
             | files in a certain format.
             | 
             | > It solves more problems than SQLite and "lightweight" is
             | not really a concrete benefit for SQLite.
             | 
             | But it is a concrete benefit. Sometimes you'll have
             | restricted environments because either by power or by
             | permissions, you can't install Postgres or any other
             | database server (e.g., mobile phones or embedded software).
             | Or sometimes you just don't want the user to configure
             | their postgres instance and your software for just a few
             | tables (e.g., system utilities/small services that just
             | need a simple database).
        
         | LVB wrote:
         | > but the infamous 100% test coverage doesn't make it somehow
         | immune to issues
         | 
         | Infamous in what way? While I totally get that 100% coverage
         | may be impractical for many projects, I'm also not seeing how
         | less coverage would have improved things. And I highly doubt
         | the SQLite team ever claimed they were immune to bugs!
        
           | lmm wrote:
           | > While I totally get that 100% coverage may be impractical
           | for many projects, I'm also not seeing how less coverage
           | would have improved things.
           | 
           | The argument is generally that language-level correctness
           | would achieve more than emphasising test coverage so heavily.
        
         | tiffanyh wrote:
         | > Reporting it to a weird forum was also a horrible experience.
         | 
         | What was so "horrible"?
         | 
         | After you posted the bug, the second comment (and only 6-hours
         | later) had a new release and fix.
        
           | robertlagrant wrote:
           | Evidence of a wonderfully comfortable life. We should all
           | hope to have horrible experiences that are that
           | inconsequential!
        
           | setr wrote:
           | He's talking about the forum software, not the forum
           | community.
        
         | eduction wrote:
         | It's high quality software that is being pushed well past its
         | intended use case. Maybe it will work out fine but rqlite is
         | taking something designed as an on disk file format for one
         | program and trying to use it as a network distributed
         | concurrent database system. It would be surprising if they
         | didn't expose bugs in SQLite.
         | 
         | There are database systems that have been around for many years
         | built from the ground up for this use case.
        
           | otoolep wrote:
           | rqlite[1] author here. To be clear rqlite is using SQLite in
           | a completely conventional manner. Nothing about the
           | distributed nature of rqlite impacts on SQLite, since each
           | rqlite node runs its own complete copy of SQLite.
           | 
           | This bug can affect anybody using an in-memory version of a
           | SQLite database. That was the point of writing the C unit
           | test.
           | 
           | [1] https://github.com/rqlite/rqlite
        
           | infamia wrote:
           | > It's high quality software that is being pushed well past
           | its intended use case. Maybe it will work out fine but rqlite
           | is taking something designed as an on disk file format for
           | one program and trying to use it as a network distributed
           | concurrent database system. It would be surprising if they
           | didn't expose bugs in SQLite.
           | 
           | Expensify is pushing millions of queries/sec by layering
           | Bedrockdb over top of SQLite. You can go a long way and do
           | amazing, unexpected things with a very solid foundation.
           | 
           | https://blog.expensify.com/2018/01/08/scaling-sqlite-
           | to-4m-q...
        
         | playingalong wrote:
         | Not sure of SQLite's interpretation, but in general 100% test
         | coverage "only" means all lines are executed in some piece of
         | test. It doesn't necessarily indicate correctness of each line
         | being checked.
         | 
         | Not to mention one can have multiple logic branches in a line.
         | Or bugs relevant to only some subset of inputs (e.g. works fine
         | for positive numbers but fails for negative is a classic
         | example)
        
           | strictfp wrote:
           | Yup. If you consider that a simple int can have billions of
           | states, and that you can have 100% test coverage while
           | testing just one of those, 100% really doesn't mean much. And
           | with several variables, the total state space quickly becomes
           | almost infinite.
        
           | CodesInChaos wrote:
           | SQLite uses a variant of branch coverage. So "multiple logic
           | branches in a line" isn't a problem. The rest of your comment
           | still applies though.
           | 
           | https://www.sqlite.org/testing.html#test_coverage
        
       | techn00 wrote:
       | It's impressive to find a bug in something like sqlite, which is
       | famous for its larger than the codebase test suite
        
         | capableweb wrote:
         | > larger than the codebase test suite
         | 
         | This tend to be true for most serious projects, that the amount
         | of test code is greater than that of the code that is being
         | exercised.
         | 
         | I think what they are famous for is the quality of the testing
         | suite, rather than the amount.
        
           | loeg wrote:
           | This is a definition of "serious projects" that excludes
           | almost all serious projects.
        
           | valray wrote:
           | > This tend to be true for most serious projects, that the
           | amount of test code is greater than that of the code that is
           | being exercised.
           | 
           | Reading this comment, I was thinking "Oh that must mean the
           | test code is 2x or maybe even 3x the amount of source code"
           | 
           | Going to the SQLite web site, I was surprised to find that
           | the test code is 600x larger than the source code.
           | Impressive.
           | 
           | Is this 600:1 ratio typical for other projects? The ones that
           | I have seen are more like 1x or 2x, but I have not worked
           | with many open source systems.
        
             | baq wrote:
             | They count post-codegen test LOC, the checked in ratio is
             | much smaller - used to be something like 5:1. Still super
             | impressive!
        
           | civopsec wrote:
           | Test to regular code ratio shouldn't be impressive by itself
           | anyway, considering how repetitive, duplicated, and setup-
           | heavy test code can be.
        
           | markerz wrote:
           | I think a good way of looking at the quality of their tests
           | is how they categorize their tests, which focuses on the
           | different types of problems they want to solve: power loss,
           | concurrency, API issues, logic, etc.
           | https://www.sqlite.org/testing.html
        
         | zoomablemind wrote:
         | >... larger than the codebase test suite
         | 
         | I wonder if they directly test the concurrent use?
         | 
         | It appears that the fix [1] of the OP bug did not lead to any
         | addition/changes in resp. tests.
         | 
         | [1]:https://www.sqlite.org/src/info/15f0be8a640e7bfa
         | 
         | P.S. looks like Fossil still has issues with content scrolling
         | and wrapping to screen size (mobile).
        
           | otoolep wrote:
           | Actually, they added a unit test in a follow-on change:
           | 
           | https://www.sqlite.org/src/info/dc7dd2d3e50e7cc4
        
             | zoomablemind wrote:
             | Indeed, thanks for pointing this one out.
             | 
             | The test seems to test a shared access in rather a serial
             | order. I wonder if underneath this is actually running as
             | concurrent processes?
        
               | otoolep wrote:
               | I may not be following your point, but I don't believe
               | the root cause of this issue was a race condition, or
               | anything that might be related to concurrency.
               | 
               | The fundamental cause AFAIK was a SQLite connection was
               | attempting to make a state transition (from one type of
               | locking state to another) which shouldn't be allowed
               | under certain circumstances, but the implementation
               | didn't actually enforce this rule. So the added test
               | really does test the root cause.
        
       | ChuckMcM wrote:
       | The lesson here is never ignore your own "Huh, that's weird."
       | experience when using software. :-)
        
         | nonrandomstring wrote:
         | Tis good advice, but in most cases I'd question my own code
         | first and then be confused by general complexity - is it the
         | threading library, or the API, or the database? Rarely do we
         | get a chance to pare back to an isolated behaviour. Also
         | there's a voice that says "surely someone much smarter than me
         | already knows about this".
        
         | otoolep wrote:
         | https://quoteinvestigator.com/2015/03/02/eureka-funny/
        
         | jicea wrote:
         | A French proverb is << Il n'y a pas de fumee sans feu >>, means
         | there is no smoke without fire. I use it a lot in my day work
         | when I try not to watch a suspicious behavior...
        
           | bewaretheirs wrote:
           | The usual form of that proverb in English is "When there's
           | smoke, there's fire".
        
             | aidos wrote:
             | Is it? Interestingly I've only heard it as "no smoke
             | without fire"
        
               | thedrexster wrote:
               | "Where there's smoke, there's fire" is how I've heard
               | that proverb
        
               | rgbrgb wrote:
               | "Where there's smoke, there's fire" is what I've heard
               | (from fire heavy Southern California).
        
               | berny222 wrote:
               | This is how we phrase it in Australia.
        
         | quickthrower2 wrote:
         | Hire me a full time bug reporting assistant then please :-)
        
       ___________________________________________________________________
       (page generated 2022-12-11 23:00 UTC)