[HN Gopher] Strict Tables - Column type constraints in SQLite - ...
       ___________________________________________________________________
        
       Strict Tables - Column type constraints in SQLite - Draft
        
       Author : thunderbong
       Score  : 165 points
       Date   : 2021-08-21 18:03 UTC (4 hours ago)
        
 (HTM) web link (www.sqlite.org)
 (TXT) w3m dump (www.sqlite.org)
        
       | bob1029 wrote:
       | This looks like a reasonable proposal to me.
       | 
       | What sort of performance impact could we expect to see when
       | accessing strict tables?
       | 
       | Also, is there going to be an ALTER TABLE variant?
        
       | mingodad wrote:
       | I also proposed once something like "PRAGMA pedantic" where any
       | type conversion would be reported to stderr see the changes and a
       | sample output:
       | 
       | https://sqlite.org/forum/forumpost/692ae69237f2537b?t=h
       | 
       | https://sqlite.org/forum/forumpost/103d7294464fcde8?t=h
        
       | simonw wrote:
       | It looks like this check-in from yesterday contains the
       | implementation and tests:
       | https://sqlite.org/src/info/f9c1d3441b87ee29
       | 
       | As always the TCL tests are worth admiring:                   #
       | STRICT tables have on a limited number of allowed datatypes.
       | #         do_catchsql_test strict1-1.1 {         CREATE TABLE
       | t1(a) STRICT;         } {1 {unknown datatype for t1.a: ""}}
       | do_catchsql_test strict1-1.2 {         CREATE TABLE t1(a PRIMARY
       | KEY) STRICT, WITHOUT ROWID;         } {1 {unknown datatype for
       | t1.a: ""}}         do_catchsql_test strict1-1.3 {         CREATE
       | TABLE t1(a PRIMARY KEY) WITHOUT ROWID, STRICT;         } {1
       | {unknown datatype for t1.a: ""}}         do_catchsql_test
       | strict1-1.4 {         CREATE TABLE t1(a BANJO PRIMARY KEY)
       | WITHOUT ROWID, STRICT;         } {1 {unknown datatype for t1.a:
       | "BANJO"}}         do_catchsql_test strict1-1.5 {         CREATE
       | TABLE t1(a TEXT PRIMARY KEY, b INT, c INTEGER, d REAL, e BLOB, f
       | DATE) strict;         } {1 {unknown datatype for t1.f: "DATE"}}
       | do_catchsql_test strict1-1.6 {         CREATE TABLE t1(a TEXT
       | PRIMARY KEY, b INT, c INTEGER, d REAL, e BLOB, f TEXT(50))
       | WITHOUT ROWID, STRICT;         } {1 {unknown datatype for t1.f:
       | "TEXT(50)"}}              do_execsql_test strict1-2.0 {
       | CREATE TABLE t1(             a INT,             b INTEGER,
       | c BLOB,             d TEXT,             e REAL         ) STRICT;
       | } {}         do_catchsql_test strict1-2.1 {         INSERT INTO
       | t1(a) VALUES('xyz');         } {1 {cannot store TEXT value in INT
       | column t1.a}}
        
       | cryptonector wrote:
       | YES YES YES PLEASE. I would also like a mode where no type
       | coercion happens.
        
       | pdimitar wrote:
       | I'm very happy to see this! I'll still keep my CHECK constraints
       | but having the table opt-in STRICT mode is unloading a boulder
       | off of my shoulders.
       | 
       | Very grateful to the SQLite team! Kudos for not digging your
       | heels in and trying to accommodate your diverse community.
       | 
       | Big respect.
        
       | justinclift wrote:
       | This is really welcome. One idea for an approach I had a while
       | back (but haven't attempted writing code yet) was to see if
       | "personalities" could be implemented.
       | 
       | For example a "PostgreSQL" personality (set via PRAGMA maybe?)
       | which dynamically imposes the same type names and "limitations"
       | on SQLite as PostgreSQL.
       | 
       | Ditto for other SQL dialects (MySQL, etc).
       | 
       | SQLite already has the flexibility to allow the data types, so
       | this idea is a way of putting constraints in place that are
       | really beneficial for a _lot_ of common use cases.
       | 
       | eg "You're using SQLite as a local test for a
       | [PostgreSQL/MySQL/etc] prod database"
        
         | electrum wrote:
         | The H2 embedded database has compatibility modes for other
         | databases, which seems helpful for testing, but ends up being
         | incompatible in various ways for anything non-trivial. You end
         | up fighting the compatibility and skipping useful native
         | features of the target database, while still not having
         | confidence that the code works on the target database.
         | 
         | We now use the awesome Testcontainers library (available for
         | multiple languages) that makes it easy to launch the real
         | database in a Docker container. Testing applications against
         | multiple databases is now trivial.
        
         | zarzavat wrote:
         | Honestly in that case you should just use Postgres locally.
         | SQLite was never meant to be a stand-in for a true RDBMS. As
         | they say, it's intended to be a replacement for fopen(). The
         | fact that it works for local development is a fluke.
        
           | nicoburns wrote:
           | Yeah, I've never understood people's aversion to running
           | Postgres or MySQL locally. They're super easy to install, and
           | they'll run in the background using minimal resources ready
           | for you when you need them.
        
             | xupybd wrote:
             | Yeah for development Postgres is easy to install. For
             | deployment it comes at a complexity cost.
             | 
             | I have small scripts that do things like send an email on a
             | given business event. These scripts pull data from the main
             | ERP. To keep the system upgradable we avoid changes to the
             | ERP's database. So these scripts need their own DB to store
             | state, such as when the email was sent. At first this
             | involved one Postgres instance with databases for each
             | script or application. If that Postgres server moves IP or
             | has a version upgrade often every script or app needs a
             | config change or library upgrade. Everything gets coupled
             | together. By using an SQLite DB per script a range of
             | problems go away. Of course anything that has multiple
             | users or needs to be reported on has to go into a central
             | DB but a surprising amount can happly run on it's own
             | independent SQLITE DB.
             | 
             | To move hosts or backup the application and DB you take a
             | copy of the folder. It's so easy.
        
           | da_chicken wrote:
           | You say that like a networked RDBMS isn't meant to be a
           | replacement for fopen() at the end of the day.
           | 
           | Yeah it does a lot more than fopen(). So does SQLite. It's
           | still, bottom line, just secondary storage with a fancy API.
        
             | zarzavat wrote:
             | Not many people are willing to make PostgreSQL a dependency
             | of their program.
        
               | da_chicken wrote:
               | Not a dependency in the sense of not a statically linked
               | binary? Sure.
               | 
               | Not a dependency in the sense of not requiring a specific
               | version of a Postgres db provider, or a specific version
               | of a Postgres itself, or able to do any work without
               | Postgres existing? _Laughably no_.
               | 
               | DB agnostic software is very uncommon, _even though ODBC
               | is well-established and mature_.
        
               | twic wrote:
               | How about DuckDB:
               | 
               | https://duckdb.org/
               | 
               | It keeps databases in a single local file like SQLite,
               | and it supports PostgreSQL syntax (not sure which
               | version). It does a load of funky stuff to be really fast
               | at analytics workloads, but I would guess it's fast
               | enough at transactional workloads for small-scale use.
        
       | spicybright wrote:
       | Huh, thought columns already had strict datatypes. Only used
       | postgres before.
       | 
       | Very scary sqlite doesn't do that already...
        
       | janvdberg wrote:
       | Side note, Richard Hipp was on this weeks Changelog podcast.
       | Super interesting as always: https://player.fm/series/the-
       | changelog-software-development-...
        
       | simonw wrote:
       | I'm so happy to see this.
       | 
       | Personally I don't particularly care - SQLite's loose typing has
       | never caused any problems for me - but I've seen SO many
       | programmers dismiss SQLite as an option because they find its
       | loose typing distasteful.
       | 
       | Eliminating one of the most common reasons that people reject
       | SQLite feels like a big win to me.
        
         | laurent123456 wrote:
         | Same, I think as long as you are strict with types in your
         | business logic and test units, SQLite loose typing is not an
         | issue.
        
         | pmontra wrote:
         | I inherited a PHP/SQLite project years ago. I found timestamps
         | with three different formats in a column because different PHP
         | pages had different ideas of what a timestamp is. OK, the
         | problem was the original developer that wrote all of that code
         | but the database made it too easy to write that horror.
         | 
         | I wish they add a real strict type for dates and times too.
        
           | simonw wrote:
           | Ouch, that sounds nasty!
           | 
           | There's a pattern for enforcing datetime formats using a
           | CHECK constraint which I've not tried myself yet but which
           | looks like it could work:
           | https://sqlite.org/forum/forumpost/4f4c96938f4bef32?t=h
        
       | scottlamb wrote:
       | Awesome. The weird column type behavior is possibly the only
       | thing I dislike about SQLite3. Having to add the word "strict" at
       | the end of each "create table" statement is a slightly annoying
       | thing I might forget, but it's a reasonable concession to
       | compatibility, and I can write a schema test to catch omissions.
        
       | epilys wrote:
       | Interesting that values are still coerced if possible. I'd expect
       | STRICT to be absolute in this case.
        
         | em500 wrote:
         | I hope you realize that most of the common databases
         | (PostgreSQL, MySQL, SQL Server) do implicit type conversion on
         | insertion.
        
       | krylon wrote:
       | This is nice. Very nice. But to be honest, having used SQLite in
       | many hobby/toy projects for ~15 years, I have never ever
       | encountered a type error with it. Stricter handling of types is a
       | good idea, IMHO, but if the alternative had been, say, a
       | reasonable type to represent date/time values, I would have
       | chosen the latter.
        
         | MrBuddyCasino wrote:
         | Is your usual language strictly typed?
        
           | krylon wrote:
           | These days, I am a Go person, so mostly yes, but I used to be
           | a Python, Perl, and Ruby person, and I did not encounter any
           | problems there, either.
           | 
           | I think if you expect your database to use certain types, you
           | write your code accordingly. Why would you ever want to
           | insert a BLOB into an INTEGER column? _Some_ kind of data
           | validation /sanitizing is necessary anyway, the database
           | engine being strict about it is just an additional layer of
           | protection, but it shouldn't be one's first line of defense
           | against mistakes. IMHO.
        
       | tehbeard wrote:
       | Ignoring the side show of "wait it doesn't have types" and "lol
       | it's bad because it's lax on types"....
       | 
       | Is having a "STRICT" keyword appended to the end of the CREATE
       | TABLE syntax the best option? I'd have thought this would be a
       | PRAGMA option if ever implemented.
        
         | chacham15 wrote:
         | PRAGMA doesnt enable you to incrementally adopt the feature.
        
       | gigatexal wrote:
       | This makes SQLite even better than it already was. I am 100%
       | onboard with this effort.
        
       | temp8964 wrote:
       | I am not surprised that many comments show surprised about the
       | column types are flexible by default. I guess most people don't
       | read the original documentation before starting using it.
       | 
       | This happens because most tutorials do not mention this either. I
       | will not be surprised if most authors of the tutorials don't know
       | this either.
       | 
       | I did read some webpages on the official website, but I don't
       | remember seeing this either. It's possible that I read pass this
       | information but did not pay attention to it.
        
         | simonw wrote:
         | This STRICT feature is a brand new (currently marked as DRAFT)
         | and hopefully scheduled for the next release of SQLite. I
         | believe this document only showed up on the website today.
        
           | temp8964 wrote:
           | Flexible is the default. This is my comment about, flexible
           | not strict.
        
             | simonw wrote:
             | https://www.sqlite.org/datatype3.html and the FAQ:
             | https://www.sqlite.org/faq.html#q3
        
         | em500 wrote:
         | SQLite's flexible datatype approach is mentioned in
         | "Distinctive Features Of SQLite"[1], "Quirks, Caveats, and
         | Gotchas In SQLite"[2] and detailed in "Datatypes In SQLite"[3].
         | 
         | I find it surprising that apparently many devs use SQLite
         | without knowing any of this. Pretty much the first thing I try
         | to find out about languages or databases new to me that I might
         | want use is the datatype support.
         | 
         | [1] https://www.sqlite.org/draft/different.html
         | 
         | [2] https://www.sqlite.org/draft/quirks.html
         | 
         | [3] https://www.sqlite.org/draft/datatype3.html
        
           | temp8964 wrote:
           | I think many people use SQLite come from analysis/statistics
           | background, not programming background. For them, restrict is
           | the obvious default, so obvious that they never think about
           | the existence of the alternative.
        
       | coldacid wrote:
       | It would be nicer if this were a per-column constraint rather
       | than a per-table one, but it's very welcome all the same.
        
       | goodells wrote:
       | I had no idea SQLite was so fluid with its column types - that's
       | kind of alarming. I guess it's fine if you still have an ORM in
       | between you and it, and this STRICT syntax is probably the most
       | reasonable workaround, but it just seems like a band-aid on top
       | of not following the sensible pattern of every other database
       | engine.
        
         | kevin_thibedeau wrote:
         | It's designed to integrate with Tcl with minimal fuss over
         | typing. The world latched on to it because of its dependability
         | and relative light weight.
        
         | rastapasta42 wrote:
         | I find in practice this limitation is not an issue and column
         | flexibility does not get in the way.
        
           | temp8964 wrote:
           | The problem is not it does not get in the way. The problem is
           | that some expect it to get in the way when the data is not
           | right and assume it will do when in fact it won't .
        
           | chousuke wrote:
           | That's kind of the problem. The database absolutely should
           | get in your way if you try to put nonsense data in it.
           | 
           | For the kinds of uses SQLite sees, the loose approach to data
           | integrity may not have caused much grief, but I can't imagine
           | many situations where it would have helped either.
           | 
           | Having support for enabling sane behaviour is welcome even if
           | it has to be opt-in for compatibility.
        
       | Slix wrote:
       | One disadvantage is that inspecting the column datatype will no
       | longer hint to the application whether the column contains a
       | date.
        
       | JNRowe wrote:
       | I wouldn't go as far as to say I'm aghast at the lax typing1, but
       | I look forward to this change hitting a release. Adding some
       | sugary support for a strict DATETIME type in a future release
       | would tick every box I want from sqlite.
       | 
       | In a couple of projects I work on the _main_ reason for plopping
       | an ORM between sqlite and the application is to implement type
       | strictness, and removing that need would be excellent.
       | 
       | It would also neatly shut down a discussion that comes up far too
       | often when someone wants to use sqlite as data store, which is
       | surely a good thing as it is often just a stop motion argument in
       | my experience.
       | 
       | 1 I am closer to that camp _iff_ I have to choose an option from
       | the article.
        
         | gwd wrote:
         | I think "other developers are aghast" is a bit strong for
         | technical documentation. I was certainly never aghast: knowing
         | that SQLite started its life aimed at TCL, the decision made
         | sense, and the commitment to backwards compatibility is
         | admirable. Nonetheless I certainly welcome this change.
         | 
         | The "doesn't enforce foreign key constraints by default" is
         | much more surprising / annoying to me.
        
           | SQLite wrote:
           | Scan this HN thread to see comments from devs who are aghast.
           | :-)
        
             | crazygringo wrote:
             | And even more, scan the HN thread from 16 days ago:
             | 
             | https://news.ycombinator.com/item?id=28050198
             | 
             | "Aghast" feels like an appropriately accurate
             | characterization to me. ;)
             | 
             | I'm actually curious if this change was a direct (and
             | incredibly fast) reaction to that thread, or if the timing
             | was just coincidental?
        
               | SQLite wrote:
               | coincidence
        
           | JNRowe wrote:
           | I suspect if I had to answer as many questions/rants about
           | type strictness as the authors do I'd probably choose much
           | stronger language. It definitely wasn't intended, but I guess
           | my earlier comment could be read as the ten-thousandth
           | entitled complaint of the day too.
        
       | sradman wrote:
       | Great news, domain constraint enforcement is a welcome new
       | feature. However, let's not throw the baby out with the
       | bathwater; being able to specify an ANYTYPE/VARIANT column on a
       | STRICT table would make this feature more useful. The canonical
       | use case for ANYTYPE is a BIGTABLE or Entity-Attribute-Value
       | (EAV) model.
       | 
       | I haven't checked recently, but I was unable to determine the
       | type of ? parameters in SQLite prepared statements. Maybe this is
       | something that can also be accommodated with STRICT tables.
        
         | SQLite wrote:
         | In something like this:
         | 
         | CREATE TABLE t1(a INT, b TEXT); INSERT INTO t1(a,b)
         | VALUES(1,'2'); SELECT * FROM t1 WHERE a=?;
         | 
         | The type of the ? is ambiguous. You can say that it "prefers"
         | an integer, but most RDBMSes will also accept a string literal
         | in place of the ?:
         | 
         | SELECT * FROM t1 WHERE a='1'; -- works in PG, MySQL, SQLServer,
         | and Oracle
        
       | ysleepy wrote:
       | Nice to see. I almost never want mixed values in a column, so
       | being able to turn on a strict mode preventing inserting a string
       | into an int column is very welcome.
       | 
       | Also the per-table opt-in is reasonable.
       | 
       | sqlite is really becoming this bedrock thing in the computing
       | landscape, it's silently everywhere and all agree that it is kind
       | of a good thing.
        
       | aidanhs wrote:
       | I've created ~five new projects over the past year using SQLite
       | and I've got into the habit of creating tables like this:
       | CREATE TABLE tIssue (         id   INTEGER PRIMARY KEY NOT NULL
       | CHECK (typeof(id) = 'integer'),         col1 BLOB NOT NULL
       | CHECK (typeof(col1) = 'blob'),         col2 TEXT
       | CHECK (typeof(col2) = 'text' OR col2 IS NULL)       );
       | 
       | This has saved me from a _lot_ of errors and has been useful, but
       | it 's just so easy to let things drift, e.g. removing `NOT NULL`
       | constraint and forgetting to also update the check constraints.
       | I'm also horrified at the auto-conversions that I've just learned
       | about from this link that I'm not protected from.
       | 
       | I'm very much looking forward to using strict tables.
        
         | em500 wrote:
         | > I'm also horrified at the auto-conversions that I've just
         | learned about from this link that I'm not protected from.
         | 
         | Most of the common databases that I'm aware of (PostgreSQL,
         | MySQL, SQL Server) do implicit type conversion on insertion (as
         | well as in in many other places). I haven't checked this, but
         | it wouldn't surprise me if that's actually ANSI SQL standard
         | behavior.
        
           | derefr wrote:
           | ANSI SQL provides "CREATE CAST [...] AS ASSIGNMENT", which
           | will define a cast that gets used implicitly to get from type
           | X to type Y when you have a tuple-field (e.g. table-row
           | column, composite-value member field, stored-proc parameter)
           | of type Y, and an expression-value being passed into it of
           | type X.
           | 
           | Quoting Postgres docs (because ain't nobody paying for the
           | ANSI SQL standard just to quote it):
           | 
           | > If the cast is marked AS ASSIGNMENT then it can be invoked
           | implicitly when assigning a value to a column of the target
           | data type. For example, supposing that foo.f1 is a column of
           | type text, then:                   INSERT INTO foo (f1)
           | VALUES (42);
           | 
           | > will be allowed if the cast from type integer to type text
           | is marked AS ASSIGNMENT, otherwise not. (We generally use the
           | term _assignment cast_ to describe this kind of cast.)
           | 
           | Presumably, in any ANSI SQL-compliant DBMS, you could
           | redefine whatever assignment casts are annoying you to be
           | non-assignment casts. (AFAIK, there's no ANSI SQL
           | standardization on what casts _must_ be assignment casts; so
           | lowest-common-denominator SQL-standard-compliant queries
           | shouldn't be relying on the presence of any defined
           | assignment casts.)
        
       ___________________________________________________________________
       (page generated 2021-08-21 23:00 UTC)