hngopher.com

       [HN Gopher] Null Values in SQL Queries
       ___________________________________________________________________
        
       Null Values in SQL Queries
        
       Author : maynman
       Score  : 65 points
       Date   : 2020-02-03 13:24 UTC (9 hours ago)
        
 (HTM) web link (mitchum.blog)
 (TXT) w3m dump (mitchum.blog)
        
       | mwexler wrote:
       | Null values are so important in representing data. But they cause
       | so much confusion in a) unexpected behaviors in queries and b)
       | inconsistent handling across various engines... I sometimes wish
       | <whisper> that they hadn't been included in the spec at all
       | </whisper>. But then I come to my senses again, and go fix yet
       | another bugged query for an analyst who didn't account for nulls
       | in the data.
        
       | wefarrell wrote:
       | Null values and inequality are extremely counterintuitive (in
       | postgres at least). If you run the query:                 SELECT
       | * FROM my_table WHERE my_column != 5
       | 
       | You would expect it to return rows that have a null value for
       | my_column, since null is not 5. However that is not the case.
        
         | [deleted]
        
         | goatlover wrote:
         | I wouldn't expect null row values since you're doing a numeric
         | comparison for my_column, and null isn't a number.
        
         | gfody wrote:
         | the idea is you don't know if the null != 5 because null isn't
         | a value it just marks the absence of a value
        
         | RmDen wrote:
         | Same in SQL Server.. this is documented behavior
         | 
         | Also a null is no equal to anything.. not even another null
         | 
         | This will print false in SQL Server
         | 
         | if null = null print 'true' else print 'false'
        
           | magicalhippo wrote:
           | > Also a null is no equal to anything.
           | 
           | Wrong. It is equal to UNKNOWN:
           | 
           | https://docs.microsoft.com/en-us/sql/t-sql/queries/is-
           | null-t...
        
             | arh68 wrote:
             | I don't think that's what it says.
             | 
             | If I'm reading it right, (null = null) is unknown, which is
             | falsy (except with ansi_nulls off, then it'll be true).
             | (null is null) is true.
             | 
             | I don't think you can test null = (null = null), i.e. null
             | = unknown. Let me know if that's possible somehow, I can't
             | get it working.
        
               | magicalhippo wrote:
               | Sorry brainfart, was responding to the second part, ie
               | comparison to another null.
        
             | RmDen wrote:
             | so?.. still not equal to anything, two unknowns are not
             | equal
             | 
             | if null = null print 'true' else print 'false'
        
               | magicalhippo wrote:
               | It's equal to something: the value UNKNOWN. This
               | influences for example how the comparison result is used
               | in compound expressions:
               | 
               | https://docs.microsoft.com/en-us/sql/t-sql/language-
               | elements...
        
         | magicalhippo wrote:
         | I think the issue here is that SQL should have more "NULL
         | variants" to express why there is no concrete value.
         | 
         | A NULL value technically means it's unknown. An unknown value
         | might be 5, hence why it's not in the result set. Some abuse
         | NULL to mean "value doesn't exist". But a value that doesn't
         | exist can't be 3, or 42, or any other value that's different
         | from 5, so in that regard shouldn't be part of the result set
         | either.
         | 
         | Others again abuse NULL to mean "doesn't apply". And in that
         | case I think it makes sense to include the row in the result
         | set. For example, if I write a query to get all people who's
         | middle name is not "William", I'd most likely want people
         | without middle names included.
         | 
         | Maybe we should have introduced NEX (non-existing) and NAP
         | (non-applicable) as possible values in addition to NULL?
        
           | at_a_remove wrote:
           | Agreed. I have, off and on, labored on a still-incomplete and
           | largely incoherent essay on this topic. NULL is overloaded to
           | the point of some confusion.
        
           | tabtab wrote:
           | Re: _I think the issue here is that SQL should have more
           | "NULL variants" to express why there is no concrete value._
           | 
           | No, that would muddy things in my opinion, like it did to
           | JavaScript. Instead, have more operations/functions for
           | dealing with them in a more "normal" way, so that we can say
           | "WHERE x <> 5" and get results one expects. I'm not sure the
           | syntax, and my drafts would take a lot of time to explain. To
           | give a taste, maybe have something like "WHERE ~x <> 5" in
           | which the tilde converts x's value to the type's default,
           | such as a blank in the case of strings.
           | 
           | If the different reasons for "emptiness" matter, then usually
           | it suggests the need for a "status" column of some kind so
           | that queries can be done on the reasons. I'd need to study
           | domain specifics to recommend something specific.
        
             | magicalhippo wrote:
             | But that would mean you need to be aware that the column
             | can have this property, no?
             | 
             | Continuing with my middle name example. Say I and everyone
             | I knew had middle names, so I write a database including a
             | required middle name column. Later I discover not everyone
             | has middle names, and so I need to relax the restriction.
             | 
             | In your case, I would change the column to accept NULLs,
             | and I'd have to remember to go over my query to add the ~
             | operator.
             | 
             | In my case, I'd change the column to accept NAPs (or
             | whatever) and since a NAP value would behave differently to
             | a NULL for <> (and other operators), I wouldn't need to
             | change my query.
        
               | tabtab wrote:
               | Re: _I 'd have to remember to go over my query to add the
               | ~ operator._
               | 
               | I'd almost always use it no matter what. In fact if
               | starting SQL over via a DeLorean, I'd reverse it to
               | require "~" to make it null-sensitive.
               | 
               | It's analogous to case-sensitive comparing. The vast
               | majority of the time you don't want case-sensitive
               | comparisons such that case-insensitive should be the
               | DEFAULT, and you only add extra specifiers/functions if
               | and when you need to do case-based comparing.
               | 
               | If not, then you either end up with SQL full of
               | "toUpperCase(x)", or problems keep popping up if somebody
               | typed stuff wrong or the UI coders messed up.
               | 
               | Similarly, if your SQL is full of NVL() or DENULL() or
               | whatnot, it means the language was poorly factored for
               | your domain, or even most domains. It flunked D.R.Y. in
               | design, such as comparisons having the wrong default
               | behavior.
        
               | magicalhippo wrote:
               | Yeah that'd be interesting.
        
           | scottlamb wrote:
           | > Maybe we should have introduced NEX (non-existing) and NAP
           | (non-applicable) as possible values in addition to NULL?
           | 
           | Codd (the inventor of relational algebra) actually suggested
           | this. I think the primary source is a book that may not be on
           | the web. There's some discussion here (mostly saying why they
           | think it didn't happen and wouldn't work out):
           | https://arxiv.org/html/1606.00740v1/
        
         | grzm wrote:
         | NULL in SQL is often interpreted in many different ways. The
         | most helpful I've found is to think of it as _unknown_.
         | Postgres has the IS DISTINCT FROM operator to capture what
         | you've intended above:                   ... WHERE my_column IS
         | DISTINCT FROM 5
        
           | wefarrell wrote:
           | I wasn't aware, thanks for the tip. Is there any equivalent
           | for sets of values? For example:                 SELECT *
           | FROM my_table WHERE my_column NOT IN (5, 6)
        
             | unnouinceput wrote:
             | ...and is not null
        
               | wefarrell wrote:
               | That will have no effect on the query.
        
               | unnouinceput wrote:
               | fine, here is the full query since extrapolating from
               | incomplete data is hard for you:
               | 
               | SELECT * FROM my_table WHERE my_column NOT IN (5, 6) AND
               | my_column IS NOT NULL
               | 
               | happy now?
        
               | oarabbus_ wrote:
               | Pretty sure most DBs do not treat that any differently
               | than
               | 
               | `select * from table where column not in (5,6)`
               | 
               | Can you name one that does?
        
               | wefarrell wrote:
               | These two queries will always return the same results:
               | SELECT * FROM my_table WHERE my_column NOT IN (5, 6)
               | SELECT * FROM my_table WHERE my_column NOT IN (5, 6) AND
               | my_column IS NOT NULL
               | 
               | Because "my_column NOT IN (5, 6)" will exclude NULL
               | values.
        
               | unnouinceput wrote:
               | Right, my bad. Upper parent was about using IN and having
               | return everything "but" 5 and 6. So here the select that
               | will do that:
               | 
               | select mumu from kaka where mumu not in (5, 6) or mumu is
               | null.
               | 
               | That will return everything, null included, except for 5
               | and 6.
               | 
               | But wait, there is more. If, for example, performance is
               | the main issue here above query is quite slow. Even on an
               | indexed table on column mumu, it will still do a full
               | scan of the table before returning. How to improve
               | performance in this case? Well, you use LEFT join on
               | itself. Implementation is left as exercise for reader :D.
        
               | Groxx wrote:
               | does postgres really not index nulls in a useful way?
               | mysql does, though it may only work efficiently on a
               | single val-or-null comparison at a time.
        
             | AdrianoKF wrote:
             | Something along the lines of
             | COALESCE(my_column, -1)
             | 
             | might work for you in this case. See [1] for documentation
             | on it in Postgres (it is in ANSI SQL though).
             | 
             | [1]: https://www.postgresql.org/docs/current/functions-
             | conditiona...
        
         | [deleted]
        
       | tabtab wrote:
       | A pet peeve of mine is concatenating null strings. It's like a
       | poison pill that nulls the whole result. 99.99% of the time
       | that's NOT what one wants domain-wise. Maybe the standard should
       | make another concatenation operator that treats null strings as
       | zero length strings. Sure, one can de-null each string in the
       | expression, but that's ugly anti-DRY code. Please fix it, I
       | haaate that.
        
       | salzig wrote:
       | there is something "missing". The SQL spec specifies `null =
       | null` to be "unknown", where i sometimes expect "true". For MSSQL
       | this can be configured using `SET ANSI_NULLS { ON | OFF }`. AFAIK
       | MySQL can't be configured. Don't know about Postgres.
        
         | himinlomax wrote:
         | The standard makes sense if you go back to the theoretical
         | basis of SQL. It seems somewhat counter-intuitive only when you
         | think of NULL as a value you set in a cell.
         | 
         | When it's the result of a relational operation (such as a LEFT
         | JOIN) however, the default makes sense while considering NULLs
         | as equal to each other is typically not useful.
        
         | quietbritishjim wrote:
         | For postgres you can just use the separate operator IS NOT
         | DISTINCT FROM to explicitly request this behaviour. In SQLite I
         | think it's just IS. I assume most SQL databases have something
         | similar, and that's a far better solution than applying a
         | global config.
        
         | hobs wrote:
         | For what its worth, don't do this - pretty much all db code and
         | practitioners expect three valued logic, not two.
        
           | salzig wrote:
           | until you have to work with a database created by an insane
           | guy. Never needed it outside of that one project. (edit:
           | small hint: composite primary key where parts can be null)
        
       | michannne wrote:
       | Another one is MIN and MAX ignore NULL values, which make for
       | some interesting rollback scenarios.
       | 
       | I also swear I have seen a gotcha involving UPDATE WHERE IN and
       | not throwing an error where it should have, which is why I always
       | quadruple check my update statements, but I wasn't able to
       | reproduce it and couldn't find any information online. I haven't
       | seen the issue in so long I forgot what it was, but it would
       | update all rows in your table even if your WHERE clause was
       | proper.
        
         | oarabbus_ wrote:
         | What is a scenario where min or max should consider NULL
         | values?
        
       | juped wrote:
       | >For example, Oracle database won't allow you to have an empty
       | string. Anytime Oracle database sees an empty string, it
       | automatically converts the empty string into a NULL value.
       | 
       | Damn. This is how you do enterprise.
       | 
       | I might be the only person who likes SQL nulls. If you learn how
       | they work up front, they're useful and not really that confusing.
       | But if I ran into weird behaviors like this, I might hate them
       | too.
        
         | ak39 wrote:
         | AFAIK, this Oracle "feature" is only true for columns marked as
         | "not nullable". So if you attempt to insert an empty sting ("")
         | into a not-nullable column, it will fail.
         | 
         | All other relational databases differentiate between empty
         | strings and NULL.
        
           | shawnz wrote:
           | It is also true for nullable columns.
        
           | Sean1708 wrote:
           | Nah, it's true for all columns:                 SQL*Plus:
           | Release 11.2.0.4.0 Production on Mon Feb 3 15:56:30 2020
           | Copyright (c) 1982, 2013, Oracle.  All rights reserved.
           | Connected to:       Oracle Database 11g Express Edition
           | Release 11.2.0.2.0 - 64bit Production              SQL>
           | create table foo(fook varchar2(10) not null);
           | Table created.              SQL> insert into foo values ('');
           | insert into foo values ('')                               *
           | ERROR at line 1:       ORA-01400: cannot insert NULL into
           | ("SPORTSBOOK_DOCK"."FOO"."FOOK")                     SQL>
           | create table bar(bark varchar2(10));              Table
           | created.              SQL> insert into bar values ('');
           | 1 row created.              SQL> insert into bar values
           | ('a');              1 row created.              SQL> insert
           | into bar values (null);              1 row created.
           | SQL> select * from bar;              BARK       ----------
           | a                     SQL> select * from bar where bark = '';
           | no rows selected
        
         | beefield wrote:
         | You are not the only one. I get that there are some exotic and
         | theoretical corner cases where nulls actually are problematic,
         | but for the vast, vast majority of practical cases, nulls and
         | three state logic are very useful. In my humble opinion.
        
         | SigmundA wrote:
         | No I am tired of this WHERE (a = b) or (a IS NULL AND b IS
         | NULL)
         | 
         | Null should equal null like in every other programming language
         | even SQL group by do null equals null which is even more
         | inconsistent.
        
           | trts wrote:
           | where coalesce(a, '') = coalesce(b, '')
        
             | SigmundA wrote:
             | Normally can't use an index as every value has to be
             | coalesced before comparison.
        
           | WorldMaker wrote:
           | Something to petition ANSI about.
           | 
           | (Microsoft's SQL Server still defaults to the non-ANSI NULL
           | behavior where a = b when both are NULL, and that's something
           | that still pings on checklists of SQL Server if it follows
           | ANSI standards. SQL Server is kind enough to let you
           | enable/disable the behavior, and likely that would persist
           | even after the default switches to meet the standard as the
           | docs assure will happen "in some future version".)
        
           | mumblemumble wrote:
           | Letting null equal null might work okay for a WHERE clause,
           | but it turn JOINing into a terrible muddle. And I don't want
           | null to behave different ways in different clauses.
        
             | SigmundA wrote:
             | Already behaves differently in GROUP BY, UNION, DISTINCT
             | and PARTITION.
        
           | bloomer wrote:
           | NULL is similar to floating point NaN (aka not a number)
           | which also has the same comparison operation NaN != NaN.
        
           | Tactic wrote:
           | My preference would be a flag that you can set per
           | connection.
           | 
           | NULL = Undefined or No Data. Whereas a blank field can, in
           | and of itself, be data. It may indicated something is
           | intentionally left blank.
           | 
           | But for those times where you want to consider them the same,
           | it would be nice to have a setting.
           | 
           | (Note that I admit the possibility that this may exist
           | already, like most my great ideas.)
        
             | SigmundA wrote:
             | In MSSQL you have SET ANSI_NULLS OFF to do this.
        
           | stetrain wrote:
           | Sounds like a disaster when trying to correlate Ids/keys
           | between tables.
        
           | progval wrote:
           | > Null should equal null like in every other programming
           | language
           | 
           | Think of SQL's NULL like it's a NaN.
        
         | danso wrote:
         | There's a lot of real-world input situations in which I want
         | NULL to indicate that a record field hasn't yet been collected,
         | versus a record in which the answer to that field was "" - i.e.
         | left blank by the respondent/data source. Sure I could have the
         | front-facing data app auto-convert blank values to "N/A" or
         | "(blank)", but that's unnecessary complexity.
        
           | SigmundA wrote:
           | Empty string isn't enough for this situation since it doesn't
           | work for data types other than string, I actually prefer null
           | vs undefined in javascript / json here, xml also has the
           | concept of null vs undefined.
           | 
           | Emtpy string vs null is not that useful if you have null and
           | undefined.
        
             | vbezhenar wrote:
             | Undefined is very underappreciated invention. I wish I had
             | undefined in Java, throwing exception on read. That would
             | make a lot of software much more reliable. I'm going to
             | write my own AutoValue library just to generate java beans
             | which will allow uninitialized values.
             | 
             | PS undefined in JavaScript is not that useful IMO. But it
             | could be useful with more strict language.
        
               | fnord123 wrote:
               | Optional was added in java 8.
        
             | giornogiovanna wrote:
             | Just to be clear, undefined is _not_ valid in JSON. The
             | only types in JSON are nulls, strings, numbers, arrays,
             | booleans and objects.
        
               | cyphar wrote:
               | "undefined" in this context means that the JSON object
               | has had the relevant key omitted rather than explicitly
               | set to null. This is valid JSON (it just so happens that
               | JavaScript handles this scenario by returning "undefined"
               | as with all undefined variables).
        
               | giornogiovanna wrote:
               | Ah, right, my bad.
               | 
               | In my defense, JavaScript treats a value of `undefined`
               | differently from an unset field / variable (e.g. when
               | testing `object.hasOwnProperty('field')`).
        
               | tabtab wrote:
               | JavaScript has too many "non-visible" value types in my
               | opinion. This makes for both messy code and messy
               | debugging.
        
               | SigmundA wrote:
               | Yes omitted value vs provided with null which is easy to
               | work with in JS because of null and undefined and them
               | being falsey.
               | 
               | Working with xml or json in more typed language like C#
               | is less fun because there is not distinction in the
               | language instead you end up using dom like checks to see
               | if the key was provided or was set to null.
               | 
               | .Net code generator use to add an extra property called
               | [property name]Specified for xml serialization so you
               | could do null values or not send the node. Definitely
               | prefer more direct language support for no value provided
               | vs a null value.
        
           | juped wrote:
           | Yeah. That's what null is for (more or less).
        
           | tabtab wrote:
           | I've never needed such in 3 decades of systems design. I'd
           | like to hear the details. An explicit flag or time-stamp
           | should be used to indicate when or if a record as been
           | updated. To be frank, heavily reliance on Null strings
           | usually means somebody is doing something wrong or awkward in
           | my opinion. Null strings cause 10 problems for every 1 they
           | solve. I stand by that and will and have defended it for
           | hours in debates. Bring it on! (Granted, most RDBMS don't
           | offer enough tools to easily do it correctly.)
        
             | SigmundA wrote:
             | Did the user leave the field blank in the form or did they
             | never even see it (multi step processes etc.)
             | 
             | Did you want to update the field to null or not update it
             | at all is another one.
        
               | tabtab wrote:
               | What exactly does "not see" mean? Some use tab, others
               | use the mouse. That doesn't tell us much and there are
               | better ways to track user hand/mouse movements if you
               | need such telemetry. UI api's often handle nulls/blanks
               | different such that you don't want to over-rely on how
               | one of them does it. Multi-screen tasks should track when
               | each sub-screen is finished, not just the final
               | transaction. I'd like to see a more specific use-case.
               | 
               | Re: _Did you want to update the field to null or not
               | update it at all is another one._
               | 
               | Who, the user or DBA? And why? Users don't know a Null
               | from a horse; you shouldn't make them think about that,
               | that's an internal guts thing.
        
               | SigmundA wrote:
               | Was the user prompted to enter the value but left it
               | blank or where they never prompted. Yes you could always
               | have more data, but this extra two value distinction is
               | common and practical in my experience.
               | 
               | On updating its related to prompting typically, again
               | communicating user intent, did the user update a field
               | from having a value to not having a value (they purposely
               | blanked it out) or they didn't touch the field and maybe
               | it wasn't even shown so don't modify it. Basically am I
               | going to generate a update statement with set field =
               | null or no set at all for that field. This is trivial in
               | json to send through application due to null vs
               | undefined.
        
               | tabtab wrote:
               | Re: _but this extra two value distinction is common and
               | practical in my experience._
               | 
               | I'm still not following. What exact hand or
               | keyboard/mouse movements constitute "left it blank"?
               | "Leave" means to "move away from". What is moving away
               | from what?
               | 
               | Usually one stores a time-stamp for input prompts
               | (screens). If the prompt never appears, you then have no
               | time-stamp and/or record for it. Why would a broken
               | prompt create a record to begin with? I'm missing some
               | context you seem to be assuming. You may be assuming
               | behavior of a specific UI product or stack.
        
               | SigmundA wrote:
               | Not sure I can explain more simply, just think multi part
               | wizard interfaces or forms with button that open up sub-
               | forms.
               | 
               | I think you get it since you talk about time stamps. Just
               | as you could record all mouse movements and keyboards to
               | get higher fidelity you could break a single record into
               | multiple with times stamps to record each step in a form,
               | then the lack of record would be distinct from a record
               | with all nulls along with time stamps. You could also do
               | a record per field (which I have seen more than once)
               | with metadata per field as other columns.
               | 
               | But without all that a system that supports null and some
               | empty value gives you more fidelity than just null that
               | again in my experience is practical.
               | 
               | Empty string and null work fine with strings for this
               | purpose but for other datatypes you start needing to pick
               | special values like the minimum value for a 32 bit int or
               | 0 date etc.
        
               | tabtab wrote:
               | Usually you don't get the entire sub-screen's data such
               | that a blank versus null distinction wouldn't help. You
               | can't "half submit" a dialog or sub-screen on the vast
               | majority of systems I ever worked with it, and one adds
               | time-stamps to critical sub-screens if they can.
               | 
               | Maybe you encountered a specific product's edge-case
               | where it helps, but I'd rather trade away 9 other
               | headaches to get your edge case headache. It may be the
               | case that Product X doesn't allow time-stamps, but should
               | we gum up all strings to make just Product X happy? I
               | don't see how the tradeoff math favors that. Plus, it
               | encourages bad designs by plugging problems with funny
               | non-printable codes. Let's not enable slop-heads.
               | 
               | Re: _but for other datatypes_
               | 
               | For the purpose of this discussion I'm limiting it to
               | strings, where the majority of null-related headaches are
               | found in my experience.
        
             | gregmac wrote:
             | A common situation I've run into is with "default" values
             | and overrides, especially for configuration-type settings.
             | NULL indicates use parent record value, while non-NULL,
             | including empty string, means to use that value. By
             | allowing empty string, you explicitly allow a user to
             | basically say "don't inherit". Think along the lines of
             | `COALESCE(userValue, tenantDefaultValue, "Default Value")`.
             | 
             | One way of implementing the UI for this is to have a
             | checkbox labelled "Inherit" or "Use default", and if it's
             | checked, disable the actual textbox (and put NULL in the
             | field).
             | 
             | I've also run into similar patterns with data telemetry. I
             | want to accurately represent the data I received, and so if
             | some older version doesn't include a particular value, I
             | want to store that as NULL, because empty string is a
             | potentially valid value. If you "normalize" NULLs to empty
             | string, and then it makes it impossible to answer a
             | question like "What % of systems have a value for this data
             | item?" since it includes versions that can't get it at all.
        
             | stronglikedan wrote:
             | I just want to know if a boolean has been explicitly set,
             | without having to check the value of another more complex
             | data type that would never be used for anything else.
        
               | tabtab wrote:
               | To clarify, my context is strings. If that was not clear,
               | my apologies.
        
               | pushpop wrote:
               | I'd go further than you and say they should be removed
               | _by default_ on all fields.
               | 
               | Want to know if a Boolean field is unset? Well it's no
               | longer Boolean because you now have 3 states for that
               | field. So why not use a char, U/Y/N with the default
               | being U?
               | 
               | NULL might have made more sense 30+ years ago when
               | systems were more resource constrained but it doesn't
               | make sense now for most of the problems people are trying
               | to solve day to day. If anything, it creates more
               | problems.
               | 
               | Just to be clear, I'm not saying they should be removed
               | entirely but rather that they shouldn't have to be
               | explicitly disabled on every CREATE.
               | 
               | I will say one use case for NULL that is hugely helpful
               | is outer joins and nested queries. However these don't
               | generate high performance queries so if you're having to
               | rely on them then you might need to rethink your database
               | schema anyway.
               | 
               | So essentially I don't disagree with you, I just think
               | you're being too nice limiting your complaint to string
               | fields.
        
               | williamdclt wrote:
               | > Well it's no longer Boolean because you now have 3
               | states for that field. So why not use a char, U/Y/N with
               | the default being U?
               | 
               | Well because instead of using a type that exactly encodes
               | the concept of "yes/no/unset" (nullable boolean), you'd
               | be using a type that encodes "any 1-character text, with
               | arbitrary meaning and most of the values being
               | nonsensical"
        
         | chias wrote:
         | I've recently gone and removed the "nullable" attribute from a
         | bunch of SQL columns that previously had them (and arguably
         | _should_ have them), and the result has actually been rather
         | pleasant.
         | 
         | One interesting problem that arises when you use nulls is that
         | it can be difficult to ensure people actually use them when
         | it's appropriate to do so. Case in point I have a field that is
         | essentially an optional positive integer, so obviously I made
         | it a nullable unsigned int. A few years later there's a pretty
         | even spread of nulls and 0s in there to indicate the same thing
         | -- to a lot of consumers, they behave the same because their
         | business logic basically says "if (foo->field) do stuff" which
         | works either way. In the end I changed it to a non-nullable
         | field using 0 as the null stand-in, which is semantically
         | _worse_ , but ended up making interesting searches over this
         | data set a lot easier.
         | 
         | On the one hand, perhaps the more correct answer would have
         | been to yell at people putting 0s in when they should have been
         | putting nulls in. On the other hand, we put constraints on
         | fields for a reason...
        
         | johannes1234321 wrote:
         | Right, I am a fan of SQL NULl as well. It is nicely consistent
         | - anything NULL in - you get NULL out. Clearly telling that you
         | get undefined data. Silently converting to empty string, zero,
         | or therelike would eventually return garbage for harder to
         | debug reasons.
         | 
         | That Oracle behavior annoys me each time, though.
        
           | tabtab wrote:
           | Re: _Silently converting to empty string, zero, or therelike
           | would eventually return garbage for harder to debug reasons._
           | 
           | It's never been a problem with _strings_ in my many decades
           | of experience, unless somebody does something which I
           | consider poor system engineering. Nearby I invited a solid
           | use-case illustrating a real string need.
        
         | JMTQp8lwXL wrote:
         | >For example, Oracle database won't allow you to have an empty
         | string.
         | 
         | It's also true for AWS' DynamoDB offering.
        
         | Starwatcher2001 wrote:
         | I like nulls too and think they make perfect sense, especially
         | with numeric fields.
         | 
         | Suppose we have an "age" field, but don't actually know the age
         | of the person, null makes perfect sense.
         | 
         | Otherwise we'd have do do something like using a "magic number"
         | like 0, -1, or a separate field altogether to indicate an
         | unknown value.
         | 
         | Granted, they do need some handling in queries.
        
           | ilogik wrote:
           | except that age = 0 is valid if your database has toddlers.
           | Hell, even -1 makes sense if you're running a query to get
           | the age of someone at a specific time. this is why you should
           | try to use NULLs whenever possible
        
         | shotashota wrote:
         | >I might be the only person who likes SQL nulls From my
         | understanding, null values are bad because its a sign that the
         | database design is flawed (See database normalizations) Perhaps
         | you have a more practical experience?
        
           | RHSeeger wrote:
           | The fact that a database is not fully normalized is not a
           | sign that it's flawed. In fact, there are cases where some
           | tables being fully denormalized makes sense (although that's
           | less common in my experience).
        
           | kstrauser wrote:
           | That's not right. "Null" means that the value is not known.
           | Suppose you have a table for employees, and you want to
           | record the last time they were paid. What do you put in that
           | column for people who just started this morning? The
           | alternatives are to use null, indicating that they haven't
           | been, or to formulate a codebase-wide sentinel value like
           | "0000-01-01" and then accounting for that in every single
           | database operation everywhere.
           | 
           | Further suppose that you have an external function in your
           | codebase to estimate how many paychecks you've paid to
           | someone, but the author doesn't know about any "0000-01-01"
           | conventions your office uses. Without that, you'd see that
           | Joe New Guy has worked here about 2,020 years, so we've
           | probably issued him about 48,000 checks. If only you'd used
           | null, then that function would have calculated "today() -
           | null", which in any sane language would raise a type
           | exception and alert you to the problem.
           | 
           | Nulls are beautiful. They have meaning. Lots of people misuse
           | them, but that doesn't mean they're not valid and useful.
        
             | oarabbus_ wrote:
             | this is 100% how I feel. It's mindboggling that some people
             | here think NULL is "wrong" or should be avoided. Nulls are,
             | as you said, beautiful.
        
           | giornogiovanna wrote:
           | 1. NOT NULL is very frequently used in schemas, because NULL
           | is often an undesirable value (e.g. for a mandatory field).
           | That doesn't make NULLs bad, and NULLs are still frequently
           | used when you have optional fields or fields where NULL has
           | some other special meaning.
           | 
           | 2. NULLs are used by SQL functions and operators as an
           | "unknown" value. So, for example, "NULL AND TRUE" is NULL,
           | because we could substitute NULL with TRUE or with FALSE to
           | get different results, but "NULL AND FALSE" is FALSE, because
           | no matter what we substitute NULL with, the result will
           | always be FALSE.
           | 
           | 3. Clearly all these valid uses of NULL do not indicate
           | "flaws" in the database design.
           | 
           | 4. Database normalization isn't always a good thing, and
           | beyond a certain level it's almost always a bad thing, so
           | using normalization methods as a standard for whether
           | something is "flawed" is probablyn ot the best idea.
           | 
           | 5. No database normalization method, as far as I know,
           | actually tries to eliminate NULLs, so I don't know what "(See
           | database normalizations)" refers to. Can you clarify?
        
             | tabtab wrote:
             | Re: _NOT NULL is very frequently used in schemas, because
             | NULL is often an undesirable value_
             | 
             | Talking strings, usually if you don't want a null string,
             | you _also_ don 't want blanks (white space) either. I'd
             | like to see auto-trim in the standard, and also a minimum
             | length specifier. We then wouldn't need to deal with nulls.
             | A single min-length-when-trimmed-and-denulled integer value
             | would replace a lot of repetitious hubbub in typical CRUD
             | apps. D.R.Y. it! You'd have one attribute "cell" that would
             | _replace_ the equivalent of:                    IF
             | length(trim(denull(inputValue,'')))              <
             | this_fields_min_length THEN
             | raise_data_too_short_error(...);
             | 
             | That's the way you'd want it done in the vast vast majority
             | of CRUD systems (if not doing weird things).
        
       | [deleted]
        
       | kords wrote:
       | DynamoDB, which is NoSql, also doesn't accept empty strings. But
       | at least, Oracle automatically converts the empty string into
       | NULL, comparing with DynamoDB which would actually fail the
       | query.
        
       | Andromeda88 wrote:
       | I was dealing with NULLs whole day on MySQL workbench. It wasn't
       | considering int as NULL value. Needed to make all empty cells 0
       | to be able to import data properly.
        
       | irrational wrote:
       | We recently moved from Oracle to Postgres. We had thousands of
       | queries written based on the way Oracle handles NULLs and empty
       | strings. It took us the better part of a year to rewrite all of
       | them to the Postgres way. I am so glad to be off of Oracle.
        
       ___________________________________________________________________
       (page generated 2020-02-03 23:00 UTC)