[HN Gopher] Old, Good Database Design
       ___________________________________________________________________
        
       Old, Good Database Design
        
       Author : jelnur
       Score  : 99 points
       Date   : 2020-09-14 05:40 UTC (17 hours ago)
        
 (HTM) web link (relinx.io)
 (TXT) w3m dump (relinx.io)
        
       | commandlinefan wrote:
       | Some people choose nosql alternatives because they've spent time
       | analyzing the performance of a proper relational model and have
       | determined that an RDBMS will generate too much overhead for
       | their data load and consciously accept the tradeoffs involved in
       | giving up automated referential integrity.
       | 
       | Most people, though, choose nosql alternatives because they're
       | too lazy to learn how to model data.
        
         | Cieplak wrote:
         | > _have determined that an RDBMS will generate too much
         | overhead_
         | 
         | I think the read/write overhead is mostly a function of schema
         | design, rather than an intrinsic property of an RDBMS.
         | Denormalized schemas have similar performance profiles to
         | document-oriented storage.
         | 
         | Moreover, mainstream SQL databases like Postgres are getting
         | better and better at indexing jsonb fields, indexing time
         | series data with BRIN, rudimentary full-text search, offering a
         | one-size-fits-all storage system that may not be the best at
         | everything, but will be good enough to support a growing
         | business to its next funding round.
        
           | johnthescott wrote:
           | for text search the rum index is quite robust. my guess is
           | that rum (or something like it) will be introduced into the
           | pg core soon. we index many terabytes of pdf files excellent
           | performance.
           | https://github.com/postgrespro/rum
           | 
           | also, postgrespro are behind the json/b indexing.
        
         | SamuelAdams wrote:
         | I am forever grateful that I took a full semester of database
         | design in my undergrad. This single skill has stood with me for
         | my entire career so far and has enabled me to figure out the
         | root cause of many production issues. Plus people really like
         | it when you can answer ad-hoc questions like "what inspections
         | are still open and when were they first opened".
         | 
         | If y'all can understand Angular / React / Vue there's no reason
         | to not learn databases.
        
           | throw345hn wrote:
           | Could you suggest resources (books, articles, videos, moocs
           | or others) to learn good database design. I am picking up
           | skills about sql but want to better understand and learn
           | about databases. As someone who doesnt have that background,
           | a lot of the times I am just googling for stuff and just
           | trying out bits and pieces.
        
             | EvanAnderson wrote:
             | Apparently there's a 4th edition coming, but if you like
             | the form-factor of a book-length text I'd throw a
             | recommendation to "Database Design for Mere Mortals"[1]. I
             | read the 1st edition from the late 90's, but I'd imagine
             | it's still just as good. It approaches database modeling
             | from a practical non-technical perspective, and I found it
             | helped me learn data modeling in a software-agnostic
             | manner, and later to influence how I talked to non-
             | technical audiences about data modeling. I'm really glad to
             | have found it early in my career.
             | 
             | I will echo what others have said. Data modeling is a
             | force-multiplier type of skill. Combine it with a
             | reasonable understanding of SQL and you can return a lot of
             | value very quickly.
             | 
             | [1] https://www.amazon.com/Database-Design-Mere-Mortals-
             | Hands/dp...
        
               | ticmasta wrote:
               | Not required, but knowing a little relational algebra
               | helps queries make a lot more sense. A lot of the
               | features like constraints and indices are very thin
               | veneer over the underlying concepts.
               | 
               | At the other end of the spectrum, once you can write a
               | few basic queries check out something like SQL Murder
               | Mystery https://mystery.knightlab.com/
               | 
               | This one is super-fun and lets you practice some basic to
               | low-intermediate skills.
        
             | mcny wrote:
             | I'd like to know this as well. I think you'll just have to
             | build things (potentially horribly) and fail.
             | 
             | I took three semesters of database (granted, baby database
             | classes) and I still have no idea how you can do something
             | pretty straightforward like creating a room reservation
             | system.
             | 
             | If there is a reservation beginning at 10:15 AM and ending
             | at 12:30 PM and someone tries to book a reservation from
             | 10:00 AM to 10:30 AM, the transaction should fail.
             | 
             | and before someone screams db2! yes, db2 can. but then
             | you'd have to use db2 https://www.ibm.com/support/knowledge
             | center/SSEPGG_11.1.0/co...
             | 
             | Why is this so difficult...
        
               | ticmasta wrote:
               | hold start & end time in normal DB columns, use business
               | logic for your rule enforcement, have a query to identify
               | rule violation. Maybe add constraints on your warehouse
               | data but I probably wouldn't on the operational DB. I
               | could be convinced to do so for the PoC because it might
               | make sense, but what if you're managing thousands or
               | millions of scheduled "resources" like a chain of
               | libraries, or hotel rooms? What if you need to coordinate
               | "leases"? DB-level is an OK place to start, but also one
               | of the hardest to tease apart when you need to scale.
               | 
               | This is kind of the point of data modeling; you start
               | with an idea, make it all pretty 3rd-normal form, define
               | your projected indices and checks and constraints, then
               | you mess it up a bit where it makes sense or you've had
               | experience in the past.
               | 
               | YMMV
        
               | IvanVergiliev wrote:
               | Exclude constraint on a GiST index?
               | 
               | https://stackoverflow.com/a/51247705
        
               | sk5t wrote:
               | The documentation for Postgresql range types describes
               | how to do exactly this.
               | 
               | https://www.postgresql.org/docs/11/rangetypes.html#RANGET
               | YPE...
               | 
               | Edit: and if you didn't want to use postgres, you could
               | have "starttime" and "endtime" columns and reject any bad
               | bookings with a before insert / before update trigger.
        
           | cgh wrote:
           | Absolutely agree. Learning data normalisation was up there
           | with Typing 9 (touch-typing class in grade 9, not Hindley-
           | Milner whatever) in terms of long-term usefulness.
        
           | shoguning wrote:
           | Pardon my ignorance--is inspections some concept that relates
           | to database management, or are you referring to a query like
           | "select * from inspections where status = 'open';". Honestly
           | asking.
        
             | crgwbr wrote:
             | I believe OP is just referring to the ability to run adhoc
             | queries. Typically nosql solutions are built to be performt
             | for common access patterns, at the expense of being
             | difficult or impossible to query in unplanned ways. SQL DBs
             | are very good in that regard.
        
         | rdiddly wrote:
         | All depends what you're doing, right? Is it a small stupid app
         | with a short lifespan? Is it "just" a proof-of-concept? Is the
         | company likely to survive long enough that technical debt
         | matters? (oblig. HN/SV/startup comment) Is there a ridiculous
         | imbalance between data ingestion volume vs. reads/retrieval?
         | All good arguments for alternative/noSQL approaches.
        
       | slifin wrote:
       | Most relational databases aren't relational enough
       | 
       | For example in Drupal you can have a node table which is a
       | foreign key relationship to many other custom tables
       | 
       | In SQL I can't say get me all the nodes that have a start date
       | without explicitly left joining to a potentially dynamic number
       | of other tables using a higher level language to modify the query
       | 
       | In Datomic or Datascript or Datahike or Datalevin or Crux this
       | query would be trivial these are not nosql databases they're EAVT
       | datalog engines, that in some cases also support SQL for
       | compatibility
        
       | rapfaria wrote:
       | Which software is used to make those diagrams? The ones I use
       | don't have good arrows like that
        
       | geebee wrote:
       | Nice link. Nothing controversial, but sometimes simplicity is
       | controversial in our field.
       | 
       | I've slowly come around to seeing proper database design as the
       | most essential foundation of an IT system. I remember reading
       | "your data will outlast your application", and I've been around
       | as a developer long enough to have lived it.
       | 
       | One big anti-pattern I've seen with ORMs is that developers who
       | don't really think in terms of data and relationships use the ORM
       | as a kind of object serialization usable only from the
       | application. Rather than thinking of the database as something
       | useful that could be queried and accessed outside the context of
       | the application, they write objects out to various tables and
       | then re-import and re-construct them once they're needed again in
       | the app, often with dependencies that are in code or even yaml
       | configuration files. The upshot is that you simply can't use the
       | database as anything other than a persistence tier for an app. It
       | really would be no different if they had simply given the objects
       | and id and pickled them to disk. The resulting data store really
       | is that inaccessible and meaningless outside the context of the
       | app.
       | 
       | As a result, if an analyst wants a report, they can't write SQL
       | to get those reports, even though the persistence tier is, if
       | perhaps in name only, a "relational database". And when the app
       | goes away (as it inevitably does), they'll still want to know,
       | say, how many beakers and test tubes were ordered by a lab tech
       | who participated in 100 or more experiments per year with at
       | least one faculty member from radiology between June and December
       | in 2007.
       | 
       | But because they don't really know SQL, they see no value to it,
       | and they're honestly just kinda irritated that they don't have an
       | object database, which is what they understand a database to be -
       | a way to pickle and reimport objects. The application outlasts
       | the developer, and the data outlasts the application.
       | 
       | Yeah, if you're writing an app that will hold potentially useful
       | data, definitely think about how this will be accessed outside
       | the context of the app that perhaps inspired its creation, and
       | think about how you'd get at the data if the application went
       | away. That'll probably lead you to old, good database design.
       | 
       | Now, I actually do think that you can create a decent relational
       | database through an ORM. I saw this go sideways with Rails a
       | bunch of times, but I think that's probably because Rails made it
       | so easy to start developing that a lot of people new to
       | application development skipped the design stage of the data
       | backend. You can construct a pretty robust database with
       | migrations rather than CREATE TABLE statements, as long as you
       | stick to the basic principles - but the technology makes it
       | relatively easy to do the opposite and never really get into the
       | mindset of data.
        
         | Supermancho wrote:
         | > Nothing controversial
         | 
         |  _ahem_
         | 
         | > Foreign Key constraint is the king of the relational database
         | design
         | 
         | Amazon does not use FK constraints and I have rarely run into
         | systems that do (since 1996ish). Most people with big enough
         | datasets learn not to use them. The overhead for orphaned data
         | is far less than the consequences of using them.
        
           | geebee wrote:
           | Ah. Fair enough, yes, I agree. That statement goes a little
           | too far, and is a bit at odds with the otherwise overall
           | light tone of the article and the YMMV-ish disclaimer at the
           | end.
        
       | mooneater wrote:
       | Is the diagram a joke? That's total ERD spaghetti.
        
       | 60secz wrote:
       | Haven't you seen ghostbusters? "Don't cross the streams. It would
       | be bad." If your data model has lines crossing over, it's the
       | first smell of a bad design. Seriously tho, 5 minutes of
       | untangling would make that data model diagram 100x better.
        
       | gopalv wrote:
       | > A well-thought design can save us many hours of coding,
       | testing, and troubleshooting.
       | 
       | That is the very definition of a waterfall design model.
       | 
       | I've turned into a fluid-design advocate over the years, where
       | every design principle follows a next question - "okay, this is
       | good but how would I change it?".
       | 
       | So you start with a unique constraint and four months later, you
       | find out that it is not actually unique (like "two patients with
       | the same email, because the 2nd one is a newborn on day of
       | birth").
       | 
       | Or you normalize a data-set only to find out that your 1:N
       | relationship turns into a 2:N relationship from before/after
       | dates (like "UK" goes from "EU" to "UK").
       | 
       | The lost-time work of a design is usually the "okay, we did it in
       | a world where UK was in EU - but we undo it & here's how" notes.
       | 
       | Having a plan to decommission the nuclear plant you're building
       | is super useful and often more relevant than designing it for
       | efficiency alone.
        
         | hinkley wrote:
         | > "okay, this is good but how would I change it?"
         | 
         | Otherwise known as Reversible Decisions. Any decision that can
         | be undone easily does not require the level of scrutiny, the
         | level of investment. Save that investment for the things you
         | won't be able to change.
         | 
         | When faced with an irreversible decision, it's helpful to
         | develop stalling tactics. Everything from distracting people
         | with other issues to finding a way to get a 'taste' of the
         | change without committing to it. It also helps if you stay on
         | top of release notes for tools you use, and competitors of
         | those tools. New opportunities might arise to use someone
         | else's work to solve your problem better.
        
         | nightski wrote:
         | Good design as advocated by the author is generally conducive
         | to change and is saying pretty much the same thing as you are.
         | 
         | It's a lot easier to take away unique constraints later on
         | instead of adding them in. It's easier to de-normalize some
         | data for performance than to normalize it later on. The list
         | goes on.
         | 
         | The reason the waterfall method received so much bad press is
         | because of requirements gathering, not the software or data
         | design phases. Requirements are hard to get right the first
         | time and they also change over time. But I'd be surprised to
         | find someone argue that good architecture and design is a bad
         | thing (being defined as the ability to adapt to changes in
         | requirements).
        
           | dragonwriter wrote:
           | > The reason the waterfall method received so much bad press
           | is because of requirements gathering, not the software or
           | data design phases.
           | 
           | No, it was because all three were done wrong.
           | 
           | Requirements gathering is the biggest problem, true. But even
           | if requirements were both knowable and fixed, for most
           | projects, big up front requirements gathering, design, and
           | then implementation would have lots of waste in the lean
           | sense of effort expended that spends time not delivering
           | customer value.
           | 
           | Now, that gets made worse with the rework created by the fact
           | that requirements gathering without validation by use gets
           | lots of stuff wrong and that the context is often evolving
           | such that requirements will drift between gathering, design,
           | and implementation in a waterfall project, so that lots of
           | work is done which never delivers value and needs reworked
           | before it can do so, but the problem exists even without that
           | exacerbation.
        
         | pmontra wrote:
         | Well-thought design saves time even in agile projects. Maybe
         | it's a one day activity instead of three months of it and yet
         | it makes a difference. Basically each activity (or whatever we
         | call it) is a micro waterfall.
         | 
         | Example from today: a developer came back from a week of
         | vacation, listened to the stand up meeting this morning and
         | pointed out that we misunderstood the purpose of a table he
         | worked on time ago. Result: a few hours of last week's coding
         | were useless and we spent a couple of hours together at
         | redesigning the activity. The total impact should be of about
         | one day.
        
         | silvestrov wrote:
         | When you need to add info to your database, do you refactor
         | existing tables or do you add a key-value table.
         | 
         | I've seen so many key-value tables that really needed to be
         | refactored into proper normalized tables.
         | 
         | It's just so tempting to stick extra info into KV when you
         | don't know where the project will end up.
        
         | pjmorris wrote:
         | Are you presuming that a 'well-thought [out] design' means that
         | all the thoughts, and the design, came only at first? I can see
         | a well-thought out design either as the waterfall you presume,
         | or as an accretion of design decisions made over time as in
         | your 'fluid design.' Either way can save many hours of coding,
         | testing, troubleshooting.
        
         | jacques_chester wrote:
         | It's easier to safely loosen constraints than to add them post
         | facto.
        
       | jrms wrote:
       | Hi, my apologies if it's a bit off topic, but I wonder if you
       | have any advise or can point me in some direction on the way to
       | becoming a DBA, or something similar enough? I've been working as
       | and old school sysadmin for the last 17 years and counting. Even
       | if I can do nowadays devops stuff, I always liked the DBA work,
       | based on my experience as sysadmin for DB servers (mysql, oracle,
       | and friends...), and I'm considering moving my career in that
       | direction. So is there any "DBA certification" for which I could
       | take a course and all that? Thanks!!
        
         | hobs wrote:
         | You've hit on the major problem with DBA work, which is that
         | you are touching some of the most expensive prod stuff and even
         | if you have "that cert" many places just want to see that
         | somewhere else trusted you as a DBA for that type of product.
         | 
         | I managed to transition in a company in dire straits that had
         | no other options, and then getting new DBA jobs was fairly
         | easy.
        
           | jrms wrote:
           | Thanks for your feedback, and I pretty understand what you
           | mean, I've worked with some DBA know as the "million dollars
           | error guy". But I think that's a second step, and anyway we
           | were maintaining the servers where those dbs were running...
           | So we deserve some trust maybe?
        
       | whirlwin wrote:
       | This is useful for most small apps or truly monolithic apps.
       | ...But normalization in a distributed environment with 50+ apps?
       | Really? You need to have some sort of duplication unless you want
       | big bottlenecks, performance penalties, and hot headaches
       | 
       | From the linked MS article: "Redundant data wastes disk space and
       | creates maintenance problems"
       | 
       | Made me laugh well
        
         | simonw wrote:
         | Duplication is OK provided there are strict, documented rules
         | about which copy is the "point of truth". If the duplicates
         | diverge you need to know how to fix them.
        
         | flowerlad wrote:
         | > _You need to have some sort of duplication unless you want
         | big bottlenecks, performance penalties_
         | 
         | I think that's called a cache. You can still have solid
         | database design at the core. Then add redis on top of it.
        
           | Raidion wrote:
           | It can be non cached data as well. Imagine a data that needs
           | to be sorted by data that's owned by another another service.
           | You need to replicate the data as you can't do a join+sort
           | between the data coming from two apis effectively. You don't
           | treat that data as a source of truth, but do use it some UI
           | purposes.
        
       | janvdberg wrote:
       | Reminds me of this great Derek Sivers post: https://sive.rs/pg
       | 
       | If your design is good, you need less code.
        
         | Raidion wrote:
         | Problem is that any changes to the databases need to be done on
         | a single point of failure for your application. Any change that
         | goes sideways, you risk downtime.
         | 
         | Counter this with code, where I can deploy code along side my
         | existing code and make sure it functions as I want it. If
         | something fails, I just remove that instance from the LB. You
         | don't want to touch something that valuable during regular
         | feature cycles. Databases should be altered rarely and with
         | much apprehension and a well established backup/rollback plan.
         | 
         | I'm not saying you can't use constraints and stuff, but they
         | should be really really static concepts that aren't subject to
         | change. Unlike code, you can't share validation conditions
         | across tables, so where you could update all the validation
         | logic in a service with a change in one place, you have to
         | update many tables in the database.
         | 
         | I would stay away from triggers entirely, and use a
         | queue/stream system to process data async. This can be better
         | prioritized when the DB is under heavy load.
        
         | nemothekid wrote:
         | > _If your design is good,_
         | 
         | "Draw the rest of the fucking owl"
         | 
         | Good design is _hard_. There are arguments to be made for both,
         | but the problem with  "Old, Good Database Design" is when the
         | design changes it either devolves into
         | 
         | 1. Downtime trying to move X billion rows
         | 
         | 2. Some ad-hoc K/V store on top of your RDBMS
         | 
         | And most companies tend to opt for (2) rather than (1). It's no
         | surprise that some systems just decide to choose (2) from the
         | onset.
        
       | throwaway894345 wrote:
       | If one of the purposes of relational databases is data modeling,
       | I've always wondered why there aren't good semantics for sum
       | types. The real world is full of them, but databases can't
       | express them. When I bring this up, some people respond that this
       | is the purpose of ORMs; however, this implies that we have an
       | arbitrary bifurcation in which some of the processing happens
       | efficiently in SQL and anything that depends on sum types has to
       | get hoisted up and over a network to application code. Further,
       | it allows for different clients to behave differently, possibly
       | having different notions of what any given sum type's variants
       | are (which leads invariably to data corruption). I really wish
       | databases did better here, but maybe I'm missing something.
        
         | alextheparrot wrote:
         | Can you give an example of real world data modeling where you
         | want more expressive sum types over just using enums? Enums are
         | technically a subclass of sum types, but even those are non-
         | trivial to use at a data format level (Try evolving them in an
         | on-the-wire message format like Avro or Protobuf).
        
           | talaketu wrote:
           | How do you model "postal address"? Some postal addresses are
           | PO Boxes, some are street addresses, etc. There are canonical
           | representations of these different cases. Do we just shove it
           | all in a string, and let the application perform domain
           | validation?
        
           | jimktrains2 wrote:
           | I think they may mean that the result set has elements of
           | different types. For instance if you stored restaurants by
           | genre but wanted a list of all restaurants, but retaining all
           | of the unique fields, you currently need to generate the
           | product type of the genres.
        
         | geophile wrote:
         | > If one of the purposes of relational databases is data
         | modeling
         | 
         | Huh?
         | 
         | Creating a data model before creating a database, is like
         | writing an outline before writing an essay. It organizes your
         | thoughts and gives structure to what you are about to do. Once
         | you have a data model, you can then implement it using whatever
         | database technology you choose.
         | 
         | If you don't start with a data model, you literally don't know
         | what you are doing.
        
           | throwaway894345 wrote:
           | I think you misunderstood my comment. I'm arguing that
           | relational databases would be more useful if they had sum-
           | type semantics. Data models often have OR-shaped data, and
           | pretending like this class of data doesn't exist and making
           | it the purview off application code makes relational
           | databases much less useful than they might otherwise be.
        
         | kccqzy wrote:
         | What's difficult about them? I typically use nullable columns
         | and then a check constraint to specify a custom condition for
         | nullability. Columns belonging to the same alternative in the
         | sum type must be all null or all not null. And then there's
         | check only one active alternative.
        
         | wisnesky wrote:
         | Sum types are harder to express as data integrity constraints
         | that product types - for example, Horn clause constraints can
         | be used to express that a table is a product (or limit, more
         | generally) of other tables, but to express that a table is a
         | co-product (or colimit, more generally) requires a stronger
         | logic (handling disjunction, for example). This has
         | implications for data processing algorithms such as the chase
         | that depend on the strength underling logic.
        
         | loopz wrote:
         | The traditional relational model is very focused on mutable
         | data and normalization. Different types would be categorized in
         | separate columns. So this idea would run counter to "best
         | practice" and need something foundational behind it, which
         | would be just enough out of scope for a traditionally typed
         | relational datastore. Maybe this is just another way of saying
         | the underlying theories are different, or covering different
         | areas of computation and storage. So the prevailing wisdom is
         | to not expose type complexity explicitly to clients, but just
         | export the inherent problem implicitly.
        
           | throwaway894345 wrote:
           | It seems that the relational model plainly enough wants to be
           | the gate keeper for your data model--it gives extensive tools
           | for modeling and enforcing data schema, but it just kind of
           | throws its arms up at data that is "OR" shaped. Some people
           | argue that it's because there's not an obvious way to lay out
           | sum type data in memory or on disk or on the wire, but these
           | problems are all solved by traditional programming languages
           | (data is data, at the end of the day). If you want to take
           | the "different philosophies" tack, then it seems like a
           | philosophy that only addresses AND-shaped data leaves a lot
           | to be desired.
        
       | roywiggins wrote:
       | > we should keep numeric data in "integer", "numeric" fields
       | 
       | I end up keeping numeric data as text when I'm ingesting an
       | external data source that I don't trust not to change ID format
       | on me. They're all numeric _now_ , but the format could change,
       | and the actual numeric value of the ID is not important at all.
        
         | bcrosby95 wrote:
         | Yes, be careful conflating numeric data with data that looks
         | numerical. IDs fall into the latter for sure. Unless you have
         | complete control over it, then it might always be numeric.
         | 
         | E.g. back when I thought I was super smart, on one project I
         | made the credit card cvv a number. Except they can start with
         | 0. Whoops.
        
       | TheRealSteve wrote:
       | This seems contrary to what I have learned in my career as an
       | application developer on data heavy platforms. Namely the first
       | section that concludes: > Having stressed the importance of good
       | database design...
       | 
       | I'm not in agreement with the author's concept of good design. I
       | don't want other "doors" to edit the data that bypass the
       | application logic. That's the mess SQL enables for DBAs and
       | scripts that think it's okay to change data without executing
       | business logic.
       | 
       | I think it's better to accept your data and application layers
       | are coupled and plan to evolve and refactor them together rather
       | than teach your database to defend itself from the evil business
       | logic in applications.
       | 
       | I also don't think this dictates relational DBs vs nosql. Your
       | data model is probably relational but how you choose to store and
       | access the data depends more on what use cases you are trying to
       | enable.
        
       | tarkin2 wrote:
       | JSONB objects with SQL relations in Postgresql is my happy-medium
       | between the joy of schema-less JSON and the reassurance of SQL
       | relations.
        
       | simonw wrote:
       | My least favorite part of database design is the bit where you
       | have to pick lengths for your char columns.
       | 
       | Twenty years in and I'm still picking these pretty much by
       | guessing. And when I guess wrong it causes really annoying
       | problems further down the line.
       | 
       | I love how SQLite doesn't make me do this - it just has a TEXT
       | type which is always unlimited in length.
        
         | Zippogriff wrote:
         | There's always a limit. You either define and manage it
         | yourself or it'll be done for you when some part of your system
         | breaks. In the end if you actually need performance and
         | reliability _everything_ will have a bound (if not fixed) size
         | and larger data will be processed as a stream, anyway.
        
           | colanderman wrote:
           | Postgres tip: define columns as TEXT, but with a CHECK
           | constraint that the length is what you expect. This avoids
           | the problem with varchar(...) that views inherit the
           | underlying column types (including the length specifier), and
           | prevent you from changing the maximum length in the table
           | unless you drop and recreate the view. (There are ways around
           | this through system tables but they are wholly unwieldy.)
        
             | simonw wrote:
             | I really like that - especially since changing CHECK
             | constraints can be done on large tables without having to
             | rebuild the entire table.
        
         | nickpeterson wrote:
         | Until some user inserts 100MB in a single column of a single
         | row.
        
           | simonw wrote:
           | I'd rather guard against that in my input validation
           | application logic than pick the wrong value when I'm
           | designing my table and have to deal with fixing that later
           | on.
        
         | downvoteme1 wrote:
         | Why don't you use varchar(max) as the range always. The varchar
         | data type specified that the length of this attribute is
         | variable in each record and the memory allocated depends only
         | on the number of actual characters stored in the column.
        
         | pizza234 wrote:
         | I suppose that in every RDBMs that makes the distinction, it's
         | an optimization matter - VARCHAR being stored on-page, while
         | TEXT off-page (although there can be optimizations for short
         | TEXT values); the latter will cause an extra page seek on
         | access.
         | 
         | Some database [versions] may also be unable to apply certain
         | optimizations in certain cases, when TEXT is used (eg. temp
         | tables on MySQL <= 5.7).
         | 
         | That doesn't prevent one from always using TEXT, and possibly,
         | for most of the use cases (surely, if one uses SQLite, that's
         | the case), the performance impact is not meaningful.
        
           | nemothekid wrote:
           | For what I understand in the docs, in postgres at least,
           | VARCHAR and TEXT are the same thing, and CHAR actually has
           | the performance hit.
        
             | colanderman wrote:
             | They are implemented the same, but they aren't the same
             | type. Notably, each parameterization of varchar (or char)
             | is its own type. This can cause issues when trying to
             | change the length parameter. For this reason I prefer to
             | use TEXT with a CHECK constraint.
        
         | true_religion wrote:
         | With postgress it doesn't matter. Chars are just varcars under
         | the hood, so you should always use the latter.
        
         | mpolun wrote:
         | Same with postgres. I never use specific lengths for text on
         | postgres.
        
         | brobinson wrote:
         | Postgres also has a TEXT type like this.
         | 
         | Edit: I see mpolun left a similar comment here, but it looks
         | like he has been mostly shadowbanned for about 8 years.
        
           | alextheparrot wrote:
           | Just for general interest, how can you see their comment or
           | assert they have been shadowbanned?
        
             | codetrotter wrote:
             | > how can you see their comment
             | 
             | Go to your profile and ensure you have showdead set to
             | _yes_.
             | 
             | > assert they have been shadowbanned
             | 
             | See https://news.ycombinator.com/threads?id=mpolun and
             | check out how many of their comments are showing as dead in
             | combination with the contents of said comments. (The step
             | above about setting showdead to yes might be required
             | before you follow said link in order to actually see the
             | dead comments). Almost every single one of the dead
             | comments is contributing to the conversation. This is
             | indicative of a shadow ban. HN users would not be
             | downvoting the vast majority of these comments I think.
        
           | codetrotter wrote:
           | > Edit: I see mpolun left a similar comment here, but it
           | looks like he has been mostly shadowbanned for about 8 years.
           | 
           | Yeah same. I don't have enough karma to vouch for their
           | comment but I see no big reason that said comment should be
           | dead. Their account is from 2012 and the vast majority of
           | their few comments seem to be contributing to the
           | conversation. Their first dead comment is also from 2012 but
           | at a glance said comment is the only one that stands out as
           | not contributing much to the conversation. And that's being
           | harsh even - I've probably made less substantial comments in
           | the past myself. Yet like 30% of their 3 pages of comments
           | are dead. And looking at their submissions they have ever
           | only submitted 3 stories, 2 of which appear to be from a
           | domain that they themselves control. Hardly enough to be
           | subjected to having so many of their comments killed I think.
           | Though of course there might be other factors at play, but
           | from what I see on their profile page I see nothing bad
           | enough to warrant this.
        
             | brobinson wrote:
             | I have 1500 karma, but I didn't see an option to vouch for
             | it... I have seen that option on other comments, though.
             | Maybe dang can take a look at their account.
        
         | hantusk wrote:
         | Just use TEXT in other databases as well. It really shouldn't
         | matter much in modern dbs
        
         | simnim wrote:
         | What's wrong with TEXT type for postgres, mysql, etc? In
         | Postgres you don't need to declare a length for varchar either.
        
         | [deleted]
        
       ___________________________________________________________________
       (page generated 2020-09-14 23:00 UTC)