[HN Gopher] Things I wished more developers knew about databases
       ___________________________________________________________________
        
       Things I wished more developers knew about databases
        
       Author : zdw
       Score  : 559 points
       Date   : 2020-04-22 04:45 UTC (18 hours ago)
        
 (HTM) web link (medium.com)
 (TXT) w3m dump (medium.com)
        
       | dirtydroog wrote:
       | I find SQL is quite like Python. You know what you want to do,
       | you can take a stab at attempting a couple of times but
       | eventually you'll consult stackoverflow and kick yourself for not
       | figuring it out. Repeat the above in two weeks when you've
       | forgotten it.
        
       | Jaruzel wrote:
       | Since when did Medium.com offer to log you in with your Google
       | account like this?
       | 
       | http://www.jaruzel.com/files/medium-google.png
       | 
       | I've got uBlock and Cookie Cleaner running - but it's clearly
       | doing some sort of cross-site shenanigans to create this pop up.
       | 
       | Time to stop visiting Medium.com I think.
        
         | benatkin wrote:
         | Since a long time ago. This is old news. iFrame sandboxing is
         | now a two-way street, depending on configuration. Not only can
         | an iFrame be prevented from accessing the parent frame, a
         | parent frame can be prevented from accessing the child frame.
         | It is still vulnerable to clickjacking, but to reduce the
         | impact of that, using one-tap sign up only allows the most
         | basic Google permissions.
         | https://news.ycombinator.com/item?id=17044518
         | https://developers.google.com/identity/one-tap/web
        
         | delecti wrote:
         | I've been seeing that for ages, months at least, maybe years.
        
         | [deleted]
        
         | cyral wrote:
         | > Time to stop visiting Medium.com I think.
         | 
         | Click the X and move on. It's helpful to be able to sign up for
         | a site so quickly with such integration.
        
       | stuaxo wrote:
       | I just want my databases to be normalised, and my queries
       | optimised.
       | 
       | Stuff like Django CMS is terribly architected, so code is ping-
       | ponging between DB and code.
       | 
       | This would have been a non-issue if all the APIs were built with
       | batching in mind, and the data was normalised in the first place.
        
         | jtchang wrote:
         | Just to be clear you are talking about the CMS and not django
         | itself right? Django itself is fantastic.
        
       | indymike wrote:
       | Nice! Perfect timing. I just started writing a new greenfield
       | project and decided to use Postgres instead of a NoSQL after five
       | years of NoSQL.
        
       | [deleted]
        
       | [deleted]
        
       | AtlasBarfed wrote:
       | One thing I've noticed that in the medium term of a software
       | service (2-5 years) is that your software should have the ability
       | to do double-writes to and flip reads between two different
       | datastores.
       | 
       | That will afford migration with as close to transparent migration
       | with as reduced a downtime or no downtime.
        
         | earhart wrote:
         | That gets tricky if one of the writes fails.
        
           | hobs wrote:
           | Generally you get two phase commit involved here, or an
           | asynch mirror that you can restore over if there's failure.
        
             | smilliken wrote:
             | Two phase commit makes a promise it can't keep. CAP theorem
             | is not amused.
             | 
             | It's slow, complex, a leaky abstraction, and ahould be
             | avoided for better consistency models.
        
               | hobs wrote:
               | I generally couldn't agree more, but yet its the first
               | tool in the toolbox for a lot of databases.
        
         | smitty1e wrote:
         | That sort of flexibility does not sound as though it would
         | scale well as the data get big.
        
           | madhadron wrote:
           | Define "big." Double writing in some form is the only option
           | for live migration and you either bolt it on later or plan
           | for it now.
        
             | smitty1e wrote:
             | When the overhead of the double writing becomes substantial
             | enough affect the architecture.
        
               | madhadron wrote:
               | Ah. So bigger than Facebook or Google.
        
         | jatone wrote:
         | wat. particularly today with logical replication this is pretty
         | much unnecessary.
        
       | falcolas wrote:
       | (The 80/20 rule applies below, some developers do care)
       | 
       | Developers... just don't care. They want to spin up an ORM, point
       | it at a URI, and forget about it.
       | 
       | I've fought this for over a decade now as a DBA, SRE, DevOps, and
       | architect. Most of the developers don't want to deal with
       | anything infrastructure-wise; they want to spend all the time
       | they can just focusing on the problem they're writing software to
       | solve.
       | 
       | Observeability, reliability, scalability - these are all words
       | that are translated into either "someone else's problem" or
       | "unproductive busywork" in their minds.
        
         | rietta wrote:
         | I sometimes feel in the minority. I love databases. When I work
         | in the Ruby on Rails ORM ActiveRecord I can actually visualize
         | the SQL it is generating in my head and also do all sorts of
         | tricks when needed.
        
           | petepete wrote:
           | That's the key. ORMs get a bad name but most of the time you
           | just want to display a list of things, or one thing in more
           | depth or maybe create a new thing.
           | 
           | ORMs unfortunately, have a habit of getting in the way when
           | you want to do something they don't natively support. When
           | they just ignore things the database provides people just end
           | up reinventing the wheel. Rails' implementation of enums is a
           | good example of this.
        
             | lovegoblin wrote:
             | > Rails' implementation of enums is a good example of this.
             | 
             | The advantage of ActiveRecord enums (vs db-native) is that
             | you can change the list of valid values without having to
             | run a whole ALTER TYPE - and all the overhead that would
             | entail in doing without downtime in a production db.
        
         | balls187 wrote:
         | Probably because they don't need to think about those things.
         | 
         | I started my career as an Embedded Software Engineer, and
         | memory allocation, and clock cycles hand to be managed. Our
         | software ran on systems with limited memory, and had to fit in
         | one 60hz cycle. We supported VAX system that used VAX Floating
         | point, and had to be cognizant of both floating point
         | conversions AND endian byte encoding.
         | 
         | These days, such concepts are basically just trivia answers for
         | interview questions.
         | 
         | I had to think about those things because it was required.
         | System software would crash, and debugging it on an expensive
         | government flight sim 1000 miles away was impossible.
         | 
         | Perhaps for the developers you work with, they don't need to
         | think about those things because they're not required to. After
         | all they have access to a dedicated DBA/SRE/DevOps/Architect
         | guru.
        
         | hermitdev wrote:
         | Not meaning to come off as trite, but I know my response will
         | sound like I am.
         | 
         | But, what you described is exactly why senior devs/engineers
         | make 2x, 3x, 4x a junior.
         | 
         | Seniors make more because they have that extra knowledge you're
         | talking about that can take a decade or more to accumulate. The
         | idiosyncrasies between different DBMS or even languages or even
         | specific versions of languages. That knowledge doesn't come
         | cheap. It literally takes upon years to develop.
         | 
         | This is partially why I think bootcamps are bullshit. I'm
         | sorry, but there is no way to become a competent full-stack
         | developer, especially if you're not targeting web/browser. Give
         | me someone with 3 months of bootcamp C++, I'd be surprised if
         | they can get a nontrivial program to link, let alone compile.
        
         | testnew2 wrote:
         | I don't care and that's why I use raw sql lol. I don't want to
         | learn some orm bs
        
         | throwaway13337 wrote:
         | All of those things are unproductive until after your project
         | is successful which is usually not a guarantee when building
         | the software.
         | 
         | You're not gonna need it.
        
           | hobs wrote:
           | Observeability, reliability, scalability - really easy to
           | bolt on once you are successful.
           | 
           | There's a middle ground, and a DBA turned SRE turned
           | Architect is probably ok with compromises, those are all
           | roles where "it depends" is a bylaw.
        
             | aeyes wrote:
             | How is scalability easy to bolt on? Most companies end up
             | rewriting their entires stack, multiple times.
        
             | falcolas wrote:
             | So, they really aren't that easy to bolt on, if not
             | considered from the beginning. Monoliths, for example, are
             | a real PITA to make reliable and scalable.
             | 
             | Worse, once your company is successful, there will be an
             | endless list of features to add to your product, meaning
             | nobody has the time to "bolt those features on". How the
             | product begins, is how the product often lives on well past
             | it's expected lifespan.
        
               | hobs wrote:
               | My apologies that the sarcasm was a bit too subtle - no
               | they are not.
        
             | AndrewKemendo wrote:
             | Those are absolutely not easy to bolt on.
        
         | jl6 wrote:
         | Application developers don't care about data - they want the
         | database to be a dumb storage unit because they only care about
         | their application functionality.
         | 
         | If you can convince your application developers of the prime
         | importance of data, they will start caring about their
         | database.
         | 
         | Applications come and go but data lives forever.
        
         | davedx wrote:
         | As a developer: I do care, but it's hard for me to focus on
         | building software if I also need to think about DBA tasks,
         | DevOps tasks, and so on. These things all take time, patience,
         | and energy. I've just spent most of today running and re-
         | running a CloudFormation template to create a SQL database.
         | Most. Of. A. Day. It's partly because I'm not a DevOps expert
         | and even if I wanted to be one, that would also take time,
         | patience and energy.
         | 
         | There's a reason people have specializations...
        
           | cookiecaper wrote:
           | > _It 's partly because I'm not a DevOps expert and even if I
           | wanted to be one, that would also take time, patience and
           | energy._
           | 
           | It's this exact divergence that creates the disconnect. If
           | someone doesn't understand and doesn't have to care about the
           | whole experience, they're going to focus on their side and
           | stop when their side is good enough.
           | 
           | On the other hand, if that same someone is going to be
           | regularly developing the application, making changes to the
           | database, and triggering deployments, they will find a way to
           | make the process flow. They'll make it adaptable enough that
           | it's not a day-long pain any time they need to run a
           | migration or spin up a test DB.
           | 
           | The whole toolkit can and should be available because every
           | piece of the stack opens up new possibilities. You want, and
           | at least at some level, can have, people who know this well
           | enough to make good use of it. Nitpicking over "not my
           | specialization" is the antithesis of a smooth engineering
           | process.
        
         | ffdjjjffjj wrote:
         | You'd be surprised how much rank and file developers can care
         | about observability and reliability. The key to unlocking this
         | is making them responsible for how their code runs in
         | production by adding them to the on call schedule.
        
           | vsareto wrote:
           | Too bad there isn't an equivalent on-call punishment for
           | management and others around the dev process who also make
           | mistakes and bad decisions.
           | 
           | As far as I can tell, they get paid more for doing less.
        
             | vajrabum wrote:
             | I'd guess that problem might go away if you make their
             | bonuses contingent on a sleep numbers (along with the SLAs)
             | for whoever has to run the application or it's downstreams
             | whether that's a dev or an ops type. Base the number on
             | pagerduty calls or something like so they can't game the
             | number.
        
               | edraferi wrote:
               | That is a wonderful idea. Managers are responsible for
               | the performance of their teams. NOBODY wants to sandbag
               | sleep numbers, so it'd be a decent comp metric.
        
         | FpUser wrote:
         | add to that "do not want to declare/worry about types", "cant
         | be bothered about RAM usage/cache coherency/etc", "concurrency"
         | etc. etc. All of those are "premature optimizations".
        
           | yjftsjthsd-h wrote:
           | > All of those are "premature optimizations".
           | 
           | And they are, until they aren't. And I really do mean that in
           | both directions: Much of the time, the simplest, dumbest,
           | most naive solution _is_ 100% fine for realistic load for the
           | forseeable future. And then some of the time it breaks
           | terribly and you do need to spend effort optimizing it,
           | whatever that entails.
        
             | FpUser wrote:
             | To be clear I am not advocating optimizing everything down
             | to the last bit here. Problem here is when THERE IS a need
             | many would not even have a slightest clue where to begin as
             | all those "low level" concepts were relegated into
             | oblivion.
        
         | jordache wrote:
         | >Observeability, reliability, scalability - these are all words
         | that are translated into either "someone else's problem" or
         | "unproductive busywork" in their minds.
         | 
         | hence why you work in a team.. everyone provide their domain
         | expertise.. we don't need dba to be dictating FE devs on
         | strategies for immutable state management in the UI.
        
         | arvinsim wrote:
         | Perhaps development work has reached complexity where it
         | demands full-time attention?
         | 
         | Perhaps there is a reason why DBA, SRE, Devops and Architects
         | are separate roles?
        
           | cookiecaper wrote:
           | > Perhaps there is a reason why DBA, SRE, Devops and
           | Architects are separate roles?
           | 
           | It's mainly an artifact of the way we've broken up degree
           | tracks, and the boundaries that each group is taught to stay
           | inside, lest any particular group actually ends up culpable
           | for a failure.
           | 
           | Make fun of the "full-stack rockstar ninja" all you want, but
           | the reality is that it is possible to have a functional
           | understanding of all of these areas. It's just a matter of
           | having the confidence and doing the legwork.
           | 
           | To the extent that more of your people have a working
           | knowledge of these domains, you'll not only have a better end
           | product, but a _much_ easier time getting stuff done.
           | 
           | There's room for hyper-specialized expert consultants in each
           | field, of course, but the myth of the myth of the full-stack
           | developer exists primarily for political convenience. Most of
           | this stuff is not any harder than the rest of it, and can be
           | learned by anyone willing to sit down and learn it.
        
             | msla wrote:
             | The phrases "functional understanding" and "working
             | knowledge" are gigantic sucking tarpits.
             | 
             | Before I got my current job, I felt confident in my
             | knowledge of computer networking at the LAN level. I knew I
             | wasn't going into the telecom world and I knew I didn't
             | have the knowledge to debug BGP or ensure a CO was doing
             | everything right, but LANs? Sure. No problem. I knew DHCP,
             | Ethernet, TCP/IP, even stuff like PPP which is more niche
             | now. Heck, I'd even passed a college course on the subject.
             | OK... set up an ICMP server and make it useful. That's LAN,
             | right? Certainly gonna be used on a LAN.
             | 
             | I'm not saying it was hard. I'm saying that I'd never
             | touched ICMP before except for ping and didn't know what
             | the more advanced stuff even was. Did I have a "working
             | knowledge" of basic networking? Did I have a "functional
             | understanding" of how to get a building full of computers
             | to talk to each other?
             | 
             | It's always something. I thought I had a good, working
             | knowledge of networking. Someone who'd actually _done_
             | networking in a corporate setting would have disagreed, and
             | pointed to a list of things I 'd never touched because
             | those things aren't useful in SoHo LANs and aren't
             | theoretical enough for a classroom. Multiply that by a few
             | dozen topic headings and watch people sink under the load.
        
               | cookiecaper wrote:
               | Unless you got yourself into a situation where you were
               | expected to set up a whole office with the same speed and
               | expertise as a full-time network engineer based on some
               | gross misrepresentation of your skillset, I don't see how
               | this story is particularly relevant. Maybe it's a good
               | cautionary tale about presuming that SoHo is the limit of
               | networking?
               | 
               | Technical topics are indeed both very deep and very
               | broad. The level of sophistication and depth is how you
               | choose your specialization, but that doesn't mean you're
               | never allowed to learn anything else. You should learn
               | enough about each field to know the shores when you're
               | standing there, to be able to communicate with the
               | "natives"/specialists, and to know when you're getting
               | out of the shallow end. This expectation should exist for
               | everyone: DBA, application developer, devops, network,
               | whatever. They should all know the territory and be able
               | to work together cohesively to identify the best place to
               | take something down deep.
               | 
               | Depending on the constraints of the project, leaving the
               | shallows means either a) developing more proficiency
               | directly and getting deeper yourself; or b) acknowledging
               | that you need someone with more expertise in that area to
               | take it from there while you go back to handle things in
               | areas you know better.
               | 
               | The thing we must avoid is "well I'm not a network
               | engineer so I don't look at Cisco configs, sorry". That
               | should be replaced with "well I'm not a network engineer,
               | so I have no idea what's happening here, but it's still
               | interesting, can I sit behind you and learn?"
        
               | msla wrote:
               | > Unless you got yourself into a situation where you were
               | expected to set up a whole office with the same speed and
               | expertise as a full-time network engineer based on some
               | gross misrepresentation of your skillset, I don't see how
               | this story is particularly relevant.
               | 
               | Taking a job as one of a business's "computer people"
               | puts you in the path of a whole lot of interesting tasks,
               | even if your main job is programming.
               | 
               | > Maybe it's a good cautionary tale about presuming that
               | SoHo is the limit of networking?
               | 
               | It's certainly that, but I want to expand on this a bit:
               | SoHo is the only stuff most people can play with. For
               | example, I can make programs and package them in Docker
               | containers and run them that way, but I don't know how
               | I'd play with Kubernetes in a realistic fashion. There's
               | whole genres of technology most people can't get
               | realistic access to without some institutional support.
               | It's an effective ceiling on some kinds of knowledge.
               | 
               | As far as learning how to learn, I agree with you. I
               | think a lot of it comes down to vocabulary: Once you know
               | the terms the experts use, you can bootstrap effectively
               | and learn more terms and bootstrap even more effectively.
               | Plus, words have a way of coming back to you at odd
               | intervals, effectively dropping you hints when you see
               | something you vaguely recognize.
               | 
               | Maybe we should all have Word Of The Day calendars.
        
               | cookiecaper wrote:
               | > _For example, I can make programs and package them in
               | Docker containers and run them that way, but I don 't
               | know how I'd play with Kubernetes in a realistic
               | fashion._
               | 
               | Minikube: https://kubernetes.io/docs/tasks/tools/install-
               | minikube/
               | 
               | Alternately, you'd take advantage of $CLOUD_PROVIDER's
               | initial signup credit and start cloud instance
               | equivalents.
               | 
               | Nowadays most things have good virtual environments
               | floating around (you can even download virtualized
               | mainframes if you want). A little bit of time tinkering
               | with such environments will take you surprisingly far --
               | _especially_ in fields like network engineering, where
               | even most professionals don 't know how to experiment.
        
           | commandlinefan wrote:
           | What, you're not a full-stack rockstar ninja?
        
             | mgkimsal wrote:
             | In either case, someone needs to have moderate expertise in
             | multiple areas of tech. If it's wrapped up in one person,
             | the business needs backups to deal with the bus factor. If
             | it's spread across multiple people, you now need to be
             | selecting for people with moderate tech expertise _AND_ the
             | ability to communicate and work together with other people
             | effectively. And you need someone to manage their process
             | (in some manner - doesn 't necessarily mean micromanaging).
             | 
             | So the 'full-stack rockstar', if they have good
             | communication skills, can exist and be valuable, but still
             | isn't the best long term solution for a business of any
             | size.
        
           | falcolas wrote:
           | A DBA can't fix a developer's code. A good 3/4 of the list
           | linked in TFA are things a DBA or Architect can't affect.
           | 
           | At some point, a developer needs to be responsible for how
           | they use a tool that's been provided for them. Abstractions
           | break.
        
             | gedy wrote:
             | Well likewise, many of the DBAs I've worked with don't try
             | and understand the software needs, and want the world to
             | revolve around what is optimal for them.
        
         | vsareto wrote:
         | Many interests are pulling developers' attention in several
         | different areas all the time.
         | 
         | Database, security, accessibility, performance, infrastructure,
         | tooling and productivity, business concerns, workflow processes
         | (agile), language concerns, new things
         | 
         | All of these like to say "if only the developer could do
         | $MY_AREA better, they'd be better developers and we'd have
         | better software". Each of them wants to pile on more
         | requirements for what devs ought to know.
         | 
         | Let's say we agree that devs should know all of these things.
         | After you tally up every area's demand, you're probably looking
         | at a 10 year timeline before someone's decent in all of these
         | areas by spending approx. 1 year on each of these areas doing
         | meaningful work (as opposed to contrived tutorials).
         | 
         | Even then, things change constantly, so you'd have to
         | continually practice all of these things. Who knows if we'll
         | add a new category next year? Then it becomes an 11 year
         | timeline.
         | 
         | At some point, developer responsibility has to stop.
        
           | harikb wrote:
           | Not really. That is exactly the point the article is trying
           | to make. Developers need to care about these things - _enough
           | to know who to go get help from_. That is the minimum. Also
           | 10 years is an exaggeration. 2 years working on a non-trivial
           | backend should expose one to these problems.
           | 
           | From what I have seen, products built without caring about
           | these will usually get rebuilt a year from the original
           | release - either by the same company or by a competitor who
           | killed them.
        
             | mgkimsal wrote:
             | > 2 years working on a non-trivial backend should expose
             | one to these problems.
             | 
             | You can be exposed to them, but without understanding them,
             | and experiencing both good, bad, and really bad 'solutions'
             | to them, and understand the impact (on the business, on the
             | code, on security, on maintainability, etc)... you just
             | can't really _get_ all that in 2 years.
             | 
             | I know plenty of people who've been 'exposed' to certain
             | type of tech problems, and the solutions they decide on are
             | objectively really bad for any metric other than "stop this
             | error from showing up on the screen right now".
             | 
             | I've been doing this for a bit over 25 years,
             | professionally now, and... there's a lot I don't 'get' with
             | current stuff. But I've seen and lived enough projects, in
             | enough different situations, to have a good idea of impact
             | of tech decisions, and to understand how to make tradeoff
             | decisions.
             | 
             | I had someone call me up to 'fix' a problem in a system I'd
             | given then 15 years earlier. It was still running, more or
             | less the same, and... spelunking your own code 15 years
             | later gives you a new perspective on the impact and value
             | of decisions you make. Many of the things people get hung
             | up on (code style, tabs/spaces/etc, particular naming
             | conventions, etc) provide pretty much no value in digging
             | in to old code that no one has touched or thought about in
             | a decade. Correct comments, sample data, repeatable tests
             | hold so much more value, but are harder to get people to
             | commit to following through on.
        
               | bdamm wrote:
               | I wish I could slap anyone who gives a hoot about tabs vs
               | spaces. Fortunately modern languages like go are removing
               | the version control problem that not caring about style
               | and using auto-formatting IDEs produces.
        
               | PunksATawnyFill wrote:
               | The solution isn't to force style on programmers within
               | the language. It's to fix the shitty diff tools that flag
               | whitespace changes as significant.
        
               | lowercased wrote:
               | when people care whether you have
               | if(foo) {
               | 
               | or                   if (foo) {
               | 
               | or                   if (foo)          {
               | 
               | you end up in pointless arguments. I don't care about
               | this sort of formatting very much (I have my own default
               | style developed over years), but I do care when other
               | people care about it, often to the exclusion of other
               | factors.
               | 
               | "but we need these tools so that we don't argue about how
               | to format code". Well... you could... just _not argue_
               | about it in the first place.
        
               | lowercased wrote:
               | I'm working in a couple of projects where there's a bunch
               | of linter-checker things that prevent any PR merges
               | (another... imo somewhat over-used tool) and... I split
               | my time between Java, PHP, various SQL engines and
               | various JS frameworks (react, extjs, vue, etc) and I'm
               | constantly battling different mental models with various
               | IDEs always showing different colored squiggles and
               | highlights telling me all the ways I'm "wrong" about the
               | code I've just typed.
               | 
               | Can't use double quotes! Always to use double quotes!
               | "Prefer string/template interpolation" in JS Always do
               | string concatenation per another project's standards.
               | 
               | Lots of different frameworks, languages, projects and
               | companies all force different types of ceremony on
               | formatting. "Hey, just let the tooling tell you!" turns
               | in to constant UI distractions telling you that you're
               | 'wrong' degrades (my?) performance. And... the value of
               | most of these formatting things is pretty low, long term.
               | I know that's heresy to some folks, and everyone I talk
               | to sort of agrees, then says "yeah, but I really think
               | standard XYZ is a good thing", but... it's nearly all
               | preference, just like tabs/spaces.
        
               | yjftsjthsd-h wrote:
               | That sounds like bad tooling. Why not have every project
               | use https://editorconfig.org/ and then have your IDE
               | auto-format? It shouldn't be popping up and making _you_
               | fix it, it should fix it for you.
        
               | lowercased wrote:
               | So there are tools that autoformat as you type?
               | 
               | let longVariableName = "hello " + this.name;
               | 
               | magically becomes
               | 
               | const longVariableName = `hello ${this.name}`;
               | 
               | without visually flagging it, just autocorrects as I
               | type?
               | 
               | And... I can just quickly swap settings for different
               | clients? Because one has eslint _block_ any PRs that don
               | 't have "prefer template" rules followed but another
               | client doesn't like that style, and don't want that style
               | in their code because it conflicts with existing style.
        
               | yjftsjthsd-h wrote:
               | I don't think that counts as a formatting issue. Yes, if
               | your clients have hard rules about different coding
               | styles at that level then it's not a technology problem
               | (nor is it likely solvable with technology). I assumed we
               | were talking about formatting issues like tabs-vs-spaces,
               | in which case yes every single project could be different
               | but auto-fixed.
        
               | lowercased wrote:
               | thx. sorry, i sounded a bit snarky before and wasn't
               | meaning to be. it's just easier for people to focus on
               | visual issues vs operational/functionality. and switching
               | between multiple projects/clients/standards illustrates
               | to me how relatively unimportant some of these things are
               | (but of course just imo).
        
               | infogulch wrote:
               | I would encourage you should check out editorconfig more
               | closely. The whole idea is that each project has a file
               | that defines the simple formatting rules for various file
               | types in the project/directory tree and your editor will
               | automatically follow them on a per-project basis. It's
               | surprisingly well supported across editors.
        
               | wwweston wrote:
               | > using auto-formatting IDEs produces.
               | 
               | I strongly suspect a lot of the remainder of the tabs vs
               | spaces reminder is actually about how it either
               | constrains editor / IDE choices OR requires and
               | investment of time to deal with whichever choice someone
               | else made.
               | 
               | "Why do you care, my IDE just handles this" is pretty
               | close to saying "use my IDE," on top of "use my
               | convention."
        
               | infogulch wrote:
               | That's why people are asking to push this problem down to
               | the language level the way Go has done. Go defines both
               | the "correct" format of the code as well as a standard
               | way for any/every editor to enforce it (gofmt etc). That
               | eliminates the double-headed subjectivity of both "use my
               | IDE" AND "use my convention" down to just "use the
               | standard convention defined by the language". And people
               | _love_ it because finally we can stop arguing about the
               | stupid color of paint for the bikeshed and just bloody
               | build the damn thing.
        
             | vsareto wrote:
             | >Not really. That is exactly the point the article is
             | trying to make. Developers need to care about these things
             | - _enough to know who to go get help from_.
             | 
             | That isn't it. They aren't telling you to learn about
             | database details for the sole purpose of "DBA handles
             | databases. Go ask the DBA database questions".
             | 
             | It's more than that. They're offloading specific knowledge
             | onto the dev and then making them accountable for it. It's
             | a reaction to common questions and an attempt to answer
             | them all at once by teaching devs answers to common
             | questions. This is a noble goal, but it's problematic _in
             | aggregate from multiple perspectives_.
             | 
             | The aggregate goal of all of these areas trying to teach
             | developers their own specialties is to make developers the
             | masters of low-hanging fruit.
        
             | [deleted]
        
           | hatchnyc wrote:
           | I have done exactly this and found it to be a frustrating and
           | thoroughly thankless exercise. No stakeholder for development
           | teams care about any of this, and your allocation of shared
           | resources plummets as people realize they can dump problems
           | on your team and instead focus their attention on the ones
           | who do not care.
        
           | debaserab2 wrote:
           | I don't agree. If you're designing data structures in a code
           | base you shoulder some of the responsibility for the
           | persistence characteristics of that data.
           | 
           | There's a lot of devs that think database design is the same
           | as starting a new ORM class and generating a migration file.
           | 
           | > Database, security, accessibility, performance,
           | infrastructure, tooling and productivity, business concerns,
           | workflow processes (agile), language concerns, new things
           | 
           | Yes, these are all things that devs should strive to know as
           | much about as possible. Software isn't easy. It takes a long
           | time to become an expert. 10 years sounds about right.
           | 
           | > Who knows if we'll add a new category next year?
           | 
           | Skill domains do come and go, but I think the ones you've
           | listed are solid staples of web development for the past
           | decade and likely will still be for a decade more.
        
             | vsareto wrote:
             | > I don't agree. If you're designing data structures in a
             | code base you shoulder some of the responsibility for the
             | persistence characteristics of that data.
             | 
             | There's usually not a relationship that goes the other way
             | though, for example, developers don't tell DBAs to pick up
             | code so they can write the models in our language in
             | addition to the underlying SQL. This highlights a trend of
             | increasing responsibilities pushed onto the developer.
             | 
             | >Yes, these are all things that devs should strive to know
             | as much about as possible. Software isn't easy. It takes a
             | long time to become an expert. 10 years sounds about right.
             | 
             | But in some cases it's specialists designating what an
             | expert developer should know. It's giving away control in
             | some respects. This turns into new job requirements and a
             | higher barrier for entry. The growth will need to stop at
             | some point.
        
               | AsyncAwait wrote:
               | Agree with this. There's the whole DevOps nowdays as
               | well, which basically just shifts what used to be an
               | entirely separate, full-time, role onto the developer.
               | Adding DBA to that sounds like it would benefit noone,
               | except perhaps business owners looking for short-term
               | savings at the expense of productivity, ala open-office
               | floor plans.
        
               | debaserab2 wrote:
               | Keep in mind that full time role used to be a hell of a
               | lot of work. Take the project I'm working on right now: h
               | ttps://gist.github.com/ajbdev/11f494906ebb6aa3f6bdf35f243
               | 7e...
               | 
               | In ~50 lines of code I can erect a load balanced set of
               | web servers with my app binary preloaded, secured in a
               | private VPN subnet with an auto-scaling policy attached
               | to it.
               | 
               | 15 years ago this project would have meant:
               | 
               | - Racking new hardware somewhere on premise
               | 
               | - Configuring several routers and networking equipment
               | 
               | - Tweaking the different software responsible for the
               | different app layers (load balancing, web servers, etc)
               | 
               | - Hoping we forecasted demand correctly and whatever
               | hardware we racked can handle a traffic spike or growth
               | surge
               | 
               | You needed a lot more expertise back then.
               | 
               | And really you can trade all of what I'm doing for a
               | higher level of abstraction that requires even less
               | knowledge of the underlying technology (netlify, heroku,
               | amplify, etc) if you're willing to pay a little more.
        
               | debaserab2 wrote:
               | > There's usually not a relationship that goes the other
               | way though, for example, developers don't tell DBAs to
               | pick up code so they can write the models in our language
               | in addition to the underlying SQL. This highlights a
               | trend of increasing responsibilities pushed onto the
               | developer.
               | 
               | Yes, because as a developer you're the one who has the
               | responsibility of implementing the business requirements
               | of the app. There's no trend here; this is the way it's
               | always been. The buck stops with development for a lot of
               | things. The developer is in a unique position to respond
               | to many incidents because they have an intimate
               | understanding of how the business requirements wed with
               | the technology in ways someone like a DBA does not.
               | 
               | DBA's have plenty of responsibilities of their own, such
               | as handling a 3AM alarm that goes off when some part on
               | the application starts hammering the DB from some poorly
               | designed N+1 query problem in the codebase. Often times
               | when the DBA tries to teach the developer it's because
               | he's sick of getting those 3AM wake-up calls.
               | 
               | > But in some cases it's specialists designating what an
               | expert developer should know. It's giving away control in
               | some respects. This turns into new job requirements and a
               | higher barrier for entry. The growth will need to stop at
               | some point.
               | 
               | That's a strange mindset to have. Different jobs have
               | different technologies, and as a developer you learn how
               | to work with them. The specialists aren't designating
               | anything, the job you're responsible for is.
        
               | vsareto wrote:
               | >Often times when the DBA tries to teach the developer
               | it's because he's sick of getting those 3AM wake-up
               | calls.
               | 
               | And if he just learned to code, he could write all the
               | queries himself, check in his own code, and never have
               | 3AM calls. I see no reason why a DBA couldn't manage to
               | write queries in source, especially with some hand-
               | holding w.r.t. source control and check-in rituals. If
               | they are already doing too much, just hire or train more
               | of them. DBAs are equally as smart as devs. Loop DBAs
               | into the business requirements.
               | 
               | >That's a strange mindset to have. Different jobs have
               | different technologies, and as a developer you learn how
               | to work with them. The specialists aren't designating
               | anything, the job you're responsible for is.
               | 
               | This is very push/pull about who should own what and I
               | personally think devs are often responsible for too much.
               | In silo'd places, make no mistake that input is gathered
               | about who owns what. In more collaborative environments,
               | you see the other sides willing to step in and share the
               | responsibilities; blurred lines, but they are still
               | there. What we've adopted is an over-reliance that a dev
               | will just learn many things and we hope he's good enough
               | at all of them that we don't really need deep knowledge
               | or need to delegate work.
        
           | AndrewKemendo wrote:
           | 10, 11 year timeline to what?
           | 
           | You should be constantly learning everything about the whole
           | stack so that you can actually build functional, reliable,
           | manageable and maintainable systems.
           | 
           | I expect a competent developer to be able to build a modern
           | multi-page web application, with a HTML/JS front end,
           | relational database back end, appropriately configured
           | certificates and DNS/CNAME/URL, build basic uptime and
           | application monitoring and do a basic SQL ETL data retrieval
           | process.
           | 
           | That seems like a reasonable bar, and while the specific
           | tools have changed over the years, that stack is basically
           | the same as it's been since the 90s.
        
             | StandardFuture wrote:
             | You seem to be contradicting yourself without realizing it.
             | 
             | > You should be constantly learning everything
             | 
             | This is an in-progress action i.e. the developer is still
             | learning.
             | 
             | > I expect a competent developer to be able to build
             | 
             | This is now considering a "learned" developer.
             | 
             | I am not sure you are making the point you think you are
             | making. The point I _think_ you are trying to make is your
             | expectations of what an experience developer should _know_.
             | But, you seem to be expressing it as what a new developer
             | should be _doing_.
             | 
             | While the core discussion of the article might be in
             | regards to what developers do and do not know, I can't help
             | but notice that a developer knowing about something does
             | not necessarily allow them to be productive in that area of
             | their knowledge (especially relative to another co-worker
             | with both the knowledge and the dedicated focus in that
             | area).
             | 
             | Also very important to note: the comment you are addressing
             | seems to be referring to knowledge that is _local_
             | (particular customer necessity /problems, particular
             | architecture choices for infrastructure, particular product
             | design decisions, particular ways to answer the quirky
             | CEO/CTO in a way that _they_ understand, etc.). There is a
             | lot of locale-based knowledge that a developer must learn
             | at a company /job/project and can even change over time
             | (temporal-locales).
             | 
             | Globally-applicable knowledge like frontend, backend, and
             | general CS concepts are for sure a reasonable expectation
             | of an experienced developer. But, there is a delicate
             | balance a developer must take in the _real_ working world
             | that is subject to _not_ attempting to master every aspect
             | of the product /business (especially if it overlaps with
             | someone else's job/focus) just because you have a high-
             | level understanding of the global concepts. In other words,
             | it is not necessarily efficient for a developer to know
             | every aspect of every language and every database in the
             | company unless that actually buys the company more
             | customers and money.
             | 
             | I would expect any decent manager to understand this very
             | basic principle. Everyone in the company trying to be a
             | master at everyone else's job does not help the company
             | make more money. Being reasonable about expectations in the
             | moment is also a critical asset of working together to make
             | money. :)
        
               | jatone wrote:
               | hes not contradicting himself, the fact is there are a
               | large percentage of developers who just don't care. its
               | not a matter of still learning. they just don't care to
               | learn.
        
               | StandardFuture wrote:
               | To be clear, I was not trying to say his points were void
               | of validity. I was trying to append some clarity to
               | properly differentiating between a developers general
               | knowledge base, a developers temporary knowledge base,
               | and the actual day-to-day _doings_ of the developer. They
               | are not equivalent sets of things even if they intersect.
               | 
               | As a personal anecdote, I know I have _learned_ many
               | things on a project that really helped improve the code,
               | that today I am not able to recall and would have to go
               | back and re-learn with a minimum of a refresher. This
               | happens a lot too. :)
        
             | vsareto wrote:
             | > 10, 11 year timeline to what?
             | 
             | To be decent at each of those areas I listed.
             | 
             | Why those 10 specific things?
             | 
             | Those are all areas I've seen "Developers should learn X"
             | calls-to-action.
             | 
             | Sometimes it's outright loathing over developers that don't
             | know specific topics. Other times it's wishful thinking, or
             | maybe a nice-to-have.
             | 
             | If you, as a developer, read all of those posts, they
             | probably all make fair points about the importance of
             | knowing each of those things as they relate to development.
             | You decide there is some merit to learning it.
             | 
             | So you decide to make a to-do list to go learn each of
             | those topics, because you want to listen to the blog posts
             | and be a good developer, and do some real work to prove you
             | know it. That's what's going to take you a while.
        
           | throwaheyy wrote:
           | Exactly. Without abstractions, developers would never be able
           | to get anything done. Every task would be dozens of new
           | rabbit holes.
        
       | Dowwie wrote:
       | "The fastest way to access to a row in a database is by its
       | primary key. If you have better ways to identify records,
       | sequential IDs may make the most significant column in tables a
       | meaningless value. Please pick a globally unique natural primary
       | key (e.g. a username) where possible."
       | 
       | Has anyone had a problem due to surrogate keys?
        
         | ComodoHacker wrote:
         | I can agree with everything in the article except this one.
         | 
         | >Has anyone had a problem due to surrogate keys?
         | 
         | There's one problem with surrogate keys: they are not
         | convenient to users (too long and not meaningful).
         | 
         | There are two problems with natural primary keys, and you are
         | guaranteed to hit one of them at some point.
         | 
         | 1. It turns out your key isn't actually unique. To resolve the
         | collision you have to replace natural key with a surrogate for
         | one of conflicting entities. Which is not always possible
         | without risk of another collision.
         | 
         | 2. It turns out your key isn't persistent. You have to change
         | it for some entities, but you can't because of so many FKs.
        
           | Scarbutt wrote:
           | _There 's one problem with surrogate keys: they are not
           | convenient to users (too long and not meaningful)._
           | 
           | You shouldn't expose those keys to users though.
        
       | hermitdev wrote:
       | > You are lucky if 99.999% of the time network is not a problem.
       | 
       | This reminds me of one time I was having networking issues
       | (around a CRUD GUI). Oddly, I identified the issue the (business)
       | day before my users did. I was working on a rather large change
       | (OS updates, 32->64 bit) and noticed that multicast updates
       | broke. The timing was funny. I was working towards a minimal
       | reproducible example already, and I had a small test app that
       | showed that multicast was broke on my PC. So, I ran down and
       | tried the app on their computers, and it worked... between the
       | two of them, but not the company et large.
       | 
       | That's when I vaguely remembered an email for Network Ops about a
       | router change on my user's floor over a weekend. I went by the
       | Ops team (they didn't like email for some reason) and told them
       | what I was seeing.
       | 
       | The short of it was, yeah, the router change screwed my 2 users
       | on the floor (they got put on the printer VLAN by accident, which
       | didn't receive multicast). Separately, the issue I saw was a bug
       | in how we built a 3rd party lib on Windows that provided 64-bit
       | multicast support.
        
       | mattferderer wrote:
       | > AUTOINCREMENT'ing can be harmful
       | 
       | I'll add that they should never be trusted to not jump around
       | either! I imagine everyone makes this mistake at least once in
       | their life.
       | 
       | There is a very high chance that the database will skip a few
       | numbers from time to time. You will then have someone from an
       | accounting department asking where Record #XX is.
        
         | stronglikedan wrote:
         | I'd posit that autoincrements are fine for primary keys, but
         | primary keys aren't fine for auditing.
        
           | csours wrote:
           | Auto-increments are fine for primary keys, until they aren't.
           | I think the list of items from the linked article are things
           | that may cause delayed problems.
        
           | LoSboccacc wrote:
           | how would you generate hole-free sequences for use cases like
           | that of parents (invoicing)?
        
             | danparsonson wrote:
             | You could do something like this.
             | 
             | Create a table with two columns:
             | 
             | 1. auto-incrementing primary key
             | 
             | 2. integer for the actual number you're generating, let's
             | call it 'IDValue'
             | 
             | Seed the new table with a single row with IDValue set to
             | one less than your minimum value (say 0), then use the
             | following process to generate a new number:
             | 
             | 1. Insert a new row into the table, with a known invalid
             | value (e.g. -1) for IDValue (note this must not be the same
             | as the IDValue from your initial row)
             | 
             | 2. Get the primary key of the newly inserted row
             | 
             | 3. Get all the rows from the table (in primary key order)
             | with primary key < the new id. This will consist of one or
             | more rows of prior valid values (or just the initial seed),
             | followed by one or more rows that are either valid or
             | invalid values (other clients may be running through this
             | process concurrently and finishing at different times) -
             | something like this:
             | 
             | Key / IDValue
             | 
             | 61 / 1119
             | 
             | 62 / 1120
             | 
             | 64 / 1121
             | 
             | 65 / -1
             | 
             | 67 / 1123
             | 
             | 70 / -1
             | 
             | 71 / -1
             | 
             | (your row is the next one after this)
             | 
             | 4. Your new IDValue == the last valid IDValue in that set
             | of rows + the number of rows between that and your new row
             | + 1 - update your row with this new value - in the above
             | example, 1123 + 2 + 1 - i.e. 1126
             | 
             | 5. Delete the first unbroken sequence of valid rows except
             | for the latest one, to keep the table small but leave at
             | least one valid IDValue (IDValues 1119 and 1120 in the
             | above example) - just something like DELETE FROM table
             | WHERE Id < 64 (in this case) should be safe.
             | 
             | The database takes care of atomically creating rows which
             | is the tricky bit, and then you can generate your own
             | number at your leisure regardless of gaps in the Key
             | numbering sequence.
        
             | uhoh-itsmaciek wrote:
             | Triggers are a great option in many situations.
        
             | marcosdumay wrote:
             | Usually, you don't. What kind of messed up requirement is
             | that? Do you want the sky to be green at sunrise tomorrow
             | too?
             | 
             | On the exceptional case where the user is on the clear and
             | there is an unavoidable reason to create this beast, you
             | create a separate numbering application, that uses database
             | transactions that are independent from the ones of the main
             | application and only cares about numbering. It will still
             | have holes, but few enough that you can manually inspect
             | every so often and explain why they happened.
        
             | hderms wrote:
             | Not the OP but perhaps if only accounting cared about it
             | you could assign these IDs separately as some asynchronous
             | single threaded process. This would only work if insert
             | rates were low enough or the assignment of the gapless
             | sequence only needed to be eventually concistent over some
             | window of time (i.e. before next payroll we must have
             | assigned gapless IDs to all the new user accounts)
        
             | balls187 wrote:
             | If it was a business requirement that you have perfectly
             | sequential invoice numbers with no gaps, do it at the
             | application level, not at the storage level.
             | 
             | Let the database do what it's great at doing: efficiently
             | store and retrieve data.
        
               | seppel wrote:
               | A database is also exceptionally good at doing
               | transactional stuff like atomically incrementing
               | something. I'd even say: This is something that belongs
               | in the database and not in some brittle application
               | logic.
        
               | mtVessel wrote:
               | Generally, I agree. But in many RDBMSs, auto-increment
               | features explicitly do not guarantee gap-less sequential
               | ordering. If anything throws, it's easy to end up with
               | discarded numbers, in which case you need to catch all
               | errors, inspect the state of the sequence and/or the
               | entity you're populating, and re-seed the sequence. At
               | this point, I'd argue you've already left the pristine
               | gardens of set theory and wandered into the thorny
               | brambles of app dev.
        
               | LoSboccacc wrote:
               | ah! TIL:
               | 
               | MYSQL - https://stackoverflow.com/questions/449346/mysql-
               | auto-increm...
               | 
               | POSTGRES -
               | https://stackoverflow.com/questions/2095917/sequences-
               | not-af...
               | 
               | thanks!
        
               | vivekseth wrote:
               | I don't think it is not possible to enforce perfectly
               | sequential numbers at the application level.
               | 
               | This is the same problem as multiple threads trying to
               | increment the same global variable. Unless there is
               | mutual exclusion while the variable is being
               | read/incremented there will be race conditions.
               | 
               | I think the only way to enforce mutual exclusion for
               | applications would be at the database layer (or any other
               | layer where there is 1 resource that is shared between
               | each application peer).
        
               | dkhenry wrote:
               | I think by "application level" they are saying "using
               | transactions at the application level". If you require a
               | strict sequence of numbers then you BEGIN, READ, INSERT,
               | COMMIT, that is really what you want so just do it
               | explicitly. If the COMMIT fails because of a duplicate
               | key then you start over.
        
         | hobs wrote:
         | There's a simple fix - Don't you ever expose primary keys to
         | the users. Ever. Seriously, ever.
         | 
         | A primary key is not an order id, it isnt a person identifier,
         | it isnt a paycheck - its a thing the database should be using
         | behind the scenes.
         | 
         | All of the things I just mentioned change - besides the primary
         | key.
         | 
         | You'll never have this problem if you separate your business
         | logic from your keys.
        
           | Terr_ wrote:
           | Quite. Not only does it protect you from architecture pain
           | later, but it also allows you to create business-keys that
           | humans have an easier time reading, comparing, and typing.
           | 
           | The implicit requirement of a global always-online single
           | master counter is great until it isn't. Perhaps one day it
           | becomes a performance bottleneck, or you have to support
           | systems that create provisional items offline, or some law or
           | client-contract stipulates everything for them must live in a
           | daughter-system hosted inside/not-inside a particular
           | country...
        
           | Macha wrote:
           | This is why I'm skeptical of the suggestion to prefer a
           | natural primary key like a username. It works fine... until
           | they day the business asks for changeable usernames because
           | BigClient is now LargeClient and can't stand anything to
           | still have their old brand identity.
        
             | hobs wrote:
             | Even in the natural key's playground (in my mind) of data
             | warehousing, you still have to manage changing dimensions,
             | and the inevitable reality that your fact table was
             | actually another dimension table all long.
        
       | wenc wrote:
       | I never realized this before but many excellent developers
       | struggle with SQL beyond simple SELECT statements. I have a
       | colleague who is by all accounts a deeply technical person but
       | one day he confessed to me that he didn't really grok SQL and
       | that he'd rather work with a "real" procedural programming
       | language to just store and retrieve data.
       | 
       | Part of it may be due to the fact SQL isn't really a programming
       | language but a declarative DSL for manipulating sets and tables.
       | Things like GROUP BYs and PARTITION BYs (window functions) that
       | come naturally to mathematical types/functional programmers are
       | less intuitive to procedural programmers.
       | 
       | I suspect this was what attracted developers to noSQL databases
       | like Mongo in the first place -- it's more attuned to a
       | programmatic mindset.
       | 
       | (this is not universally true of course -- many programmers have
       | no issues with SQL at all.)
        
         | MaxBarraclough wrote:
         | > I suspect this was what attracted developers to noSQL
         | databases like Mongo in the first place -- it's more attuned to
         | a programmatic mindset.
         | 
         | Well, it's more attuned to the dynamically typed mindset, sure.
         | Programmers who understand the value of static type systems
         | should understand the value of relational schemas.
        
           | andybak wrote:
           | I disagree. There's more to relationships that typing. I'm
           | very pro-dynamic languages and still chafe at static typing
           | but the wonder of the relational model fits nicely with my
           | liking for declarative and functional approaches.
           | 
           | (EDIT - and as another data point I dislike SQL's syntax. The
           | semantics are bearable but the syntax just makes my brain
           | melt)
        
             | MaxBarraclough wrote:
             | They're not identical concepts, but both relational schemas
             | and statically typed programming languages provide
             | assurances about basic structure. Mongo, and Python, offer
             | no such assurances, and leave it to the developer to get it
             | right.
             | 
             | Agree that SQL's syntax is rather bad.
        
               | MaBeuLux88 wrote:
               | MongoDB supports Schema Validation since MongoDB 3.2
               | (Dec. 2015) and JSON Schema since MongoDB 3.6 (Nov. 2017)
               | so MongoDB can enforce types strictly if you want to at
               | the collection level. NoSQL doesn't mean NoSchema or
               | NoTypes. https://docs.mongodb.com/manual/core/schema-
               | validation/#json...
        
               | MaxBarraclough wrote:
               | So they bolted it on eventually then. The relational
               | databases went the other way and bolted-on schemaless
               | data [0] [1]. We can still compare the schema-based and
               | schemaless approaches.
               | 
               | [0] https://www.postgresql.org/docs/10/datatype-json.html
               | 
               | [1] https://docs.microsoft.com/en-
               | us/sql/t-sql/functions/json-qu...
        
         | brightball wrote:
         | I firmly believe that every developer should spend 2-3 weeks
         | early in their career working with nothing but SQL. It will pay
         | huge dividends for the rest of it.
         | 
         | IMO a lot of the issue is that developers for many years using
         | Java or PHP, were using SQL to handle everything. The
         | application language was a pass through later between the
         | client and the database.
         | 
         | Your goal was to accomplish as much as possible in a single
         | query and then to simply return the results of that query to
         | the interface. That meant formatting numbers or currency in
         | your SQL. Optimizing inserts or updates to be handled in a
         | single query. Grouping, counting, left/inner joins, having
         | statements to filter on aggregate results. More than 1 or 2
         | queries for the primary area of the screen was both a rare and
         | foreign experience.
         | 
         | And then ORMs started to slowly integrate themselves into the
         | flow of various frameworks to automate the repetitive things
         | around CRUD tasks. Then to address scaling & bloat problems we
         | saw an uptick in REST APIs, microservices that further made
         | those ORMs the norm...and then many developers started actively
         | trying to stay within those API constraints to an almost
         | religious degree which led to a nested payload becoming
         | acceptable fueling the whole "NoSQL" situation, along with the
         | idea that it was somehow better to repeat the same data
         | thousands of times over.
         | 
         | A whole lot of people pushed back against this and eventually,
         | it mostly ran its course. I've often seen resistance to SQL to
         | be driven by _fear_ of SQL more so than anything else. As soon
         | as people get a basic comfort level with SQL, it become almost
         | automatic.
        
           | techbio wrote:
           | I have a side-project to solve Kakuro puzzles with SQL logic,
           | not because it's useful, but because it seemed possible and
           | distinctively challenging.
           | 
           | I'll show off the POC if I ever get around to smoothing the
           | rough edges.
        
           | Matrixik wrote:
           | Here is one nice example: https://sivers.org/pg
        
           | dominotw wrote:
           | > I firmly believe that every developer should spend 2-3
           | weeks early in their career working with nothing but SQL. It
           | will pay huge dividends for the rest of it.
           | 
           | i did tons of sql couple of years ago on a reporting team.
           | Now I do android dev fulltime and don't remember any SQL
           | beyond basics, highly doubt it will all come back to me if i
           | tried.
        
             | baq wrote:
             | it doesn't matter, you'll google the details. many people
             | aren't aware that sql isn't just a stupid way of filtering
             | tables and can do things server-side that you'd spend hours
             | implementing and testing on the application side with worse
             | performance in the end most of the time.
        
             | anotheracct_ wrote:
             | It's unfortunate that your education failed you so much.
        
           | TremendousJudge wrote:
           | > The application language was a pass through later between
           | the client and the database.
           | 
           | This style of doing things resulted in spaghetti style
           | unmanageable databases, filled with an unknowable number of
           | triggers and procedures, all written in PL/SQL (which is
           | much, much worse than either Java or PHP). The reason why
           | ORMs started to become popular is that you can write your
           | application without filling your DB with arcane and
           | inscrutable logic
        
             | karatestomp wrote:
             | But going too far the other way is how you end up with
             | performance 100-1000x worse than it should be. Which is an
             | actual thing that happens quite often in the wild. Also
             | often ends up heavily dependent on some ORM or framework,
             | making rewrites or multi-client DB access dangerous and
             | painful.
        
               | eternauta3k wrote:
               | You can create pretty complex SQL queries with
               | SqlAlchemy's ORM, without going back and forth to the DB.
        
             | jtdev wrote:
             | In my observations, ORM use has a perspicuous relationship
             | to piles of arcane spaghetti code.
             | 
             | Not to mention, 50%+ of ORM managed DB schemas that I've
             | observed don't have proper constraints, indexes,
             | relationships, etc. Because the developers using the ORM
             | think it's a magical tool that makes understanding SQL and
             | relational databases optional.
        
               | oftenwrong wrote:
               | When you give an object the power to access to the db,
               | you invariably end up with db access everywhere you can
               | pass that object.
        
             | harikb wrote:
             | There is an entire world between ORM and PL/SQL.
             | Programmatically constructing SQL statements is also a
             | thing. Just because someone writes SQL does not mean SQL
             | needs to be spread through out code or that we need to have
             | lot of logic in PL/SQL. Of course, there will be cases
             | where a store procedure is desired (any kind of validation
             | that cannot be expressed as fkeys, canonicalization of some
             | core data components etc). I worked on DBs writing SQL for
             | several years with only minimal code as stored procedures.
             | 
             | Of course the database vendor and culture also plays a
             | role. We were primarily a mysql/postgresql shop
        
               | resu_nimda wrote:
               | How did you minimize roundtrips between server and DB, or
               | did you find that they were not a big concern?
               | 
               | I'm working on a project with a Postgres database, and as
               | it gets more complex I'm moving more stuff into stored
               | procedures, pretty much wherever a single action requires
               | multiple statements in series (e.g. check if this thing
               | exists, check a value, get the id of some other thing, on
               | success update another table).
               | 
               | Of course I would prefer to leave as much of it in the
               | middle tier as possible because the ergonomics are better
               | there but I don't want to sacrifice performance.
        
               | monocasa wrote:
               | I think they're saying that you profile your changes on
               | an individual query.
               | 
               | Yeah, if all of your queries are slow, you've got more
               | work to do, but that doesn't change the overall process
               | of optimization.
        
             | coliveira wrote:
             | This is trying to avoid the reality that the data is in a
             | database. A database is not just a flat storage area that
             | can be poked by code from outside. It has a lot of
             | relationships and integrity constraints that need to be
             | maintained by the DB itself, and all this is expressed in
             | SQL/stored procs. Trying to avoid this is basically
             | removing the power of a DB and converting it into just an
             | expensive and cumbersome storage area.
        
             | breischl wrote:
             | But now you're filling your application with arcane and
             | inscrutable logic, with an extra layer of abstraction via
             | the ORM to make it even less scrutable.
             | 
             | I think one should view a SQL DB like a microservice.
             | Instead of REST endpoints (or gRPC or whatever), create
             | stored procedures. These define a strong contract with your
             | DB, the capabilities that it provides to your app(s). Now
             | you know what the query and insert patterns are, and can
             | tweak the table layout under the covers without screwing up
             | your application code.
             | 
             | Of course you _can_ abuse this into a spaghetti monolith,
             | just like you can evolve a microservice into a spaghetti
             | monolith, but you shouldn 't. There's no technology that
             | will prevent you from making poor architectural decisions,
             | you just have to not go down those dark paths.
        
               | wenc wrote:
               | > Instead of REST endpoints (or gRPC or whatever), create
               | 
               | > stored procedures. These define a strong contract with
               | your DB
               | 
               | Exactly. It took me years to grasp this, but when I did
               | my code became much simpler.
               | 
               | A REST API (that returns JSON) has to contort a tabular
               | data structure into a loosely-typed hierarchical data
               | structure (JSON) which has to be read back, reconstructed
               | and in many instances type-checked (e.g. DATETIMEs are
               | not native to JSON, nuanced datatypes like DECIMAL(18,3)
               | are lost).
               | 
               | Whereas a SQL interface returns data in its native
               | tabular format with all the correct types.
        
             | wenc wrote:
             | There's a middle way which is very powerful: SQL views
             | (just SQL queries; no triggers or procedures)
             | 
             | Here's a powerful mindset trick: think of _SQL views as an
             | sort of a REST API_ , but whose access language is SQL and
             | not HTTP, and that returns data in a table rather than JSON
             | (hierarchical).
             | 
             | I once tried to build a REST API to a database, and someone
             | told me I already had a battle-tested and highly performant
             | API that outperformed REST at scale -- it's called SQL. A
             | SQL view is a dynamic lens into the underlying tables, so
             | even if the underlying tables/schemas were to change, your
             | consumers don't care as long as they can access the SQL
             | View.
             | 
             | SQL views are also composable: you can build SQL views on
             | top of other SQL views, and any changes made in the base
             | views are propagated throughout. Need to add/transform a
             | field? Do it in the view. Need pull in auxiliary data?
             | Bring it in through a JOIN in the view. I've built many
             | systems by composing SQL views and they're very
             | maintainable and very flexible. They're kind of like
             | function compositions but on tabular data.
             | 
             | The rule of thumb is: always access a database through a
             | view, never the underlying raw tables. In computer science,
             | a great many maintainability issues are alleviated through
             | a layer of abstraction/indirection, and SQL views provide
             | exactly that.
             | 
             | This centralization of the core logic becomes especially
             | powerful if the database is accessed from multiple
             | consumers (webapps, analytics backends, Tableau, ML tools,
             | etc.) The "API" remains consistent throughout.
        
               | cgh wrote:
               | Agreed, and a good example of this is implementing
               | search. You can define a view on top of your searchable
               | entities that includes the urls to the entities, as well
               | as searchable metadata (entity descriptions or whatever).
               | So when you add new searchable items, you just update the
               | view to include them and the code to select from the view
               | doesn't change.
        
               | deckard1 wrote:
               | > highly performant API that outperformed REST at scale
               | -- it's called SQL
               | 
               | You are conflating many disparate things here.
               | 
               | SQL is a language (DSL) for accessing data. REST is a
               | protocol and a data transport method (one could surmise a
               | way to do REST without HTTP, but when reasonable people
               | refer to REST they mean HTTP (over TCP (over IP
               | (etc.)))).
               | 
               | Even REST is not an API. You can't do anything with a GET
               | or a POST without other abstractions built on top of
               | that. So I don't understand how anyone could make
               | performance claims beyond something like "HTTP is slow"
               | and "binary transport is faster", with regards to SQL vs.
               | REST/HTTP.
               | 
               | SQL does not define how you receive your data. Databases
               | have different methods of sending SQL and responding to
               | SQL. Oracle, MSSQL, MySQL, etc.
               | 
               | > This centralization of the core logic becomes
               | especially powerful if the database is accessed from
               | multiple consumers
               | 
               | That's the entire point of an API. Any API. REST APIs,
               | even.
        
               | cryptonector wrote:
               | Not only that, but VIEWs can have INSTEAD OF triggers,
               | which then lets you build powerful abstractions in SQL.
        
               | wenc wrote:
               | Amazing. I didn't know it was possible to write to a
               | VIEW.
        
               | cryptonector wrote:
               | Changes everything no?
        
               | jwdunne wrote:
               | This is a very interesting comment!
               | 
               | Two questions:
               | 
               | Do you have any example code that shows how this works? I
               | get what you're saying intuitively but example code will
               | help me bring it to table.
               | 
               | What about cross cutting concerns? I've found stored
               | procedures to be a performant solution here. By version
               | controlling them, and limiting to pure functions, I found
               | them quite maintainable. Would you instead just define a
               | new view, or extend an existing one, or refactor into a
               | separate view that's then joined into the existing views?
               | 
               | I haven't delved as far as views, admittedly. One app
               | featured a bit of complicated logic and eschewing the ORM
               | in favour of raw SQL helped (instead of getting tangled
               | up in Demeter chains). Despite new developers, who have
               | used purely ORM for years, shitting their pants at the
               | raw SQL, both of us who worked on it felt it was the
               | right call. We feel much better about leveraging more of
               | the database in new projects.
               | 
               | In fact, when we took our experience to a Django project,
               | my colleague wrote a Manager method in such a way that an
               | ORM favouring developer questioned because it looked too
               | much like SQL. But it was the obvious implementation to
               | us after using raw SQL. And, after benchmarking, the most
               | performant.
        
               | wenc wrote:
               | Briefly,
               | 
               | 1. Let me try with a simple example. Suppose you have a
               | fact table A with fields (ItemID, Item, Amt) where Amt is
               | in USD. Rule of thumb is: don't expose A to the consumer;
               | instead write a SQL View V_A and expose that instead:
               | CREATE VIEW V_A AS SELECT ItemID, Item, Amt FROM A
               | 
               | Then suppose a European counterpart wants to use the same
               | API but needs the amounts to be in Euros. You can write
               | another view: (in practice the conversion 0.92 shouldn't
               | be a static number, this is just for illustration)
               | CREATE VIEW V_A_EURO AS SELECT ItemID, Item, Amt * 0.92
               | AS AmtEUR FROM V_A
               | 
               | Expose this to the Europeans. You can keep stacking views
               | on top of other views. Your U.S. consumers will always
               | see the data through the lens of V_A and your European
               | consumers will always see it through V_A_Euro.
               | 
               | Suppose the underlying table A now changes. There's been
               | a merger and the company now stops reporting currencies
               | in USD, and everything is now in British Pounds so your
               | DBA adds a field AmtGBP and starts populating that field
               | instead. Amt still contains historical data, but moving
               | forward the data in Amt will be NULLs; AmtGBP is the new
               | internal baseline currency. From a VIEW perspective, all
               | you have to do is:                 ALTER VIEW V_A AS
               | SELECT ItemID, Item, ISNULL(Amt, AmtGBP * 1.23) AS Amt
               | FROM A
               | 
               | Your V_A and V_A_EURO consumers (could be Tableau, Excel,
               | other SQL views, etc.) will still happily receive data
               | per usual, unaware of the internal changes (the British
               | are coming!) that have occurred. Contract kept.
               | Table A <- View V_A <- View V_A_Euro
               | 
               | 2. Cross cutting concerns come in many forms so not sure
               | if I can address. Stored Procedures are definitely an
               | acceptable abstraction -- they accept parameters and can
               | return tabular results just like VIEWs. They do however
               | work in a procedural manner (like subroutines) and can
               | produce side effects, which is sometimes necessary to
               | accomplish very specific tasks. VIEWs on the other hand
               | are more similar to pure functions (unless random number
               | generation is involved) with no side effects. Because
               | views are dynamic, they flex with your data and VIEW
               | definitions.
        
               | projektfu wrote:
               | Are you also promoting CQRS?
        
             | jayd16 wrote:
             | This is the exact fear of SQL that's being talked about.
             | You can make a mess of anything but its not hard to have a
             | well maintained relational DB with minimal cruft.
        
           | lhorie wrote:
           | With the whole shelter-at-home thing, I had a chance to work
           | on a simple app for my kids. One of the corollaries of
           | "simple" was avoiding an ORM.
           | 
           | Implementing logic is single SQL queries can definitely be a
           | bit challenging, but I thought it was also quite rewarding
           | and liberating - Raw SQL is incredibly powerful!
        
             | beckingz wrote:
             | At first I missed for loops.
             | 
             | Then I was empowered and did not miss for loops.
             | 
             | Now I still miss for loops.
        
         | vnorilo wrote:
         | I agree that this is in part why nosql became popular. I'm
         | quite experienced with systems programming but databases were
         | never central to my interests.
         | 
         | So when I needed one for my personal stuff, I went with
         | CouchDB, because it's just a very nice RESTful API around a
         | data structure I understand well.
        
         | corey_moncure wrote:
         | I'm an SQL hater in remediation.
         | 
         | In a given week I might work with all of the following: SQL,
         | C#, Python, JS (Kendo, Vue, React), XSL, bash, and more. I'm a
         | quick learner and I pick things up fast, always have, I've got
         | a deadline and I don't have the time or capacity to fully
         | internalize the minutiae of all the technologies I have to work
         | with. In other words I depend on the tools to show me the
         | options at my disposal and construct syntactically correct
         | expressions.
         | 
         | What I really dislike about SQL is common to most systems I
         | dislike, where the tools aren't discoverable to me. In SQL it
         | boils down to the fundamental syntactical requirement to put
         | the SELECT clause before the FROM clause. So I have to build up
         | my statement in this weird spiral pattern where I change
         | something deeper in before I know what I can SELECT in the
         | first place. The ability to give tables shorthand names with
         | MS-SQL, i.e.
         | 
         | `left join [dbo].[sometable] st`
         | 
         | ...is very helpful but I gather this isn't common to all
         | dialects. Working with XML in SQL is a nightmare as the tool
         | cannot tell you whether you can do obvious things like pass an
         | sql:variable into nodes() until you actually query the server.
         | (spoiler: you can't.)
         | 
         | I vastly prefer the functional programming approach to working
         | with data e.g. C#'s lambda style linq. The tool shows me all
         | the pieces I have to work with, and all I have to do is piece
         | them together the right way.
        
           | throwaway_pdp09 wrote:
           | > In SQL it boils down to the fundamental syntactical
           | requirement to put the SELECT clause before the FROM clause
           | 
           | No offence but if you are at the level of struggling with the
           | syntax then you should not be using SQL until you have more
           | experience. There have been plenty of well-founded critiques
           | of the crappy syntax of SQL, but it's not ultimately that
           | hard. There are worse things you'll have to cope with such as
           | the implications of 3-valued logic that comes with nulls.
           | 
           | The giving-tables-alias-names feature has been standard
           | probably since the very first standard was released. In some
           | cases such as self-joins, it is necessary.
           | 
           | If I can give you a piece of advice, arrange things using the
           | 'with' clause and learn to break things down as simple as
           | possible at each step. Let the optimiser sort things out for
           | you. If performance is poor, look at the query plan. Also
           | understand you don't have to write everything in one huge
           | statement. Spool intermediate results off to temp tables if
           | that helps (not @tables but #tables - @tables have problems
           | with them).
           | 
           | As for the XML, I guess that may be better handled outside
           | the DB. IMO XML should never be made part of SQL. Good luck.
        
           | Viliam1234 wrote:
           | Reading SQL is like reading German, where the last word in a
           | long sentence determines the meaning of the entire sentence.
           | 
           | An SQL statement starts with "select ABC.XYZ", but you have
           | no idea what it means, because only one screen later it is
           | written that "ABC" is actually an alias for
           | "T_ACCOUNT_BUSINESS_CREDITS" or something. The logical order
           | would be "from ... where ... select ...".
           | 
           | Imagine a programming language designed like SQL. It would
           | look approximately like this:                 function foo(a)
           | {         return s;         b = join(a);         s =
           | concatenate("[", b, "]");         by the way, a is list of
           | strings, b is string, s is string;         also, function
           | "join" is imported from "lists", and "concatenate" is
           | imported from "string" module;         actually, don't return
           | the result, just tell me how many characters it would have;
           | }
        
         | kyllo wrote:
         | TFA isn't even about SQL the language at all though, it's about
         | the scalability and reliability characteristics of databases,
         | especially in a distributed environment.
        
         | txcwpalpha wrote:
         | When I was in undergrad I was part of program that was heavily
         | programming focused, but was actually part of the business
         | school. Several of the classes I took were _heavily_ SQL
         | focused, with at least one class that had every single
         | assignment requiring extensive designing database schemas and
         | writing SQL. Now, years later, I still think that those classes
         | were some of the most valuable to my career as a programmer.
         | 
         | What's interesting to me is that apparently the CS program at
         | my university hardly did anything with SQL, and I notice too
         | that most programmers I meet "in the wild" are lacking in SQL
         | skills, as you mentioned. It's led to some interesting
         | interview situations where I really struggle with any questions
         | about algorithms (my college courses didn't cover algorithms at
         | all) while the interviewer will tell me that I have the best
         | SQL skills of anyone they interviewed.
         | 
         | Speaking with others that went to other universities, I've
         | heard that it's similar elsewhere for the
         | "business/programming" to include SQL classes but eschew
         | algorithms, while CS programs will ignore SQL but focus heavily
         | on algorithms. It seems to me like both programs could benefit
         | from meeting in the middle a bit.
        
           | carlps wrote:
           | I have a very similar background. My degrees are in Business
           | Computer Systems and we had SQL hammered into us. We learned
           | how to program with Java and .NET, but all from a very high
           | level. It wasn't until a couple years into my career after
           | school that I started learning DS+A fundamentals on my own.
           | It blew my mind. I agree that there should be more of a mid-
           | point available.
        
         | MathCodeLove wrote:
         | As a programmer who is admittedly attracted to Mongo, a large
         | part of it is simply its ease of integration. I'd happily spend
         | more time familiarizing myself with SQL if it were less of a
         | pain to integrate into my projects.
         | 
         | The first time ever using Mongo I had a cloud cluster connected
         | and working in about 10 minutes after signing up. Trying to
         | integrate SQLite took me around 1.5 hours before it was
         | functional. To this day I have yet to set up a real cloud SQL
         | database because the one time I tried it with PostreSQL I just
         | couldnt get it to work.
        
           | izacus wrote:
           | Is initial integration cost really the most important metric
           | you look at when designing a system that will probably run
           | for years and will have to be maintained and scaled?
        
         | pbowyer wrote:
         | I like SQL when I'm not writing reporting queries. GROUP BYs
         | bite me (With MySQL 8 I end up reaching for the ANY_VALUE()
         | function), and I end up with more subqueries than I feel I
         | should need.
         | 
         | When working with time-indexed data I feel I'm forcing the
         | database to do something it doesn't want to. E.g. if I want to
         | answer the query "How many sales are there per day this month?"
         | and I want an entry for every day in the month, even when there
         | were zero sales. Or another query asking "Which days didn't
         | have sales?" and listing the days. I haven't found a way to do
         | this in-database. I end up answering in code based on the data
         | I fetch.
         | 
         | Definitely feel it's me not SQL, but not found the answer.
        
           | JamesSwift wrote:
           | The answer is likely that your storage schema is incorrect.
           | You have things stored in OLTP (i.e. app database) but trying
           | to read it as OLAP (i.e. reporting database). Once you
           | reimagine the data in the OLAP style then these kinds of
           | queries are simplistic.
           | 
           | EDIT: specifically for your example, in an OLAP style you
           | would generate a Times table and then foreign key the sales
           | table to it based on the date. Then you can easily query
           | against that Times table as the filter/bucket for your
           | queries.
        
             | pbowyer wrote:
             | Yes, agreed. It was designed for OLTP not OLAP, and I have
             | to get my mindset into that.
             | 
             | For days as my example has, would the Time table be
             | generated for say 1970-2050? Given months are of different
             | lengths and there are leap years, I'm assuming this is
             | needed, rather than storing a single year.
        
               | jacques_chester wrote:
               | I found Kimball's book to be interesting and helpful.
               | 
               | I think the Kimball approach gets bogged down in temporal
               | evolution (Slowly Changing Dimension type 1? 2? 3? 4?
               | 5?), but that's more about the underlying absence of
               | meaningful bitemporalism than the dimensional schema
               | approach per se.
        
               | JamesSwift wrote:
               | Yes, the time table is a pregenerated set of all possible
               | moments in time for whatever resolution you care about.
               | So, for example, if you cared about day resolution it
               | might be                 date_key | year_num | month_num
               | | day_num | quarter_num | week_num | month_name |
               | month_name_short       ...       20200101   2020       1
               | 1         1             1          january      jan
               | 20200102   2020       1           2         1
               | 1          january      jan       ...       20200401
               | 2020       4           1         2             14
               | april        apr       ...
               | 
               | You would FK on the `date_key` and add as many columns as
               | you need to support querying against the dimension. I
               | also like to add a proper `datetime` representation of
               | the date so I can easily do a date range query.
        
           | EvanAnderson wrote:
           | Your example is a fun one and not too hard to do.
           | 
           | Thinking about the intersection of sets is important for
           | grokking what you're trying to do in that example (and with
           | SQL in general). For me, at least, SQL suddenly made a lot of
           | sense when the idea that I was working with intersections and
           | subsets of sets.
           | 
           | In your example you have (1) a set of all dates in a range,
           | and (2) a set of sales totals for days that had sales. Set 1
           | could be a "numbers table" or generated with something like
           | "generate_series()" in PostgreSQL. Set 2 is made by
           | summarizing the data in a sales table by date using "GROUP
           | BY" and "SUM".
           | 
           | Then you're just looking at JOIN'ing those sets and
           | COAELSCE'ing the NULL returned from days when there are no
           | sales into 0.
        
             | will_pseudonym wrote:
             | I learned SQL working in Access. There wasn't anything like
             | generate_series(), and I remember having to do things like
             | above by having a start and end date, cross joined with a
             | table of nonnegative integers I made, and creating a field
             | which added an integer to the start date, subject to being
             | no greater than the end date.
        
             | pbowyer wrote:
             | > Set 1 could be a "numbers table" or generated with
             | something like "generate_series()"
             | 
             | I've done it this way when doing it per-minute or per-hour
             | - i.e. for 60 minutes or 24 hours; fixed, constant ranges.
             | I picked this example because (to me at least) it's more
             | difficult :)
             | 
             | Calendar months have different lengths, so something would
             | have to tell it which month to use (and account for leap
             | years). But say you wanted the last 60, 90 or 120 days? I
             | suppose you could first create a temporary table to be the
             | "numbers table", custom made for the range you want to
             | fetch - is there an alternative?
        
               | EvanAnderson wrote:
               | In many databases you would create a function to fill the
               | role of that "numbers table" and, effectively, map that
               | function onto rows. That function can express whatever
               | "richness" you need (i.e. civil calendar month, phase of
               | the moon, etc... >smile<) that might not be easily
               | expressed as a series.
        
               | will_pseudonym wrote:
               | There are date functions in most SQL packages which can
               | create the limiting factors you're looking for, just as
               | in procedural languages. Once you have the list of dates
               | you're interested in, the problem becomes simple.
        
               | Amezarak wrote:
               | I have done this many times. My preferred solution was to
               | create a recursive CTE beginning with the desired start
               | date and ending with the desired end date. I often based
               | these on MIN and MAX subqueries, but obviously you could
               | also pick arbitrary dates.
               | 
               | You can use days, months, weeks, years, or whatever you
               | wanted as the "increment" in the CTE using the DATEADD
               | function. Then (for nulls) you simply LEFT JOIN your CTE
               | with the desired date part (aggregated) of your table.
               | This gives you your first answer, and a simply filter
               | will get you your second answer.
               | 
               | It's also common for people to just create all these date
               | tables beforehand as actual, materialized tables. In my
               | opinion, this is less elegant (what happens in 2101?!
               | Somebody had better remember to add to the table!) but it
               | naturally works just as well and probably saves some
               | perf.
        
           | alexilliamson wrote:
           | You can (IMO) pretty succinctly enumerate all the days in a
           | month with a recursive CTE. The example in the following link
           | does it for weekdays, but it's the same idea:
           | https://www.sqlservertutorial.net/sql-server-basics/sql-
           | serv...
        
           | mulmen wrote:
           | For anything other than the absolute simplest case you should
           | have two databases. One for OLTP (the application(s)) and one
           | for OLAP (the reporting).
           | 
           | If your reporting queries are hard or complex it's because
           | you did a bad job architecting the reporting tables.
           | Reporting queries should almost always be the simplest form
           | of query if you designed your warehouse properly.
           | 
           | Days with sales is easy with a simple SUM or COUNT and GROUP
           | BY.
           | 
           | Days without sales is easy if you think about this
           | dimensionally.                 SELECT dim_days.day_of_month,
           | COUNT(fact_orders.order_id)       /*         Select from
           | dim_days first because you want every day.       */
           | FROM dim_days       /*         Outer join to the fact table
           | to pull in the data you have and add that to your dates.
           | */       LEFT JOIN fact_orders       ON dim_days.order_day =
           | fact_orders.order_day       /*         Filter results by the
           | desired range.       */       WHERE dim_days.month = 4
           | AND dim_days.year = 2020;
           | 
           | As I continue to say:
           | 
           | "SQL is easy. Data is hard."
        
           | sk5t wrote:
           | A more functional mindset can definitely help here. Think of
           | your "sales per day" model in terms of starting with a
           | sequence of days--startingDay up to startingDay + n--as the
           | input to a function that maps to an aggregate of that day's
           | activity.
           | 
           | Aggregate functions in SQL are IMHO quite awesome once you
           | develop a comfort level to stop worrying about them per se. I
           | wouldn't like to try to get Excel to tell me--or write the
           | code to do manually--something like "show me the standard
           | deviation in units sold by day of week over the last ten
           | summers."
        
             | pbowyer wrote:
             | I find thinking about analytic queries like these as
             | map/filter/reduce easier than thinking in terms of the SQL!
        
               | taffer wrote:
               | SQL and functional programming constructs are actually
               | quite similar:
               | 
               | SELECT -> map
               | 
               | FROM -> stream
               | 
               | JOIN -> flatMap
               | 
               | WHERE -> filter
               | 
               | GROUP BY -> collect
               | 
               | ORDER BY -> sorted
               | 
               | https://blog.jooq.org/2015/08/13/common-sql-clauses-and-
               | thei...
        
         | bryanrasmussen wrote:
         | I'm pretty sure I would be good at SQL if I devoted time to it,
         | because I've worked in other declarative DSLs and generally
         | outperformed other programmers using them, but basically every
         | place I've ever worked at that used SQL extensively had already
         | employed SQL experts that just handled the problems on that
         | level so that it didn't seem worthwhile bothering with it.
        
         | fimdomeio wrote:
         | I struggle with lack of experience with SQL by always being
         | told that I should always use an ORM or I would regret it in
         | the future when I would change database technology. I'm in the
         | future now and spend a lot of time debbuging the ORM and the
         | sql statements it produces, when I could split that work in
         | half by not using the orm at all. Would also have a lot more
         | experience with sql so there would probably be less bugs in the
         | first place.
        
           | sk5t wrote:
           | Swapping out database platforms is pretty deep into YAGNI
           | territory for most. Unless you _know_ you are selling an on-
           | prem software product to some customers who will demand MSSQL
           | and others who will demand Oracle, or whatever, this  "swap
           | databases" justification for scrupulously using an ORM is not
           | well grounded in reality.
        
           | btilly wrote:
           | If you're writing a CRUD application, an ORM saves a lot of
           | headaches.
           | 
           | If you're doing complex reporting queries, an ORM is strictly
           | worse. And yes, I've seen developers, architects, and authors
           | of ORMs that believed otherwise. They are wrong.
           | 
           | As an example, very, very few ORMs can make the distinction
           | between                   SELECT ...         FROM foo
           | LEFT JOIN bar             ON foo.id = bar.foo_id
           | AND bar.category_id = 5           LEFT JOIN baz
           | ON bar.id = baz.bar_id         ...
           | 
           | and                   SELECT ...         FROM foo
           | LEFT JOIN bar             ON foo.id = bar.foo_id
           | LEFT JOIN baz             ON bar.id = baz.bar_id
           | AND bar.category_id = 5         ...
           | 
           | (And if you get into things like analytic queries, just
           | forget about it.)
        
             | jakearmitage wrote:
             | Curious, what is the difference here?
        
               | [deleted]
        
               | gtsteve wrote:
               | I believe they may have intended to write this query with
               | INNER JOINs instead of LEFT JOINs.
               | 
               | If so, the result of the query would be identical but the
               | second one would likely have performance problems given
               | the category_id filter is not being applied at the point
               | where bar is joined. A clever database engine might find
               | an optimisation but I wouldn't count on it.
               | 
               | This illustrates the subtle problem that ORMs can
               | introduce - all logical tests pass but issues emerge over
               | time. I found that typically every ORM based problem was
               | resolved by rewriting it using SQL.
               | 
               | I used to use ORMs extensively, but my philosophy now is
               | that you can make simple queries by using a query
               | generator (i.e. connection.GetByID("MyTable", 100)) and
               | complex queries have to be written by hand. I would not
               | choose to use an ORM again.
        
               | btilly wrote:
               | No, left joins were intended here. They mean different
               | things.
               | 
               | With inner joins the semantics are the same. If the
               | performance is different, the database query optimizer is
               | broken.
        
               | btilly wrote:
               | The difference is whether the condition on bar is a
               | filter is on the join from bar or baz.
        
               | saltcured wrote:
               | The extra "AND expr" in the join clause makes it more
               | strict, which means the outer join can produce more nulls
               | in the joined tuple. In a sequence like A left join B
               | left join C, moving the extra join filter between the two
               | joins is the difference between getting tuples like
               | (a..., null..., null...) vs (a..., b..., null...).
        
         | gwbas1c wrote:
         | Part of the issue is that a complicated database can handle the
         | same SQL query many different ways based on indexes and other
         | configurations.
         | 
         | This kind of "magic" isn't always clear when programmers are
         | mostly used to working with data structures and procedural
         | code.
         | 
         | The other problem, IMO, is that programming languages are very
         | poor at bridging the difference between the SQL domain and the
         | language domain. We really need plugins for compilers because
         | ORM libraries often are harder to learn than the database
         | itself.
        
           | takeda wrote:
           | > The other problem, IMO, is that programming languages are
           | very poor at bridging the difference between the SQL domain
           | and the language domain. We really need plugins for compilers
           | because ORM libraries often are harder to learn than the
           | database itself.
           | 
           | Actually in past year I discovered that JetBrains IDEs (in my
           | case PyCharm) have nice feature that seems like not everyone
           | is aware. It is due to their integration with DataGrip. If
           | you configure the IDE to connect to your database, it will
           | start looking for SQL in your strings, it will then do code
           | highlighting, autocomplete table names and fields. If you use
           | refactoring it will understand SQL, if you refactor database
           | it will produce migration statements. It appears to solve all
           | the issues that ORM supposed to solve, and you still have
           | full control since SQL didn't need to be abstracted from you.
           | 
           | Edit: this video shows how it works in action (forwarded to a
           | relevant part): https://youtu.be/_FlpiNno088?t=2868
        
           | treis wrote:
           | >This kind of "magic" isn't always clear when programmers are
           | mostly used to working with data structures and procedural
           | code.
           | 
           | My problem is that it is like some sort of black magic to me.
           | If I write a complex query I have no idea if what is spit
           | back to me is actually what I want. The only way is seeding
           | lots of records and then manually checking that each filter
           | and calculation is doing what I want.
           | 
           | In code complex things can be broken down into more simple
           | items. Then I can reason about and test those building blocks
           | into something I understand and am confident that is working
           | as intended.
        
             | frosted-flakes wrote:
             | Yes, SQL's biggest fault is that it's not very composable.
             | Complex queries end up being long and repetitive, and the
             | order of the parts of a query is totally unintuitive (it
             | should go something like: FROM, GROUP BY, SELECT, ORDER BY
             | rather than SELECT, FROM, GROUP BY, ORDER BY, which makes
             | autocompletion hard).
        
               | ratww wrote:
               | CTEs can help a bit with composition (and readability)
        
               | takeda wrote:
               | QUEL[1] was like that, but thanks to Oracle SQL won.
               | 
               | [1] https://en.wikipedia.org/wiki/QUEL_query_languages
        
               | setr wrote:
               | One thing I never understood is that the SQL language,
               | and its alternatives, share the same theoretical IR --
               | the relational algebra -- it shouldn't be _that_
               | difficult to implement for postgres /mysql alternative
               | relational languages like QUEL or Datalog. Or even just a
               | simplified SQL with a sane, consistent syntax.
               | 
               | I know PG has a bunch of procedural-language
               | alternatives, but afaik, no relational-language
               | alternatives.
               | 
               | At the same time, I'd also expect it to not be _that_
               | difficult to transpile from say mysql to postgres, yet
               | there 's very little in that space, at least in open
               | source (there is however many ORMS that map to _either_
               | mysql or postgres..)
               | 
               | Both of which, amongst the procedural-languages, you'd
               | find a hundred transpilers (JS->C) and VM-languages
               | (Clojure,Scala,etc on jvm) implemented even by
               | bored/curious students. Which makes my suspicion that
               | there's no technical blocker, just a cultural one.
               | 
               | That is, QUEL losing to SQL, such that you can't find any
               | implementation, is an absurd concept -- it _should_ be
               | available on postgres (perhaps requiring some special
               | starting keyword) -- but somehow it is not.
        
           | pjscott wrote:
           | The thing that really worries me when I'm writing SQL is the
           | possibility that the query planner will get frisky and choose
           | some disastrously slow join order, but only once in a while.
           | I've run into _way_ too many hard-to-reproduce performance
           | issues like this.
           | 
           | Having the database automatically figure out how to run a
           | query is a great feature, but most of the time I'd happily
           | just write explicit nested loops for the sake of
           | predictability.
        
           | goostavos wrote:
           | >programming languages are very poor at bridging the
           | difference between the SQL domain and the language domain
           | 
           | Depends a lot on the language. I've lost countless hours to
           | things like JOOQ trying to figure out how to get it to do
           | what I want, or express the query in its quirky not-quite-
           | right DSL, plus dealing with mappings, pojos, auto-
           | generation, and so on.
           | 
           | However, on the other hand, in a dynamic language with just
           | enough support to move the result of your queries into a map
           | of key/value, I feel no friction at all. I'm using little
           | more than a simple jdbc wrapper in Clojure and even after
           | months on the project, I'm still just continuously stoked
           | with seamless the whole thing is.
        
             | oftenwrong wrote:
             | jOOQ can be used in a less-type-safe way. For example,
             | `fetchMaps` [1] does more-or-less what you describe.
             | 
             | However, I have found it worthwhile to learn to use the
             | more advanced features you mention. Extending type safety
             | to queries is incredibly useful. Consider cases when
             | developers are making code and schema changes concurrently
             | that overlap.
             | 
             | [1] https://www.jooq.org/javadoc/latest/org.jooq/org/jooq/R
             | esult...
        
               | Scarbutt wrote:
               | SQL (at least in Postgres) is already type safe.
        
               | oftenwrong wrote:
               | jOOQ can effectively extend the type system of Java to
               | the construction of ad-hoc queries by way of code
               | generation. jOOQ generates Java classes that correspond
               | to the database schema, which can be used in its query
               | building DSL.
               | 
               | For example, if I have a `timestamp with time zone`
               | column in PostgreSQL, that is represented in the
               | generated code. I will be prevented from inserting a Java
               | `String` into that column - I will have to provide an
               | `OffsetDateTime`. (This is how it would be in a typical
               | configuration - it is flexible enough to do pretty much
               | anything).
               | 
               | Another example, let's say I reference some column in a
               | query in one of my feature branches. Meanwhile, somebody
               | has dropped that column on the master branch. When I
               | rebase my feature branch onto master, my build will fail.
        
         | homami wrote:
         | I find SQL INSERT statement not intuitive. I can understand why
         | SQL requires me to declare the field names and then the values
         | of a new row that I am inserting; but it would've been a huge
         | time saver if SQL had a key-value dictionary-like syntax:
         | INSERT INTO "my_table"            "col1": value1,
         | "col2": value2,            ...
        
           | PZ81JUXJE7uJ wrote:
           | MySQL supports a insert/set statement:
           | INSERT INTO tbl SET a = 1, b = 2;
        
             | castorp wrote:
             | How does that look like with a multi-row insert?
        
           | castorp wrote:
           | And when you need to insert more than one row, you repeat the
           | column name over and over again for each row - not a good
           | idea either.
        
         | pjc50 wrote:
         | This is partly why I love LINQ: it's a more C# flavoured way of
         | expressing queries. And it works on objects as well as
         | databases.
         | 
         | Most ORMs are bad for queries (pull over all the objects and
         | look at their properties!), but LINQ will actually turn your
         | code into SQL under the hood with some remarkable machinery.
        
           | fernandotakai wrote:
           | >Most ORMs are bad for queries (pull over all the objects and
           | look at their properties!)
           | 
           | i would change that to ORMs CAN be bad for queries. most ORMs
           | are super configurable and can be changed only to pull
           | specific stuff.
           | 
           | i would argue that most ORMs are a plus to productivity
           | because for most of cases, you just need simple querying
           | (select * from, simple updates, simple deletes).
           | 
           | also, if you really need the power of raw sql you an just use
           | that -- and even then, for most advanced ORMs (django's, for
           | example) exposes a LOT of really advanced sql stuff in
           | python, which, for a lot of developers, is a lot more
           | expressive.
        
           | andybak wrote:
           | Not doubting there are bad ORMs but none of the ones I've
           | dived into is quite that bad (SQL Alchemy and Django's ORM).
           | 
           | I sometimes wonder if the debate about ORM is mainly driven
           | by some traumatic horror witnessed at some point in every
           | developer's career. If we judged programming languages by the
           | same metric... Oh... Actually we kind often do that too!
        
           | uhoh-itsmaciek wrote:
           | Which ORMs are bad for queries? Most popular ORMs these days
           | expose most of sql in a language-specific DSL (and some allow
           | you to splice in bits of raw sql in a semi-structured way as
           | an escape hatch). Sure, you still need to learn to "think in
           | sql", to use these effectively, but the ORM has not been the
           | problem for a long time in most languages.
        
             | ratww wrote:
             | ActiveRecord, Rails ORM was the poster child of ORMs for a
             | while, and is completely abysmal for non-trivial left
             | joins, non-trivial aggregations and some subqueries.
             | 
             | Of course using raw SQL is possible but might force you
             | into converting other parts of the code into raw SQL. There
             | is Arel, but it's just a more verbose SQL in quasi-AST
             | form.
             | 
             | On the other hand, I never had much problems with LINQ or
             | Hibernate.
        
         | slifin wrote:
         | SQL definitely solves a lot of performance issues for typical
         | users who think they can do better
         | 
         | but throwing strings over the wire, trying to get as much as
         | possible without crossing cardinalities and ignoring the n
         | query problem makes SQL at scale disastrous
         | 
         | Devs always get one of these things wrong and not being able to
         | use my programming language in the query makes some things hard
         | to express
         | 
         | Datalog and Datomic are definitely worth learning
         | https://youtu.be/Pz_NvY1kw6I
        
           | nsomaru wrote:
           | could you clarify what throwing strings over the wire and
           | crossing cardinalities means in this context?
        
         | meddlepal wrote:
         | From my experience it's just unfamiliarity with the SQL
         | language. I've spent thousands of hours writing PHP, Java,
         | Scala, Kotlin, Python and Go and doing fairly complex things
         | with them since I started programming.
         | 
         | SQL maybe a few hundred hours? And off and on again rather than
         | constantly so I don't always remember beyond the basics. The
         | concepts make sense to me but the language always feels
         | foreign.
        
         | bpyne wrote:
         | I think you're onto something with the observation about what
         | type of developers take to SQL. We have a spectrum in which
         | people think more like how the processor operates to people who
         | think more in classes of problems and their solutions.
         | 
         | We run into problems when someone with a proclivity to one side
         | meets a problem best solved by thinking on the other side of
         | the spectrum. An example is a young dev in my organization who
         | was given requirements for a new app that required a data
         | store. Not being comfortable with SQL or relational databases,
         | he chose a document store. It wasn't a good fit. Very quickly
         | requirements expanded and caused his code to balloon into a
         | mess of nested loops with lots of if-checks. Performance has
         | progressively decayed as well. A simple multi-table join with
         | filtering would have knocked off 2/3 of the code.
        
         | dirtydroog wrote:
         | I'd put myself in this category. I just never liked databases,
         | they never interested me.
        
         | josephorjoe wrote:
         | When I first encountered SQL, my immediate reaction was to
         | start writing my own lightweight ORM (two days later I learned
         | ORMs were a thing that already existed) because I took one look
         | at that syntax and decided it was insane to work with directly.
         | 
         | I definitely don't think in SQL.
         | 
         | That said, I like working w SQL databases -- just please give
         | me a battle tested ORM...
        
         | EvanAnderson wrote:
         | Some of the most fun I've ever had coding has been creating
         | "complex" SQL queries. The syntax is something that can only be
         | overcome with memorization, but it becomes second-nature fairly
         | quickly.
         | 
         | Until I started thinking about SQL as manipulating sets it
         | never "clicked". Once it did, though, a whole world of
         | applications filled with mismatched procedural thinking mapped
         | on to SQL was revealed to me, and my own work became much
         | easier.
        
           | resu_nimda wrote:
           | The first time I accidentally typed "select 8 from users..."
           | and, instead of an error, it returned a column called "8"
           | with a bunch of rows with the value "8", my mind was blown.
        
           | eropple wrote:
           | Despite our industry's habit of treating mathy whiteboard
           | things as indicative of one's "programming talent", real
           | application of type, set, and graph theory are not tested
           | nearly as much as they end up being in practice, and I think
           | that developers' reticence to get deep into SQL is
           | symptomatic of the weaknesses there.
        
         | hknd wrote:
         | I used to write SQL at Google/FB, and I did and still do
         | struggle with it.
         | 
         | I think it's because I usually touch SQL like once a month and
         | by the time I need it I already forgot everything :shrug:
        
         | crazygringo wrote:
         | SQL is very much like CSS to me.
         | 
         | It's declarative, the primitives seem entirely non-intuitive,
         | it often takes a lot of fiddling to get what you want, the
         | behind-the-scenes execution is mostly a black box, and while
         | it's supposed to work the same on different implementations (of
         | browsers/databases), there are tons of little gotcha quirks.
         | 
         | All in all, they're both entirely different skill sets from
         | traditional programming, and also where experience counts for a
         | ton more than just normal logical thinking.
        
           | taffer wrote:
           | When you look at SQL from a logical/set-based perspective, it
           | is by no means unintuitive. Basically, all you do is join all
           | the tables you need and then filter out everything you don't
           | need and maybe do an aggregation here and there.
        
             | 411111111111111 wrote:
             | SQL can do a lot more though.
             | 
             | Triggers, functions, procedures, access control.....
        
               | astura wrote:
               | Plus indices, views, synonyms, constraints, and the
               | entire concept of data normalization...
        
               | beckingz wrote:
               | Turing complete!
        
             | outworlder wrote:
             | I had way less trouble in college with relational algebra,
             | compared with SQL. SQL is by no means intuitive.
             | Projection, which is what you do last, comes first in the
             | select statement. Then there are all the join types.
             | 
             | Relational algebra, which is what SQL is ultimately based
             | on, is much more elegant.
        
             | alexbanks wrote:
             | And with CSS, basically all you do is tell the browser how
             | things look.
        
               | paulryanrogers wrote:
               | Conceptually yes. The devil is in the details: tables,
               | flexbox, grid, div-soup, inconsistent naming, etc.
        
             | crazygringo wrote:
             | How about the following:
             | 
             | - When to use JOIN vs a subquery?
             | 
             | - When is a subquery actually a correlated subquery? Will
             | this destroy your performance? Or is it a critical feature?
             | 
             | - Should you put constraints in the JOIN or in the WHERE?
             | Will the distinction drastically affect performance?
             | 
             | - When do you use WHERE vs HAVING?
             | 
             | - Is the NULL from the join because no joined row was
             | found, or because the joined row had a NULL value itself?
             | 
             | Etc. etc. The basic concepts are simple, but the
             | implementation details quickly become very complex,
             | particularly when you're ensuring high performance with
             | indices, and making sure the query uses the indices.
             | 
             | And all of the questions I pose above have clear answers...
             | but the answers certainly aren't obvious from SQL basic
             | concepts.
             | 
             | And also, while JOIN seems like it _ought_ to be intuitive,
             | in real life it seems like it 's like pointers in C -- some
             | people get it pretty quickly, other people struggle
             | forever.
        
               | polygotdomain wrote:
               | You need to spend time working with the language and
               | understanding how the set based operations used in SQL
               | work. Declarative languages means you can express things
               | in multiple ways and get the same results. This is an
               | incredibly important aspect of SQL
               | 
               | >the basic concepts are simple, but the implementation
               | details quickly become very complex
               | 
               | This is no different than programming. Assuming that SQL
               | doesn't have complexity because you can only SELECT,
               | INSERT, UPDATE or DELETE is going to have you banging
               | your head against the wall. Tackle the complexity in SQL
               | like you'd tackle the complexity in your programming
               | language of choice; read the docs, work through examples,
               | and read how other people solve the problem. There's ton
               | out there for SQL
               | 
               | >When do you use WHERE vs HAVING?
               | 
               | HAVINGs allow you to add a condition to an aggregate
               | function. So SUM(myColumn) > 5 would be something you put
               | in a HAVING clause. Honestly, this is pretty clear cut.
               | 
               | >Should you put constraints in the JOIN or in the WHERE?
               | Will the distinction drastically affect performance?
               | 
               | The first thing to understand is a condition in a join
               | versus a where might return a different result set,
               | specifically on anything other than an INNER join. The
               | impact to performance will depend on the rest of your
               | query, your data, and your index coverage. For simple
               | cases, there is likely no difference. For complex ones,
               | there may be an impact
               | 
               | > Is the NULL from the join because no joined row was
               | found, or because the joined row had a NULL value itself?
               | 
               | An inner join shouldn't produce a null. That's why it's
               | an inner join, as the data needs to exist in both places.
               | If you want to "test" whether a join found a row, look at
               | the field you were joining to and see if it's a non-null
               | value. Nulls won't join to Nulls unless you've change
               | some settings in most RDBMS. If you're looking at other
               | fields to determine the presence of a row from a join,
               | make sure you're looking at a non-nullable field.
               | 
               | >When to use JOIN vs a subquery?
               | 
               | A better way to phrase this would be when to just join
               | the table, vs writing a sub query and joining to that.
               | When is a question of the complexity of the query and
               | performance characteristics, and that can't be answered
               | in the abstract. The most important thing is that in a
               | large number of cases you can do both, and knowing how to
               | express things in both ways is powerful.
               | 
               | >And all of the questions I pose above have clear answers
               | 
               | No they don't. Any time you're wondering about how
               | different SQL impacts performance, there's absolutely a
               | huge "it depends" angle on it, because how you've
               | structured the tables, index coverage, and the volume of
               | data, can have a significant impact. This is why DBAs
               | still have jobs, because the database is an incredibly
               | complex system. You seem to be complaining that SQL
               | shouldn't be complex, yet are not willing to accept that
               | it is more complex that you've assumed it to be. It's
               | complex. You don't need to know everything if your just a
               | dev, but don't just assume it's simple.
               | 
               | >while JOIN seems like it ought to be intuitive
               | 
               | I'd check the diagram here ->
               | https://stackoverflow.com/questions/13997365/sql-joins-
               | as-ve... Half those joins aren't needed as you can re-
               | order a right join into a left join. For 95% of
               | development Inner joins and left joins are all you need.
               | The other 5% is an outer join and that's mainly needed in
               | report writing, not app development.
        
               | Drakar1903 wrote:
               | I will be very glad if you actually answer these
               | questions, in a separate comment. I'm driven to write
               | this by nothing but the desire to know.
        
               | jayd16 wrote:
               | Like everything else, write it the simple/elegant way
               | then profile it and tweak if you have to.
               | 
               | Once you're at the point where you have to worry about
               | these things, tuning the SQL is still probably much less
               | complex than writing the query in your app language or
               | figuring out how a NOSQL db can do these joins.
        
               | izacus wrote:
               | In reality it rarely works this way - there's plenty of
               | systems which are falling apart due to "death of thousand
               | cuts" type issues. You run a profiler and most of the
               | queries are slow and there's no one obvious part to
               | optimize - because developers over the years ignored
               | basic optimizations and there are inefficiencies
               | everywhere.
               | 
               | E.g., for a quick practice run, try optimizing Wordpress
               | without making it a static page via caching - how many
               | queries will you have to optimize and how much of a
               | codebase rewrite will it be to make it significantly more
               | performant?
        
               | jatone wrote:
               | in reality it does work that way.
               | 
               | the problem you describe is completely different and its
               | more a measure of system health in its entirety of the
               | DB. it can be easier to fix or harder depending on the
               | precise cause. but that's a standard debugging skill
               | which isn't strictly in the DBA skillset.
        
               | taffer wrote:
               | > The basic concepts are simple, but the implementation
               | details quickly become very complex
               | 
               | True, but those kinds of questions come up in every
               | language: Should I use an array or a dictionary? Should
               | people have references to projects or should projects
               | have references to people or both? Is money a float, an
               | int, a decimal or should I write my own money class?
               | Should I memoize the results? Is it thread-safe?
               | 
               | As you can see, this could go on forever, pretty much for
               | any language.
        
               | UncleMeat wrote:
               | They do. And language designers seek to smooth the edges
               | and develop ways to encourage devs to write clear and
               | intuitive code. When a language says "well, these are
               | hairy questions that you should just figure out" we tend
               | to criticize those languages unless they have clear
               | reasons for that decision. It should be _obvious_ that
               | something is thread-compatible.  "Well, if you model it
               | in pi-calculus it is easy" is a crappy way of handling
               | criticism.
               | 
               | "Its easy if you think about it mathematically" is not
               | enough.
        
               | taffer wrote:
               | In which language is it obvious not to put money into a
               | float? In which language is it obvious if I want a
               | synchronizedSortedMap or an arrayList?
               | 
               | There are simply things you have to learn to use a
               | technology. If you want to write Java, you need to know
               | what a variable is, what a for loop is and what
               | Inheritance/Interfaces are. Likewise, in the case of SQL,
               | it means you need to know concepts like normalization,
               | ACID and joins. Just poking around until the code works
               | won't do it.
        
               | UncleMeat wrote:
               | No language is perfect. But surely you'd agree that there
               | is a spectrum here. In C++ you have to think "should this
               | be a pointer or a reference" all the time. In most modern
               | languages, you never do. And although there are clear
               | methods to write C++ well, it is rightly criticized for
               | being overly complex and unintuitive. For example,
               | std::set is ordered. That's a basic mismatch with
               | expectations. Sure, you can just check the docs. But how
               | much code would be more efficient if they had just spent
               | a bit of time making things easier to work with?
        
           | acdha wrote:
           | > It's declarative, the primitives seem entirely non-
           | intuitive, it often takes a lot of fiddling to get what you
           | want, the behind-the-scenes execution is mostly a black box,
           | and while it's supposed to work the same on different
           | implementations (of browsers/databases), there are tons of
           | little gotcha quirks.
           | 
           | My reaction to that is that it's similar in a different way:
           | everyone needs to use it but many developers don't take it
           | seriously, avoid learning how it works, and then complain
           | that it's unintuitive (i.e. not something they had already
           | known) and hard to use because they're basically just poking
           | randomly until they get close enough to what they want.
           | 
           | It's true that not every implementation is the same but ...
           | where else is that not true? This is also true of operating
           | systems, filesystems, every library implementing a standard
           | format ("Why doesn't this PDF open in ..."), etc. I would
           | have trouble supporting the belief that databases are an
           | outlier in this regard.
        
             | fools wrote:
             | If I write a C program, especially with the appropriate
             | compiler warnings enabled, or some filesystem code, there
             | is a high chance that it will work across platforms with no
             | further coding necessary. Anecdotally, the same cannot be
             | said for CSS or SQL.
             | 
             | It doesn't matter much for SQL, because I always know what
             | database I'm using, but for CSS, it's a massive pain.
        
               | acdha wrote:
               | That's a charming belief about C which is only true if
               | you work on the same operating system, processor, and
               | don't do anything complicated. If those are not true, as
               | I've experienced many times over the years, you will
               | learn otherwise about things like differences in floating
               | point behaviour, memory allocation and access patterns,
               | which filesystem and/or locking semantics, etc.
               | 
               | In other words, it's about as true as it is for CSS and
               | SQL, where that kind of simple use is also stable.
        
               | fools wrote:
               | Yes, I write fairly boring applications. But the point is
               | that C is mostly cross-platform for those boring
               | applications, whereas the the "cross-platformness" of CSS
               | and SQL fall apart for even the simplest of tasks.
        
               | acdha wrote:
               | Again, I think you're viewing this through the lens of
               | relative experience. I have seen tons of C code which
               | required substantial fixes to work portably (bonus points
               | when this was exploitable) and I have plenty of CSS and
               | SQL which hasn't had to be touched in years.
        
           | izacus wrote:
           | This is... one of the best comparisons I've seen and sums up
           | the reason why I dislike SQL as well (although I know how to
           | use it).
           | 
           | Sometimes it really feels like you're trying to give
           | instructions to someone via chat which gets Google translated
           | to chinese, japanese and russian on the way - it's this very
           | lossy communication channel where you need to tweak the
           | language "just so" to get maximum performance. I think it's
           | no wonder that newer DB designes opted for more direct and
           | tailored APIs.
        
             | jacques_chester wrote:
             | Databases are quite transparent about what they do. The
             | impression of "just so"-ness is a property of the observer,
             | not the system.
        
           | cat199 wrote:
           | > it often takes a lot of fiddling to get what you want
           | 
           | and yet, for any nontrivial operations, opting for a few
           | simple K:V stores instead means one basically ends up
           | implementing an unrolled SQL engine processing loop to do the
           | application-specific select statements you would otherwise
           | need.
           | 
           | there's a reason SQLite took over from dbm files ...
        
         | forgingahead wrote:
         | I don't think it's as elaborate as you're saying -- the
         | clearest road into a productive job for many years has been
         | "join a fast growing tech company who uses <cool_framework> to
         | do the magic, so the tech team is always building more business
         | logic". When you've been working with (eg:) ActiveRecord and
         | Ruby on your database data, direct SQL just doesn't come up
         | until you're trying to wrangle more efficiency in your queries
         | or do more complicated joins.
         | 
         | It certainly makes sense for all developers to learn more SQL!
         | But it just hasn't been part of the daily work requirement for
         | many.
         | 
         | As for the attraction of noSQL databases, I think perhaps it's
         | more of a new-hotness trend rather than a serious technical
         | decision. NoSQL has its place, but like most of this kind of
         | tooling, it's often applied incorrectly because of a lack of
         | thinking through the business logic.
        
         | Ididntdothis wrote:
         | SQL feels to me like bash or regular expressions. You can do
         | amazing things if you do it full time. But if you do it only a
         | few times per month or year you quickly forget all the
         | subtleties and it gets hard to understand even the stuff you
         | wrote half a year ago. I guess in the end things have become so
         | complex that as a dev you can't be good at everything. I often
         | wish there were dedicated database guys but if you have one
         | usually that person is sysadmin and won't help much with coding
         | against the database.
        
           | rurp wrote:
           | This echos my experience. I've written some complicated SQL
           | queries, but I do it rarely enough that I always have to re-
           | learn a lot of it.
           | 
           | There have been a number of instances where I have had to
           | look something up seemingly for the first time, found a good
           | Stack Overflow answer, and then chuckled to myself because I
           | had already upvoted that exact answer at some point in the
           | past. SQL queries are a pretty common source of this, along
           | with uncommon git and terminal commands.
        
             | Ididntdothis wrote:
             | Very true about git. I would add C++ to that lis. Done full
             | time it's super powerful. But later or with less experience
             | you look at the code and think "WTF?" .
        
         | echlebek wrote:
         | SQL really is a programming language though. It's just not
         | imperative.
        
         | coliveira wrote:
         | In the minds of many developers, SQL is not a "real programming
         | language". That's why they don't see the value of spending the
         | time learning it. Many also believe that SQL is something that
         | they should better avoid, using an ORM, and let it be handled
         | by libraries written by "experts". All of this contributes to
         | developers not having a firm grasp of SQL.
        
           | mey wrote:
           | Yet they are unwittingly willing to learn a different DSL,
           | the ORM's DSL.
        
       | chaps wrote:
       | Here's a fun bug I had a few years ago -
       | 
       | Had a postgres database which was using pgbouncer for connection
       | pooling. The most senior developer (24yo or so) we had on the
       | project was using Go to connect to the database to write some
       | simple reports, but each report took hours to run, and often had
       | to sleep for 30+ minutes. So, after a while, pgbouncer would kill
       | their connection, and their report would die. No other
       | application did this among the many that we had connect to that
       | DB, so it was definitely strange.
       | 
       | Found out pretty early on in troubleshooting it that they had no
       | mechanism to keep the connection alive, which makes total sense
       | for why his app died. So, they put the library standard keepalive
       | function in a loop if the report wasn't doing anything.. but that
       | didn't fix it.. it made no friggin' sense. After bashing my head
       | against that for a while, I finally threw my hands up and asked
       | if they could just run a "SELECT 1" as a keepalive instead of
       | whatever the Go library was doing. Got a bit of pushback, but
       | just told him to do it and walked away. That ended up fixing the
       | problem.
       | 
       | Turns out the Go library was trying to be clever in its
       | keepalives (can't remember what it was doing exactly), in that it
       | made some silly assumptions that there was nothing in the middle
       | managing connections.
       | 
       | I like to think that dev learned a lot about trust in "magical"
       | libraries after that.
        
         | kerng wrote:
         | Go code often reinvents/reimplements a lot of things from
         | scratch, reintroducing problems that have been addressed long
         | ago in other systems.
         | 
         | It's like this new trend, let's rewrite everything in Go to be
         | cool. Financially makes little to no sense.
        
         | eeZah7Ux wrote:
         | > The most senior developer (24yo or so)
         | 
         | I see the problem there.
        
           | lanius wrote:
           | Age is a just a number. John Carmack created Doom in his
           | early 20s.
        
             | throwaway3563 wrote:
             | Every good rule has an exception.
        
           | osrec wrote:
           | I've seen older developers that call themselves senior, but
           | lack basic knowledge. I've seen younger developers, wise
           | _well_ beyond their years. Age simply isn 't a big factor in
           | how you judge a developer.
        
             | brabel wrote:
             | Just because you saw a few exception does not mean the rule
             | does not hold in general. Or are you saying most people
             | don't learn with time (a corollary of your theory that age
             | is not a big factor)?
        
               | osrec wrote:
               | I would say that I have seen no discernible pattern, so I
               | have learnt that it is imprudent to judge a developer by
               | their age. People do learn over time, but some people
               | learn more "cogently" than others, i.e. some get more out
               | of one year's worth of learning than others get in 10.
        
               | codegladiator wrote:
               | > most people don't learn with time
               | 
               | I think this is correct. The curve flattens with age (in
               | my opinion).
               | 
               | > you saw a few exception
               | 
               | I think the seniors you met (the knowledgeable ones)
               | /were/ the exceptions. Most seniors I have seen have
               | convinced themselves that they know all there is to know.
        
         | cpufry wrote:
         | which library?
        
           | atian wrote:
           | Sounds like pgx.
           | 
           | Spent forever debugging this myself.
           | 
           | https://github.com/jackc/pgx/issues/494
        
         | nuggien wrote:
         | Nothing about go is magical, and you probably solved the
         | problem blindly.
        
           | chaps wrote:
           | Wow, okay. You alright over there?
           | 
           | Edit, because meh: I'm making no claims about go itself. No
           | idea what makes you think that's what I'm saying, since I'm
           | clearly talking about a library, and not even any stdlibs.
           | "Magic" is just a term useful for describing systems that
           | sweep much of their abstractions under the carpet in a way
           | that probably has gotchas. Granted, the term itself is
           | magical.
           | 
           | In terms of fixing the problem, I knew for a fact that the
           | keepalives that I was seeing were nothing like what I've seen
           | in the past, at many companies, across Oracle, postgres, and
           | MySQL, all who've implemented "SELECT 1" for the sake keep
           | alives, by devs who've been in the field for much longer than
           | me. The suggestion was by no means blind, unless you consider
           | implementing a widely used method for this exact purpose,
           | "blind". Had I gone a different route in fixing the problem
           | within the stable, existing system, it would have likely
           | broken many of the other database connections by many teams.
           | I'll pass on that, since frankly, even ignoring the risk of
           | such a change, the dev should have done the investigation
           | themselves.
           | 
           | Your post was unnecessarily aggressive and seems to come from
           | me having struck a nerve somehow. Genuinely hoping you're
           | doing alright. Peace.
        
             | nuggien wrote:
             | I have no affection towards go or any language. They're
             | just tools. You sounded elitist calling something magic and
             | pointing out someone's age as part of your point. And your
             | passive aggressiveness to my response is proof of it. I
             | "genuinely" hope you're doing alright too.
        
               | chaps wrote:
               | Heh, I wasn't being passive aggressive, I was being
               | serious. Take a breath, man.
               | 
               | Also "magic" is _not_ a new term:
               | https://en.wikipedia.org/wiki/Magic_(programming)
        
             | bdcravens wrote:
             | I presume they took it as an attack on Go. Truth is, it's
             | an attack on the library developer who themselves may have
             | found their keep-alive solution by stumbling blindly on it.
        
         | user5994461 wrote:
         | was the go library fixed?
        
           | chaps wrote:
           | No clue!
        
         | taf2 wrote:
         | Interesting I've run into similar issues when we put network
         | load balancer from aws in front of a db. It has a fixed tcp
         | connection timeout so similar if the query takes a long time
         | it'll disconnect. We fixed this at the socket level by ensuring
         | our linux syscnf was set to keep connections alive at an
         | interval below network load balancer. Was a tricky problem to
         | figure out.
        
         | harikb wrote:
         | Go sql/database uses its own connection pool. But still that
         | shouldn't create any problems. I have seen the reverse where
         | apps that assumed temporary tables stick around from statement
         | to statement without an explict `txn` (which regular postgres
         | connections don't need) clearly failed. But I have not seen the
         | issue you talk about.
         | 
         | My wild guess would be that the Go code never closed the
         | result/rows which caused either a pool exhaustion / left the
         | connections hanging and eventually got a timeout. Consider this
         | similar to how http client's need to close the response.Body or
         | else connections can't be reused.
        
           | randomdata wrote:
           | The database/sql package isn't very magical. The comment
           | about magical library makes me think it was some other
           | package.
        
           | hermitdev wrote:
           | At a previous employer, I was forced to use a certain 3rd
           | party ODBC library. Under certain circumstances, it would
           | just do `exit(1)`, with nothing even to stderr. Very
           | frustrating and annoying to debug/fix. Had I physically known
           | the developer responsible for that behavior, I'd probably
           | have faced murder charges and plead temporary insanity.
        
       | xwdv wrote:
       | These sort of articles are very effective, if you nail at least
       | one thing in your checklist the reader does not know or is
       | confident about, it often encourages them to continue reading
       | further. I know I did.
        
       | tenant wrote:
       | An odd thing that happened to me yesterday with the PostgreSQL
       | ODBC driver (psqlODBC) v9.3.400. It wouldn't let me insert a
       | string longer than 255 characters long into a character varying
       | field into a local v9.4 database on windows using a recordset
       | update. I didn't have a problem pasting it in via pgadmin.
       | Altered the field to text and the problem went away. I've a
       | suspicion that there is a limit on text in the tens of thousands
       | of characters length too though despite both those fields being
       | essentially the same thing and limitless.
        
         | echlebek wrote:
         | That sounds like a bug in the driver. I'm pretty sure varchar
         | on postgres is just an alias for text, which has no limit.
         | 
         | edit: varchar(n) will issue an error past n bytes, varchar
         | without n is the same as text. character varying(n) is like
         | varchar(n).
         | 
         | https://www.postgresql.org/docs/9.4/datatype-character.html
        
       | gdubs wrote:
       | There seems to be a wealth of information here. I wonder, though,
       | if the goal would be better served breaking this up into a
       | series. Or perhaps applying the 80/20 rule to the list to come up
       | with truly "a few" items that will have the highest leverage.
       | Otherwise, I think it will provoke a lot of discussion and
       | analysis from people who are experts in databases, but it might
       | remain impenetrable for the core audience: the majority of
       | developers who ignore this stuff (according to the author.)
        
       | paxos_failure wrote:
       | Very useful list! Related to this, I've recently discovered
       | https://databass.dev, a database internals book, which also
       | covers a few things that were mentioned in this list, and some
       | more.
        
       | Buttons840 wrote:
       | I learned SQL working on a "database as a product". The database
       | was filled with medical ontologies. It was a perfect environment
       | for learning. We were always looking for obscure things in that
       | database, and rarely changed data, just selects for days. In the
       | end I once wrote a query spanning about 100 lines that used
       | common table expressions and found it quite maintainable.
       | 
       | Where can an SQL beginner find such a database to experiment
       | with? I was fortunate because my job provided me this database to
       | experiment with, but what about those who are not so fortunate?
        
         | war1025 wrote:
         | CTEs are great. It's such a nicer syntax than subqueries.
        
       | tonymet wrote:
       | The most important and overlooked characteristic is that most
       | RDBMS use b-tree for their table-space, meaning all the
       | operations (including search / lookup) are O(log n). For online
       | (OLTP) applications, this means you will have to shard sooner or
       | later (assuming your audience is growing)
        
         | mtVessel wrote:
         | I agree with your first statement, but could you explain how
         | b-trees logically lead to sharding (at some growth point)? What
         | storage structure doesn't lead to sharding eventually?
        
       | xupybd wrote:
       | I could be wrong but it seems to me that she is confusing not
       | storing state with Idempotent calls to the db. Everything you
       | store in the db is some form of state.
        
       | dmtroyer wrote:
       | As a developer, I am glad there are other people in the world who
       | understand this.
        
       | danielovichdk wrote:
       | I love a good old database
        
       | [deleted]
        
       | a_c wrote:
       | Learn about modelling. Database is more than just storing data.
       | Drink less koolaid of NoSQL, any NoSQL. It is trading initial
       | result with future development time. SQL has been battlefield
       | tested. No amount of "convenience" is more convenient than
       | learning the fundamentals.
        
         | strbean wrote:
         | I'm disappointed that there isn't more criticism of the SQL
         | _language_. The whole NoSQL buzz got me excited, then turned
         | out to actually mean NoRelational.
         | 
         | It is wild that we are still using a language that looks and
         | feels like COBOL, and any criticism is met with drive-by
         | disapproval (downvotes and no comments) or an argument about
         | why relational databases are important.
         | 
         | SQL is a deeply flawed language by standards that are pretty
         | much universal today - we on HN discuss them daily on posts
         | about new programming languages. For example, one of the top
         | level comments here:
         | 
         | > In SQL it boils down to the fundamental syntactical
         | requirement to put the SELECT clause before the FROM clause. So
         | I have to build up my statement in this weird spiral pattern
         | where I change something deeper in before I know what I can
         | SELECT in the first place.
         | 
         | If this were the case for a new language post on HN, the author
         | would get run out of town for this reason alone. And yet, the
         | person who wrote that comment feels the need to hedge by saying
         | they are "an SQL hater in remediation".
         | 
         | Relational data does require a mode of thinking that many
         | programmers are not practiced with. Wouldn't it help to have a
         | language for working with it that isn't outright terrible?
        
           | nouveaux wrote:
           | In regards to 'select' before 'from', there could be a
           | similar argument to be made for declaring imports at the top
           | of a file. When you write a program, you may not know what
           | libraries you need. Variable declaration at the beginning of
           | a function is also a common pattern.
           | 
           | The SQL language stood the test of time where as COBOL did
           | not. I think it says something about how well it was
           | designed.
           | 
           | I strongly suspect that developers have such a hard time with
           | it is because of ORM. ORM trains you to think of the database
           | as objects. Another poster mentioned that treating databases
           | like sets made SQL click. I think that is a good way of
           | thinking about SQL.
           | 
           | To be fair, SQL does have some archaic things in there that
           | make it annoying to work with. However, once it clicks, it's
           | fits naturally with how a relational db works.
        
             | setr wrote:
             | >Variable declaration at the beginning of a function is
             | also a common pattern
             | 
             | What language requires you to declare variables at the
             | beginning? C even stopped doing that. People choose to do
             | that, but ime that's after the writing phase, to make it
             | more readable
             | 
             | >When you write a program, you may not know what libraries
             | you need
             | 
             | Sort of -- the difference is that when I'm writing code in
             | my IDE, I know all database objects available to me. It's
             | in the schema. A library I import once (to the project
             | itself), and the IDE can always assist from then-on. But
             | SQL is designed such that despite the library (schema)
             | being imported, the IDE can't actually assist, unless I
             | write the code out of order (eg start with SELECT * FROM
             | table and then start working)
             | 
             | >However, once it clicks, it's fits naturally with how a
             | relational db works
             | 
             | I think you've misunderstood the complaint -- the
             | relational language is a very strong concept, which has
             | stood the test of time, and is difficult to complain about
             | -- it does its job well, and fits naturally with how a
             | relational db works.
             | 
             | SQL the language however is:
             | 
             | A hodgepodge of random keywords tossed about in a totally
             | inconsistent fashion (eg postgres overlay:
             | OVERLAY('Txxxxas' PLACING 'hom' FROM 2 FOR 4) -- postgres
             | at least generally offers a consistent comma-separated
             | syntax for every functions
             | 
             | has weird and technically unnecessary limitations (like
             | SELECT being evaluated _after_ the WHERE clause, so you can
             | 't use the aliases defined in select clause)
             | 
             | putting SELECT _before_ the FROM (disabling IDE auto-
             | complete support)
             | 
             | the stuffing of a 3-value logic system into a 2-value logic
             | interface, so boolean operations break silently and produce
             | nonsense in the face of a database with NULL values,
             | because no mapping of NOT (TRUE, FALSE, NULL) AND NULL
             | makes sense. [0]
             | 
             | It generally composes very poorly, leading to redundant,
             | long and convoluted queries
             | 
             | The MODEL is fantastic -- the language is not.
             | 
             | [0] https://sigmodrecord.org/publications/sigmodRecord/1709
             | /pdfs...
        
           | a_c wrote:
           | SQL stood the test of time. SQL is widely adopted. Once you
           | get the hang of it, it can be applied on a wide range of
           | RDBMS.
           | 
           | I would argue that countless productivity has been lost to
           | learning yet another query language for yet another NoSQL db.
           | Mongo has its own query language. Cassandra has its own.
           | Neo4j has its own. What not. Guess what, few engineers need
           | these to solve their actual problem. Be it building an
           | application, a library, a SaaS, a tool with wide database
           | support. The yet another query language is an imaginary
           | solution to an imaginary problem (most of the time). Your
           | problem isn't big enough to use whatever NoSQL of choice.
           | 
           | When developers starting out, they get attracted to
           | technologies with the most marketing money. They beat around
           | the bush of learning things that might not matter anymore 5
           | years down the road.
           | 
           | Is english the best language? No, not even close. Should one
           | learn it? Probably.
           | 
           | Is esperanto a better language? I don't know, maybe. People
           | invent it for a reason (a problem to solve) after all. Should
           | one learn it? Probably no.
        
             | strbean wrote:
             | > SQL stood the test of time. SQL is widely adopted.
             | 
             | I'm highly dubious of "it is widely used so there"
             | arguments. I think in the case of SQL, the fact that ORMs
             | and SQL generators are ubiquitous are evidence that a huge
             | proportion of engineers would rather not write it, and part
             | of why it is still widely used (most of the time, most
             | people can avoid actually touching it).
             | 
             | > I would argue that countless productivity has been lost
             | to learning yet another query language for yet another
             | NoSQL db. Mongo has its own query language. Cassandra has
             | its own. Neo4j has its own. What not. Guess what, few
             | engineers need these to solve their actual problem.
             | 
             | This is exactly one of the points I addressed in my
             | comment... none of those are even relational databases. And
             | this is where the discussion ends up every time someone
             | says SQL sucks.
             | 
             | > Is english the best language? No, not even close. Should
             | one learn it? Probably.
             | 
             | The vast majority of the time, when you are using English,
             | you have no control over the receiving end. The vast
             | majority of time, when you are using SQL, you / your org
             | grabbed an SQL RDBMS to use.
             | 
             | Furthermore, I don't see why a modern, reasonable query
             | language couldn't transpile to SQL easily when necessary,
             | making all path-dependency / adoption arguments void.
        
         | tingol wrote:
         | Any tips where to look for db modelling learning resources?
        
           | a_c wrote:
           | I actually don't know. There was this previous discussion
           | https://news.ycombinator.com/item?id=22324691. Might be some
           | gems there.
           | 
           | For me, it is the 3rd normal form and the adage of "Normalize
           | till it hurts, denormalize till it works".
        
         | adrianmonk wrote:
         | > _Learn about modelling._
         | 
         | So true. I worked at a database-centric place for a while, so I
         | read up on database stuff to fit in.
         | 
         | Learning about first, second, and third normal forms was very
         | enlightening. If you learn these thoroughly enough that they
         | are second nature, it really helps you see modeling mistakes
         | that you might be making. It just becomes a lot easier to think
         | clearly about how to lay out data. Just like there are code
         | smells, if you learn normalization, you will immediately detect
         | data structure smells.
         | 
         | Third normal form is not always the one and only right answer
         | for modeling data, but it's a pretty good starting point from
         | which you can refine and adjust if needed. Just because you
         | understand normalization doesn't mean you have to do it all the
         | time, of course. But if your data isn't normalized, it should
         | be for a specific reason (performance), not by accident or
         | because you don't know how to keep it organized. It is one of
         | those things where knowing the rules gives you the freedom to
         | know when it's right to break them.
         | 
         | It's even useful when thinking about organizing data in RAM.
         | The situation is a little bit different because in RAM you
         | follow pointers instead of doing joins. But there are still
         | cases where it helps. For example, you might have one big
         | struct, and you realize it should be two different structs
         | because you are filling the same (redundant) data into multiple
         | instances. And you know how to fix it.
         | 
         | Another very useful concept from modeling is choosing keys. One
         | of the lessons of databases is that if you don't use synthetic
         | keys, you're going to have a bad time. When you choose real-
         | world data (like first name and last name as a key for a table
         | of people), you're going to have problems like non-uniqueness.
         | 
         | Relational databases aren't the last word on how to organize
         | and store data, but there are just a lot of good ideas that
         | pick up if you learn about them.
        
       | jasonhansel wrote:
       | I would add: when using SQL, the default transaction isolation
       | level is probably not what you'd expect.
        
       ___________________________________________________________________
       (page generated 2020-04-22 23:00 UTC)