[HN Gopher] Things I wished more developers knew about databases ___________________________________________________________________ Things I wished more developers knew about databases Author : zdw Score : 559 points Date : 2020-04-22 04:45 UTC (18 hours ago) (HTM) web link (medium.com) (TXT) w3m dump (medium.com) | dirtydroog wrote: | I find SQL is quite like Python. You know what you want to do, | you can take a stab at attempting a couple of times but | eventually you'll consult stackoverflow and kick yourself for not | figuring it out. Repeat the above in two weeks when you've | forgotten it. | Jaruzel wrote: | Since when did Medium.com offer to log you in with your Google | account like this? | | http://www.jaruzel.com/files/medium-google.png | | I've got uBlock and Cookie Cleaner running - but it's clearly | doing some sort of cross-site shenanigans to create this pop up. | | Time to stop visiting Medium.com I think. | benatkin wrote: | Since a long time ago. This is old news. iFrame sandboxing is | now a two-way street, depending on configuration. Not only can | an iFrame be prevented from accessing the parent frame, a | parent frame can be prevented from accessing the child frame. | It is still vulnerable to clickjacking, but to reduce the | impact of that, using one-tap sign up only allows the most | basic Google permissions. | https://news.ycombinator.com/item?id=17044518 | https://developers.google.com/identity/one-tap/web | delecti wrote: | I've been seeing that for ages, months at least, maybe years. | [deleted] | cyral wrote: | > Time to stop visiting Medium.com I think. | | Click the X and move on. It's helpful to be able to sign up for | a site so quickly with such integration. | stuaxo wrote: | I just want my databases to be normalised, and my queries | optimised. | | Stuff like Django CMS is terribly architected, so code is ping- | ponging between DB and code. | | This would have been a non-issue if all the APIs were built with | batching in mind, and the data was normalised in the first place. | jtchang wrote: | Just to be clear you are talking about the CMS and not django | itself right? Django itself is fantastic. | indymike wrote: | Nice! Perfect timing. I just started writing a new greenfield | project and decided to use Postgres instead of a NoSQL after five | years of NoSQL. | [deleted] | [deleted] | AtlasBarfed wrote: | One thing I've noticed that in the medium term of a software | service (2-5 years) is that your software should have the ability | to do double-writes to and flip reads between two different | datastores. | | That will afford migration with as close to transparent migration | with as reduced a downtime or no downtime. | earhart wrote: | That gets tricky if one of the writes fails. | hobs wrote: | Generally you get two phase commit involved here, or an | asynch mirror that you can restore over if there's failure. | smilliken wrote: | Two phase commit makes a promise it can't keep. CAP theorem | is not amused. | | It's slow, complex, a leaky abstraction, and ahould be | avoided for better consistency models. | hobs wrote: | I generally couldn't agree more, but yet its the first | tool in the toolbox for a lot of databases. | smitty1e wrote: | That sort of flexibility does not sound as though it would | scale well as the data get big. | madhadron wrote: | Define "big." Double writing in some form is the only option | for live migration and you either bolt it on later or plan | for it now. | smitty1e wrote: | When the overhead of the double writing becomes substantial | enough affect the architecture. | madhadron wrote: | Ah. So bigger than Facebook or Google. | jatone wrote: | wat. particularly today with logical replication this is pretty | much unnecessary. | falcolas wrote: | (The 80/20 rule applies below, some developers do care) | | Developers... just don't care. They want to spin up an ORM, point | it at a URI, and forget about it. | | I've fought this for over a decade now as a DBA, SRE, DevOps, and | architect. Most of the developers don't want to deal with | anything infrastructure-wise; they want to spend all the time | they can just focusing on the problem they're writing software to | solve. | | Observeability, reliability, scalability - these are all words | that are translated into either "someone else's problem" or | "unproductive busywork" in their minds. | rietta wrote: | I sometimes feel in the minority. I love databases. When I work | in the Ruby on Rails ORM ActiveRecord I can actually visualize | the SQL it is generating in my head and also do all sorts of | tricks when needed. | petepete wrote: | That's the key. ORMs get a bad name but most of the time you | just want to display a list of things, or one thing in more | depth or maybe create a new thing. | | ORMs unfortunately, have a habit of getting in the way when | you want to do something they don't natively support. When | they just ignore things the database provides people just end | up reinventing the wheel. Rails' implementation of enums is a | good example of this. | lovegoblin wrote: | > Rails' implementation of enums is a good example of this. | | The advantage of ActiveRecord enums (vs db-native) is that | you can change the list of valid values without having to | run a whole ALTER TYPE - and all the overhead that would | entail in doing without downtime in a production db. | balls187 wrote: | Probably because they don't need to think about those things. | | I started my career as an Embedded Software Engineer, and | memory allocation, and clock cycles hand to be managed. Our | software ran on systems with limited memory, and had to fit in | one 60hz cycle. We supported VAX system that used VAX Floating | point, and had to be cognizant of both floating point | conversions AND endian byte encoding. | | These days, such concepts are basically just trivia answers for | interview questions. | | I had to think about those things because it was required. | System software would crash, and debugging it on an expensive | government flight sim 1000 miles away was impossible. | | Perhaps for the developers you work with, they don't need to | think about those things because they're not required to. After | all they have access to a dedicated DBA/SRE/DevOps/Architect | guru. | hermitdev wrote: | Not meaning to come off as trite, but I know my response will | sound like I am. | | But, what you described is exactly why senior devs/engineers | make 2x, 3x, 4x a junior. | | Seniors make more because they have that extra knowledge you're | talking about that can take a decade or more to accumulate. The | idiosyncrasies between different DBMS or even languages or even | specific versions of languages. That knowledge doesn't come | cheap. It literally takes upon years to develop. | | This is partially why I think bootcamps are bullshit. I'm | sorry, but there is no way to become a competent full-stack | developer, especially if you're not targeting web/browser. Give | me someone with 3 months of bootcamp C++, I'd be surprised if | they can get a nontrivial program to link, let alone compile. | testnew2 wrote: | I don't care and that's why I use raw sql lol. I don't want to | learn some orm bs | throwaway13337 wrote: | All of those things are unproductive until after your project | is successful which is usually not a guarantee when building | the software. | | You're not gonna need it. | hobs wrote: | Observeability, reliability, scalability - really easy to | bolt on once you are successful. | | There's a middle ground, and a DBA turned SRE turned | Architect is probably ok with compromises, those are all | roles where "it depends" is a bylaw. | aeyes wrote: | How is scalability easy to bolt on? Most companies end up | rewriting their entires stack, multiple times. | falcolas wrote: | So, they really aren't that easy to bolt on, if not | considered from the beginning. Monoliths, for example, are | a real PITA to make reliable and scalable. | | Worse, once your company is successful, there will be an | endless list of features to add to your product, meaning | nobody has the time to "bolt those features on". How the | product begins, is how the product often lives on well past | it's expected lifespan. | hobs wrote: | My apologies that the sarcasm was a bit too subtle - no | they are not. | AndrewKemendo wrote: | Those are absolutely not easy to bolt on. | jl6 wrote: | Application developers don't care about data - they want the | database to be a dumb storage unit because they only care about | their application functionality. | | If you can convince your application developers of the prime | importance of data, they will start caring about their | database. | | Applications come and go but data lives forever. | davedx wrote: | As a developer: I do care, but it's hard for me to focus on | building software if I also need to think about DBA tasks, | DevOps tasks, and so on. These things all take time, patience, | and energy. I've just spent most of today running and re- | running a CloudFormation template to create a SQL database. | Most. Of. A. Day. It's partly because I'm not a DevOps expert | and even if I wanted to be one, that would also take time, | patience and energy. | | There's a reason people have specializations... | cookiecaper wrote: | > _It 's partly because I'm not a DevOps expert and even if I | wanted to be one, that would also take time, patience and | energy._ | | It's this exact divergence that creates the disconnect. If | someone doesn't understand and doesn't have to care about the | whole experience, they're going to focus on their side and | stop when their side is good enough. | | On the other hand, if that same someone is going to be | regularly developing the application, making changes to the | database, and triggering deployments, they will find a way to | make the process flow. They'll make it adaptable enough that | it's not a day-long pain any time they need to run a | migration or spin up a test DB. | | The whole toolkit can and should be available because every | piece of the stack opens up new possibilities. You want, and | at least at some level, can have, people who know this well | enough to make good use of it. Nitpicking over "not my | specialization" is the antithesis of a smooth engineering | process. | ffdjjjffjj wrote: | You'd be surprised how much rank and file developers can care | about observability and reliability. The key to unlocking this | is making them responsible for how their code runs in | production by adding them to the on call schedule. | vsareto wrote: | Too bad there isn't an equivalent on-call punishment for | management and others around the dev process who also make | mistakes and bad decisions. | | As far as I can tell, they get paid more for doing less. | vajrabum wrote: | I'd guess that problem might go away if you make their | bonuses contingent on a sleep numbers (along with the SLAs) | for whoever has to run the application or it's downstreams | whether that's a dev or an ops type. Base the number on | pagerduty calls or something like so they can't game the | number. | edraferi wrote: | That is a wonderful idea. Managers are responsible for | the performance of their teams. NOBODY wants to sandbag | sleep numbers, so it'd be a decent comp metric. | FpUser wrote: | add to that "do not want to declare/worry about types", "cant | be bothered about RAM usage/cache coherency/etc", "concurrency" | etc. etc. All of those are "premature optimizations". | yjftsjthsd-h wrote: | > All of those are "premature optimizations". | | And they are, until they aren't. And I really do mean that in | both directions: Much of the time, the simplest, dumbest, | most naive solution _is_ 100% fine for realistic load for the | forseeable future. And then some of the time it breaks | terribly and you do need to spend effort optimizing it, | whatever that entails. | FpUser wrote: | To be clear I am not advocating optimizing everything down | to the last bit here. Problem here is when THERE IS a need | many would not even have a slightest clue where to begin as | all those "low level" concepts were relegated into | oblivion. | jordache wrote: | >Observeability, reliability, scalability - these are all words | that are translated into either "someone else's problem" or | "unproductive busywork" in their minds. | | hence why you work in a team.. everyone provide their domain | expertise.. we don't need dba to be dictating FE devs on | strategies for immutable state management in the UI. | arvinsim wrote: | Perhaps development work has reached complexity where it | demands full-time attention? | | Perhaps there is a reason why DBA, SRE, Devops and Architects | are separate roles? | cookiecaper wrote: | > Perhaps there is a reason why DBA, SRE, Devops and | Architects are separate roles? | | It's mainly an artifact of the way we've broken up degree | tracks, and the boundaries that each group is taught to stay | inside, lest any particular group actually ends up culpable | for a failure. | | Make fun of the "full-stack rockstar ninja" all you want, but | the reality is that it is possible to have a functional | understanding of all of these areas. It's just a matter of | having the confidence and doing the legwork. | | To the extent that more of your people have a working | knowledge of these domains, you'll not only have a better end | product, but a _much_ easier time getting stuff done. | | There's room for hyper-specialized expert consultants in each | field, of course, but the myth of the myth of the full-stack | developer exists primarily for political convenience. Most of | this stuff is not any harder than the rest of it, and can be | learned by anyone willing to sit down and learn it. | msla wrote: | The phrases "functional understanding" and "working | knowledge" are gigantic sucking tarpits. | | Before I got my current job, I felt confident in my | knowledge of computer networking at the LAN level. I knew I | wasn't going into the telecom world and I knew I didn't | have the knowledge to debug BGP or ensure a CO was doing | everything right, but LANs? Sure. No problem. I knew DHCP, | Ethernet, TCP/IP, even stuff like PPP which is more niche | now. Heck, I'd even passed a college course on the subject. | OK... set up an ICMP server and make it useful. That's LAN, | right? Certainly gonna be used on a LAN. | | I'm not saying it was hard. I'm saying that I'd never | touched ICMP before except for ping and didn't know what | the more advanced stuff even was. Did I have a "working | knowledge" of basic networking? Did I have a "functional | understanding" of how to get a building full of computers | to talk to each other? | | It's always something. I thought I had a good, working | knowledge of networking. Someone who'd actually _done_ | networking in a corporate setting would have disagreed, and | pointed to a list of things I 'd never touched because | those things aren't useful in SoHo LANs and aren't | theoretical enough for a classroom. Multiply that by a few | dozen topic headings and watch people sink under the load. | cookiecaper wrote: | Unless you got yourself into a situation where you were | expected to set up a whole office with the same speed and | expertise as a full-time network engineer based on some | gross misrepresentation of your skillset, I don't see how | this story is particularly relevant. Maybe it's a good | cautionary tale about presuming that SoHo is the limit of | networking? | | Technical topics are indeed both very deep and very | broad. The level of sophistication and depth is how you | choose your specialization, but that doesn't mean you're | never allowed to learn anything else. You should learn | enough about each field to know the shores when you're | standing there, to be able to communicate with the | "natives"/specialists, and to know when you're getting | out of the shallow end. This expectation should exist for | everyone: DBA, application developer, devops, network, | whatever. They should all know the territory and be able | to work together cohesively to identify the best place to | take something down deep. | | Depending on the constraints of the project, leaving the | shallows means either a) developing more proficiency | directly and getting deeper yourself; or b) acknowledging | that you need someone with more expertise in that area to | take it from there while you go back to handle things in | areas you know better. | | The thing we must avoid is "well I'm not a network | engineer so I don't look at Cisco configs, sorry". That | should be replaced with "well I'm not a network engineer, | so I have no idea what's happening here, but it's still | interesting, can I sit behind you and learn?" | msla wrote: | > Unless you got yourself into a situation where you were | expected to set up a whole office with the same speed and | expertise as a full-time network engineer based on some | gross misrepresentation of your skillset, I don't see how | this story is particularly relevant. | | Taking a job as one of a business's "computer people" | puts you in the path of a whole lot of interesting tasks, | even if your main job is programming. | | > Maybe it's a good cautionary tale about presuming that | SoHo is the limit of networking? | | It's certainly that, but I want to expand on this a bit: | SoHo is the only stuff most people can play with. For | example, I can make programs and package them in Docker | containers and run them that way, but I don't know how | I'd play with Kubernetes in a realistic fashion. There's | whole genres of technology most people can't get | realistic access to without some institutional support. | It's an effective ceiling on some kinds of knowledge. | | As far as learning how to learn, I agree with you. I | think a lot of it comes down to vocabulary: Once you know | the terms the experts use, you can bootstrap effectively | and learn more terms and bootstrap even more effectively. | Plus, words have a way of coming back to you at odd | intervals, effectively dropping you hints when you see | something you vaguely recognize. | | Maybe we should all have Word Of The Day calendars. | cookiecaper wrote: | > _For example, I can make programs and package them in | Docker containers and run them that way, but I don 't | know how I'd play with Kubernetes in a realistic | fashion._ | | Minikube: https://kubernetes.io/docs/tasks/tools/install- | minikube/ | | Alternately, you'd take advantage of $CLOUD_PROVIDER's | initial signup credit and start cloud instance | equivalents. | | Nowadays most things have good virtual environments | floating around (you can even download virtualized | mainframes if you want). A little bit of time tinkering | with such environments will take you surprisingly far -- | _especially_ in fields like network engineering, where | even most professionals don 't know how to experiment. | commandlinefan wrote: | What, you're not a full-stack rockstar ninja? | mgkimsal wrote: | In either case, someone needs to have moderate expertise in | multiple areas of tech. If it's wrapped up in one person, | the business needs backups to deal with the bus factor. If | it's spread across multiple people, you now need to be | selecting for people with moderate tech expertise _AND_ the | ability to communicate and work together with other people | effectively. And you need someone to manage their process | (in some manner - doesn 't necessarily mean micromanaging). | | So the 'full-stack rockstar', if they have good | communication skills, can exist and be valuable, but still | isn't the best long term solution for a business of any | size. | falcolas wrote: | A DBA can't fix a developer's code. A good 3/4 of the list | linked in TFA are things a DBA or Architect can't affect. | | At some point, a developer needs to be responsible for how | they use a tool that's been provided for them. Abstractions | break. | gedy wrote: | Well likewise, many of the DBAs I've worked with don't try | and understand the software needs, and want the world to | revolve around what is optimal for them. | vsareto wrote: | Many interests are pulling developers' attention in several | different areas all the time. | | Database, security, accessibility, performance, infrastructure, | tooling and productivity, business concerns, workflow processes | (agile), language concerns, new things | | All of these like to say "if only the developer could do | $MY_AREA better, they'd be better developers and we'd have | better software". Each of them wants to pile on more | requirements for what devs ought to know. | | Let's say we agree that devs should know all of these things. | After you tally up every area's demand, you're probably looking | at a 10 year timeline before someone's decent in all of these | areas by spending approx. 1 year on each of these areas doing | meaningful work (as opposed to contrived tutorials). | | Even then, things change constantly, so you'd have to | continually practice all of these things. Who knows if we'll | add a new category next year? Then it becomes an 11 year | timeline. | | At some point, developer responsibility has to stop. | harikb wrote: | Not really. That is exactly the point the article is trying | to make. Developers need to care about these things - _enough | to know who to go get help from_. That is the minimum. Also | 10 years is an exaggeration. 2 years working on a non-trivial | backend should expose one to these problems. | | From what I have seen, products built without caring about | these will usually get rebuilt a year from the original | release - either by the same company or by a competitor who | killed them. | mgkimsal wrote: | > 2 years working on a non-trivial backend should expose | one to these problems. | | You can be exposed to them, but without understanding them, | and experiencing both good, bad, and really bad 'solutions' | to them, and understand the impact (on the business, on the | code, on security, on maintainability, etc)... you just | can't really _get_ all that in 2 years. | | I know plenty of people who've been 'exposed' to certain | type of tech problems, and the solutions they decide on are | objectively really bad for any metric other than "stop this | error from showing up on the screen right now". | | I've been doing this for a bit over 25 years, | professionally now, and... there's a lot I don't 'get' with | current stuff. But I've seen and lived enough projects, in | enough different situations, to have a good idea of impact | of tech decisions, and to understand how to make tradeoff | decisions. | | I had someone call me up to 'fix' a problem in a system I'd | given then 15 years earlier. It was still running, more or | less the same, and... spelunking your own code 15 years | later gives you a new perspective on the impact and value | of decisions you make. Many of the things people get hung | up on (code style, tabs/spaces/etc, particular naming | conventions, etc) provide pretty much no value in digging | in to old code that no one has touched or thought about in | a decade. Correct comments, sample data, repeatable tests | hold so much more value, but are harder to get people to | commit to following through on. | bdamm wrote: | I wish I could slap anyone who gives a hoot about tabs vs | spaces. Fortunately modern languages like go are removing | the version control problem that not caring about style | and using auto-formatting IDEs produces. | PunksATawnyFill wrote: | The solution isn't to force style on programmers within | the language. It's to fix the shitty diff tools that flag | whitespace changes as significant. | lowercased wrote: | when people care whether you have | if(foo) { | | or if (foo) { | | or if (foo) { | | you end up in pointless arguments. I don't care about | this sort of formatting very much (I have my own default | style developed over years), but I do care when other | people care about it, often to the exclusion of other | factors. | | "but we need these tools so that we don't argue about how | to format code". Well... you could... just _not argue_ | about it in the first place. | lowercased wrote: | I'm working in a couple of projects where there's a bunch | of linter-checker things that prevent any PR merges | (another... imo somewhat over-used tool) and... I split | my time between Java, PHP, various SQL engines and | various JS frameworks (react, extjs, vue, etc) and I'm | constantly battling different mental models with various | IDEs always showing different colored squiggles and | highlights telling me all the ways I'm "wrong" about the | code I've just typed. | | Can't use double quotes! Always to use double quotes! | "Prefer string/template interpolation" in JS Always do | string concatenation per another project's standards. | | Lots of different frameworks, languages, projects and | companies all force different types of ceremony on | formatting. "Hey, just let the tooling tell you!" turns | in to constant UI distractions telling you that you're | 'wrong' degrades (my?) performance. And... the value of | most of these formatting things is pretty low, long term. | I know that's heresy to some folks, and everyone I talk | to sort of agrees, then says "yeah, but I really think | standard XYZ is a good thing", but... it's nearly all | preference, just like tabs/spaces. | yjftsjthsd-h wrote: | That sounds like bad tooling. Why not have every project | use https://editorconfig.org/ and then have your IDE | auto-format? It shouldn't be popping up and making _you_ | fix it, it should fix it for you. | lowercased wrote: | So there are tools that autoformat as you type? | | let longVariableName = "hello " + this.name; | | magically becomes | | const longVariableName = `hello ${this.name}`; | | without visually flagging it, just autocorrects as I | type? | | And... I can just quickly swap settings for different | clients? Because one has eslint _block_ any PRs that don | 't have "prefer template" rules followed but another | client doesn't like that style, and don't want that style | in their code because it conflicts with existing style. | yjftsjthsd-h wrote: | I don't think that counts as a formatting issue. Yes, if | your clients have hard rules about different coding | styles at that level then it's not a technology problem | (nor is it likely solvable with technology). I assumed we | were talking about formatting issues like tabs-vs-spaces, | in which case yes every single project could be different | but auto-fixed. | lowercased wrote: | thx. sorry, i sounded a bit snarky before and wasn't | meaning to be. it's just easier for people to focus on | visual issues vs operational/functionality. and switching | between multiple projects/clients/standards illustrates | to me how relatively unimportant some of these things are | (but of course just imo). | infogulch wrote: | I would encourage you should check out editorconfig more | closely. The whole idea is that each project has a file | that defines the simple formatting rules for various file | types in the project/directory tree and your editor will | automatically follow them on a per-project basis. It's | surprisingly well supported across editors. | wwweston wrote: | > using auto-formatting IDEs produces. | | I strongly suspect a lot of the remainder of the tabs vs | spaces reminder is actually about how it either | constrains editor / IDE choices OR requires and | investment of time to deal with whichever choice someone | else made. | | "Why do you care, my IDE just handles this" is pretty | close to saying "use my IDE," on top of "use my | convention." | infogulch wrote: | That's why people are asking to push this problem down to | the language level the way Go has done. Go defines both | the "correct" format of the code as well as a standard | way for any/every editor to enforce it (gofmt etc). That | eliminates the double-headed subjectivity of both "use my | IDE" AND "use my convention" down to just "use the | standard convention defined by the language". And people | _love_ it because finally we can stop arguing about the | stupid color of paint for the bikeshed and just bloody | build the damn thing. | vsareto wrote: | >Not really. That is exactly the point the article is | trying to make. Developers need to care about these things | - _enough to know who to go get help from_. | | That isn't it. They aren't telling you to learn about | database details for the sole purpose of "DBA handles | databases. Go ask the DBA database questions". | | It's more than that. They're offloading specific knowledge | onto the dev and then making them accountable for it. It's | a reaction to common questions and an attempt to answer | them all at once by teaching devs answers to common | questions. This is a noble goal, but it's problematic _in | aggregate from multiple perspectives_. | | The aggregate goal of all of these areas trying to teach | developers their own specialties is to make developers the | masters of low-hanging fruit. | [deleted] | hatchnyc wrote: | I have done exactly this and found it to be a frustrating and | thoroughly thankless exercise. No stakeholder for development | teams care about any of this, and your allocation of shared | resources plummets as people realize they can dump problems | on your team and instead focus their attention on the ones | who do not care. | debaserab2 wrote: | I don't agree. If you're designing data structures in a code | base you shoulder some of the responsibility for the | persistence characteristics of that data. | | There's a lot of devs that think database design is the same | as starting a new ORM class and generating a migration file. | | > Database, security, accessibility, performance, | infrastructure, tooling and productivity, business concerns, | workflow processes (agile), language concerns, new things | | Yes, these are all things that devs should strive to know as | much about as possible. Software isn't easy. It takes a long | time to become an expert. 10 years sounds about right. | | > Who knows if we'll add a new category next year? | | Skill domains do come and go, but I think the ones you've | listed are solid staples of web development for the past | decade and likely will still be for a decade more. | vsareto wrote: | > I don't agree. If you're designing data structures in a | code base you shoulder some of the responsibility for the | persistence characteristics of that data. | | There's usually not a relationship that goes the other way | though, for example, developers don't tell DBAs to pick up | code so they can write the models in our language in | addition to the underlying SQL. This highlights a trend of | increasing responsibilities pushed onto the developer. | | >Yes, these are all things that devs should strive to know | as much about as possible. Software isn't easy. It takes a | long time to become an expert. 10 years sounds about right. | | But in some cases it's specialists designating what an | expert developer should know. It's giving away control in | some respects. This turns into new job requirements and a | higher barrier for entry. The growth will need to stop at | some point. | AsyncAwait wrote: | Agree with this. There's the whole DevOps nowdays as | well, which basically just shifts what used to be an | entirely separate, full-time, role onto the developer. | Adding DBA to that sounds like it would benefit noone, | except perhaps business owners looking for short-term | savings at the expense of productivity, ala open-office | floor plans. | debaserab2 wrote: | Keep in mind that full time role used to be a hell of a | lot of work. Take the project I'm working on right now: h | ttps://gist.github.com/ajbdev/11f494906ebb6aa3f6bdf35f243 | 7e... | | In ~50 lines of code I can erect a load balanced set of | web servers with my app binary preloaded, secured in a | private VPN subnet with an auto-scaling policy attached | to it. | | 15 years ago this project would have meant: | | - Racking new hardware somewhere on premise | | - Configuring several routers and networking equipment | | - Tweaking the different software responsible for the | different app layers (load balancing, web servers, etc) | | - Hoping we forecasted demand correctly and whatever | hardware we racked can handle a traffic spike or growth | surge | | You needed a lot more expertise back then. | | And really you can trade all of what I'm doing for a | higher level of abstraction that requires even less | knowledge of the underlying technology (netlify, heroku, | amplify, etc) if you're willing to pay a little more. | debaserab2 wrote: | > There's usually not a relationship that goes the other | way though, for example, developers don't tell DBAs to | pick up code so they can write the models in our language | in addition to the underlying SQL. This highlights a | trend of increasing responsibilities pushed onto the | developer. | | Yes, because as a developer you're the one who has the | responsibility of implementing the business requirements | of the app. There's no trend here; this is the way it's | always been. The buck stops with development for a lot of | things. The developer is in a unique position to respond | to many incidents because they have an intimate | understanding of how the business requirements wed with | the technology in ways someone like a DBA does not. | | DBA's have plenty of responsibilities of their own, such | as handling a 3AM alarm that goes off when some part on | the application starts hammering the DB from some poorly | designed N+1 query problem in the codebase. Often times | when the DBA tries to teach the developer it's because | he's sick of getting those 3AM wake-up calls. | | > But in some cases it's specialists designating what an | expert developer should know. It's giving away control in | some respects. This turns into new job requirements and a | higher barrier for entry. The growth will need to stop at | some point. | | That's a strange mindset to have. Different jobs have | different technologies, and as a developer you learn how | to work with them. The specialists aren't designating | anything, the job you're responsible for is. | vsareto wrote: | >Often times when the DBA tries to teach the developer | it's because he's sick of getting those 3AM wake-up | calls. | | And if he just learned to code, he could write all the | queries himself, check in his own code, and never have | 3AM calls. I see no reason why a DBA couldn't manage to | write queries in source, especially with some hand- | holding w.r.t. source control and check-in rituals. If | they are already doing too much, just hire or train more | of them. DBAs are equally as smart as devs. Loop DBAs | into the business requirements. | | >That's a strange mindset to have. Different jobs have | different technologies, and as a developer you learn how | to work with them. The specialists aren't designating | anything, the job you're responsible for is. | | This is very push/pull about who should own what and I | personally think devs are often responsible for too much. | In silo'd places, make no mistake that input is gathered | about who owns what. In more collaborative environments, | you see the other sides willing to step in and share the | responsibilities; blurred lines, but they are still | there. What we've adopted is an over-reliance that a dev | will just learn many things and we hope he's good enough | at all of them that we don't really need deep knowledge | or need to delegate work. | AndrewKemendo wrote: | 10, 11 year timeline to what? | | You should be constantly learning everything about the whole | stack so that you can actually build functional, reliable, | manageable and maintainable systems. | | I expect a competent developer to be able to build a modern | multi-page web application, with a HTML/JS front end, | relational database back end, appropriately configured | certificates and DNS/CNAME/URL, build basic uptime and | application monitoring and do a basic SQL ETL data retrieval | process. | | That seems like a reasonable bar, and while the specific | tools have changed over the years, that stack is basically | the same as it's been since the 90s. | StandardFuture wrote: | You seem to be contradicting yourself without realizing it. | | > You should be constantly learning everything | | This is an in-progress action i.e. the developer is still | learning. | | > I expect a competent developer to be able to build | | This is now considering a "learned" developer. | | I am not sure you are making the point you think you are | making. The point I _think_ you are trying to make is your | expectations of what an experience developer should _know_. | But, you seem to be expressing it as what a new developer | should be _doing_. | | While the core discussion of the article might be in | regards to what developers do and do not know, I can't help | but notice that a developer knowing about something does | not necessarily allow them to be productive in that area of | their knowledge (especially relative to another co-worker | with both the knowledge and the dedicated focus in that | area). | | Also very important to note: the comment you are addressing | seems to be referring to knowledge that is _local_ | (particular customer necessity /problems, particular | architecture choices for infrastructure, particular product | design decisions, particular ways to answer the quirky | CEO/CTO in a way that _they_ understand, etc.). There is a | lot of locale-based knowledge that a developer must learn | at a company /job/project and can even change over time | (temporal-locales). | | Globally-applicable knowledge like frontend, backend, and | general CS concepts are for sure a reasonable expectation | of an experienced developer. But, there is a delicate | balance a developer must take in the _real_ working world | that is subject to _not_ attempting to master every aspect | of the product /business (especially if it overlaps with | someone else's job/focus) just because you have a high- | level understanding of the global concepts. In other words, | it is not necessarily efficient for a developer to know | every aspect of every language and every database in the | company unless that actually buys the company more | customers and money. | | I would expect any decent manager to understand this very | basic principle. Everyone in the company trying to be a | master at everyone else's job does not help the company | make more money. Being reasonable about expectations in the | moment is also a critical asset of working together to make | money. :) | jatone wrote: | hes not contradicting himself, the fact is there are a | large percentage of developers who just don't care. its | not a matter of still learning. they just don't care to | learn. | StandardFuture wrote: | To be clear, I was not trying to say his points were void | of validity. I was trying to append some clarity to | properly differentiating between a developers general | knowledge base, a developers temporary knowledge base, | and the actual day-to-day _doings_ of the developer. They | are not equivalent sets of things even if they intersect. | | As a personal anecdote, I know I have _learned_ many | things on a project that really helped improve the code, | that today I am not able to recall and would have to go | back and re-learn with a minimum of a refresher. This | happens a lot too. :) | vsareto wrote: | > 10, 11 year timeline to what? | | To be decent at each of those areas I listed. | | Why those 10 specific things? | | Those are all areas I've seen "Developers should learn X" | calls-to-action. | | Sometimes it's outright loathing over developers that don't | know specific topics. Other times it's wishful thinking, or | maybe a nice-to-have. | | If you, as a developer, read all of those posts, they | probably all make fair points about the importance of | knowing each of those things as they relate to development. | You decide there is some merit to learning it. | | So you decide to make a to-do list to go learn each of | those topics, because you want to listen to the blog posts | and be a good developer, and do some real work to prove you | know it. That's what's going to take you a while. | throwaheyy wrote: | Exactly. Without abstractions, developers would never be able | to get anything done. Every task would be dozens of new | rabbit holes. | Dowwie wrote: | "The fastest way to access to a row in a database is by its | primary key. If you have better ways to identify records, | sequential IDs may make the most significant column in tables a | meaningless value. Please pick a globally unique natural primary | key (e.g. a username) where possible." | | Has anyone had a problem due to surrogate keys? | ComodoHacker wrote: | I can agree with everything in the article except this one. | | >Has anyone had a problem due to surrogate keys? | | There's one problem with surrogate keys: they are not | convenient to users (too long and not meaningful). | | There are two problems with natural primary keys, and you are | guaranteed to hit one of them at some point. | | 1. It turns out your key isn't actually unique. To resolve the | collision you have to replace natural key with a surrogate for | one of conflicting entities. Which is not always possible | without risk of another collision. | | 2. It turns out your key isn't persistent. You have to change | it for some entities, but you can't because of so many FKs. | Scarbutt wrote: | _There 's one problem with surrogate keys: they are not | convenient to users (too long and not meaningful)._ | | You shouldn't expose those keys to users though. | hermitdev wrote: | > You are lucky if 99.999% of the time network is not a problem. | | This reminds me of one time I was having networking issues | (around a CRUD GUI). Oddly, I identified the issue the (business) | day before my users did. I was working on a rather large change | (OS updates, 32->64 bit) and noticed that multicast updates | broke. The timing was funny. I was working towards a minimal | reproducible example already, and I had a small test app that | showed that multicast was broke on my PC. So, I ran down and | tried the app on their computers, and it worked... between the | two of them, but not the company et large. | | That's when I vaguely remembered an email for Network Ops about a | router change on my user's floor over a weekend. I went by the | Ops team (they didn't like email for some reason) and told them | what I was seeing. | | The short of it was, yeah, the router change screwed my 2 users | on the floor (they got put on the printer VLAN by accident, which | didn't receive multicast). Separately, the issue I saw was a bug | in how we built a 3rd party lib on Windows that provided 64-bit | multicast support. | mattferderer wrote: | > AUTOINCREMENT'ing can be harmful | | I'll add that they should never be trusted to not jump around | either! I imagine everyone makes this mistake at least once in | their life. | | There is a very high chance that the database will skip a few | numbers from time to time. You will then have someone from an | accounting department asking where Record #XX is. | stronglikedan wrote: | I'd posit that autoincrements are fine for primary keys, but | primary keys aren't fine for auditing. | csours wrote: | Auto-increments are fine for primary keys, until they aren't. | I think the list of items from the linked article are things | that may cause delayed problems. | LoSboccacc wrote: | how would you generate hole-free sequences for use cases like | that of parents (invoicing)? | danparsonson wrote: | You could do something like this. | | Create a table with two columns: | | 1. auto-incrementing primary key | | 2. integer for the actual number you're generating, let's | call it 'IDValue' | | Seed the new table with a single row with IDValue set to | one less than your minimum value (say 0), then use the | following process to generate a new number: | | 1. Insert a new row into the table, with a known invalid | value (e.g. -1) for IDValue (note this must not be the same | as the IDValue from your initial row) | | 2. Get the primary key of the newly inserted row | | 3. Get all the rows from the table (in primary key order) | with primary key < the new id. This will consist of one or | more rows of prior valid values (or just the initial seed), | followed by one or more rows that are either valid or | invalid values (other clients may be running through this | process concurrently and finishing at different times) - | something like this: | | Key / IDValue | | 61 / 1119 | | 62 / 1120 | | 64 / 1121 | | 65 / -1 | | 67 / 1123 | | 70 / -1 | | 71 / -1 | | (your row is the next one after this) | | 4. Your new IDValue == the last valid IDValue in that set | of rows + the number of rows between that and your new row | + 1 - update your row with this new value - in the above | example, 1123 + 2 + 1 - i.e. 1126 | | 5. Delete the first unbroken sequence of valid rows except | for the latest one, to keep the table small but leave at | least one valid IDValue (IDValues 1119 and 1120 in the | above example) - just something like DELETE FROM table | WHERE Id < 64 (in this case) should be safe. | | The database takes care of atomically creating rows which | is the tricky bit, and then you can generate your own | number at your leisure regardless of gaps in the Key | numbering sequence. | uhoh-itsmaciek wrote: | Triggers are a great option in many situations. | marcosdumay wrote: | Usually, you don't. What kind of messed up requirement is | that? Do you want the sky to be green at sunrise tomorrow | too? | | On the exceptional case where the user is on the clear and | there is an unavoidable reason to create this beast, you | create a separate numbering application, that uses database | transactions that are independent from the ones of the main | application and only cares about numbering. It will still | have holes, but few enough that you can manually inspect | every so often and explain why they happened. | hderms wrote: | Not the OP but perhaps if only accounting cared about it | you could assign these IDs separately as some asynchronous | single threaded process. This would only work if insert | rates were low enough or the assignment of the gapless | sequence only needed to be eventually concistent over some | window of time (i.e. before next payroll we must have | assigned gapless IDs to all the new user accounts) | balls187 wrote: | If it was a business requirement that you have perfectly | sequential invoice numbers with no gaps, do it at the | application level, not at the storage level. | | Let the database do what it's great at doing: efficiently | store and retrieve data. | seppel wrote: | A database is also exceptionally good at doing | transactional stuff like atomically incrementing | something. I'd even say: This is something that belongs | in the database and not in some brittle application | logic. | mtVessel wrote: | Generally, I agree. But in many RDBMSs, auto-increment | features explicitly do not guarantee gap-less sequential | ordering. If anything throws, it's easy to end up with | discarded numbers, in which case you need to catch all | errors, inspect the state of the sequence and/or the | entity you're populating, and re-seed the sequence. At | this point, I'd argue you've already left the pristine | gardens of set theory and wandered into the thorny | brambles of app dev. | LoSboccacc wrote: | ah! TIL: | | MYSQL - https://stackoverflow.com/questions/449346/mysql- | auto-increm... | | POSTGRES - | https://stackoverflow.com/questions/2095917/sequences- | not-af... | | thanks! | vivekseth wrote: | I don't think it is not possible to enforce perfectly | sequential numbers at the application level. | | This is the same problem as multiple threads trying to | increment the same global variable. Unless there is | mutual exclusion while the variable is being | read/incremented there will be race conditions. | | I think the only way to enforce mutual exclusion for | applications would be at the database layer (or any other | layer where there is 1 resource that is shared between | each application peer). | dkhenry wrote: | I think by "application level" they are saying "using | transactions at the application level". If you require a | strict sequence of numbers then you BEGIN, READ, INSERT, | COMMIT, that is really what you want so just do it | explicitly. If the COMMIT fails because of a duplicate | key then you start over. | hobs wrote: | There's a simple fix - Don't you ever expose primary keys to | the users. Ever. Seriously, ever. | | A primary key is not an order id, it isnt a person identifier, | it isnt a paycheck - its a thing the database should be using | behind the scenes. | | All of the things I just mentioned change - besides the primary | key. | | You'll never have this problem if you separate your business | logic from your keys. | Terr_ wrote: | Quite. Not only does it protect you from architecture pain | later, but it also allows you to create business-keys that | humans have an easier time reading, comparing, and typing. | | The implicit requirement of a global always-online single | master counter is great until it isn't. Perhaps one day it | becomes a performance bottleneck, or you have to support | systems that create provisional items offline, or some law or | client-contract stipulates everything for them must live in a | daughter-system hosted inside/not-inside a particular | country... | Macha wrote: | This is why I'm skeptical of the suggestion to prefer a | natural primary key like a username. It works fine... until | they day the business asks for changeable usernames because | BigClient is now LargeClient and can't stand anything to | still have their old brand identity. | hobs wrote: | Even in the natural key's playground (in my mind) of data | warehousing, you still have to manage changing dimensions, | and the inevitable reality that your fact table was | actually another dimension table all long. | wenc wrote: | I never realized this before but many excellent developers | struggle with SQL beyond simple SELECT statements. I have a | colleague who is by all accounts a deeply technical person but | one day he confessed to me that he didn't really grok SQL and | that he'd rather work with a "real" procedural programming | language to just store and retrieve data. | | Part of it may be due to the fact SQL isn't really a programming | language but a declarative DSL for manipulating sets and tables. | Things like GROUP BYs and PARTITION BYs (window functions) that | come naturally to mathematical types/functional programmers are | less intuitive to procedural programmers. | | I suspect this was what attracted developers to noSQL databases | like Mongo in the first place -- it's more attuned to a | programmatic mindset. | | (this is not universally true of course -- many programmers have | no issues with SQL at all.) | MaxBarraclough wrote: | > I suspect this was what attracted developers to noSQL | databases like Mongo in the first place -- it's more attuned to | a programmatic mindset. | | Well, it's more attuned to the dynamically typed mindset, sure. | Programmers who understand the value of static type systems | should understand the value of relational schemas. | andybak wrote: | I disagree. There's more to relationships that typing. I'm | very pro-dynamic languages and still chafe at static typing | but the wonder of the relational model fits nicely with my | liking for declarative and functional approaches. | | (EDIT - and as another data point I dislike SQL's syntax. The | semantics are bearable but the syntax just makes my brain | melt) | MaxBarraclough wrote: | They're not identical concepts, but both relational schemas | and statically typed programming languages provide | assurances about basic structure. Mongo, and Python, offer | no such assurances, and leave it to the developer to get it | right. | | Agree that SQL's syntax is rather bad. | MaBeuLux88 wrote: | MongoDB supports Schema Validation since MongoDB 3.2 | (Dec. 2015) and JSON Schema since MongoDB 3.6 (Nov. 2017) | so MongoDB can enforce types strictly if you want to at | the collection level. NoSQL doesn't mean NoSchema or | NoTypes. https://docs.mongodb.com/manual/core/schema- | validation/#json... | MaxBarraclough wrote: | So they bolted it on eventually then. The relational | databases went the other way and bolted-on schemaless | data [0] [1]. We can still compare the schema-based and | schemaless approaches. | | [0] https://www.postgresql.org/docs/10/datatype-json.html | | [1] https://docs.microsoft.com/en- | us/sql/t-sql/functions/json-qu... | brightball wrote: | I firmly believe that every developer should spend 2-3 weeks | early in their career working with nothing but SQL. It will pay | huge dividends for the rest of it. | | IMO a lot of the issue is that developers for many years using | Java or PHP, were using SQL to handle everything. The | application language was a pass through later between the | client and the database. | | Your goal was to accomplish as much as possible in a single | query and then to simply return the results of that query to | the interface. That meant formatting numbers or currency in | your SQL. Optimizing inserts or updates to be handled in a | single query. Grouping, counting, left/inner joins, having | statements to filter on aggregate results. More than 1 or 2 | queries for the primary area of the screen was both a rare and | foreign experience. | | And then ORMs started to slowly integrate themselves into the | flow of various frameworks to automate the repetitive things | around CRUD tasks. Then to address scaling & bloat problems we | saw an uptick in REST APIs, microservices that further made | those ORMs the norm...and then many developers started actively | trying to stay within those API constraints to an almost | religious degree which led to a nested payload becoming | acceptable fueling the whole "NoSQL" situation, along with the | idea that it was somehow better to repeat the same data | thousands of times over. | | A whole lot of people pushed back against this and eventually, | it mostly ran its course. I've often seen resistance to SQL to | be driven by _fear_ of SQL more so than anything else. As soon | as people get a basic comfort level with SQL, it become almost | automatic. | techbio wrote: | I have a side-project to solve Kakuro puzzles with SQL logic, | not because it's useful, but because it seemed possible and | distinctively challenging. | | I'll show off the POC if I ever get around to smoothing the | rough edges. | Matrixik wrote: | Here is one nice example: https://sivers.org/pg | dominotw wrote: | > I firmly believe that every developer should spend 2-3 | weeks early in their career working with nothing but SQL. It | will pay huge dividends for the rest of it. | | i did tons of sql couple of years ago on a reporting team. | Now I do android dev fulltime and don't remember any SQL | beyond basics, highly doubt it will all come back to me if i | tried. | baq wrote: | it doesn't matter, you'll google the details. many people | aren't aware that sql isn't just a stupid way of filtering | tables and can do things server-side that you'd spend hours | implementing and testing on the application side with worse | performance in the end most of the time. | anotheracct_ wrote: | It's unfortunate that your education failed you so much. | TremendousJudge wrote: | > The application language was a pass through later between | the client and the database. | | This style of doing things resulted in spaghetti style | unmanageable databases, filled with an unknowable number of | triggers and procedures, all written in PL/SQL (which is | much, much worse than either Java or PHP). The reason why | ORMs started to become popular is that you can write your | application without filling your DB with arcane and | inscrutable logic | karatestomp wrote: | But going too far the other way is how you end up with | performance 100-1000x worse than it should be. Which is an | actual thing that happens quite often in the wild. Also | often ends up heavily dependent on some ORM or framework, | making rewrites or multi-client DB access dangerous and | painful. | eternauta3k wrote: | You can create pretty complex SQL queries with | SqlAlchemy's ORM, without going back and forth to the DB. | jtdev wrote: | In my observations, ORM use has a perspicuous relationship | to piles of arcane spaghetti code. | | Not to mention, 50%+ of ORM managed DB schemas that I've | observed don't have proper constraints, indexes, | relationships, etc. Because the developers using the ORM | think it's a magical tool that makes understanding SQL and | relational databases optional. | oftenwrong wrote: | When you give an object the power to access to the db, | you invariably end up with db access everywhere you can | pass that object. | harikb wrote: | There is an entire world between ORM and PL/SQL. | Programmatically constructing SQL statements is also a | thing. Just because someone writes SQL does not mean SQL | needs to be spread through out code or that we need to have | lot of logic in PL/SQL. Of course, there will be cases | where a store procedure is desired (any kind of validation | that cannot be expressed as fkeys, canonicalization of some | core data components etc). I worked on DBs writing SQL for | several years with only minimal code as stored procedures. | | Of course the database vendor and culture also plays a | role. We were primarily a mysql/postgresql shop | resu_nimda wrote: | How did you minimize roundtrips between server and DB, or | did you find that they were not a big concern? | | I'm working on a project with a Postgres database, and as | it gets more complex I'm moving more stuff into stored | procedures, pretty much wherever a single action requires | multiple statements in series (e.g. check if this thing | exists, check a value, get the id of some other thing, on | success update another table). | | Of course I would prefer to leave as much of it in the | middle tier as possible because the ergonomics are better | there but I don't want to sacrifice performance. | monocasa wrote: | I think they're saying that you profile your changes on | an individual query. | | Yeah, if all of your queries are slow, you've got more | work to do, but that doesn't change the overall process | of optimization. | coliveira wrote: | This is trying to avoid the reality that the data is in a | database. A database is not just a flat storage area that | can be poked by code from outside. It has a lot of | relationships and integrity constraints that need to be | maintained by the DB itself, and all this is expressed in | SQL/stored procs. Trying to avoid this is basically | removing the power of a DB and converting it into just an | expensive and cumbersome storage area. | breischl wrote: | But now you're filling your application with arcane and | inscrutable logic, with an extra layer of abstraction via | the ORM to make it even less scrutable. | | I think one should view a SQL DB like a microservice. | Instead of REST endpoints (or gRPC or whatever), create | stored procedures. These define a strong contract with your | DB, the capabilities that it provides to your app(s). Now | you know what the query and insert patterns are, and can | tweak the table layout under the covers without screwing up | your application code. | | Of course you _can_ abuse this into a spaghetti monolith, | just like you can evolve a microservice into a spaghetti | monolith, but you shouldn 't. There's no technology that | will prevent you from making poor architectural decisions, | you just have to not go down those dark paths. | wenc wrote: | > Instead of REST endpoints (or gRPC or whatever), create | | > stored procedures. These define a strong contract with | your DB | | Exactly. It took me years to grasp this, but when I did | my code became much simpler. | | A REST API (that returns JSON) has to contort a tabular | data structure into a loosely-typed hierarchical data | structure (JSON) which has to be read back, reconstructed | and in many instances type-checked (e.g. DATETIMEs are | not native to JSON, nuanced datatypes like DECIMAL(18,3) | are lost). | | Whereas a SQL interface returns data in its native | tabular format with all the correct types. | wenc wrote: | There's a middle way which is very powerful: SQL views | (just SQL queries; no triggers or procedures) | | Here's a powerful mindset trick: think of _SQL views as an | sort of a REST API_ , but whose access language is SQL and | not HTTP, and that returns data in a table rather than JSON | (hierarchical). | | I once tried to build a REST API to a database, and someone | told me I already had a battle-tested and highly performant | API that outperformed REST at scale -- it's called SQL. A | SQL view is a dynamic lens into the underlying tables, so | even if the underlying tables/schemas were to change, your | consumers don't care as long as they can access the SQL | View. | | SQL views are also composable: you can build SQL views on | top of other SQL views, and any changes made in the base | views are propagated throughout. Need to add/transform a | field? Do it in the view. Need pull in auxiliary data? | Bring it in through a JOIN in the view. I've built many | systems by composing SQL views and they're very | maintainable and very flexible. They're kind of like | function compositions but on tabular data. | | The rule of thumb is: always access a database through a | view, never the underlying raw tables. In computer science, | a great many maintainability issues are alleviated through | a layer of abstraction/indirection, and SQL views provide | exactly that. | | This centralization of the core logic becomes especially | powerful if the database is accessed from multiple | consumers (webapps, analytics backends, Tableau, ML tools, | etc.) The "API" remains consistent throughout. | cgh wrote: | Agreed, and a good example of this is implementing | search. You can define a view on top of your searchable | entities that includes the urls to the entities, as well | as searchable metadata (entity descriptions or whatever). | So when you add new searchable items, you just update the | view to include them and the code to select from the view | doesn't change. | deckard1 wrote: | > highly performant API that outperformed REST at scale | -- it's called SQL | | You are conflating many disparate things here. | | SQL is a language (DSL) for accessing data. REST is a | protocol and a data transport method (one could surmise a | way to do REST without HTTP, but when reasonable people | refer to REST they mean HTTP (over TCP (over IP | (etc.)))). | | Even REST is not an API. You can't do anything with a GET | or a POST without other abstractions built on top of | that. So I don't understand how anyone could make | performance claims beyond something like "HTTP is slow" | and "binary transport is faster", with regards to SQL vs. | REST/HTTP. | | SQL does not define how you receive your data. Databases | have different methods of sending SQL and responding to | SQL. Oracle, MSSQL, MySQL, etc. | | > This centralization of the core logic becomes | especially powerful if the database is accessed from | multiple consumers | | That's the entire point of an API. Any API. REST APIs, | even. | cryptonector wrote: | Not only that, but VIEWs can have INSTEAD OF triggers, | which then lets you build powerful abstractions in SQL. | wenc wrote: | Amazing. I didn't know it was possible to write to a | VIEW. | cryptonector wrote: | Changes everything no? | jwdunne wrote: | This is a very interesting comment! | | Two questions: | | Do you have any example code that shows how this works? I | get what you're saying intuitively but example code will | help me bring it to table. | | What about cross cutting concerns? I've found stored | procedures to be a performant solution here. By version | controlling them, and limiting to pure functions, I found | them quite maintainable. Would you instead just define a | new view, or extend an existing one, or refactor into a | separate view that's then joined into the existing views? | | I haven't delved as far as views, admittedly. One app | featured a bit of complicated logic and eschewing the ORM | in favour of raw SQL helped (instead of getting tangled | up in Demeter chains). Despite new developers, who have | used purely ORM for years, shitting their pants at the | raw SQL, both of us who worked on it felt it was the | right call. We feel much better about leveraging more of | the database in new projects. | | In fact, when we took our experience to a Django project, | my colleague wrote a Manager method in such a way that an | ORM favouring developer questioned because it looked too | much like SQL. But it was the obvious implementation to | us after using raw SQL. And, after benchmarking, the most | performant. | wenc wrote: | Briefly, | | 1. Let me try with a simple example. Suppose you have a | fact table A with fields (ItemID, Item, Amt) where Amt is | in USD. Rule of thumb is: don't expose A to the consumer; | instead write a SQL View V_A and expose that instead: | CREATE VIEW V_A AS SELECT ItemID, Item, Amt FROM A | | Then suppose a European counterpart wants to use the same | API but needs the amounts to be in Euros. You can write | another view: (in practice the conversion 0.92 shouldn't | be a static number, this is just for illustration) | CREATE VIEW V_A_EURO AS SELECT ItemID, Item, Amt * 0.92 | AS AmtEUR FROM V_A | | Expose this to the Europeans. You can keep stacking views | on top of other views. Your U.S. consumers will always | see the data through the lens of V_A and your European | consumers will always see it through V_A_Euro. | | Suppose the underlying table A now changes. There's been | a merger and the company now stops reporting currencies | in USD, and everything is now in British Pounds so your | DBA adds a field AmtGBP and starts populating that field | instead. Amt still contains historical data, but moving | forward the data in Amt will be NULLs; AmtGBP is the new | internal baseline currency. From a VIEW perspective, all | you have to do is: ALTER VIEW V_A AS | SELECT ItemID, Item, ISNULL(Amt, AmtGBP * 1.23) AS Amt | FROM A | | Your V_A and V_A_EURO consumers (could be Tableau, Excel, | other SQL views, etc.) will still happily receive data | per usual, unaware of the internal changes (the British | are coming!) that have occurred. Contract kept. | Table A <- View V_A <- View V_A_Euro | | 2. Cross cutting concerns come in many forms so not sure | if I can address. Stored Procedures are definitely an | acceptable abstraction -- they accept parameters and can | return tabular results just like VIEWs. They do however | work in a procedural manner (like subroutines) and can | produce side effects, which is sometimes necessary to | accomplish very specific tasks. VIEWs on the other hand | are more similar to pure functions (unless random number | generation is involved) with no side effects. Because | views are dynamic, they flex with your data and VIEW | definitions. | projektfu wrote: | Are you also promoting CQRS? | jayd16 wrote: | This is the exact fear of SQL that's being talked about. | You can make a mess of anything but its not hard to have a | well maintained relational DB with minimal cruft. | lhorie wrote: | With the whole shelter-at-home thing, I had a chance to work | on a simple app for my kids. One of the corollaries of | "simple" was avoiding an ORM. | | Implementing logic is single SQL queries can definitely be a | bit challenging, but I thought it was also quite rewarding | and liberating - Raw SQL is incredibly powerful! | beckingz wrote: | At first I missed for loops. | | Then I was empowered and did not miss for loops. | | Now I still miss for loops. | vnorilo wrote: | I agree that this is in part why nosql became popular. I'm | quite experienced with systems programming but databases were | never central to my interests. | | So when I needed one for my personal stuff, I went with | CouchDB, because it's just a very nice RESTful API around a | data structure I understand well. | corey_moncure wrote: | I'm an SQL hater in remediation. | | In a given week I might work with all of the following: SQL, | C#, Python, JS (Kendo, Vue, React), XSL, bash, and more. I'm a | quick learner and I pick things up fast, always have, I've got | a deadline and I don't have the time or capacity to fully | internalize the minutiae of all the technologies I have to work | with. In other words I depend on the tools to show me the | options at my disposal and construct syntactically correct | expressions. | | What I really dislike about SQL is common to most systems I | dislike, where the tools aren't discoverable to me. In SQL it | boils down to the fundamental syntactical requirement to put | the SELECT clause before the FROM clause. So I have to build up | my statement in this weird spiral pattern where I change | something deeper in before I know what I can SELECT in the | first place. The ability to give tables shorthand names with | MS-SQL, i.e. | | `left join [dbo].[sometable] st` | | ...is very helpful but I gather this isn't common to all | dialects. Working with XML in SQL is a nightmare as the tool | cannot tell you whether you can do obvious things like pass an | sql:variable into nodes() until you actually query the server. | (spoiler: you can't.) | | I vastly prefer the functional programming approach to working | with data e.g. C#'s lambda style linq. The tool shows me all | the pieces I have to work with, and all I have to do is piece | them together the right way. | throwaway_pdp09 wrote: | > In SQL it boils down to the fundamental syntactical | requirement to put the SELECT clause before the FROM clause | | No offence but if you are at the level of struggling with the | syntax then you should not be using SQL until you have more | experience. There have been plenty of well-founded critiques | of the crappy syntax of SQL, but it's not ultimately that | hard. There are worse things you'll have to cope with such as | the implications of 3-valued logic that comes with nulls. | | The giving-tables-alias-names feature has been standard | probably since the very first standard was released. In some | cases such as self-joins, it is necessary. | | If I can give you a piece of advice, arrange things using the | 'with' clause and learn to break things down as simple as | possible at each step. Let the optimiser sort things out for | you. If performance is poor, look at the query plan. Also | understand you don't have to write everything in one huge | statement. Spool intermediate results off to temp tables if | that helps (not @tables but #tables - @tables have problems | with them). | | As for the XML, I guess that may be better handled outside | the DB. IMO XML should never be made part of SQL. Good luck. | Viliam1234 wrote: | Reading SQL is like reading German, where the last word in a | long sentence determines the meaning of the entire sentence. | | An SQL statement starts with "select ABC.XYZ", but you have | no idea what it means, because only one screen later it is | written that "ABC" is actually an alias for | "T_ACCOUNT_BUSINESS_CREDITS" or something. The logical order | would be "from ... where ... select ...". | | Imagine a programming language designed like SQL. It would | look approximately like this: function foo(a) | { return s; b = join(a); s = | concatenate("[", b, "]"); by the way, a is list of | strings, b is string, s is string; also, function | "join" is imported from "lists", and "concatenate" is | imported from "string" module; actually, don't return | the result, just tell me how many characters it would have; | } | kyllo wrote: | TFA isn't even about SQL the language at all though, it's about | the scalability and reliability characteristics of databases, | especially in a distributed environment. | txcwpalpha wrote: | When I was in undergrad I was part of program that was heavily | programming focused, but was actually part of the business | school. Several of the classes I took were _heavily_ SQL | focused, with at least one class that had every single | assignment requiring extensive designing database schemas and | writing SQL. Now, years later, I still think that those classes | were some of the most valuable to my career as a programmer. | | What's interesting to me is that apparently the CS program at | my university hardly did anything with SQL, and I notice too | that most programmers I meet "in the wild" are lacking in SQL | skills, as you mentioned. It's led to some interesting | interview situations where I really struggle with any questions | about algorithms (my college courses didn't cover algorithms at | all) while the interviewer will tell me that I have the best | SQL skills of anyone they interviewed. | | Speaking with others that went to other universities, I've | heard that it's similar elsewhere for the | "business/programming" to include SQL classes but eschew | algorithms, while CS programs will ignore SQL but focus heavily | on algorithms. It seems to me like both programs could benefit | from meeting in the middle a bit. | carlps wrote: | I have a very similar background. My degrees are in Business | Computer Systems and we had SQL hammered into us. We learned | how to program with Java and .NET, but all from a very high | level. It wasn't until a couple years into my career after | school that I started learning DS+A fundamentals on my own. | It blew my mind. I agree that there should be more of a mid- | point available. | MathCodeLove wrote: | As a programmer who is admittedly attracted to Mongo, a large | part of it is simply its ease of integration. I'd happily spend | more time familiarizing myself with SQL if it were less of a | pain to integrate into my projects. | | The first time ever using Mongo I had a cloud cluster connected | and working in about 10 minutes after signing up. Trying to | integrate SQLite took me around 1.5 hours before it was | functional. To this day I have yet to set up a real cloud SQL | database because the one time I tried it with PostreSQL I just | couldnt get it to work. | izacus wrote: | Is initial integration cost really the most important metric | you look at when designing a system that will probably run | for years and will have to be maintained and scaled? | pbowyer wrote: | I like SQL when I'm not writing reporting queries. GROUP BYs | bite me (With MySQL 8 I end up reaching for the ANY_VALUE() | function), and I end up with more subqueries than I feel I | should need. | | When working with time-indexed data I feel I'm forcing the | database to do something it doesn't want to. E.g. if I want to | answer the query "How many sales are there per day this month?" | and I want an entry for every day in the month, even when there | were zero sales. Or another query asking "Which days didn't | have sales?" and listing the days. I haven't found a way to do | this in-database. I end up answering in code based on the data | I fetch. | | Definitely feel it's me not SQL, but not found the answer. | JamesSwift wrote: | The answer is likely that your storage schema is incorrect. | You have things stored in OLTP (i.e. app database) but trying | to read it as OLAP (i.e. reporting database). Once you | reimagine the data in the OLAP style then these kinds of | queries are simplistic. | | EDIT: specifically for your example, in an OLAP style you | would generate a Times table and then foreign key the sales | table to it based on the date. Then you can easily query | against that Times table as the filter/bucket for your | queries. | pbowyer wrote: | Yes, agreed. It was designed for OLTP not OLAP, and I have | to get my mindset into that. | | For days as my example has, would the Time table be | generated for say 1970-2050? Given months are of different | lengths and there are leap years, I'm assuming this is | needed, rather than storing a single year. | jacques_chester wrote: | I found Kimball's book to be interesting and helpful. | | I think the Kimball approach gets bogged down in temporal | evolution (Slowly Changing Dimension type 1? 2? 3? 4? | 5?), but that's more about the underlying absence of | meaningful bitemporalism than the dimensional schema | approach per se. | JamesSwift wrote: | Yes, the time table is a pregenerated set of all possible | moments in time for whatever resolution you care about. | So, for example, if you cared about day resolution it | might be date_key | year_num | month_num | | day_num | quarter_num | week_num | month_name | | month_name_short ... 20200101 2020 1 | 1 1 1 january jan | 20200102 2020 1 2 1 | 1 january jan ... 20200401 | 2020 4 1 2 14 | april apr ... | | You would FK on the `date_key` and add as many columns as | you need to support querying against the dimension. I | also like to add a proper `datetime` representation of | the date so I can easily do a date range query. | EvanAnderson wrote: | Your example is a fun one and not too hard to do. | | Thinking about the intersection of sets is important for | grokking what you're trying to do in that example (and with | SQL in general). For me, at least, SQL suddenly made a lot of | sense when the idea that I was working with intersections and | subsets of sets. | | In your example you have (1) a set of all dates in a range, | and (2) a set of sales totals for days that had sales. Set 1 | could be a "numbers table" or generated with something like | "generate_series()" in PostgreSQL. Set 2 is made by | summarizing the data in a sales table by date using "GROUP | BY" and "SUM". | | Then you're just looking at JOIN'ing those sets and | COAELSCE'ing the NULL returned from days when there are no | sales into 0. | will_pseudonym wrote: | I learned SQL working in Access. There wasn't anything like | generate_series(), and I remember having to do things like | above by having a start and end date, cross joined with a | table of nonnegative integers I made, and creating a field | which added an integer to the start date, subject to being | no greater than the end date. | pbowyer wrote: | > Set 1 could be a "numbers table" or generated with | something like "generate_series()" | | I've done it this way when doing it per-minute or per-hour | - i.e. for 60 minutes or 24 hours; fixed, constant ranges. | I picked this example because (to me at least) it's more | difficult :) | | Calendar months have different lengths, so something would | have to tell it which month to use (and account for leap | years). But say you wanted the last 60, 90 or 120 days? I | suppose you could first create a temporary table to be the | "numbers table", custom made for the range you want to | fetch - is there an alternative? | EvanAnderson wrote: | In many databases you would create a function to fill the | role of that "numbers table" and, effectively, map that | function onto rows. That function can express whatever | "richness" you need (i.e. civil calendar month, phase of | the moon, etc... >smile<) that might not be easily | expressed as a series. | will_pseudonym wrote: | There are date functions in most SQL packages which can | create the limiting factors you're looking for, just as | in procedural languages. Once you have the list of dates | you're interested in, the problem becomes simple. | Amezarak wrote: | I have done this many times. My preferred solution was to | create a recursive CTE beginning with the desired start | date and ending with the desired end date. I often based | these on MIN and MAX subqueries, but obviously you could | also pick arbitrary dates. | | You can use days, months, weeks, years, or whatever you | wanted as the "increment" in the CTE using the DATEADD | function. Then (for nulls) you simply LEFT JOIN your CTE | with the desired date part (aggregated) of your table. | This gives you your first answer, and a simply filter | will get you your second answer. | | It's also common for people to just create all these date | tables beforehand as actual, materialized tables. In my | opinion, this is less elegant (what happens in 2101?! | Somebody had better remember to add to the table!) but it | naturally works just as well and probably saves some | perf. | alexilliamson wrote: | You can (IMO) pretty succinctly enumerate all the days in a | month with a recursive CTE. The example in the following link | does it for weekdays, but it's the same idea: | https://www.sqlservertutorial.net/sql-server-basics/sql- | serv... | mulmen wrote: | For anything other than the absolute simplest case you should | have two databases. One for OLTP (the application(s)) and one | for OLAP (the reporting). | | If your reporting queries are hard or complex it's because | you did a bad job architecting the reporting tables. | Reporting queries should almost always be the simplest form | of query if you designed your warehouse properly. | | Days with sales is easy with a simple SUM or COUNT and GROUP | BY. | | Days without sales is easy if you think about this | dimensionally. SELECT dim_days.day_of_month, | COUNT(fact_orders.order_id) /* Select from | dim_days first because you want every day. */ | FROM dim_days /* Outer join to the fact table | to pull in the data you have and add that to your dates. | */ LEFT JOIN fact_orders ON dim_days.order_day = | fact_orders.order_day /* Filter results by the | desired range. */ WHERE dim_days.month = 4 | AND dim_days.year = 2020; | | As I continue to say: | | "SQL is easy. Data is hard." | sk5t wrote: | A more functional mindset can definitely help here. Think of | your "sales per day" model in terms of starting with a | sequence of days--startingDay up to startingDay + n--as the | input to a function that maps to an aggregate of that day's | activity. | | Aggregate functions in SQL are IMHO quite awesome once you | develop a comfort level to stop worrying about them per se. I | wouldn't like to try to get Excel to tell me--or write the | code to do manually--something like "show me the standard | deviation in units sold by day of week over the last ten | summers." | pbowyer wrote: | I find thinking about analytic queries like these as | map/filter/reduce easier than thinking in terms of the SQL! | taffer wrote: | SQL and functional programming constructs are actually | quite similar: | | SELECT -> map | | FROM -> stream | | JOIN -> flatMap | | WHERE -> filter | | GROUP BY -> collect | | ORDER BY -> sorted | | https://blog.jooq.org/2015/08/13/common-sql-clauses-and- | thei... | bryanrasmussen wrote: | I'm pretty sure I would be good at SQL if I devoted time to it, | because I've worked in other declarative DSLs and generally | outperformed other programmers using them, but basically every | place I've ever worked at that used SQL extensively had already | employed SQL experts that just handled the problems on that | level so that it didn't seem worthwhile bothering with it. | fimdomeio wrote: | I struggle with lack of experience with SQL by always being | told that I should always use an ORM or I would regret it in | the future when I would change database technology. I'm in the | future now and spend a lot of time debbuging the ORM and the | sql statements it produces, when I could split that work in | half by not using the orm at all. Would also have a lot more | experience with sql so there would probably be less bugs in the | first place. | sk5t wrote: | Swapping out database platforms is pretty deep into YAGNI | territory for most. Unless you _know_ you are selling an on- | prem software product to some customers who will demand MSSQL | and others who will demand Oracle, or whatever, this "swap | databases" justification for scrupulously using an ORM is not | well grounded in reality. | btilly wrote: | If you're writing a CRUD application, an ORM saves a lot of | headaches. | | If you're doing complex reporting queries, an ORM is strictly | worse. And yes, I've seen developers, architects, and authors | of ORMs that believed otherwise. They are wrong. | | As an example, very, very few ORMs can make the distinction | between SELECT ... FROM foo | LEFT JOIN bar ON foo.id = bar.foo_id | AND bar.category_id = 5 LEFT JOIN baz | ON bar.id = baz.bar_id ... | | and SELECT ... FROM foo | LEFT JOIN bar ON foo.id = bar.foo_id | LEFT JOIN baz ON bar.id = baz.bar_id | AND bar.category_id = 5 ... | | (And if you get into things like analytic queries, just | forget about it.) | jakearmitage wrote: | Curious, what is the difference here? | [deleted] | gtsteve wrote: | I believe they may have intended to write this query with | INNER JOINs instead of LEFT JOINs. | | If so, the result of the query would be identical but the | second one would likely have performance problems given | the category_id filter is not being applied at the point | where bar is joined. A clever database engine might find | an optimisation but I wouldn't count on it. | | This illustrates the subtle problem that ORMs can | introduce - all logical tests pass but issues emerge over | time. I found that typically every ORM based problem was | resolved by rewriting it using SQL. | | I used to use ORMs extensively, but my philosophy now is | that you can make simple queries by using a query | generator (i.e. connection.GetByID("MyTable", 100)) and | complex queries have to be written by hand. I would not | choose to use an ORM again. | btilly wrote: | No, left joins were intended here. They mean different | things. | | With inner joins the semantics are the same. If the | performance is different, the database query optimizer is | broken. | btilly wrote: | The difference is whether the condition on bar is a | filter is on the join from bar or baz. | saltcured wrote: | The extra "AND expr" in the join clause makes it more | strict, which means the outer join can produce more nulls | in the joined tuple. In a sequence like A left join B | left join C, moving the extra join filter between the two | joins is the difference between getting tuples like | (a..., null..., null...) vs (a..., b..., null...). | gwbas1c wrote: | Part of the issue is that a complicated database can handle the | same SQL query many different ways based on indexes and other | configurations. | | This kind of "magic" isn't always clear when programmers are | mostly used to working with data structures and procedural | code. | | The other problem, IMO, is that programming languages are very | poor at bridging the difference between the SQL domain and the | language domain. We really need plugins for compilers because | ORM libraries often are harder to learn than the database | itself. | takeda wrote: | > The other problem, IMO, is that programming languages are | very poor at bridging the difference between the SQL domain | and the language domain. We really need plugins for compilers | because ORM libraries often are harder to learn than the | database itself. | | Actually in past year I discovered that JetBrains IDEs (in my | case PyCharm) have nice feature that seems like not everyone | is aware. It is due to their integration with DataGrip. If | you configure the IDE to connect to your database, it will | start looking for SQL in your strings, it will then do code | highlighting, autocomplete table names and fields. If you use | refactoring it will understand SQL, if you refactor database | it will produce migration statements. It appears to solve all | the issues that ORM supposed to solve, and you still have | full control since SQL didn't need to be abstracted from you. | | Edit: this video shows how it works in action (forwarded to a | relevant part): https://youtu.be/_FlpiNno088?t=2868 | treis wrote: | >This kind of "magic" isn't always clear when programmers are | mostly used to working with data structures and procedural | code. | | My problem is that it is like some sort of black magic to me. | If I write a complex query I have no idea if what is spit | back to me is actually what I want. The only way is seeding | lots of records and then manually checking that each filter | and calculation is doing what I want. | | In code complex things can be broken down into more simple | items. Then I can reason about and test those building blocks | into something I understand and am confident that is working | as intended. | frosted-flakes wrote: | Yes, SQL's biggest fault is that it's not very composable. | Complex queries end up being long and repetitive, and the | order of the parts of a query is totally unintuitive (it | should go something like: FROM, GROUP BY, SELECT, ORDER BY | rather than SELECT, FROM, GROUP BY, ORDER BY, which makes | autocompletion hard). | ratww wrote: | CTEs can help a bit with composition (and readability) | takeda wrote: | QUEL[1] was like that, but thanks to Oracle SQL won. | | [1] https://en.wikipedia.org/wiki/QUEL_query_languages | setr wrote: | One thing I never understood is that the SQL language, | and its alternatives, share the same theoretical IR -- | the relational algebra -- it shouldn't be _that_ | difficult to implement for postgres /mysql alternative | relational languages like QUEL or Datalog. Or even just a | simplified SQL with a sane, consistent syntax. | | I know PG has a bunch of procedural-language | alternatives, but afaik, no relational-language | alternatives. | | At the same time, I'd also expect it to not be _that_ | difficult to transpile from say mysql to postgres, yet | there 's very little in that space, at least in open | source (there is however many ORMS that map to _either_ | mysql or postgres..) | | Both of which, amongst the procedural-languages, you'd | find a hundred transpilers (JS->C) and VM-languages | (Clojure,Scala,etc on jvm) implemented even by | bored/curious students. Which makes my suspicion that | there's no technical blocker, just a cultural one. | | That is, QUEL losing to SQL, such that you can't find any | implementation, is an absurd concept -- it _should_ be | available on postgres (perhaps requiring some special | starting keyword) -- but somehow it is not. | pjscott wrote: | The thing that really worries me when I'm writing SQL is the | possibility that the query planner will get frisky and choose | some disastrously slow join order, but only once in a while. | I've run into _way_ too many hard-to-reproduce performance | issues like this. | | Having the database automatically figure out how to run a | query is a great feature, but most of the time I'd happily | just write explicit nested loops for the sake of | predictability. | goostavos wrote: | >programming languages are very poor at bridging the | difference between the SQL domain and the language domain | | Depends a lot on the language. I've lost countless hours to | things like JOOQ trying to figure out how to get it to do | what I want, or express the query in its quirky not-quite- | right DSL, plus dealing with mappings, pojos, auto- | generation, and so on. | | However, on the other hand, in a dynamic language with just | enough support to move the result of your queries into a map | of key/value, I feel no friction at all. I'm using little | more than a simple jdbc wrapper in Clojure and even after | months on the project, I'm still just continuously stoked | with seamless the whole thing is. | oftenwrong wrote: | jOOQ can be used in a less-type-safe way. For example, | `fetchMaps` [1] does more-or-less what you describe. | | However, I have found it worthwhile to learn to use the | more advanced features you mention. Extending type safety | to queries is incredibly useful. Consider cases when | developers are making code and schema changes concurrently | that overlap. | | [1] https://www.jooq.org/javadoc/latest/org.jooq/org/jooq/R | esult... | Scarbutt wrote: | SQL (at least in Postgres) is already type safe. | oftenwrong wrote: | jOOQ can effectively extend the type system of Java to | the construction of ad-hoc queries by way of code | generation. jOOQ generates Java classes that correspond | to the database schema, which can be used in its query | building DSL. | | For example, if I have a `timestamp with time zone` | column in PostgreSQL, that is represented in the | generated code. I will be prevented from inserting a Java | `String` into that column - I will have to provide an | `OffsetDateTime`. (This is how it would be in a typical | configuration - it is flexible enough to do pretty much | anything). | | Another example, let's say I reference some column in a | query in one of my feature branches. Meanwhile, somebody | has dropped that column on the master branch. When I | rebase my feature branch onto master, my build will fail. | homami wrote: | I find SQL INSERT statement not intuitive. I can understand why | SQL requires me to declare the field names and then the values | of a new row that I am inserting; but it would've been a huge | time saver if SQL had a key-value dictionary-like syntax: | INSERT INTO "my_table" "col1": value1, | "col2": value2, ... | PZ81JUXJE7uJ wrote: | MySQL supports a insert/set statement: | INSERT INTO tbl SET a = 1, b = 2; | castorp wrote: | How does that look like with a multi-row insert? | castorp wrote: | And when you need to insert more than one row, you repeat the | column name over and over again for each row - not a good | idea either. | pjc50 wrote: | This is partly why I love LINQ: it's a more C# flavoured way of | expressing queries. And it works on objects as well as | databases. | | Most ORMs are bad for queries (pull over all the objects and | look at their properties!), but LINQ will actually turn your | code into SQL under the hood with some remarkable machinery. | fernandotakai wrote: | >Most ORMs are bad for queries (pull over all the objects and | look at their properties!) | | i would change that to ORMs CAN be bad for queries. most ORMs | are super configurable and can be changed only to pull | specific stuff. | | i would argue that most ORMs are a plus to productivity | because for most of cases, you just need simple querying | (select * from, simple updates, simple deletes). | | also, if you really need the power of raw sql you an just use | that -- and even then, for most advanced ORMs (django's, for | example) exposes a LOT of really advanced sql stuff in | python, which, for a lot of developers, is a lot more | expressive. | andybak wrote: | Not doubting there are bad ORMs but none of the ones I've | dived into is quite that bad (SQL Alchemy and Django's ORM). | | I sometimes wonder if the debate about ORM is mainly driven | by some traumatic horror witnessed at some point in every | developer's career. If we judged programming languages by the | same metric... Oh... Actually we kind often do that too! | uhoh-itsmaciek wrote: | Which ORMs are bad for queries? Most popular ORMs these days | expose most of sql in a language-specific DSL (and some allow | you to splice in bits of raw sql in a semi-structured way as | an escape hatch). Sure, you still need to learn to "think in | sql", to use these effectively, but the ORM has not been the | problem for a long time in most languages. | ratww wrote: | ActiveRecord, Rails ORM was the poster child of ORMs for a | while, and is completely abysmal for non-trivial left | joins, non-trivial aggregations and some subqueries. | | Of course using raw SQL is possible but might force you | into converting other parts of the code into raw SQL. There | is Arel, but it's just a more verbose SQL in quasi-AST | form. | | On the other hand, I never had much problems with LINQ or | Hibernate. | slifin wrote: | SQL definitely solves a lot of performance issues for typical | users who think they can do better | | but throwing strings over the wire, trying to get as much as | possible without crossing cardinalities and ignoring the n | query problem makes SQL at scale disastrous | | Devs always get one of these things wrong and not being able to | use my programming language in the query makes some things hard | to express | | Datalog and Datomic are definitely worth learning | https://youtu.be/Pz_NvY1kw6I | nsomaru wrote: | could you clarify what throwing strings over the wire and | crossing cardinalities means in this context? | meddlepal wrote: | From my experience it's just unfamiliarity with the SQL | language. I've spent thousands of hours writing PHP, Java, | Scala, Kotlin, Python and Go and doing fairly complex things | with them since I started programming. | | SQL maybe a few hundred hours? And off and on again rather than | constantly so I don't always remember beyond the basics. The | concepts make sense to me but the language always feels | foreign. | bpyne wrote: | I think you're onto something with the observation about what | type of developers take to SQL. We have a spectrum in which | people think more like how the processor operates to people who | think more in classes of problems and their solutions. | | We run into problems when someone with a proclivity to one side | meets a problem best solved by thinking on the other side of | the spectrum. An example is a young dev in my organization who | was given requirements for a new app that required a data | store. Not being comfortable with SQL or relational databases, | he chose a document store. It wasn't a good fit. Very quickly | requirements expanded and caused his code to balloon into a | mess of nested loops with lots of if-checks. Performance has | progressively decayed as well. A simple multi-table join with | filtering would have knocked off 2/3 of the code. | dirtydroog wrote: | I'd put myself in this category. I just never liked databases, | they never interested me. | josephorjoe wrote: | When I first encountered SQL, my immediate reaction was to | start writing my own lightweight ORM (two days later I learned | ORMs were a thing that already existed) because I took one look | at that syntax and decided it was insane to work with directly. | | I definitely don't think in SQL. | | That said, I like working w SQL databases -- just please give | me a battle tested ORM... | EvanAnderson wrote: | Some of the most fun I've ever had coding has been creating | "complex" SQL queries. The syntax is something that can only be | overcome with memorization, but it becomes second-nature fairly | quickly. | | Until I started thinking about SQL as manipulating sets it | never "clicked". Once it did, though, a whole world of | applications filled with mismatched procedural thinking mapped | on to SQL was revealed to me, and my own work became much | easier. | resu_nimda wrote: | The first time I accidentally typed "select 8 from users..." | and, instead of an error, it returned a column called "8" | with a bunch of rows with the value "8", my mind was blown. | eropple wrote: | Despite our industry's habit of treating mathy whiteboard | things as indicative of one's "programming talent", real | application of type, set, and graph theory are not tested | nearly as much as they end up being in practice, and I think | that developers' reticence to get deep into SQL is | symptomatic of the weaknesses there. | hknd wrote: | I used to write SQL at Google/FB, and I did and still do | struggle with it. | | I think it's because I usually touch SQL like once a month and | by the time I need it I already forgot everything :shrug: | crazygringo wrote: | SQL is very much like CSS to me. | | It's declarative, the primitives seem entirely non-intuitive, | it often takes a lot of fiddling to get what you want, the | behind-the-scenes execution is mostly a black box, and while | it's supposed to work the same on different implementations (of | browsers/databases), there are tons of little gotcha quirks. | | All in all, they're both entirely different skill sets from | traditional programming, and also where experience counts for a | ton more than just normal logical thinking. | taffer wrote: | When you look at SQL from a logical/set-based perspective, it | is by no means unintuitive. Basically, all you do is join all | the tables you need and then filter out everything you don't | need and maybe do an aggregation here and there. | 411111111111111 wrote: | SQL can do a lot more though. | | Triggers, functions, procedures, access control..... | astura wrote: | Plus indices, views, synonyms, constraints, and the | entire concept of data normalization... | beckingz wrote: | Turing complete! | outworlder wrote: | I had way less trouble in college with relational algebra, | compared with SQL. SQL is by no means intuitive. | Projection, which is what you do last, comes first in the | select statement. Then there are all the join types. | | Relational algebra, which is what SQL is ultimately based | on, is much more elegant. | alexbanks wrote: | And with CSS, basically all you do is tell the browser how | things look. | paulryanrogers wrote: | Conceptually yes. The devil is in the details: tables, | flexbox, grid, div-soup, inconsistent naming, etc. | crazygringo wrote: | How about the following: | | - When to use JOIN vs a subquery? | | - When is a subquery actually a correlated subquery? Will | this destroy your performance? Or is it a critical feature? | | - Should you put constraints in the JOIN or in the WHERE? | Will the distinction drastically affect performance? | | - When do you use WHERE vs HAVING? | | - Is the NULL from the join because no joined row was | found, or because the joined row had a NULL value itself? | | Etc. etc. The basic concepts are simple, but the | implementation details quickly become very complex, | particularly when you're ensuring high performance with | indices, and making sure the query uses the indices. | | And all of the questions I pose above have clear answers... | but the answers certainly aren't obvious from SQL basic | concepts. | | And also, while JOIN seems like it _ought_ to be intuitive, | in real life it seems like it 's like pointers in C -- some | people get it pretty quickly, other people struggle | forever. | polygotdomain wrote: | You need to spend time working with the language and | understanding how the set based operations used in SQL | work. Declarative languages means you can express things | in multiple ways and get the same results. This is an | incredibly important aspect of SQL | | >the basic concepts are simple, but the implementation | details quickly become very complex | | This is no different than programming. Assuming that SQL | doesn't have complexity because you can only SELECT, | INSERT, UPDATE or DELETE is going to have you banging | your head against the wall. Tackle the complexity in SQL | like you'd tackle the complexity in your programming | language of choice; read the docs, work through examples, | and read how other people solve the problem. There's ton | out there for SQL | | >When do you use WHERE vs HAVING? | | HAVINGs allow you to add a condition to an aggregate | function. So SUM(myColumn) > 5 would be something you put | in a HAVING clause. Honestly, this is pretty clear cut. | | >Should you put constraints in the JOIN or in the WHERE? | Will the distinction drastically affect performance? | | The first thing to understand is a condition in a join | versus a where might return a different result set, | specifically on anything other than an INNER join. The | impact to performance will depend on the rest of your | query, your data, and your index coverage. For simple | cases, there is likely no difference. For complex ones, | there may be an impact | | > Is the NULL from the join because no joined row was | found, or because the joined row had a NULL value itself? | | An inner join shouldn't produce a null. That's why it's | an inner join, as the data needs to exist in both places. | If you want to "test" whether a join found a row, look at | the field you were joining to and see if it's a non-null | value. Nulls won't join to Nulls unless you've change | some settings in most RDBMS. If you're looking at other | fields to determine the presence of a row from a join, | make sure you're looking at a non-nullable field. | | >When to use JOIN vs a subquery? | | A better way to phrase this would be when to just join | the table, vs writing a sub query and joining to that. | When is a question of the complexity of the query and | performance characteristics, and that can't be answered | in the abstract. The most important thing is that in a | large number of cases you can do both, and knowing how to | express things in both ways is powerful. | | >And all of the questions I pose above have clear answers | | No they don't. Any time you're wondering about how | different SQL impacts performance, there's absolutely a | huge "it depends" angle on it, because how you've | structured the tables, index coverage, and the volume of | data, can have a significant impact. This is why DBAs | still have jobs, because the database is an incredibly | complex system. You seem to be complaining that SQL | shouldn't be complex, yet are not willing to accept that | it is more complex that you've assumed it to be. It's | complex. You don't need to know everything if your just a | dev, but don't just assume it's simple. | | >while JOIN seems like it ought to be intuitive | | I'd check the diagram here -> | https://stackoverflow.com/questions/13997365/sql-joins- | as-ve... Half those joins aren't needed as you can re- | order a right join into a left join. For 95% of | development Inner joins and left joins are all you need. | The other 5% is an outer join and that's mainly needed in | report writing, not app development. | Drakar1903 wrote: | I will be very glad if you actually answer these | questions, in a separate comment. I'm driven to write | this by nothing but the desire to know. | jayd16 wrote: | Like everything else, write it the simple/elegant way | then profile it and tweak if you have to. | | Once you're at the point where you have to worry about | these things, tuning the SQL is still probably much less | complex than writing the query in your app language or | figuring out how a NOSQL db can do these joins. | izacus wrote: | In reality it rarely works this way - there's plenty of | systems which are falling apart due to "death of thousand | cuts" type issues. You run a profiler and most of the | queries are slow and there's no one obvious part to | optimize - because developers over the years ignored | basic optimizations and there are inefficiencies | everywhere. | | E.g., for a quick practice run, try optimizing Wordpress | without making it a static page via caching - how many | queries will you have to optimize and how much of a | codebase rewrite will it be to make it significantly more | performant? | jatone wrote: | in reality it does work that way. | | the problem you describe is completely different and its | more a measure of system health in its entirety of the | DB. it can be easier to fix or harder depending on the | precise cause. but that's a standard debugging skill | which isn't strictly in the DBA skillset. | taffer wrote: | > The basic concepts are simple, but the implementation | details quickly become very complex | | True, but those kinds of questions come up in every | language: Should I use an array or a dictionary? Should | people have references to projects or should projects | have references to people or both? Is money a float, an | int, a decimal or should I write my own money class? | Should I memoize the results? Is it thread-safe? | | As you can see, this could go on forever, pretty much for | any language. | UncleMeat wrote: | They do. And language designers seek to smooth the edges | and develop ways to encourage devs to write clear and | intuitive code. When a language says "well, these are | hairy questions that you should just figure out" we tend | to criticize those languages unless they have clear | reasons for that decision. It should be _obvious_ that | something is thread-compatible. "Well, if you model it | in pi-calculus it is easy" is a crappy way of handling | criticism. | | "Its easy if you think about it mathematically" is not | enough. | taffer wrote: | In which language is it obvious not to put money into a | float? In which language is it obvious if I want a | synchronizedSortedMap or an arrayList? | | There are simply things you have to learn to use a | technology. If you want to write Java, you need to know | what a variable is, what a for loop is and what | Inheritance/Interfaces are. Likewise, in the case of SQL, | it means you need to know concepts like normalization, | ACID and joins. Just poking around until the code works | won't do it. | UncleMeat wrote: | No language is perfect. But surely you'd agree that there | is a spectrum here. In C++ you have to think "should this | be a pointer or a reference" all the time. In most modern | languages, you never do. And although there are clear | methods to write C++ well, it is rightly criticized for | being overly complex and unintuitive. For example, | std::set is ordered. That's a basic mismatch with | expectations. Sure, you can just check the docs. But how | much code would be more efficient if they had just spent | a bit of time making things easier to work with? | acdha wrote: | > It's declarative, the primitives seem entirely non- | intuitive, it often takes a lot of fiddling to get what you | want, the behind-the-scenes execution is mostly a black box, | and while it's supposed to work the same on different | implementations (of browsers/databases), there are tons of | little gotcha quirks. | | My reaction to that is that it's similar in a different way: | everyone needs to use it but many developers don't take it | seriously, avoid learning how it works, and then complain | that it's unintuitive (i.e. not something they had already | known) and hard to use because they're basically just poking | randomly until they get close enough to what they want. | | It's true that not every implementation is the same but ... | where else is that not true? This is also true of operating | systems, filesystems, every library implementing a standard | format ("Why doesn't this PDF open in ..."), etc. I would | have trouble supporting the belief that databases are an | outlier in this regard. | fools wrote: | If I write a C program, especially with the appropriate | compiler warnings enabled, or some filesystem code, there | is a high chance that it will work across platforms with no | further coding necessary. Anecdotally, the same cannot be | said for CSS or SQL. | | It doesn't matter much for SQL, because I always know what | database I'm using, but for CSS, it's a massive pain. | acdha wrote: | That's a charming belief about C which is only true if | you work on the same operating system, processor, and | don't do anything complicated. If those are not true, as | I've experienced many times over the years, you will | learn otherwise about things like differences in floating | point behaviour, memory allocation and access patterns, | which filesystem and/or locking semantics, etc. | | In other words, it's about as true as it is for CSS and | SQL, where that kind of simple use is also stable. | fools wrote: | Yes, I write fairly boring applications. But the point is | that C is mostly cross-platform for those boring | applications, whereas the the "cross-platformness" of CSS | and SQL fall apart for even the simplest of tasks. | acdha wrote: | Again, I think you're viewing this through the lens of | relative experience. I have seen tons of C code which | required substantial fixes to work portably (bonus points | when this was exploitable) and I have plenty of CSS and | SQL which hasn't had to be touched in years. | izacus wrote: | This is... one of the best comparisons I've seen and sums up | the reason why I dislike SQL as well (although I know how to | use it). | | Sometimes it really feels like you're trying to give | instructions to someone via chat which gets Google translated | to chinese, japanese and russian on the way - it's this very | lossy communication channel where you need to tweak the | language "just so" to get maximum performance. I think it's | no wonder that newer DB designes opted for more direct and | tailored APIs. | jacques_chester wrote: | Databases are quite transparent about what they do. The | impression of "just so"-ness is a property of the observer, | not the system. | cat199 wrote: | > it often takes a lot of fiddling to get what you want | | and yet, for any nontrivial operations, opting for a few | simple K:V stores instead means one basically ends up | implementing an unrolled SQL engine processing loop to do the | application-specific select statements you would otherwise | need. | | there's a reason SQLite took over from dbm files ... | forgingahead wrote: | I don't think it's as elaborate as you're saying -- the | clearest road into a productive job for many years has been | "join a fast growing tech company who uses <cool_framework> to | do the magic, so the tech team is always building more business | logic". When you've been working with (eg:) ActiveRecord and | Ruby on your database data, direct SQL just doesn't come up | until you're trying to wrangle more efficiency in your queries | or do more complicated joins. | | It certainly makes sense for all developers to learn more SQL! | But it just hasn't been part of the daily work requirement for | many. | | As for the attraction of noSQL databases, I think perhaps it's | more of a new-hotness trend rather than a serious technical | decision. NoSQL has its place, but like most of this kind of | tooling, it's often applied incorrectly because of a lack of | thinking through the business logic. | Ididntdothis wrote: | SQL feels to me like bash or regular expressions. You can do | amazing things if you do it full time. But if you do it only a | few times per month or year you quickly forget all the | subtleties and it gets hard to understand even the stuff you | wrote half a year ago. I guess in the end things have become so | complex that as a dev you can't be good at everything. I often | wish there were dedicated database guys but if you have one | usually that person is sysadmin and won't help much with coding | against the database. | rurp wrote: | This echos my experience. I've written some complicated SQL | queries, but I do it rarely enough that I always have to re- | learn a lot of it. | | There have been a number of instances where I have had to | look something up seemingly for the first time, found a good | Stack Overflow answer, and then chuckled to myself because I | had already upvoted that exact answer at some point in the | past. SQL queries are a pretty common source of this, along | with uncommon git and terminal commands. | Ididntdothis wrote: | Very true about git. I would add C++ to that lis. Done full | time it's super powerful. But later or with less experience | you look at the code and think "WTF?" . | echlebek wrote: | SQL really is a programming language though. It's just not | imperative. | coliveira wrote: | In the minds of many developers, SQL is not a "real programming | language". That's why they don't see the value of spending the | time learning it. Many also believe that SQL is something that | they should better avoid, using an ORM, and let it be handled | by libraries written by "experts". All of this contributes to | developers not having a firm grasp of SQL. | mey wrote: | Yet they are unwittingly willing to learn a different DSL, | the ORM's DSL. | chaps wrote: | Here's a fun bug I had a few years ago - | | Had a postgres database which was using pgbouncer for connection | pooling. The most senior developer (24yo or so) we had on the | project was using Go to connect to the database to write some | simple reports, but each report took hours to run, and often had | to sleep for 30+ minutes. So, after a while, pgbouncer would kill | their connection, and their report would die. No other | application did this among the many that we had connect to that | DB, so it was definitely strange. | | Found out pretty early on in troubleshooting it that they had no | mechanism to keep the connection alive, which makes total sense | for why his app died. So, they put the library standard keepalive | function in a loop if the report wasn't doing anything.. but that | didn't fix it.. it made no friggin' sense. After bashing my head | against that for a while, I finally threw my hands up and asked | if they could just run a "SELECT 1" as a keepalive instead of | whatever the Go library was doing. Got a bit of pushback, but | just told him to do it and walked away. That ended up fixing the | problem. | | Turns out the Go library was trying to be clever in its | keepalives (can't remember what it was doing exactly), in that it | made some silly assumptions that there was nothing in the middle | managing connections. | | I like to think that dev learned a lot about trust in "magical" | libraries after that. | kerng wrote: | Go code often reinvents/reimplements a lot of things from | scratch, reintroducing problems that have been addressed long | ago in other systems. | | It's like this new trend, let's rewrite everything in Go to be | cool. Financially makes little to no sense. | eeZah7Ux wrote: | > The most senior developer (24yo or so) | | I see the problem there. | lanius wrote: | Age is a just a number. John Carmack created Doom in his | early 20s. | throwaway3563 wrote: | Every good rule has an exception. | osrec wrote: | I've seen older developers that call themselves senior, but | lack basic knowledge. I've seen younger developers, wise | _well_ beyond their years. Age simply isn 't a big factor in | how you judge a developer. | brabel wrote: | Just because you saw a few exception does not mean the rule | does not hold in general. Or are you saying most people | don't learn with time (a corollary of your theory that age | is not a big factor)? | osrec wrote: | I would say that I have seen no discernible pattern, so I | have learnt that it is imprudent to judge a developer by | their age. People do learn over time, but some people | learn more "cogently" than others, i.e. some get more out | of one year's worth of learning than others get in 10. | codegladiator wrote: | > most people don't learn with time | | I think this is correct. The curve flattens with age (in | my opinion). | | > you saw a few exception | | I think the seniors you met (the knowledgeable ones) | /were/ the exceptions. Most seniors I have seen have | convinced themselves that they know all there is to know. | cpufry wrote: | which library? | atian wrote: | Sounds like pgx. | | Spent forever debugging this myself. | | https://github.com/jackc/pgx/issues/494 | nuggien wrote: | Nothing about go is magical, and you probably solved the | problem blindly. | chaps wrote: | Wow, okay. You alright over there? | | Edit, because meh: I'm making no claims about go itself. No | idea what makes you think that's what I'm saying, since I'm | clearly talking about a library, and not even any stdlibs. | "Magic" is just a term useful for describing systems that | sweep much of their abstractions under the carpet in a way | that probably has gotchas. Granted, the term itself is | magical. | | In terms of fixing the problem, I knew for a fact that the | keepalives that I was seeing were nothing like what I've seen | in the past, at many companies, across Oracle, postgres, and | MySQL, all who've implemented "SELECT 1" for the sake keep | alives, by devs who've been in the field for much longer than | me. The suggestion was by no means blind, unless you consider | implementing a widely used method for this exact purpose, | "blind". Had I gone a different route in fixing the problem | within the stable, existing system, it would have likely | broken many of the other database connections by many teams. | I'll pass on that, since frankly, even ignoring the risk of | such a change, the dev should have done the investigation | themselves. | | Your post was unnecessarily aggressive and seems to come from | me having struck a nerve somehow. Genuinely hoping you're | doing alright. Peace. | nuggien wrote: | I have no affection towards go or any language. They're | just tools. You sounded elitist calling something magic and | pointing out someone's age as part of your point. And your | passive aggressiveness to my response is proof of it. I | "genuinely" hope you're doing alright too. | chaps wrote: | Heh, I wasn't being passive aggressive, I was being | serious. Take a breath, man. | | Also "magic" is _not_ a new term: | https://en.wikipedia.org/wiki/Magic_(programming) | bdcravens wrote: | I presume they took it as an attack on Go. Truth is, it's | an attack on the library developer who themselves may have | found their keep-alive solution by stumbling blindly on it. | user5994461 wrote: | was the go library fixed? | chaps wrote: | No clue! | taf2 wrote: | Interesting I've run into similar issues when we put network | load balancer from aws in front of a db. It has a fixed tcp | connection timeout so similar if the query takes a long time | it'll disconnect. We fixed this at the socket level by ensuring | our linux syscnf was set to keep connections alive at an | interval below network load balancer. Was a tricky problem to | figure out. | harikb wrote: | Go sql/database uses its own connection pool. But still that | shouldn't create any problems. I have seen the reverse where | apps that assumed temporary tables stick around from statement | to statement without an explict `txn` (which regular postgres | connections don't need) clearly failed. But I have not seen the | issue you talk about. | | My wild guess would be that the Go code never closed the | result/rows which caused either a pool exhaustion / left the | connections hanging and eventually got a timeout. Consider this | similar to how http client's need to close the response.Body or | else connections can't be reused. | randomdata wrote: | The database/sql package isn't very magical. The comment | about magical library makes me think it was some other | package. | hermitdev wrote: | At a previous employer, I was forced to use a certain 3rd | party ODBC library. Under certain circumstances, it would | just do `exit(1)`, with nothing even to stderr. Very | frustrating and annoying to debug/fix. Had I physically known | the developer responsible for that behavior, I'd probably | have faced murder charges and plead temporary insanity. | xwdv wrote: | These sort of articles are very effective, if you nail at least | one thing in your checklist the reader does not know or is | confident about, it often encourages them to continue reading | further. I know I did. | tenant wrote: | An odd thing that happened to me yesterday with the PostgreSQL | ODBC driver (psqlODBC) v9.3.400. It wouldn't let me insert a | string longer than 255 characters long into a character varying | field into a local v9.4 database on windows using a recordset | update. I didn't have a problem pasting it in via pgadmin. | Altered the field to text and the problem went away. I've a | suspicion that there is a limit on text in the tens of thousands | of characters length too though despite both those fields being | essentially the same thing and limitless. | echlebek wrote: | That sounds like a bug in the driver. I'm pretty sure varchar | on postgres is just an alias for text, which has no limit. | | edit: varchar(n) will issue an error past n bytes, varchar | without n is the same as text. character varying(n) is like | varchar(n). | | https://www.postgresql.org/docs/9.4/datatype-character.html | gdubs wrote: | There seems to be a wealth of information here. I wonder, though, | if the goal would be better served breaking this up into a | series. Or perhaps applying the 80/20 rule to the list to come up | with truly "a few" items that will have the highest leverage. | Otherwise, I think it will provoke a lot of discussion and | analysis from people who are experts in databases, but it might | remain impenetrable for the core audience: the majority of | developers who ignore this stuff (according to the author.) | paxos_failure wrote: | Very useful list! Related to this, I've recently discovered | https://databass.dev, a database internals book, which also | covers a few things that were mentioned in this list, and some | more. | Buttons840 wrote: | I learned SQL working on a "database as a product". The database | was filled with medical ontologies. It was a perfect environment | for learning. We were always looking for obscure things in that | database, and rarely changed data, just selects for days. In the | end I once wrote a query spanning about 100 lines that used | common table expressions and found it quite maintainable. | | Where can an SQL beginner find such a database to experiment | with? I was fortunate because my job provided me this database to | experiment with, but what about those who are not so fortunate? | war1025 wrote: | CTEs are great. It's such a nicer syntax than subqueries. | tonymet wrote: | The most important and overlooked characteristic is that most | RDBMS use b-tree for their table-space, meaning all the | operations (including search / lookup) are O(log n). For online | (OLTP) applications, this means you will have to shard sooner or | later (assuming your audience is growing) | mtVessel wrote: | I agree with your first statement, but could you explain how | b-trees logically lead to sharding (at some growth point)? What | storage structure doesn't lead to sharding eventually? | xupybd wrote: | I could be wrong but it seems to me that she is confusing not | storing state with Idempotent calls to the db. Everything you | store in the db is some form of state. | dmtroyer wrote: | As a developer, I am glad there are other people in the world who | understand this. | danielovichdk wrote: | I love a good old database | [deleted] | a_c wrote: | Learn about modelling. Database is more than just storing data. | Drink less koolaid of NoSQL, any NoSQL. It is trading initial | result with future development time. SQL has been battlefield | tested. No amount of "convenience" is more convenient than | learning the fundamentals. | strbean wrote: | I'm disappointed that there isn't more criticism of the SQL | _language_. The whole NoSQL buzz got me excited, then turned | out to actually mean NoRelational. | | It is wild that we are still using a language that looks and | feels like COBOL, and any criticism is met with drive-by | disapproval (downvotes and no comments) or an argument about | why relational databases are important. | | SQL is a deeply flawed language by standards that are pretty | much universal today - we on HN discuss them daily on posts | about new programming languages. For example, one of the top | level comments here: | | > In SQL it boils down to the fundamental syntactical | requirement to put the SELECT clause before the FROM clause. So | I have to build up my statement in this weird spiral pattern | where I change something deeper in before I know what I can | SELECT in the first place. | | If this were the case for a new language post on HN, the author | would get run out of town for this reason alone. And yet, the | person who wrote that comment feels the need to hedge by saying | they are "an SQL hater in remediation". | | Relational data does require a mode of thinking that many | programmers are not practiced with. Wouldn't it help to have a | language for working with it that isn't outright terrible? | nouveaux wrote: | In regards to 'select' before 'from', there could be a | similar argument to be made for declaring imports at the top | of a file. When you write a program, you may not know what | libraries you need. Variable declaration at the beginning of | a function is also a common pattern. | | The SQL language stood the test of time where as COBOL did | not. I think it says something about how well it was | designed. | | I strongly suspect that developers have such a hard time with | it is because of ORM. ORM trains you to think of the database | as objects. Another poster mentioned that treating databases | like sets made SQL click. I think that is a good way of | thinking about SQL. | | To be fair, SQL does have some archaic things in there that | make it annoying to work with. However, once it clicks, it's | fits naturally with how a relational db works. | setr wrote: | >Variable declaration at the beginning of a function is | also a common pattern | | What language requires you to declare variables at the | beginning? C even stopped doing that. People choose to do | that, but ime that's after the writing phase, to make it | more readable | | >When you write a program, you may not know what libraries | you need | | Sort of -- the difference is that when I'm writing code in | my IDE, I know all database objects available to me. It's | in the schema. A library I import once (to the project | itself), and the IDE can always assist from then-on. But | SQL is designed such that despite the library (schema) | being imported, the IDE can't actually assist, unless I | write the code out of order (eg start with SELECT * FROM | table and then start working) | | >However, once it clicks, it's fits naturally with how a | relational db works | | I think you've misunderstood the complaint -- the | relational language is a very strong concept, which has | stood the test of time, and is difficult to complain about | -- it does its job well, and fits naturally with how a | relational db works. | | SQL the language however is: | | A hodgepodge of random keywords tossed about in a totally | inconsistent fashion (eg postgres overlay: | OVERLAY('Txxxxas' PLACING 'hom' FROM 2 FOR 4) -- postgres | at least generally offers a consistent comma-separated | syntax for every functions | | has weird and technically unnecessary limitations (like | SELECT being evaluated _after_ the WHERE clause, so you can | 't use the aliases defined in select clause) | | putting SELECT _before_ the FROM (disabling IDE auto- | complete support) | | the stuffing of a 3-value logic system into a 2-value logic | interface, so boolean operations break silently and produce | nonsense in the face of a database with NULL values, | because no mapping of NOT (TRUE, FALSE, NULL) AND NULL | makes sense. [0] | | It generally composes very poorly, leading to redundant, | long and convoluted queries | | The MODEL is fantastic -- the language is not. | | [0] https://sigmodrecord.org/publications/sigmodRecord/1709 | /pdfs... | a_c wrote: | SQL stood the test of time. SQL is widely adopted. Once you | get the hang of it, it can be applied on a wide range of | RDBMS. | | I would argue that countless productivity has been lost to | learning yet another query language for yet another NoSQL db. | Mongo has its own query language. Cassandra has its own. | Neo4j has its own. What not. Guess what, few engineers need | these to solve their actual problem. Be it building an | application, a library, a SaaS, a tool with wide database | support. The yet another query language is an imaginary | solution to an imaginary problem (most of the time). Your | problem isn't big enough to use whatever NoSQL of choice. | | When developers starting out, they get attracted to | technologies with the most marketing money. They beat around | the bush of learning things that might not matter anymore 5 | years down the road. | | Is english the best language? No, not even close. Should one | learn it? Probably. | | Is esperanto a better language? I don't know, maybe. People | invent it for a reason (a problem to solve) after all. Should | one learn it? Probably no. | strbean wrote: | > SQL stood the test of time. SQL is widely adopted. | | I'm highly dubious of "it is widely used so there" | arguments. I think in the case of SQL, the fact that ORMs | and SQL generators are ubiquitous are evidence that a huge | proportion of engineers would rather not write it, and part | of why it is still widely used (most of the time, most | people can avoid actually touching it). | | > I would argue that countless productivity has been lost | to learning yet another query language for yet another | NoSQL db. Mongo has its own query language. Cassandra has | its own. Neo4j has its own. What not. Guess what, few | engineers need these to solve their actual problem. | | This is exactly one of the points I addressed in my | comment... none of those are even relational databases. And | this is where the discussion ends up every time someone | says SQL sucks. | | > Is english the best language? No, not even close. Should | one learn it? Probably. | | The vast majority of the time, when you are using English, | you have no control over the receiving end. The vast | majority of time, when you are using SQL, you / your org | grabbed an SQL RDBMS to use. | | Furthermore, I don't see why a modern, reasonable query | language couldn't transpile to SQL easily when necessary, | making all path-dependency / adoption arguments void. | tingol wrote: | Any tips where to look for db modelling learning resources? | a_c wrote: | I actually don't know. There was this previous discussion | https://news.ycombinator.com/item?id=22324691. Might be some | gems there. | | For me, it is the 3rd normal form and the adage of "Normalize | till it hurts, denormalize till it works". | adrianmonk wrote: | > _Learn about modelling._ | | So true. I worked at a database-centric place for a while, so I | read up on database stuff to fit in. | | Learning about first, second, and third normal forms was very | enlightening. If you learn these thoroughly enough that they | are second nature, it really helps you see modeling mistakes | that you might be making. It just becomes a lot easier to think | clearly about how to lay out data. Just like there are code | smells, if you learn normalization, you will immediately detect | data structure smells. | | Third normal form is not always the one and only right answer | for modeling data, but it's a pretty good starting point from | which you can refine and adjust if needed. Just because you | understand normalization doesn't mean you have to do it all the | time, of course. But if your data isn't normalized, it should | be for a specific reason (performance), not by accident or | because you don't know how to keep it organized. It is one of | those things where knowing the rules gives you the freedom to | know when it's right to break them. | | It's even useful when thinking about organizing data in RAM. | The situation is a little bit different because in RAM you | follow pointers instead of doing joins. But there are still | cases where it helps. For example, you might have one big | struct, and you realize it should be two different structs | because you are filling the same (redundant) data into multiple | instances. And you know how to fix it. | | Another very useful concept from modeling is choosing keys. One | of the lessons of databases is that if you don't use synthetic | keys, you're going to have a bad time. When you choose real- | world data (like first name and last name as a key for a table | of people), you're going to have problems like non-uniqueness. | | Relational databases aren't the last word on how to organize | and store data, but there are just a lot of good ideas that | pick up if you learn about them. | jasonhansel wrote: | I would add: when using SQL, the default transaction isolation | level is probably not what you'd expect. ___________________________________________________________________ (page generated 2020-04-22 23:00 UTC)