[HN Gopher] Foundations of Databases (1995) ___________________________________________________________________ Foundations of Databases (1995) Author : harperlee Score : 167 points Date : 2021-04-14 16:22 UTC (6 hours ago) (HTM) web link (webdam.inria.fr) (TXT) w3m dump (webdam.inria.fr) | ExcavateGrandMa wrote: | Behavior DB, let's call it that way :D | macando wrote: | It's incredible how little attention is paid to data modeling and | querying in the education of people entering the software | engineering workforce. | | Getting your database model right, on the logical and physical | level, will make developing and deploying any data-driven app | simpler and easier. Getting it wrong? No modern programming | language or architectural pattern will save you from the worst | kinds of bugs, workarounds and bad performance. | | Do yourself a favor and read this or any other legendary DB book. | adamnemecek wrote: | > this or any other legendary DB book. | | What are the other legendary DB books? | macando wrote: | https://www.amazon.com/Modeling-Essentials-Third-Graeme- | Sims... | | My favorite. Not an easy read, it took me weeks to complete | it. | | https://www.amazon.com/Date-Database- | Writings-2000-2006-Chri... | | Relatively unknown but great book if you want to read the | thoughts of a DB heavyweight Christopher Date who | collaborated with Edgar Codd. | | https://www.amazon.com/NoSQL-Distilled-Emerging-Polyglot- | Per... | | The best book to start exploring the land outside the | relational world. An easy read, can be completed in two to | three days. | vishnugupta wrote: | Copy pasting my comment[1] from an earlier thread. | | A couple of years ago I spent quite some time trying to | evaluate the tech stack (and general engineering culture) of | merger/acquisition targets of my employer. It was quite a fun | exercise, all said and done. I encountered all sorts; from a | small team start up who had their tech sorted out more or less | to a largish organisation who relied on IBM's ESB which exactly | one person in their team knew how it worked!! | | I discovered this exact method during the third tech evaluation | exercise. When the team began explaining various modules top- | down and user-flows etc., I politely interrupted them and asked | for DB schema. It was just on a whim because I was bored of | typical one way session interrupted by me asking minor | questions. Once I had a hang of their schema rest of the | session was literally me telling them what their control and | user flows were and them validating it. | | Since then it's become my magic wand to understand a new | company or team. Just go directly to the schema and work | backwards. | | Conversely, I've begun paying more attention to data modelling. | Because once a data model is fixed it's very hard to change and | once enough data accumulates the inertia just increases and | instead if changing the data model (for the fear of data | migration etc.,) the tendency is to beat the use cases to fit | the data model. It's not your usual fail-fast-and-iterate | thing. | | [1] https://news.ycombinator.com/item?id=24137997 | macando wrote: | > I politely interrupted them and asked for DB schema. | | In order of importance: | | 1. DB schema | | 2. List of dependancies | | 3. List of 3rd party integrations | | Depending on the domain, 2. and 3. can be switched. | macintux wrote: | Many years ago, my employer was transitioning ERP systems | from...something CA-owned whose name escapes me and I'd | probably have pretty awful memories dredged up if I actually | went looking for it. | | Anyway, we were lucky enough to have a very talented intern | who was assigned the task of understanding the database | schema behind the CA product in order to migrate our data to | the new system. | | It was...horrific. Absolute nightmare of spaghetti. I'd like | to say I've seen worse, but I've never seen anything even in | the same ballpark. | | I think we finally gave up after weeks of digging into it and | just started over with a mostly clean slate. | msluyter wrote: | To wit: I made it through a master's in CS without a database | class. | | This reminds me of the famous Rob Pike quote: | | "Data dominates. If you've chosen the right data structures and | organized things well, the algorithms will almost always be | self-evident. Data structures, not algorithms, are central to | programming." | | I've often found that if I'm coding something and the code | starts looking increasingly gnarly, that rethinking the data | structures / data model will clean up the code. | gautamdivgi wrote: | We had two books when I was doing my bachelors in CS many | moons ago (late 90's). One was by Silberschatz & Galvin and | the other by Ullman. The first one was used for relational | algebra, various normal forms and introducing table design, | keys, constraints,etc. The second was used to teach the | theory for internals on how a DB is implemented - B+ trees, | deadlock handling, etc, etc. That was a hard course and it | was mandatory. | | I can see not electing to have databases as a course for | masters though. Masters is to allow you to specialize and be | more choice driven. | andrewl wrote: | On that topic, Fred Brooks, author of _The Mythical Man | Month,_ said "Show me your flowchart and conceal your | tables, and I shall continue to be mystified. Show me your | tables, and I won't usually need your flowchart; it'll be | obvious." | slver wrote: | "Data structures, not algorithms, are central to programming. | | Can we unify this to simply "data structures are essential to | algorithms". It's really weird to put these in opposition, | when algorithm with no data and data without algorithm make | no sense. | | Data structures are encoded with use cases in mind, those use | cases at least at the very low level are their algorithms. | Twisol wrote: | I think programming is more than just algorithms. System | design and architecture have a real impact on the longevity | and maintainability of your system, and I'd argue that data | models (not _structures_ per se, the distinction being | logical vs. physical) are foundational to a good design. In | contrast, algorithms play a much more focused role. | | Put differently: | | > Data structures are encoded with use cases in mind, those | use cases at least at the very low level are their | algorithms. | | As you've lampshaded, it's the use cases that are most | fundamental. The data model should be designed to serve | those use cases. The data structures and algorithms reflect | the physical reality of the logical data model. | macando wrote: | > "Data dominates. If you've chosen the right data structures | and organized things well, the algorithms will almost always | be self-evident. Data structures, not algorithms, are central | to programming." | | I'm saving this quote. | | > I've often found that if I'm coding something and the code | starts looking increasingly gnarly that rethinking the data | structures / data model will clean up the code. | | I've faced this over and over again. Writing algorithms is | hard when the underlying data is not optimal. It simply | invites writing complicated code and workarounds. | AlphaSite wrote: | Although it says data structures, isn't it more important to | talk about the structure of the data than the choice of data | structure? | bshipp wrote: | every one of my data scraping projects is littered with files | titled "database.db.bak1", "database.db.bak2".... for this | exact reason. "Oh I scraped 1000 pages and realized I missed | an entire nested data structure....better blow away the dB | and try again. | achn wrote: | This is very true, but also a hugely unfortunate reality. It | amazes me that we have reached this far with "object" graph | oriented data models backed with relational DBs. They are very | seldom conducive to one another. | rgbimbochamp wrote: | Plugging in Andy Pavlo's Database lectures @ CMU which are | completely free on Youtube. Great guy and great lectures. | alistairw wrote: | Agreed. I finished watching them at the start of this year and | it's significantly helped me in using and making choices about | databases. I've also since started implementing my own database | for learning off the back of them. Couldn't be more grateful | for those courses being public. | | link if anyone is interested: https://youtu.be/oeYBdghaIjc | golergka wrote: | And hands down best music choice of all the CS lectures I've | ever seen. | tinmandespot wrote: | Seconded. I have been watching them over the past 2-weeks - | he's a brilliant guy and a great teacher. | jenkstom wrote: | We don't need "more theory". The vast majority of computer | science theory goes unapplied in solving real-world problems. | While I'm sure this is great for postgrad computer science | students, it's not very useful for the every day programmers out | there. | dudeman13 wrote: | >We don't need "more theory". The vast majority of computer | science theory goes unapplied in solving real-world problems | | The sheer amount of rewriting that is being done every day | around the world because people didn't bother understanding | what they were doing (and how they should have thought more | before trying to make it work) disagrees with this. | ggambetta wrote: | ...and this is how we ended with NoSQL. | marcodiego wrote: | >We don't need "more theory". | | >it's not very useful for the every day programmers out there | | So, programmer's talent is not being used to their fullest. The | ones which can't understand basic theory can be easily | condemned to always be mere users of tools while never | improving them. They run a greater risk of becoming obsolete or | devalued. | anonymousDan wrote: | Anyone recommend a good book specifically on distributed | databases (not more general distributed systems stuff like e.g. | klepmann's DDIA)? | jkaptur wrote: | In case someone isn't aware, Designing Data-Intensive | Applications is a very good introduction to distributed | databases, even if it doesn't specialize in them. | anonymousDan wrote: | Yeah I don't disagree, I've just read it already :) | artificial wrote: | I'll second this. It's an excellent intro that's very | approachable. | convolvatron wrote: | I got alot of value out of Distributed | Databases: Principles and Systems Stefano Ceri, Giuseppe | Pelagatti McGraw-Hill, 1985 | | it adopts a very specific approach, and is obviously missing | some newer techniques, but it certainly leaves you with a | feeling of 'yeah, i can build this' | e12e wrote: | Not a book... But foundationdb documentation (and source) might | be worth looking at? https://apple.github.io/foundationdb/ | pcthrowaway wrote: | Not a book, but the podcast series "Scaling Postgres" looks | really good. | throwaway823882 wrote: | Not a book, but I would add to the foreword: "Try not to use | them." | victor106 wrote: | Genuinely curious: What are some challenges with using | distributed databases? | biggestdummy wrote: | Tons of complexity behind this, but the simple problem is | source of truth. When you store data in multiple places, if | they become unsynchronized, which is the true answer??? | Within a monolithic system, you don't have differing | clocks, network partitions that can be very weird, etc. | And, a related problem, which is race conditions. What if | you are reading one copy of the data while the other | replica is being updated. Most complexity, imo, arises from | this fundamental problem and the various techniques to deal | with it. | bshipp wrote: | This is a valid comment that I'd expand to say "...without | exhausting all the tools available." | | For example, if you're facing lengthy search queries and are | looking at partitioning the database to speed them up, many | databases include helpful internal tools that should be | attempted first. Proper indexing is a textbook all on it's | own, and the appropriate design of full-text search | dictionaries and queries is another. | | The one that shocked me a year or two ago was implementing | full-text search in a gigantic sqlite DB. I was preparing to | migrate it to postgresql or elasticsearch because my | traditional "ilike" queries, even indexed, were ridiculously | slow, and I was not aware FTS was an included library. | | Although FTS pretty much doubled the size of the DB, a three | minute query dropped to a few seconds. Since read queries are | non-blocking in sqlite, this allowed me to continue using the | DB without needing a bigger, more complex solution. | decebalus1 wrote: | Database Internals: A Deep Dive into How Distributed Data | Systems Work by Alex Petrov | ChrisArchitect wrote: | (1995) | slver wrote: | Year programmers have ignored relational theory since: | senthil_rajasek wrote: | Previous hn post with comments | | https://news.ycombinator.com/item?id=19726520 | slyrus wrote: | Was lucky enough to take Serge's class when he was visiting UC | Berkeley back in the 90's. And yet I'm still writing SQL :( | Curious to know what the state in the art in solving the then- | thornier theoretical problems (negation mostly) is. | tingletech wrote: | I think this was my textbook when I took a relational database | class at UCSD Extension back in the 90s. ___________________________________________________________________ (page generated 2021-04-14 23:00 UTC)