hngopher.com

       [HN Gopher] Ask HN: What could a modern database do that Postgre...
       ___________________________________________________________________
        
       Ask HN: What could a modern database do that PostgreSQL and MySQL
       can't
        
       If there were a general-purpose OLTP SQL database that was
       developed today, what features or design decisions would it pick
       that PostgreSQL and MySQL cannot adapt for historic reasons?  Or
       put another way, what are some cutting edge OLTP database
       techniques/features/architectures that PostgreSQL and MySQL would
       have a hard time supporting?
        
       Author : eatonphil
       Score  : 121 points
       Date   : 2021-09-05 16:11 UTC (6 hours ago)
        
       | aserafini wrote:
       | One thing I find really interesting from Redis that I would love
       | in a relational database is the concept of 'blocking' queries,
       | which block until they get a result.
       | 
       | For example: https://redis.io/commands/BLPOP
       | 
       | If I could do something like:                   Select id, msg
       | From job         Where id > 1224
       | 
       | And have that query block until there actually was a job with an
       | id > 1224, it would open up some interesting use cases.
        
       | truenindb wrote:
       | Basically they aren't going to impress your non-technical boss
       | who is your boss for some inexplicable reason. Postgres has been
       | able to function as key value store(which is basically what most
       | NoSQL "engines" basically are, write once append only key value
       | stores) for quite a long time now. Writing code to query your
       | key-value store is another story.
       | 
       | The rest of my admittedly very snarky, but arguably highly
       | accurate commentary on various veins of discussion on this topic
       | will be left in a time capsule below for future generations of
       | people who actually have a legitimate interest in computer
       | programming to discover after the Taliban are driven out of the
       | computer industry.
       | 
       | Theoretically you can replace hiring smart programmers with
       | magical artificial intelligence that reads your mind and then
       | uses the computer to bend the world to your iron will, but if
       | someone else could produce such a magical artificially
       | intelligent database why would they share it with you? Perhaps
       | they are an omnipotent omnipresent omnibenevolent entity that
       | loves you just a little bit more because of your good looks and
       | high intelligence?
       | 
       | Is this different from wanting to do a persistent query?
       | Basically, since you have a programmable computer that you can
       | program to talk to the database program, you write a computer
       | program that periodically queries the database and then takes the
       | action when the condition is met.
       | 
       | SQL is an implementation of a mathematical descriptive language
       | for relationships. The whole point is that temporal logic like
       | "wait until this happens, then do this" can be kept somewhat
       | separate from logic describing the data you are tracking.
       | 
       | You have SQL that describes what it is you want to store, and
       | particular questions you want to ask about what it is you want to
       | store, and then the job of the database program is to figure out
       | how to store the data safely and efficiently and answer your
       | questions quickly. How you write the SQL that describes the way
       | you want to store the data depends some on what kind of questions
       | you want to ask, and this is what an actually skilled "Database
       | Application Programmer" can figure out for you.
       | 
       | Some proprietary(and probably also Postgres) databases do provide
       | support for the kind of thing you are asking to do here in the
       | form of what are called "Stored Procedures" . Your average
       | corporation accumulates an utter shitload of these stored
       | procedures that various non-technical technical question askers
       | in different departments don't tell each other about and they are
       | often doing the same thing in different ways at different times.
       | Then later they crash the database and break the application
       | itself because there is insufficient technical oversight and
       | communication at the actual decision making levels of the
       | corporate bureaucracy.
       | 
       | Long story short, do all of this stuff in persistent queries done
       | outside of the database and tracked in a shared medium like a
       | wiki page, or even better a physical notebook routinely reviewed
       | as part of management actually doing some managing.
       | https://en.wikipedia.org/wiki/Gantt_chart
       | 
       | This is an indication that you are not using a Model View
       | Controller approach to building your database client. There are
       | some python ORMs that were doing this automatically correctly a
       | decade ago, but there are presuambly still several corporations
       | with hundreds of millions of dollars using Enterprise Java Beans
       | in 2021, also maybe some people with nuclear arsenals as well so
       | you shouldn't consider yourself too behind the times.
        
       | svcrunch wrote:
       | Support for vector indexes and vector similarity queries (dense,
       | and sparse-dense).
       | 
       | I believe PostgreSQL and MySQL will support this within a few
       | years.
        
         | gk1 wrote:
         | In the meantime there's Pinecone (https://www.pinecone.io) to
         | put alongside your DB.
         | 
         | Also there is at least one plugin for Postgres that I know of,
         | PGVector, but no idea how it performs.
        
       | jen20 wrote:
       | I'm somewhat surprised that "zero downtime failover" does not
       | feature more prominently in responses to this - that would be the
       | biggest thing I'd be looking for in a database for OLTP.
        
       | ww520 wrote:
       | What is modern? If we think how databases run better in the
       | modern hardware, the trend goes toward plenty of RAM and fast
       | disk (SSD/memory-disk) that doesn't have the disk seek problem of
       | the old days. Modern databases being developed today aim for
       | those environment.
        
       | tdhz77 wrote:
       | Does your data model change often? PostgreSQL and MySQL well be
       | very difficult to make fundamental changes. If you need to make
       | changes to your data model, the structure of relational databases
       | will make it difficult for you to move fast.
       | 
       | Linear relationships must be defined. Your database doesn't do
       | much for you. You must define every relationship between tables.
        
         | hpeinar wrote:
         | As some of the commenters here have already voiced, I also
         | struggle a bit to understand the issue with using relational
         | databases if the data is actually relational. Which, for
         | probably like 95% projects it is.
         | 
         | The argument that "it will slow us down" seems so vague, I
         | can't really see how defining the schemas and relationships
         | slow you down. You need to know what data you are dealing with
         | anyway even when prototyping, how else would you create views
         | for the users? Tables? Are profile fields for users also open
         | ended? I can add any number of fields with random values?
         | 
         | And "Your database doesn't do much for you?" So what exactly
         | does NoSQL database do for me that PostgreSQL (or MySQL) won't?
         | I'm really curious, I'm not attacking anyone, I just want to
         | know, maybe I've been using the wrong database my whole life.
         | 
         | But I think that database as anything else in your project
         | should be selected by best fit not by mere "it's a cool
         | buzzword, lets use that". Most of the projects out there have
         | relational data and should be using relational database because
         | it fits the core data model.
         | 
         | I don't have anything against NoSQL databases, they obviously
         | have their usage and place but I have against choosing them
         | because they are "cool" and "fast to prototype on" if the
         | project itself has 100% relational data.
         | 
         | Also the tools fastest to prototype with are the ones you know
         | best, it doesn't mean that tool is best fit for the project.
        
           | throwawayboise wrote:
           | > You need to know what data you are dealing with anyway
           | 
           | Exactly. At some point, somewhere, you need to understand
           | your data and how different attributes or objects relate to
           | one another. You can do this in a variety of ways, but if you
           | don't understand this you don't understand what you are
           | building.
        
         | smitty1e wrote:
         | PostGreSQL provides JSON functions if your data have more of a
         | document feel to them.
         | 
         | https://www.postgresql.org/docs/9.5/functions-json.html
        
           | axegon_ wrote:
           | Postgres was first to the party but MySQL supports that too
           | since 5.6 or 5.7, I can't remember.
        
         | traceroute66 wrote:
         | > Does your data model change often? PostgreSQL and MySQL well
         | be very difficult to make fundamental changes.
         | 
         | I hear this argument a lot and I struggle with it.
         | 
         | It is an argument that, at least for me, falls into the same
         | category as "you should design your schema in a portable
         | manner".
         | 
         | The "portable schema" argument is easy to disprove because if
         | you don't design your schema in accordance with the features
         | available in your database then you are setting yourself up for
         | a big performance fail (e.g. for those of you familiar with the
         | work of Tom Kyte of Oracle, a party trick of his was detailed
         | evidence-based demonstrations of why you should use Oracle
         | features in your schema design vs generic schemas ... but the
         | same applies in the open-source world, e.g. Postgres[1]).
         | 
         | The problem I have with the "use noSQL because your data model
         | changes often" is that you then become heavily reliant on your
         | upstream devs who are coding the app layer to behave themselves
         | because you are no longer in a position to enforce or validate
         | their actions at database layer. It also potentially puts you
         | at risk of loosing the database's position as "source of truth"
         | - because if upstream can change your data model at a whim, it
         | means you could easily loose visibility of data elements
         | overnight.
         | 
         | To me, relational databases will always have a place in the
         | world and I don't think people should blindly follow
         | alternative models just because its the bandwagon of the day.
         | ACID compliance is, AFAIK, not available anywhere else other
         | than an RDBMS setting.
         | 
         | NoSQL, graph databases etc. also have a place in the world of
         | course, but only if you understand the limitations and
         | tradeoffs you are accepting. It is quite possible that many
         | people would be better off with RDBMS.
         | 
         | [1] https://wiki.postgresql.org/wiki/Don%27t_Do_This
        
         | fny wrote:
         | This is no longer a good reason since Postgres and MySQL
         | support JSON with indexing.
         | 
         | Also, you don't actually need to define relationships. Rails
         | for eons never did this at the database level.
        
       | gkoberger wrote:
       | Oh I've been thinking a lot about this! It might not be the
       | answer you're looking for since they're all very UX-related and
       | not particularly database-specific, however I have a few ideas.
       | 
       | Sometimes I feel like databases were created by people who never
       | built a website before. Most websites are pretty similar, and
       | databases historically have never felt (to me at least) "modern".
       | I always feel like I'm fighting against them, and making
       | usability concessions for the sake of performance.
       | 
       | First, the ability to subscribe to external data sets. I feel
       | like I spend so much time writing crappy syncing code with
       | external APIs (like Clearbit, GitHub, etc), and it would be so
       | much nicer if I could just "connect" with them and know it will
       | be fairly up to date.
       | 
       | I also think there's so many things everyone finds themselves
       | redoing for no reason. For example, almost every site on the
       | internet as a user database with sessions, and each user has an
       | email (that must be valid + verified), a password (that's
       | encrypted + salted) and 2FA (which everyone is basically
       | implementing themselves). It'd be so nice if the database just
       | "knew" it was a user, and you were tweaking the presets rather
       | than building it from scratch.
       | 
       | Every single company has similar workflows they each solve
       | themselves (often in insecure ways): they all have a production
       | database, staging environments, migrations, direct access for
       | customer support to fix things, local db access/clones for
       | development, etc. I'd LOVE a database that was created with all
       | these use-cases in mind... such as a way to connect to a DB
       | locally but scrub sensitive data, take care of migrations
       | seamlessly, etc.
       | 
       | This might be a bit too "in a magical world"-y, but I'd love to
       | not have to think about tradeoffs. Kind of like an automatic car,
       | I'd love my database to be able to shift based on the types of
       | data and amount of read/writes. At my company, we have 3-4
       | different databases for different reasons (Mongo, REDIS,
       | ElasticSearch, ClickHouse), and it gets really difficult to keep
       | all the data synced and connect them behind the scenes. I'd love
       | to just never have to think about the low-level data store ever
       | again, and have the DB do all the work without us having to
       | worry.
       | 
       | There's a number of primitives that I think are used a lot, and
       | it'd be amazing if they were built in. For example, time. It'd be
       | great to easily get the difference between two times, or total
       | the times of a bunch of rows. Airtable has a time primitive, and
       | it's amazing how much friendlier it is to use.
       | 
       | Overall, I'd also love it to just feel a lot more like Airtable,
       | including an Airtable-like interface for working with it (right
       | down to the ability to create custom views and create on-the-fly
       | forms people can submit data to). I honestly use Airtable for
       | most of my DB needs these days (for one-off small projects), and
       | it's such a delight to use.
       | 
       | Maybe I'm underestimating the importance but... I feel like
       | databases are pretty performant these days. I hope that we can
       | start seeing dramatic UX improvements, since we don't have to
       | optimize for performance the same way we have in the past.
        
         | laurent92 wrote:
         | What I would like best about a database would be to be in a
         | file, in the data directory of the website. Snapshot the
         | filesystem, snapshot the db. SSH the filesystem and the db
         | exists elsewhere. I wouldn't have to deal with the db at all,
         | since it would be part of the website files. Most Wordpress
         | sites would fit in SQLite easily. Using Wordpress would be like
         | opening a Word document. Postgresql can't do that, since the
         | binaries are CPU-specific and under BSD license.
        
           | magicalhippo wrote:
           | The SQLite WASM version[1] showcased here a few times does
           | just that.
           | 
           | [1]: http://static.wiki/
        
       | nicoburns wrote:
       | Subscriptions. Databases like Firebase will automatically push
       | changes to query results down to clients. You can add this to
       | Postgres with tools like Hasura, but it's poll based and not very
       | efficient. It's a super-useful feature for keeping UIs in sync
       | with database state.
        
         | XCSme wrote:
         | Can't you add something like this to MySQL using triggers or
         | some similar system?
        
           | spiffytech wrote:
           | Hasura evaluated Postgres' listen/notify feature to power
           | their subscriptions, but chose polling instead:
           | 
           | https://github.com/hasura/graphql-
           | engine/blob/master/archite...
           | 
           | > Listen/Notify: Requires instrumenting all tables with
           | triggers, events consumed by consumer (the web-server) might
           | be dropped in case of the consumer restarting or a network
           | disruption.
           | 
           | It was substantially non-trivial for them to implement
           | subscriptions that were both robust and efficient using this
           | approach.
           | 
           | Many applications need the extra step on top of listen/notify
           | of relaying subscriptions to an untrusted client (e.g., a
           | browser). I'd like to see more DBs bake that feature in, like
           | RethinkDB did.
        
             | nicoburns wrote:
             | The Superbase (https://github.com/supabase/realtime)
             | approach is really interesting. It listens to the logical
             | replication stream. Makes a lot of sense to me.
             | Unfortunately our postgres instances hosted on heroku don't
             | expose this, so I've been unable to try it out.
        
         | pornel wrote:
         | Postgres has LISTEN and NOTIFY. People build DIY pub-sub with
         | this.
        
         | [deleted]
        
       | thom wrote:
       | I realise I'm straying a bit from core OLTP stuff but also I
       | think removing the historical need for a separate OLAP database
       | is something modern systems should address. Off the top of my
       | head:
       | 
       | 1) Incremental materialized view maintenance, a la Materialize
       | (bonus points for supporting even gnarly bits of SQL like window
       | functions).
       | 
       | 2) Really ergonomic and scalable pub/sub of some sort, a la
       | RethinkDB.
       | 
       | 3) Fine tuned control over query plans if I want it.
       | 
       | 4) Probably very deep Apache Arrow integration.
        
         | taffer wrote:
         | Another interesting approach is HyPer[1]. HyPer uses many new
         | techniques to combine OLTP and OLAP in one database. For
         | example, to achieve good OLAP performance, a columnar storage
         | layout is used, but the columns are chunked for locality to
         | achieve good OLTP performance at the same time. OLTP queries
         | are executed in memory, but cold data that is not used for OLTP
         | is automatically compressed and moved to secondary storage for
         | OLAP.
         | 
         | [1] https://hyper-db.de/
        
         | JohnBooty wrote:
         | 3) Fine tuned control over query plans if I want it.
         | 
         | I feel like when most people say this, what they really want is
         | a better query planner.
         | 
         | The optimum query plan depends on a lot of dynamically changing
         | factors: system load, free RAM, and of course the data in the
         | tables themselves. Any hints we give the query planner are
         | going to help at certain times and be somewhere between
         | "suboptimial" and "disasterous" at most other times.
         | 
         | It's certainly true that the query planners in major RDBMS
         | could be better or, at least, give insight into why they made
         | the choices they did.
         | 
         | It would be cool if EXPLAIN ANALYZE also perhaps showed other
         | query plans the planner considered but discarded, and why.
         | Imagine if the planner tried multiple plans and adjusted itself
         | accordingly.
         | 
         | For Postgres in particular, I think the user-specified costs in
         | pg.conf like `seq_page_cost` and `random_page_cost` feel like
         | one obvious area for improvement: why am I guessing at these
         | values? Postgres should be determining and adjusting these
         | costs on the fly. But, I could be wrong.
        
           | thom wrote:
           | It's true a better query planner is what I want, i.e. one
           | that always does what I want it to do. That not being the
           | case, I will settle for being in control, footguns and all.
           | 
           | I agree the config is hard to wrangle though, you effectively
           | find yourself doing grid search with a bunch of common
           | workloads, it does feel like something the machine should be
           | doing for me (the most common config variable we end up
           | tweaking is "how much money we give Amazon").
        
       | grobbie wrote:
       | CockroachDB is getting a lot of interest these days.
       | 
       | It has broad PGSQL language (and also wire I think) compatibility
       | yet has a clustered peer architecture well suited to running in a
       | dynamic environment like cloud or k8s. Nodes can join dynamically
       | and it can survive them leaving dynamically as long as there's a
       | quorum. Data is distributed across the nodes without
       | administrator needing to make any shard rebalance type
       | interventions.
       | 
       | PGSQL is designed for deployment as a single server with replica
       | servers for HA. It's not really designed for horizontal
       | scalability like Cockroach. You can do it - the foreign data
       | wrappers feature and table partitioning can give you poor man's
       | scale out. Or you can use Citus which won itself a FOSS license
       | earlier this year. And there are other Foss and proprietary
       | approaches too.
       | 
       | MySQL is similar - you can do it, like with their recent router
       | feature, but it has been retrofitted, and it's not as fluid as
       | Cockroach. IIRC MySQL router is similar in configuration to
       | Galera - that is, a static config file containing a list of
       | cluster members.
       | 
       | Listen I'm sure that the design approach of Cockroach could be
       | retrofitted to PGSQL and MySQL, but I'm pretty sure that doing a
       | good job of it would be a lot of work.
       | 
       | So in answer to your question, I'm not sure that there's all that
       | much RDBMS can't be made to do. Geospatial, Graph, Timeseries,
       | GPU acceleration. Postgres has it all and often the new stuff
       | comes to Postgres first.
       | 
       | By the way I love MySQL and PostgreSQL, and the amazing
       | extensions for PGSQL make it extra awesome. Many are super mature
       | and make pgsql perfect for many many diverse use cases.
       | 
       | For XXL use cases though, CockroachDB is taking a very
       | interesting new path and I think it's worth watching.
        
         | why-el wrote:
         | I don't know if anybody from CockroachDB is reading this but
         | their article[1] on serializable transactions is somewhat
         | questionable, as it compares CockroachDB's Serializable to
         | Postgres's Read Committed, which seems to imply CockroachDB is
         | better. Of course, Postgres has serializable as well. The only
         | novelty in the article is that Cockroach DB runs Serializable
         | by default, so I am not sure what that comparison was doing.
         | 
         | [1] https://www.cockroachlabs.com/docs/stable/demo-
         | serializable....
        
           | ummonk wrote:
           | I don't get that implication from the article at all. What I
           | do get is that the only transaction isolation level available
           | under CockroachDB is serializable. And to show why
           | serializable is valuable, they have to demonstrate using
           | Postgres' default of read committed (because they would be
           | unable to demonstrate this using CockroachDB which doesn't
           | allow any other transaction isolation levels.
        
         | mirekrusin wrote:
         | It always rubs me the wrong way - all those paxos/raft
         | approaches (which are great, but...) simply elect the leader to
         | pick writes. In that sense there is no distribution of
         | computation at all. It's still single target that has to cruch
         | through updates. Replication is just for reads. Are we going to
         | have something better anytime soon? Like real distribution,
         | when you add more servers writes distribute as well?
        
           | riku_iki wrote:
           | data is also sharded in those databases
        
             | mirekrusin wrote:
             | true, thanks for pointing out, but this is a bit cheating
             | isn't it? ie. atomic transactions cross shards flip over,
             | right? it's basically ergonomic equivalent of just using
             | multiple databases?
        
               | radicalbyte wrote:
               | You move over into the world of distributed transactions,
               | which can be _really_ expensive.
               | 
               | Thankfully sharding works great for a large number of
               | applications (or in other cases you can accept eventual
               | consistency).
        
               | eis wrote:
               | Not 100% sure what you mean by flipping over (fail?) but
               | at least in CRDB you can of course do cross shard
               | transactions. They wont be as fast as a transaction that
               | can be completely handled by a single leader but it works
               | fine and is transparent to the client.
        
           | FpUser wrote:
           | You are asking for a miracle or just simply - breach of the
           | physics laws. The only way to horizontally scale write speed
           | is to shard your data to be written somehow. You can easily
           | do it but then your reading queries have to go to multiple
           | servers to assemble single result. You can't scale both at
           | the same time. There are some in between solutions like
           | eventual read consistency that are relying on traffic at some
           | point easing enough so that the conductor can actually
           | synchronize servers. But if you have a steady stream of
           | read/write requests the only way you really can scale is to
           | tell amazon to sod off, buy yourself big fat multicore /
           | multi CPU server with nice SSD array (preferably of Optane
           | type) and watch your processing power suddenly shoot through
           | the roof while the cost greatly decreases ;)
        
           | Nican wrote:
           | There are two ways I see databases doing paxos. The basic
           | way, like some databases like Percona, basically is a single
           | raft across the whole database. It can help with high
           | availability, but the database scale is still kind of
           | constrained to the capability of a single writer.
           | 
           | What you really want is databases like CRDB/Yugabyte/TiDB,
           | which are sharding+raft. Tables are sharded into 128MB
           | chunks, and each chunk has their own raft. The database
           | handles transactions, distributed queries, and auto-balancing
           | transparently.
        
             | rad_gruchalski wrote:
             | Definitely, YugabyteDB falls into the second category.
             | Tables are split into so called tablets. This splitting can
             | be controlled by choosing the hashing algorithm for the
             | primary key (or transparent row key hash, if no primary key
             | for a table exists). Each tablet is replicated and has its
             | own raft group. Different tables can have different
             | replication factors, the number of tablets to split the
             | table into can be selected and modified.
             | 
             | YugabyteDB recently added support for table spaces on
             | steroids where table spaces are allocated to physical nodes
             | in the cluster. This enables geo-replication features where
             | tables or rows can be placed within selected geo locations.
             | 
             | All the data shifting is done transparently by the
             | database.
        
           | skyde wrote:
           | All those noSQL or newSQL have more than one leader per table
           | they have a distinct leader for each partition.
           | 
           | So if you have 6 server and 2 partition you could have 3
           | servers for partition number #1 and 3 different servers for
           | partition number #2.
           | 
           | If you want extra performance you could make the server
           | simply store key->value mapping using quorum write like
           | Cassandra is doing but to keep data consistency you still
           | have 2 choice
           | 
           | #1 use Optimistic concurrency (an app performing an update
           | will verify if the data has changed since the app last read
           | that data).
           | 
           | #2 using some kind of Lease, elect one machine to be the only
           | one allowed to write to that partition for some time period.
           | 
           | Option #1 do not give faster transaction throughput but could
           | offer lower tail latency.
           | 
           | Option #2 bring you back to square one of having a leader so
           | you better just use (Paxos/Raf)
        
           | calmoo wrote:
           | CAP theorem. Choose two, it's a fundamental limitation of
           | scaling a database.
        
         | someelephant wrote:
         | Make sure you thoroughly test your multi-region deploys. Last
         | time we tried the system was not stable when a region went
         | down. Also beware of this anti pattern:
         | https://www.cockroachlabs.com/docs/stable/topology-patterns....
        
           | bogomipz wrote:
           | Interesting. So you followed the advice of deploying to
           | greater than two regions and your DB didn't survive when you
           | lost a region or you only deployed to two and got bit by the
           | anti-pattern?
        
         | alberth wrote:
         | Sounds like Mnesia, which has existed for 20+ years.
         | 
         | https://erlang.org/doc/man/mnesia.html
        
           | ramchip wrote:
           | Mnesia isn't SQL-compatible, can't scale horizontally (each
           | node has a full copy of the DB + writes hit all nodes), needs
           | external intervention to recover from netsplits, and expects
           | a static cluster.
           | 
           | It's pretty much the opposite of what the parent described
           | IMHO. It's meant more like a configuration store than a
           | proper DB. That's how RabbitMQ uses it for instance, it
           | stores metadata in Mnesia and messages in a separate store,
           | but they're working on replacing Mnesia with their own more
           | modern store based on Raft.
        
           | rad_gruchalski wrote:
           | That's a huge oversimplification. Mnesia is distributed but
           | that's about it. There's so much more to sql than selecting
           | data. Transactions, user defined functions, stored
           | procedures, custom types...
        
             | alberth wrote:
             | None of what you detail though is included in what the GP
             | describes as a modern database.
             | 
             | I'm not trying to nitpick but what GP describes aligns to
             | what Mnesia provides.
        
               | rad_gruchalski wrote:
               | The GP mentioned CockroachDB though. There's a lot of
               | implicit functionality behind that name. Even just behind
               | mentioning Postgres.
               | 
               | The distributed part of modern sql is fairly recent.
        
         | traceroute66 wrote:
         | > It has broad PGSQL language compatibility
         | 
         | Depends how you define broad. :)
         | 
         | Many key features are missing, including but not limited to:
         | - UDFs and sprocs.         - Useful datatypes such as TSTZRANGE
         | - More limited constraints
         | 
         | I was recently looking at Cockroach as a PGSQL replacement
         | because of its distributed nature. But the equivalence
         | featureset is still lagging badly, unfortunatley.
        
           | rad_gruchalski wrote:
           | May I suggest looking at YugabyteDB, which is a distributed
           | SQL database built on actual PostgreSQL 11.2 engine? It's way
           | ahead of CockroachDB in terms of feature support.
           | 
           | YugabyteDB has UDFs, stored procedures, distributed
           | transactions, the range types are working from what I can
           | tell, at least the example from here: https://wiki.postgresql
           | .org/wiki/Extract_days_from_range_typ... works right out of
           | the box, just copy paste. Postgres extensions are of course
           | working. Orafce, postgis (with extra work needed for gin
           | indexes, afair...), custom extensions.
           | 
           | YugabyteBD == Postgres. The query planner, analyzer and
           | executor are all Postgres. Mind you, some features are not
           | readily available because handling them properly in a
           | distributed manner takes effort. Those unsupported features
           | are disabled on the grammar level, before being worked on.
           | But - unsupported features will not corrupt your data.
           | Missing features are enabled very fast.
           | 
           | For example, I have recently contributed foreign data wrapper
           | support: https://github.com/yugabyte/yugabyte-db/pull/9650
           | (enables postgres_fdw at a usable level) and working on table
           | inheritance now.
           | 
           | Yugabyte is an amazing bit of technology and more people
           | should know about it. By the way - it's Apache 2 with no
           | strings attached.
           | 
           | For clarification: I'm not associated with Yugabyte in any
           | way other than very happy user and contributor. Yugabyte
           | organizes the Distributed SQL Summit later this month:
           | https://distributedsql.org/. Might be a good opportunity to
           | learn more about it. Second clarification: I'll be speaking
           | at the event.
        
           | tuckerconnelly wrote:
           | Are you still using postgres? Thinking about a similar
           | transition, would really appreciate any shared learnings :)
        
         | dvtkrlbs wrote:
         | Not a database expert but here is my thoughts after using
         | Cockroach DB for a mid sized: - Single cluster mode is really
         | nice for local deployment - WebUI is a cool idea but I wish it
         | had more info - The biggest problem is Postgres compatibility
         | there are some quirks that was annoying examples being. The
         | default integer types having different sizes, cockroachdb
         | having a really nice if not exists clause for create type but
         | having no way to maintain a Postgres compatible way of doing it
         | (i think this one is on postgres only being able to do it with
         | plsql scripts is cumbersome) - For me the neutral one. IT HAS
         | ZERO DOWNTIME SCHEMA CHANGES. But if you are just coming from
         | Postgres and just using a regular migration tool and
         | transactions and schema changes having serious limitations and
         | could end up your database in inconsistent state is scary.
         | (Docs really document the behavior but still I would expect a
         | runtime warning for it.
        
           | Supermancho wrote:
           | CDB was great until we started doing table creation on the
           | minute and hundreds of inserts on those tables. When we tried
           | to drop tables on a schedule, CDB never could catch up and
           | would generally just crash (3 nodes). I really don't like the
           | magic you have to do with CDB for things that your commodity
           | DBs can be expected do.
        
       | deergomoo wrote:
       | I am yet to actually use it for anything, but I like the sound of
       | what EdgeDB is doing [0][1].
       | 
       | The tl;dr is that its schema definitions and query language align
       | much more closely to typical app logic, theoretically eliminating
       | the draw of an ORM and being very good at things that are
       | unwieldy with SQL, like retrieving sets of related records.
       | 
       | [0] https://www.edgedb.com/showcase/data-modeling
       | 
       | [1] https://www.edgedb.com/showcase/edgeql
        
       | truculent wrote:
       | I would be interested in more sophisticated type definitions.
       | That is, algebraic data types.
       | 
       | One can achieve this to a degree by storing the data as JSON, but
       | it would be nice to be able to remove the chance of introducing
       | errors when converting to/from JSON.
        
       | jake_morrison wrote:
       | VoltDB is a good example of rethinking relational databases for
       | for modern technology. Traditional databases assume that
       | everything is stored on disk, with a bit of RAM available for
       | cacheing. VoltDB assumes that everything is stored in RAM first,
       | using checkpointing to disk and replication to get durability.
       | 
       | https://www.usenix.org/legacy/events/lisa11/tech/slides/ston...
       | 
       | Have a look at Michael Stonebraker, he's a database genius, and
       | keeps starting "NewSQL" companies which use the relational model,
       | but specialized for different applications, e.g. column stores.
       | 
       | https://en.wikipedia.org/wiki/Michael_Stonebraker
        
         | winrid wrote:
         | Neat. Storing everything in ram and checkpointing to disk is
         | how a lot of game servers work, too. :)
        
       | znpy wrote:
       | be multi-master and horizontally scalable out of the box.
        
       | kweinber wrote:
       | Automatic versioning of data and CDC (change data capture)
       | broadcast of data deltas to external systems.
        
         | refset wrote:
         | > Automatic versioning of data
         | 
         | More specifically, I would argue for the ability to run
         | arbitrarily complex, non-locking, fully-consistent queries
         | against all historic versions of the database, a.k.a. "the
         | database as a value" (also "transaction time" or "system time"
         | temporal queries)
        
           | sam_lowry_ wrote:
           | Isn't this part of the SQL standard already, to some extent?
        
         | reilly3000 wrote:
         | CDC is build-in via binlog; it's the basis of most replication
         | schemes but can be consumed by anything.
        
       | laurencerowe wrote:
       | More from a document database rather than an OLTP standpoint, but
       | working with JSON data in Postgres I'd really like to see:
       | 
       | * Exclusion Constraints support for GIN indexes (which supports
       | Arrays and JSONB.) This would allow unique keys on array elements
       | without having to use triggers to put the individual array
       | elements in another table.
       | 
       | * Array element foreign keys, which I think is a subset of the
       | Inclusion Constraints proposal (which would additionally allow
       | constraints into ranges.)
       | 
       | * A faster procedural language. While plv8 is an improvement over
       | plpgsql, it still seems substantially slower than running the
       | equivalent in a separate NodeJS process. (Possibly just build
       | issues for me.)
        
       | Chyzwar wrote:
       | 1. High performance read/write of Scylla/Cassandra with high
       | availability[1]. It has some limitations for OLTP workloads and
       | require careful planning. Postgres without Citus is not really HA
       | and even then companies would often invent custom
       | sharding/cluster management system.
       | 
       | 2. Powerful graph query language like Cypher. It might not
       | perform well in real life, but my personal experience left me
       | amazed[2]. There is a number of issues with SQL that could be
       | addressed[3], but current standard is way too much prevalent.
       | 
       | 3. Zero independence mismatch between database and application
       | representation. In database like Smalltalk GemStone it is really
       | seamless experience for developer to write database code and
       | application code[4]. To some extent, MongoDB success can be
       | attributed to this aspect.
       | 
       | 4. Datomic temporal capabilities[5]. It is hard to maintain
       | temporal tables in postgres. There are some use cases where you
       | really want query in point of time. Strictly not an OLTP feature,
       | but I can see this be usefully in many scenarios.
       | [1] https://www.youtube.com/watch?v=Fo1dPRqbF-Q       [2]
       | https://www.youtube.com/watch?v=pMjwgKqMzi8&t=726s       [3]
       | https://www.edgedb.com/blog/we-can-do-better-than-sql       [4]
       | https://www.youtube.com/watch?v=EyBkLbNlzbM       [5]
       | https://www.youtube.com/watch?v=7lm3K8zVOdY
        
       | Torwald wrote:
       | Write a cheque for an Oracle consultant.
        
       | jb1991 wrote:
       | Look at Datomic.
        
       | masa331 wrote:
       | This title kind of implies that PG or MySQL aren't modern or
       | modern enough which i think is is very wrong. Look what they
       | bring in every update. I think they are quite modern!
        
         | randomdata wrote:
         | Postgres was more modern when it used QUEL. SQL was a big a
         | step backwards technically (although beneficial from a
         | marketing standpoint).
        
         | kixiQu wrote:
         | The title seems fair to me. Anything that's actually _used_ has
         | certain commitments it made earlier in its lifecycle from which
         | it now can 't deviate, even if later developments made the
         | commitments problematic.
        
           | tmh88j wrote:
           | >Anything that's actually used has certain commitments it
           | made earlier in its lifecycle from which it now can't deviate
           | 
           | Perhaps I'm misunderstanding your comment, but MySQL has
           | definitely deprecated and removed features over the years.
           | https://dev.mysql.com/doc/refman/8.0/en/mysql-nutshell.html
        
           | jffry wrote:
           | I think it's unfair. "Modern" is in no way equivalent to what
           | the thread is really asking, which is what could you do if
           | you built a new database today with all of the knowledge but
           | none of the existing commitments of a large mainstream
           | database.
        
             | Retric wrote:
             | It's fair because these databases made trade offs for older
             | ideas presumably for good reasons. The question isn't what
             | could the have done differently up to now, but rather what
             | new ideas might designers have included if they existed
             | back then.
        
               | jffry wrote:
               | Modern software development practices and knowledge are
               | orthogonal to feature set / compatibility guarantees. The
               | title asks about the former and the text asks about the
               | latter, and I don't agree it's correct to conflate the
               | two
        
       | shekispeaks wrote:
       | Automatic indexes. Adding indexes is a guessing game. It's a bit
       | of abstraction leakage. Imagine a product engineer did not have
       | to think about how the data is laid out on disk
        
         | Upitor wrote:
         | What? I wouldn't want this. How would you evne automate it?
         | Creating the right indexes is the same as creating the right
         | tables and columns in your data model: It depends on the
         | business purpose and usage of the data.
        
           | iudqnolq wrote:
           | Business usage is observed over time instead of known up
           | front though. In a way this is sort of like the query
           | planner. Couldn't your usage be observed and used to
           | determine appropriate indexes?
        
             | karmajunkie wrote:
             | i'd be more interested in an automatic index "suggester"
             | based on observation and slow query analysis. there's also
             | the matter of new use cases where you'd absolutely want to
             | be able to create them manually.
        
               | iudqnolq wrote:
               | Absolutely. I'm increasingly in favor of generated code
               | you check into source control.
        
               | adrianmonk wrote:
               | It has been a million years since I've used it, but I
               | think MS SQL Server has this.
               | 
               | Here's some documentation I just found (at
               | https://docs.microsoft.com/en-us/sql/relational-
               | databases/pe...):
               | 
               | > _The Missing Indexes report shows potentially missing
               | indexes that the Query Optimizer identified during query
               | compilation. However, these recommendations should not be
               | taken at face value. Microsoft recommends that indexes
               | with a score greater than 100,000 should be evaluated for
               | creation, as those have the highest anticipated
               | improvement for user queries._
               | 
               | > _Tip_
               | 
               | > _Always evaluate if a new index suggestion is
               | comparable to an existing index in the same table, where
               | the same practical results can be achieved simply by
               | changing an existing index instead of creating a new
               | index. For example, given a new suggested index on
               | columns C1, C2 and C3, first evaluate if there is an
               | existing index over columns C1 and C2. If so, then it may
               | be preferable to simply add column C3 to the existing
               | index (preserving the order of pre-existing columns) to
               | avoid creating a new index._
        
           | CharlesW wrote:
           | > _Creating the right indexes is the same as creating the
           | right tables and columns in your data model: It depends on
           | the business purpose and usage of the data._
           | 
           | Right, so the way this works is that the database collects
           | instrumentation data and, over time, automatically applies
           | strategies (indexing improvements being one of them) to
           | improve its performance.
           | 
           | https://arxiv.org/abs/2007.14244
        
       | polskibus wrote:
       | Postgresql does not have real clustered index (one that is
       | automatically maintained) that Ms SQL has (and other DBs). Such
       | feature is important for a lot of popular workloads.
        
       | klysm wrote:
       | I think automatically maintaining materialized views isn't well
       | supported for a lot of reasons. But I also don't think that it's
       | impossible for something like Postgres to support it more in the
       | future.
        
         | reilly3000 wrote:
         | PipelineDB!
        
       | reilly3000 wrote:
       | At least speaking for Postgres, it _is_ modern in that it's very
       | actively developed with all kinds of innovations layered on top
       | of the core system. It can be a timeseries db, a real-time db,
       | horizontally sharded, a self-contained REST/graphql API server, a
       | graph db, an interface to many APIs via foreign data wrappers,
       | and much more. In itself it has many analytics functions, and
       | most cloud OLAP dbs use its syntax over a columnar storage
       | scheme. I'm afraid most would-be database projects would be
       | cloned by a Postgres extension before they could make a dent in
       | its share.
        
       | tbrock wrote:
       | Partition a table without jumping through hoops?
       | 
       | A reasonable replication story?
       | 
       | How much time you got?
        
       | pankajdoharey wrote:
       | High volume data ingestion and tunable consistency because Most
       | sites start out with these two databases but eventually end of
       | porting significant sections of their database to Cassandra.
        
       | pornel wrote:
       | I'm missing something as easy to deploy as SQLite, but without
       | its performance problems (like the stop-the-world write lock).
       | Basically, I'd love an embeddable Postgres with a single-file
       | storage.
        
       | lifepillar wrote:
       | One thing PostgreSQL would likely not be able to adapt to, at
       | least without significant effort, is dropping MVCC in favor of
       | more traditional locking protocols.
       | 
       | While MVCC is fashionable nowadays, and more or less every
       | platform offers it at least as an option, my experience, and also
       | opinions I have heard from people using SQL Server and similar
       | platforms professionally, is that for true OLTP at least, good
       | ol' locking-based protocols in practice outperform MVCC-based
       | protocols (when transactions are well programmed).
       | 
       | The "inconvenient truth" [0] that maintaining multiple versions
       | of records badly affects performance might in the future make
       | MVCC less appealing. There's ongoing research, such as [0], to
       | improve things, but it's not clear to me at this point that MVCC
       | is a winning idea.
       | 
       | [0] https://dl.acm.org/doi/10.1145/3448016.3452783
        
         | truenindb wrote:
         | There is absolutely no reason at all that using a Model View
         | Controller architecture should say anything about your
         | persistence layer. Model View Controller is an abstraction for
         | managing code that generates things that are put on a screen.
         | It says that you should have code that represents underlying
         | data separate from code that represents different ways to show
         | the data to the user and also separate from code that
         | represents different ways for the user to specify how they
         | would like to view or modify the data.
         | 
         | The Model portion of MVC should entirely encapsulate whether
         | you are using a relational database or a fucking abacus to
         | store your state. Obviously serializing and deserializing to an
         | Abacus will negatively impact user experience, but
         | theoretically it may be a more reliable data store than
         | something named after a Czech writer famous for his portrayl of
         | Hofbureaucralypse Now .
        
           | nextaccountic wrote:
           | What's being discussed is MVCC, or multiversion concurrency
           | control
           | 
           | https://en.wikipedia.org/wiki/Multiversion_concurrency_contr.
           | ..
        
         | Andys wrote:
         | Having time travel capabilities (eg. cockroachdb) is a really
         | useful side effect of MVCC though. Postgres once had this
         | capability. Need for garbage collection/vacuuming is a
         | downside.
         | 
         | I think it all depends on the pattern of reads, writes, and the
         | types of writes. Mysqls innodb is often faster than postgres
         | but under some usage patterns suffers from significant lock
         | contention. (I have found it gets worse as you add more
         | indexes)
        
         | throwawayboise wrote:
         | The downside of "good ol' locking" is that you can end up with
         | more fights and possibly deadlocks over who gets access and who
         | has to wait.
         | 
         | With Postgres/Oracle MVCC model, readers don't block writers
         | and writers don't block readers.
         | 
         | It's true that an awareness of the data concurrency model,
         | whatever it is, is essential for developers to be able to write
         | transactions that work as they intended.
        
         | ww520 wrote:
         | I don't know. Lock-based concurrency control has problem
         | scaling up concurrent access among readers and writers for read
         | consistency. Of course Oracle with MVCC still beats everybody
         | performance wise.
        
         | zmmmmm wrote:
         | I sort of want the opposite. Except for extremely high velocity
         | mutable data, why do we ever drop an old version of any record?
         | I want the whole database to look more like git commits -
         | completely immutable, versionable, every change attributable to
         | a specific commit, connection, client, user.
         | 
         | So much complexity and stress and work at the moment comes from
         | the fear of data loss or corruption. Schema updates,
         | migrations, backups, all the distributed computing stuff where
         | every node has to assume every other node could have mutated
         | the data .... And then there are countless applications full of
         | "history" type tables to reinstate audit trails for the mutable
         | data. It's kind of ridiculous when you think about it.
         | 
         | It all made sense when storage was super expensive but these
         | days all the counter measures we have to implement to deal with
         | mutable state are far more expensive than just using more disk
         | space.
        
       | pat2man wrote:
       | Cockroachdb: automatic multiple primaries + sharding.
        
       | zinclozenge wrote:
       | 1. Automatic backups to an S3 compatible storage, out of the box.
       | 
       | 2. Progressive automatic scalability. As load increases or
       | storage runs out, the DB should be able to automatically. NewSQL
       | databases do this already.
       | 
       | 3. Tiered storage.
       | 
       | 4. Support streaming, stream processing, in memory data
       | structures, etc. I feel like this is one of those weird things
       | but I keep wishing this were possible when I work on side
       | projects or startups. I don't want to have to spin up
       | mysql/postgres, kafka/pulsar, flink/whatever/stream processing,
       | redis, etc separately because that would be prohibitively expense
       | when you're just getting off the ground unless you have VC money.
       | So I find myself wishing I could deploy something that would do
       | all of those things, that could also scale somewhat until the
       | need to break everything into their own infrastructure, and if it
       | was wire compatible with popular projects then that would be
       | perfect. Will it happen? I doubt it, but it would be lovely
       | assuming it worked well.
        
         | rc_mob wrote:
         | i mean it took me just a couple hours to write script to backup
         | postgres db to s3. not a lot of work
        
         | tlack wrote:
         | Great list. For #4, you should take a look at Erlang or Elixir
         | - they have "just enough" concurrency with streaming primitives
         | and a functional style that makes it easy to use. They make a
         | lot of "bolt on" stuff you need for a regular startup (Redis,
         | Celery, etc) superfluous.
        
       | mamcx wrote:
       | Too much focus in the "scalability" that only matter for a very
       | narrow niche and lateral to the DB engine, so I instead focus in
       | real progress/improvements for RDBMS (one of my dreams is doing
       | this):
       | 
       | - Algebraic data types, removal of NULLs.
       | 
       | - Including a relational language, not just a partial query
       | language (SQL). (I making one at https://tablam.org, just to get
       | the idea)
       | 
       | - So, is full relational (you can store tables in tables, you can
       | model trees with table because above, etc)
       | 
       | - SQL is a interface for compatibility and stuff, but the above
       | is for the rest, because:
       | 
       | - The engine is not a full black box but a composite of blocks
       | so:
       | 
       | -- The inner and only only black box is the full ACID storage
       | layer, that is concerned in manage PAGEs, WALs, etc made in a
       | lang like Rust.
       | 
       | -- The user-facing/high-level storage layer is above this. I
       | think this will allow to code it in the lang above because exist:
       | 
       | -- A pluggable language interface (making a "WASM for
       | database/VM") that others (like SQL) compile to. And probably
       | WASM for stored procedures and/or extend it, THEN
       | 
       | -- This will allow to compile "SELECT field FROM table" CLIENT-
       | SIDE and check it! (after supplied with the schema definition),
       | AND TOO:
       | 
       | - Because it not have a limited query language but one that is
       | full, you can code a new INDEX with it. Note how do it in any
       | language (ignoring the complexity of storage and acid, this is
       | where a high-level interface is needed) is simple, but impossible
       | in current RDBMS.
       | 
       | - Because the DB is truly, fully, relational, you can do "SELECT
       | * FROM Index"
       | 
       | - Then, you can add a cargo-like package manager to shared code
       | to the community
       | 
       | - Then, you can "db-pkg add basic-auth" to install stuff like
       | auth modules that are actually used, not like the security that
       | is included in old database for a use case not many care for
       | 
       | - Allow to make real-time subscriptions to data/schema changes
       | 
       | - Make it HTTP-native, so is already REST/GrapQL/WebSocket/etc
       | endpoint-capable and
       | 
       | - Go extra-mile with the idea of Apache Avro or similar and make
       | the description of the DB-schema integral to it, so you can
       | compile interfaces to the db
       | 
       | - Store the schema changes, so it have in-built MIGRATION support
       | (git-like?)
       | 
       | - Then auto-generate DOCS with something like swagger?
       | 
       | ----
       | 
       | In relational to the engine itself:
       | 
       | - The storage is mixed row/columnar (PAX-like) to support mixed-
       | workloads
       | 
       | - The engine, like sqlite, is a single library. Server-support is
       | another exe and the package manager is what install support for
       | operation
       | 
       | - The DB is stored in a single-file?
       | 
       | - We want to store:
       | 
       | -- Rows/Tables: BTrees + PAX like today, nothing out-of-ordinary
       | 
       | -- LOGs/Metrics: is the same as the WAL!. A rdbms already have
       | it, but is buried: Allow to surface that, so you can do 'SELECT *
       | FROM my_wal"
       | 
       | -- Vectors: Is the same as a PAX storage but one where is only 1
       | column
       | 
       | -- Trees: Is something you can do if the DB is truly relational
       | and allow to store tables/algebraic types on it
       | 
       | IF the storage have a high-level interface and exist a full-
       | featured language ("WASM-like") interace to it, you can add the
       | optimizations to the query planner and the code that manipulate
       | the data without demand to get into the deeps of the engine.
       | 
       | This mean that people that want to disable the query planner,
       | INSTEAD NEED to improve it! IF the query planner is a component
       | of the engine that is surfaced, and can tweak it.
        
         | threeseed wrote:
         | Scalability and high availability are not niche.
         | 
         | Everyone wants this and most production environments need it.
        
       | mikewarot wrote:
       | Relationships between tables, as supported by Microsoft Access..
       | it did cascaded deletes and things automagically.
       | 
       | Persistent queries, where any change to the answer is propagated,
       | like a subscription to a feed.
        
         | majewsky wrote:
         | ON DELETE CASCADE has been in SQL since approximately forever.
        
       | spicybright wrote:
       | I'm extremely out of my expertise here, but I'll see if I can
       | spark some conversation.
       | 
       | While possible with older SQL's through your own code,
       | distributed sharding and keeping multiple databases in sync I
       | would think be useful at a DB level vs user code level.
       | 
       | You can certainly argue that shouldn't be part of the database
       | software though.
        
         | bob1029 wrote:
         | I think it should be the job of the application logic to handle
         | replication, validation and recovery of persisted business
         | objects. Deferring this responsibility to some middleware is
         | not something I am a big fan of.
         | 
         | There are powerful arguments for using some off-the-shelf
         | solution, but I also like being able to set breakpoints in the
         | logic that ties all of the computers together.
        
           | MrStonedOne wrote:
           | There are things the db side would be better posed to handle
           | and understand wrt replication.
           | 
           | Moving it to application code requires a trade off and also
           | everybody copying the same code in all of their applications
           | for no reason other then somebody decided the application
           | should be the one to handle it.
        
           | anticristi wrote:
           | I can't argue against that _iff_ the team has this
           | capability.
           | 
           | Unfortunately, I saw too many teams that end up implementing
           | an informally-specified, bug-ridden, slow implementation of
           | half of PostgreSQL.
        
         | cornel_io wrote:
         | If you've ever used Spanner you'll quickly decide that it
         | absolutely makes sense for the DB to handle almost all of this
         | stuff itself, it's a wonderful system that is sadly
         | prohibitively expensive for many (most?) use cases outside of
         | Google.
        
       | zz865 wrote:
       | My biggest problem with databases is always versioning. IE
       | renaming a column will break old clients. If there was a way you
       | could have multiple schema versions so you could upgrade database
       | then clients later it would be the best.
       | 
       | EDIT: yes thanks for the comments, views and creating and API
       | layers and adding instead of subtracting do all work, but I
       | believe they're all workarounds for the underlying problem.
       | Fixing versioning would make everyone's lives easier.
        
         | DanHulton wrote:
         | Consider using Postgres views! You can create a view into your
         | data and use that for reading and writing. Then, when you need
         | to change the underlying data model, you create a new view with
         | the renamed column. Old clients will target the old view, new
         | clients will target the new view. Then, when all old clients
         | are removed, you can remove the old view safely.
        
         | withinboredom wrote:
         | The ideas is not to validate your schema in a client during
         | runtime. Just like you don't care if a new key is present in a
         | json blob. You should be able to add new features without
         | breaking things.
        
         | johnchristopher wrote:
         | Besides using an ORM that would hide the renaming or views, why
         | not build an API layer ?
        
           | _3u10 wrote:
           | Because Postgres views enforce API versioning in ways that
           | don't break older clients. Pretty much any schema
           | modification that would break an older client is forbidden by
           | REPLACE VIEW. It is certainly possible to break older clients
           | but you have to work at it.
           | 
           | Also... you don't want to build an API layer because
           | postgrest builds one for you. One that is close to on par
           | with GraphQL (arbitrary joins / filters)
        
       | darksaints wrote:
       | Get rid of NULLs and ternary logic.
        
       | m_ke wrote:
       | A combination of SQL, Redis, ES, and Columnar stores in a single
       | package.
        
       | rkarthik007 wrote:
       | When building YugabyteDB, we reuse the "upper half" of PostgreSQL
       | just like Amazon Aurora PostgreSQL and hence support most of the
       | functionality in PG (including advanced ones like triggers,
       | stored procedures, full suite of indexes like
       | partial/expression/function indexes, extensions, etc).
       | 
       | We think a lot about this exact question. Here are some of the
       | things YugabyteDB can do as a "modern database" that a
       | PostgreSQL/MySQL cannot (or will struggle to):
       | 
       | * High availability with resilience / zero data loss on failures
       | and upgrades. This is because of the inherent architecture,
       | whereas with traditional leader-follower replication you could
       | lose data and with solutions like Patroni, you can lose
       | availability / optimal utilization of the cluster resources.
       | 
       | * Scaling the database. This includes scaling transactions,
       | connections and data sets *without* complicating the app (like
       | having to read from replicas some times and from the primary
       | other times depending on the query). Scaling connections is also
       | important for lambdas/functions style apps in the cloud, as they
       | could all try to connect to the DB in a short burst.
       | 
       | * Replicating data across regions. Use cases like geo-
       | partitioning, multi-region sync replication to tolerate a region
       | failure without compromising ACID properties. Some folks think
       | this is far fetched - its not. Examples: the recent fire on an
       | OVH datacenter and the Texas snowstorm both caused regional
       | outages.
       | 
       | * Built-in async replication. Typically, async replication of
       | data is "external" to DBs like PG and MySQL. In YugabyteDB, since
       | replication is a first-class feature, it is supported out of the
       | box.
       | 
       | * Follower reads / reads from nearest region with programmatic
       | bounds. So read stale data for a particular query from the local
       | region if the data is no more than x seconds old.
       | 
       | * We recently enhanced the JDBC driver to be cluster aware,
       | eliminating the need to maintain an external load balancer
       | because each node of the cluster is "aware" of the other nodes at
       | all times - including node failures / add / remove / etc.
       | 
       | * Finally, we give users control over how data is distributed
       | across nodes - for example, do you want to preserve ASC/DESC
       | ordering of the PKs or use a HASH based distribution of data.
       | 
       | There are a few others, but this should give an idea.
       | 
       | (Disclosure: I am the cto/co-founder of Yugabyte)
        
       | talolard wrote:
       | I think AWS Aurora is the most prominent thing that comes to
       | mind. The key value proposition is the separation of storage from
       | compute. That unlocks many promising features, "serverless" , a
       | better parallelization story of OLAP queries etc
        
         | azinman2 wrote:
         | Don't both Postgres and MySQL do this?
        
         | thom wrote:
         | The copy-on-write workflow for clones in Aurora is particularly
         | nice in a dev/staging environment.
        
         | Jgrubb wrote:
         | I read that as "the key: value proposition.." so I probably
         | need to take a break.
        
       | sandGorgon wrote:
       | search - really the holy grail. Combine full text search (with
       | tf-idf or bm-25 support. Not the kind that postgres/mysql does).
       | 
       | This is a tricky engineering problem - have two kinds of indexes
       | in the same database. But disk space is cheap. Network is
       | expensive (especially if you're on AWS).
        
       ___________________________________________________________________
       (page generated 2021-09-05 23:00 UTC)