[HN Gopher] Ask HN: It's 2023, how do you choose between MySQL a... ___________________________________________________________________ Ask HN: It's 2023, how do you choose between MySQL and Postgres? Imagine you are starting a new project, and your backing datastore options are restricted to mysql or postgres (and/or their cloud- tailored equivalents.) What sort of functional requirements would cause you to choose one over the other? Author : debo_ Score : 110 points Date : 2023-05-11 18:52 UTC (4 hours ago) | erulabs wrote: | Use MySQL if you're expecting to have to do large operational | database tasks re: migrations, maintenances, with no ability to | go offline. gh-ost, percona-osc, the new INSTANT DDL stuff, is | all quite far ahead in MySQL-land. Additionally, Vitess and | Planetscale are making huge strides in MySQL performance. There | are more people and guides in the world to help recover even the | most mutilated of MySQL databases. MySQL gets more love in | extremely large enterprise-level organizations, and it shows. | | Use Postgres if you need some of the quite-good extensions, most | notably PostGIS, or if you just want things to work; most | documentation will be postgres flavored. Postgres gets more love | from web-developers, and it shows. | fzeindl wrote: | This is wrong. MySQL does not support transactional DDL, so you | cannot run migration and abort them in the middle. | | Always use postgresql. It's more logical, more extensible, | saner, supports many extensions and is more predictable. | | MySQL is inconsistent crap, that trades away consistency, | correctness and stability for a little bit of performance in | standard use cases. | | Do yourself a favor and always use postgreSQL. I switched 15 | years ago and never looked back. Have done 15-50 projects since | in psql. | WJW wrote: | With a sane migration tool like pt-osc or gh-ost you can | absolutely abort in the middle. What's more, you can pause in | the middle or even slow down in the middle based on arbitrary | logic (ie, pause migration if replication delay rises above a | certain value). Postgres is nice and transactional DDL has | its place but postgres stopped halfway through IMO. Vanilla | Postgres > vanilla MySQL, but the migration story of MySQL + | tooling is so far beyond Postgres + tooling that it's not | even funny. | | That said, if you don't expect to have tables with 100m+ | rows, even vanilla postgres will be good enough. | [deleted] | amflare wrote: | Spoken like someone who switched 15 years ago and never | looked back. | barrkel wrote: | Production MySQL databases of any significant size use pt-osc | or gh-ost for schema changes, and these can be throttled, | paused, aborted and so on. | gardenhedge wrote: | If Postgres was that much better than MySQL then you would expect | to see exact reasons on why to pick it. Every comment so far has | not listed any reason. | l5ymep wrote: | Postgres has a worse implementation of MVCC. It results in more | bloat being produced, and slightly slower updates in a highly | concurrent environment. Most businesses don't operate at the | scale where this matters. But on the flip side, the tooling and | developer experience is much better. | nicoburns wrote: | A few reasons: | | - Transactional DDL statements (schema modifications) | | - Better support for UPSERT operations | | - Better JSON support (including ability to index into JSON | columns) | | - the RETURNING statement to return data that was | inserted/updated | | In general Postgres is a lot more featureful than MySQL. | eqvinox wrote: | Array data type. | | This has saved my ass a bunch of times. Not even as a column | type, just in complex queries that otherwise became unwieldy | monsters. The usual "started out simple but now it's a | frankenbase" and you're stuck with a shitty schema. (The one | thing worse than refactoring code: refactoring databases!) | | In one case I was able to replace a 15-minute process with | thousands of queries with one single query that aggregated all | the data into a bunch of arrays. It completed in a few seconds. | (Doing it without arrays would have been possible, but | duplicated a lot of data in the result set.) | mort96 wrote: | Ok, here's one: When you give MySQL invalid data, its standard | behavior is to just silently store garbage in your database in | many cases where PostgreSQL would've told you that your data is | invalid. | | MySQL's handling of unicode has also been terrible | historically, with way too many foot guns, but I don't know if | that may be better with recent versions. | | People aren't providing strong reasons because the question | wasn't "what are some objective reasons for picking one over | another", but "how do you pick between them". People are simply | answering the question OP asked, and a lot of people's process | is simply to pick PostgreSQL. | AdamJacobMuller wrote: | A lot of the MySQL issues historically have been fixed. UTF-8 | is better now, invalid data handling is better (by default | even! though your distros default config probably puts you | back in permissive mode!) but regardless I'm still using | Postgres every single time. | | The fact is that MySQL historically was terrible for complex | schemas with complex types while postgres was a pleasure to | work with. MySQL had a huge performance edge for many years | but that came at a cost of resiliency and reliability. MySQL | has greatly improved on these key areas and Postgres has also | made significant performance improvements. Systems these days | are also so powerful that the database probably isn't your | benchmark. | | Regardless, I always use Postgres every single time because I | am just scarred from years of dealing with MySQL. What even | is MySQL is also an open question at this point, there's | MySQL and MariaDB and Percona flavors and the founder of | Percona was just ousted and I can't be bothered to put in | mental energy to untangle all this to even figure out what | MySQL I should be developing against. | | Compare this to Postgres where the community seems to have an | extremely steady hand and constant progress. There's no | forks, there's no infighting, there's no drama, there's a | great regular release schedule with incremental improvements. | bcrosby95 wrote: | In MySQLandia, we do not speak of utf8, we only speak of | utf8mb4. | CaveTech wrote: | Hasnt been the case for a few major versions, unless you want | to anthropomorphise your db and and hold it accountable for | past behaviour. | ok_dad wrote: | > unless you want to anthropomorphise your db and and hold | it accountable for past behaviour | | No one is holding the literal bits that make up the | database executable accountable here, they are indicating | they don't trust the devs of MySQL/MariaDB to do a good | job. Whether or not that is an accurate view on their part | is arguable, but it's pretty clear from context that they | don't think that several if/else statements had it out for | them. | bakugo wrote: | > MySQL's handling of unicode has also been terrible | historically, with way too many foot guns, but I don't know | if that may be better with recent versions. | | Unicode generally "just works" if the charset in use is | utf8mb4. As of MySQL 8.0, this is the default. | mort96 wrote: | Ah, that's good to hear. I haven't looked seriously at | databases other than SQLite for a long time, it would be | interesting to see a more up to date evaluation. | evanelias wrote: | > When you give MySQL invalid data, its standard behavior is | to just silently store garbage | | This is a common misconception, but this hasn't been the case | for over 7 years. MySQL 5.7, released in Oct 2015, changed | its defaults to enable strict sql_mode. All prior versions | have hit end-of-life for support years ago, so there is no | modern version of MySQL with this silent truncation behavior. | | The only reason this problem persists is because _Amazon RDS_ | (all versions and forms, including Aurora) uses nonstandard | default settings which disable strict mode! | | That all said, I do believe Postgres is an excellent | database, and a great choice for quite a large range of use- | cases. But please, let's compare 2023 Postgres with 2023 | MySQL, not 2023 Postgres with much older MySQL. It's only | fair. | justinclift wrote: | > But please, let's compare 2023 Postgres with 2023 MySQL, | not 2023 Postgres with much older MySQL. It's only fair. | | Heh Heh Heh | | On a humorous note, the official MySQL page (in the early | 2000's) comparing MySQL vs other databases had the same | problem. | | They'd list the latest and greatest MySQL version, but | compare it against archaic versions of the others. Clearly | on purpose, because "Marketing" probably. | | Asked them (via official @postgresql.org email address) to | please update that page to more a recent PostgreSQL, for | fairness. And was completely ignored of course. | | So it's kind of amusing to see a request for fairness in | the opposite direction (which I agree with anyway) ~20 | years later. ;) | jbverschoor wrote: | Why do they have these settings? Because shit software used | shit MySQL +data corruption. | | It might be better now. But for me it's in the same | shithole as mongodb, php (old style, so no recovery there | even though it's possible to create proper code) and most | JavaScript. | | Other things is that people don't even want to use Oracle's | MySQL but MariaDB. Why the hell would I want to run a fork | of something, and still keep calling it something else. | | The only reason for MySQL is wordpress | et-al wrote: | If you're doing any sort of spatial logic (e.g. mapping), | you'll want to use PostGIS. | MangoCoffee wrote: | pick the database that you are most familiar with | web3-is-a-scam wrote: | Easy. Always postgres. | | I also use the same logic applied to document databases. Mongo or | Postgres? Postgres | | Also pub sub. Postgres or redis? Postgres | | Use postgres until it's not technically feasible to use it | anymore. | LinuxBender wrote: | For my little hobby sites I use whichever best fits the | application. If an application has a specific need for a function | that is in Postgres and not MySQL then I use Postgres. | | In my former work life we used Percona MySQL for the commercial | support and very fast response to fix bugs and add features, but | we also used Postgres and Oracle. In those cases it was more | important to have awesome DBA's that could do anything with any | database. | iamwil wrote: | The difference is not significant enough to matter for most | projects, esp just starting out. Hence, I mostly choose Postgres, | since I don't like Oracle as a company very much. | dylan604 wrote: | Whenever I see MySQL, my brain automatically sees MariaDB. What | is this Oracle thing you speak of ;-) | evanelias wrote: | MySQL and MariaDB have diverged quite a bit. I recently wrote | a roundup of just the differences in table functionality and | DDL, and despite keeping the post focused to that relatively- | narrow topic, the list of differences is getting VERY long: | https://www.skeema.io/blog/2023/05/10/mysql-vs-mariadb- | schem... | dylan604 wrote: | >MySQL and MariaDB | | Again, what is this thing you are trying to compare? I just | see MariaDB and MariaDB =) | | However, because of my blindness to actual MySQL, I have | totally not paid attention to any differences between the | two. I guess "drop in replacement" isn't actually true any | more. Thanks for the info | [deleted] | MrResearcher wrote: | Postgres doesn't have type hints and it might create a false | impression of robustness until it messes up table statistics and | does FULL SCAN against a table with millions of rows ignoring all | indicies. It happens super rarely though, so if you run anything | critical, you'll probably be down only for a few hours once a | year or two. Be ready for this to happen though. | | Apart from that (and noticeably higher memory consumption), | Postgres is most likely preferable. | endgame wrote: | https://web.archive.org/web/20211206040804/https://blog.sess... | | From a former MySQL developer: | | > let me point out something that I've been saying both | internally and externally for the last five years (although never | on a stage--which explains why I've been staying away from stages | talking about MySQL): MySQL is a pretty poor database, and you | should strongly consider using Postgres instead. | skunkworker wrote: | Postgres for anything with a single database size < 4TB. | | But if you need something that can handle 100TB+, go Vitess(mysql | compatible). | MrThoughtful wrote: | By choosing SQLite. | | No server process and a single file per DB which I can put | wherever I like. | jjav wrote: | > By choosing SQLite. | | It's a good reminder to give some thought to whether one | actually needs MySQL|Postgres. If not, SQLite is the way to go. | Most of my code that uses a DB is using SQLite. | | But obviously, if you actually need MySQL|Postgres then SQLite | is not an option. | mort96 wrote: | I like SQLite. But I really wish its default behavior wasn't to | simply allow garbage data into the database. If I have an int | column, don't let me accidentally store a string. | euroderf wrote: | create table strict | eesmith wrote: | https://www.sqlite.org/stricttables.html | | "In a CREATE TABLE statement, if the "STRICT" table-option | keyword is added to the end, after the closing ")", then | strict typing rules apply to that table. ... The STRICT | keyword at the end of a CREATE TABLE statement is only | recognized by SQLite version 3.37.0 (2021-11-27) and later. | sqlite> create table x (a int); sqlite> insert into x | values ('hello'); sqlite> select * from x; hello | sqlite> drop table x; sqlite> create table x (a int) | strict; sqlite> insert into x values ('hello'); | Runtime error: cannot store TEXT value in INT column x.a (19) | mort96 wrote: | Yeah, I know about that, and I'm doing that on all my | tables these days. It's just sad that the default behaviour | is to allow garbage data, and that if you ever forget to | put 'strict' on your tables you'll have a perfectly | functional application with no sign that anything is wrong | until you suddenly find corrupt data in your database. | lisasays wrote: | SQLite is great for its scope - but not in the same class as a | full-fledged RDBMS. | MrThoughtful wrote: | Can you give a specific example of what you are missing when | using SQLite? | endisneigh wrote: | Multi writer, access through the internet without 3rd party | software, etc. | mort96 wrote: | At a certain scale, you'll want replication or replication, | which SQLite doesn't really do AFAIK. At a scale below | that, you'll probably want to be able to have multiple web | servers talking to one database server, which SQLite | doesn't really do either. I also think SQLite's performance | during heavy write workloads is worse than PostgreSQL's? | | Basically, AFAIK, SQLite becomes problematic once you need | more than one computer to handle requests. | justinclift wrote: | Just to point out, there are now SQLite replication and | various "distributed database" projects which seem to | work fairly well. | | They're probably not as battle tested as the PostgreSQL | ones, but they are around, have users, and are actively | developed. | | The ones I remember off the top of my head: | | * https://litestream.io | | * https://github.com/rqlite/rqlite <-- more of a | "distributed database using RAFT" type of thing | | * https://github.com/canonical/dqlite | Avamander wrote: | Performance, especially after a while and certain size. | lisasays wrote: | Here's a good place to start: | | https://www.sqlite.org/whentouse.html | | https://www.sqlite.org/quirks.html | | Full-scale RDBMSs, especially Postgres, have _lots_ of | goodies that either SQLlite doesn 't have (or which it does | have, but which aren't so richly featured). Once you've | gotten hooked on a few of these, the distinction will feel | a lot more clear. | | Meanwhile the tipping points in favor of SQLite seem to be | embedded systems, and its whole "service-less" architecture | and plain ease of use. Which is why it still gets lots of | love, for those contexts. | snowski3 wrote: | just pick MySQL. If you want 100 half-baked non-production-ready | features pick Postgres. | Doctor_Fegg wrote: | For anything involving location, choose Postgres because PostGIS | is just so good. | frizlab wrote: | Good to know! I'm starting a project that will use location, | and I chose Postgres, I'm happy to hear this :-) | trympet wrote: | The Postgres query optimizer is more powerful than the MySQL | query optimizers [1]. It generally scales better for OLTP. Also | tons of extensions that can accelerate your workload. | | [1] - https://ieeexplore.ieee.org/document/9537400 | spudlyo wrote: | It's also more opinionated than the MySQL query optimizer, in | that you can't give it hints to prevent it from making a | horrible mistake. | mthoms wrote: | Previously - | | https://news.ycombinator.com/item?id=35599118 | | https://news.ycombinator.com/item?id=12166585 | OrvalWintermute wrote: | Friends don't let their friends choose Mysql :) | | A super long time ago (decades) when I was using Oracle regularly | I had to make a decision on which way to go. Although Mysql then | had the mindshare I thought that Postgres was more similar to | Oracle, more standards compliant, and more of a real enterprise | type of DB. The rumor was also that Postgres was heavier than | MySQL. Too many horror stories of lost data (MyIsam), bad | transactions (MyIsam lacks transaction integrity), and the number | of Mysql gotchas being a really long list influenced me. | | In time I actually found out that I had underestimated one of the | most important attributes of Postgres that was a huge strength | over Mysql: the power of community. Because Postgres has a really | superb community that can be found on Libera Chat and elsewhere, | and they are very willing to help out, I think Postgres has a | huge advantage over Mysql. RhodiumToad [Andrew Gierth] | https://github.com/RhodiumToad & davidfetter [David Fetter] | https://www.linkedin.com/in/davidfetter are incredibly helpful | folks. | | I don't know that Postgres' licensing made a huge difference or | not but my perception is that there are a ton of 3rd party | products based on Postgres but customized to specific DB needs | because of the more liberalness of the PG license which is | MIT/BSD derived https://www.postgresql.org/about/licence/ | | Some of the PG based 3rd party DBs: | | Enterprise DB https://www.enterprisedb.com/ - general purpose PG | with some variants | | Greenplum https://greenplum.org/ - Data warehousing | | Crunchydata https://www.crunchydata.com/products/hardened- | postgres - high security Postgres for regulated environments | | Citus https://www.citusdata.com - Distributed DB & Columnar | | Timescale https://www.timescale.com/ | | Why Choose PG today? | | If you want better ACID: Postgres | | If you want more compliant SQL: Postgres | | If you want more customizability to a variety of use-cases: | Postgres using a variant | | If you want the flexibility of using NOSQL at times: Postgres | | If you want more product knowledge reusability for other backend | products: Postgres | GiorgioG wrote: | It's pretty easy, it's always Postgres. MySQL has been awful at | every place I've had the misfortune of using it. | usrbinbash wrote: | > What sort of functional requirements would cause you to choose | one over the other? | | Simple: I don't like having headaches. Therefore, I chose | postgres. | dylan604 wrote: | Not even funny or clever, and definitely not helpful to the | actual question being posed | api_or_ipa wrote: | Sigh. Hoping dang will swoop in and close this flame war... | didip wrote: | Always PostgreSQL. Unless if MySQL protocol is absolutely needed. | adoxyz wrote: | Choose whichever one you/your team is more familiar with. Both | are battle-tested and proven and will likely scale to whatever | needs you have. | TehShrike wrote: | This is the correct answer. | | Whichever one you start out with, you will be annoyed if you | switch to the other one 5 years later. I started out with | mysql, and when I started working on a postgres project, I was | shocked at some of the ways it was lacking (how am I supposed | to store email addresses in a database without collations?). | | But when postgres folks grouse about stuff in mysql, I'm | usually nodding along and saying "yeah, that would be nice". | | They're both great options. If anybody on your team is already | an expert at one of them, use that one. | robertlagrant wrote: | > when I started working on a postgres project, I was shocked | at some of the ways it was lacking (how am I supposed to | store email addresses in a database without collations?) | | How long ago was this? :) | TehShrike wrote: | 3 years ago. From this comment thread | https://news.ycombinator.com/item?id=35908169 I infer that | Postgres still doesn't support collations. | vosper wrote: | Having used both in production, I agree with the above. It's | not going to make or break your business or project. | | I will also add that their are giant companies relying on both | databases with great success. Facebook still runs on MySQL, and | contribute back to it. Youtube I'm not sure about, but it did | run on MySQL for a long time, well after it got massive. I'm | sure examples exist for Postgres (Amazonm since they moved off | Oracle?) | bombcar wrote: | And if you have experience with one and try to use the other, | you may end up foot gunned by something you didn't know about. | DiabloD3 wrote: | Why would I choose MySQL in any year? There is no context you can | provide to this question where I wouldn't always choose Postgres. | muhammadusman wrote: | piggybacking on this: does anyone know a Postgres alternative to | PlanetScale? | fastest963 wrote: | We use YugabyteDB and have had a lot of success with it. We | deploy it across 5 continents and it's remarkably performant. | gmac wrote: | https://neon.tech | racl101 wrote: | Can I actually work with it locally and on a production server? | mgl wrote: | We choose Postgres for extensibility and stability :) | david927 wrote: | I was going to make a similar joke: you look at them and choose | Postgres | entropicgravity wrote: | If you're a gigacorp you might have reasons to go with a | customized version of MySQL otherwise pick Postgres. | api_or_ipa wrote: | Practically speaking, they're very similar. Mysql and Postgres | differ in their approaches to replication & clustering, which can | have a big impact to high availability, high volume database | configurations. | rc_mob wrote: | Its not even a choice anymore. One is far better than all of the | others. | jfb wrote: | What's to choose? Postgres all the way. | herpderperator wrote: | I find PostgreSQL permission management quite convoluted. In | MySQL it is simple to query for what grants a user has, but in | PostgresSQL you need to write 100 lines of SQL to do the same... | and you can't run \du and other commands without psql. Why | couldn't they just come up with `SHOW` shortcuts that work in any | SQL client? | gibsonf1 wrote: | By going with Apache Solr instead | tdy721 wrote: | Easy: SQLite | toomuchtodo wrote: | Always Postgres. It will be more painful to migrate in the future | versus starting with it. | m0llusk wrote: | PostgreSQL is a community thing and MySQL is Oracle. Maybe make | some basic benchmarks for comparison? | johnchristopher wrote: | I choose mysql because they don't want to install extensions (geo | stuff) at work and we are using many PHP things that use mysql. | And WordPress. Choose boring tech or something. | azurelake wrote: | MySQL is still ahead operationally (no vacuum, group replication, | gh-ost, optimizer hints, etc.). I would choose it unless one of | the Postgres extensions is a killer fit for your project. | [deleted] | javajosh wrote: | Go with postgres. Unless you need php/wordpress, then pick mysql. | geenat wrote: | I know it seems dumb, but postgres really needs to add the simple | developer experience stuff like: | | SHOW CREATE TABLE; | | SHOW TABLES; | | SHOW DATABASES; | | SHOW PROCESSLIST; | | CockroachDB added these aliases ages ago. | Izkata wrote: | I forget if there's an equivalent for the first one, but from | psql there is a translation of mysql's "DESC table" as "\d | table", and the rest are: | | \dt | | \l | | SELECT * FROM pg_stat_activity; | spacedcowboy wrote: | This. | | For anything at home, I would use MySQL just for those things. | The psql client _feels_ very primitive by comparison to me - | even though it isn 't. | tbarbugli wrote: | I highly suggest investing time learning psql, autocomplete | works great and it has a ton of useful slash commands. \d for | instance shows you the list of tables. Awesome tool | spacedcowboy wrote: | I don't need it often enough to invest the time. I | generally set up a database as backing store to some | project, fiddle with it until I'm happy it's working at the | scale/performance I want, and then move on to something | else. | | During those few weeks I'm actively using the database on | the project, I can either get frustrated beyond belief with | the CLI for Postgres, or just use what's at hand with | MySQL. In fact, these days SQLite is getting more of my | attention anyway, and I wrote a small CLI for it a decade | or so back (before the sqlite3 client gave us most of the | below) to provide: | | - Timings for the queries (in fact I called it 'tsql') | | - Aligned-column displays, with | separators between | columns and index-indicators | | - Ability to parse some blobs (Plists on the Mac, for | example) and display | | - Things like "SHOW DATABASES", "SHOW TABLES", "SHOW TABLES | LIKE" etc. | | Mainly I wrote it to do some benchmarking, but I eventually | preferred it over sqlite3 as the CLI. | | Note that all this is personal stuff - When I do commercial | stuff, the answer is always "what is best understood by the | people maintaining it afterwards"... | sproketboy wrote: | [dead] | chunk_waffle wrote: | I dunno if this is still true but a couple of years ago MySQL was | cheaper on AWS (RDS/Aurora) than Postgres. | maxmorlocke wrote: | For a typical db.t3.xlarge instance, you're talking about | 29c/hour vs 27.2c per hour. That's $157.68 as the total | difference for one year's runtime, when the whole instance cost | for postgres would be $2540.4 for the year, or about 6%. The | larger the machine, the closer to parity. Given the absolutely | small difference, I hope this isn't the dividing line in any | commercial project. | chunk_waffle wrote: | Again, I don't still know if this is the case, but you could | use smaller instances with MySQL aurora than with Postgres, | given the way our application worked it would have made a big | difference for us if we had used mysql. | | RDS/Aurora was our most expensive resource so we were looking | at ways to cut that cost down and mysql was one option | (though the way the app worked and the extensions that it | relied on made it not a possibility.) | NuSkooler wrote: | PostgreSQL every time, unless you have a specific reason, or as | already pointed out, you're sure you don't just need SQLite. | | PSQL in my experience has vastly better tooling, community, and | is ahead of the curve tech wise. Same with extensions | availability. Or perhaps you need to move away from it to say | CockroachDB, or similar which is much easier. | grzm wrote: | nit: psql is the command line client. postgres or pg are the | more common shortenings of PostgreSQL. | | https://www.postgresql.org/docs/15/app-psql.html | jstx1 wrote: | Also interested in the responses, not because it seems like a | close decision but because I would pick postgres by default for | anything (anything that isn't simple enough to be done in | sqlite). | majestic5762 wrote: | Same. | [deleted] | kgwxd wrote: | I don't use either but I don't think I've seen MySQL mentioned in | any meaningful way in almost a decade at this point. If forced, | knowing only that, I'd choose postgres. | test6554 wrote: | I've been using MariaDB (MySQL) as a hobbyist for years. I just | set up a couple myqsql servers with phpmyadmin on Raspberry PIs | and use them for local development. Basic crud apps, etc. | | I've always assumed that PostgreSQL is a step up, but never | really bothered to look into what I get for the effort. Do I | really get anything if I'm not trying to make apps at scale? | eqvinox wrote: | > I've always assumed that PostgreSQL, but never really | bothered to look into what I get for the effort. | | You're making a (mistaken) assumption that Postgres giving you | a "step up" means that you also have to put in more effort. You | don't, at least not in my experience. Both are database servers | with a bunch of install & setup. There's phppgadmin if you want | an 1:1 replacement for phpmyadmin (no opinion on these, haven't | used either). | | Postgres just gets you farther if you need to at a later point. | | I would recommend you swap out mysql for postgres on your | raspis. You're gaining experience on one of the two. But | experience on Postgres seems to be more useful and valuable | (cf. rest of the HN comments), for the same cost of your time. | 12907835202 wrote: | Same position. | | There's so many things I want to learn I'm not sure postgres is | such a step up from MySQL that it's worth being at the top of | the list. | pierat wrote: | Friends dont let friends use #Horracle software. | | That includes VirtualBox, MySQL, Horracle Cloud. Just step back. | Walk away. Do not pass go, do not collect $20000 lawyers fees for | unintended actions. | za3faran wrote: | That's quite silly. VirtualBox is great, and so is MySQL. | They're also both OSS, so no lawyers in the sense you're | implying. | vermaden wrote: | Its simple AF - I just always pick the well proven PostgreSQL | database. | | ... if that is too big I use SQLite. | CuriouslyC wrote: | The only instance where I'd choose mysql over postgres is if your | database needs are very simple, but you need to be able to scale | hard, and your devops aren't skilled enough to manage an advanced | postgres setup. | tough wrote: | Pick postgres unless you have a good reason not too? | cryptonector wrote: | And even then you pick Postgres. | tough wrote: | Can't really loose with postgres, I concur | milesward wrote: | You choose Postgres. | tobyhinloopen wrote: | Am I the only one who thinks postgresql's timestamp and | timestamptz types are incredibly stupid? | | I just want to either save a local date and time, or an utc | timestamp. Postgresql's timestamp(tz) types do neither and both | at the same time. | claytonjy wrote: | I've avoided `timestamp` and haven't needed localized times | stored in the DB, but in what way is `timestamptz` not exactly | the utc timestamp you expect? | donatj wrote: | Unpopular opinion on HN apparently, but MySQL | | - It's less featureful, and I'd consider that a strong virtue in | the YAGNI camp - less to go wrong, less mental overhead. | | - Maintenance is simpler and far less necessary in my general | experience. | | - Replication is simpler and more reliable. | | - You can tell the query optimizer what to do. When this is | needed, you'll be thankful. It's a godsend. | | That said, I wouldn't run Oracle MySQL. I opt for MariaDB on | smaller projects and AWS Aurora MySQL for larger projects. Aurora | scales insanely well, and replication lag is almost non-existent. | | In my general experience MySQL was always significantly faster | but it's been a number of years since I've worked with Postgres | and the comments here seem to indicate that that may no longer be | the case. YMMV | droobles wrote: | samesies | EamonnMR wrote: | I would disagree on maintenance being simpler. I have never had | Postgres randomly munge a table and require me to run a command | to fix it. | [deleted] | stouset wrote: | > It's less featureful, and I'd consider that a strong virtue | in the YAGNI camp - less to go wrong, less mental overhead. | | This doesn't really hold water in my opinion. | | It's not like PostgreSQL is some minefield of misfeatures and | quirky behavior. Some of these features _exist_ , but have zero | impact on you unless you actually opt to use them. But if you | end up needing to: they're there, and you can just start using | them. | | Compare this to MySQL where _they simply don 't exist no matter | how much you may need them_. Need to apply an index to the | result of a function to quickly fix a performance issue in | prod? Sorry, you can't. Need window functions to accurately | compute some analytics in a sane period of time? Sorry, you | can't. The list of things you _can_ do in PostgreSQL that you | simply can 't with MySQL is massive and grows every day. | | The odds that you'll want, need, or greatly benefit at least | one of these features is not small. Having the flexibility of | knowing these features exist should you ever have a use-case | for them is massive. | evanelias wrote: | Your examples regarding MySQL's features are not correct. | | Need to apply an index to the result of a function? No | problem, use a functional index, supported since October | 2018: https://dev.mysql.com/doc/refman/8.0/en/create- | index.html#cr... | | Need to use a window function? No problem, supported since | April 2018: https://dev.mysql.com/doc/refman/8.0/en/window- | functions.htm... | pawelduda wrote: | What's the ratio of solving DB perf issues by optimizing it and | letting the planner do its work, to telling it what to do? For | me it's like 1000:1. | | And that one case I remember was perfectly solvable the regular | way, with a little more time. | glogla wrote: | I choose Postgres every time, because that is what I am familiar | with. | | But nerdy stuff: | | Postgres stores data linearly (in heap - which has nothing to do | with the heap data structure used for sorting, it just means pile | of data). If you need to have fast access to data, you need to | add secondary indexes - and the secondary indexes point to | location in the heap as "this is where you find the data". | | MySQL stores data in a tree - a table is a tree sorted by primary | key. You can create secondary indexes and instead of a pointer | they contain the primary key value. | | That means for example that data with similar primary key will be | located physically nearby each other, in MySQL but not in | Postgres. At the same time, inserting new data with random (like | UUID) primary key in MySQL will write all over the table, but | will mostly "append at the end" in Postgres. | | Postgres also implements MVCC with Serializable Snapshot | Isolation - so data that someone changes exists in multiple | copies and needs to be cleaned up later - but there's no locking. | MySQL relies on locks instead so there's no duplication but you | might see transactions waiting for each other. I don't remember | if MySQL implements a proper serializable isolation - but that is | not really the default on any database anyway. | | Interestingly, Oracle has very similar design to Postgres (though | it uses rollback segment for old data, so there's no bloat and | vacuum but you might get "snapshot too old" error) while MS SQL | Server is also tree and lock-based database like MySQL. | | Does this impact you? It might, like in cases where MySQL | performs terribly due to UUID keys or Postgres can't vacuum fast | enough due to high volume of updates or something. Or you're | implementing money settlement logic and need proper serilizable | transactions, who know. But it is cool to know the implementation | details. | spudlyo wrote: | This is like asking how you'd choose between Emacs and Vim, Mac | and PC, Monoliths and Microservices, Functional and Object | Oriented .. you're likely going to elicit a lot of passion and | not a ton of objective information. | | For most applications, either choice is going to be just fine. | Use what your team has the most experience with. If you have no | experience, try them both out and go with whatever you're most | comfortable with. | gmac wrote: | Postgres. Fast, full-featured, rock-solid, and a great community. | | I think many of us can't be bothered to go over (again) the | issues we've had with MySQL in the past. The last straw for me | was about ten years ago, when I caught MySQL merrily making up | nonsense results for a query I'd issued that accidentally didn't | make any sense. | | Very likely this particular issue, and others like it, have been | fixed in the meantime. But I just got the sense that MySQL was | developed by people who didn't quite know what they were doing, | and that people who really did know what they were doing weren't | ever likely to be attracted to that to fix it. | serverholic wrote: | [dead] | kiernanmcgowan wrote: | Having used postgres for the past decade, I tried MySQL for a | side project to see whats changed with it. The sad answer is | that it feels like nothing has changed - Oracle seems to have | let what used to be a core technology of the industry languish. | | I'm sure there are use cases where MySQL will be the better | choice over postgres, but the future for the stack looks bleak. | johnny22 wrote: | i think one is also referring to mariadb here and not just | mysql. Maybe that's better enough? I wouldn't know, I just go | with postres. | dunno7456 wrote: | Creating a series of connections very quickly is cheaper in | MySQL and MariaDB than in PostgreSQL. Typically, a connection | poller is used before PostgreSQL to support connection | scalability. | | I'm not sure if there has been a recent breakthrough that has | changed that. I think that still applies today. Correct me if | I'm wrong. | gabereiser wrote: | You can create a series of connections in postgres just as | fast. The connection pooler you are referring to is when | you put pgBounce or pgPool in between your pgdb and your | client software to expand beyond the physical limits of | connections and optimize clustered architectures. MySQL at | scale is replication only. A few commercial offerings for | MySQL like planetscale have brought MySQL into the 21st | century. Postgres has a couple ways of clustering, | sharding, scaling, beyond your Wordpress database. | JohnBooty wrote: | Oracle seems to have let what used to be a core | technology of the industry languish | | I think slowly squeezing the life from MySQL was a very | explicit goal for them. After the big wins (Wal-Mart, etc) | MySQL had 15-20 years ago I think it was _very_ clear MySQL | was going to chip away at more and more of Oracle 's | business. | | I wonder how much Oracle spends on MySQL every year? They're | spending a lot of money to keep MySQL at kind of a "not quite | good enough" state. But they can't kill it outright - it'd be | like boiling a frog fast instead of slow. | | In the end, I wonder what extinguishing MySQL really | accomplished for them. It might have bought them some | breathing room but Postgres quickly filled MySQL's old | segment. | stouset wrote: | Yep. The real question here is: it's 2023, why would you choose | MySQL over PostgreSQL? | | Not that there aren't reasons. There are some. But for starting | out with a new app without a very, very good reason to do | something different? PostgreSQL every day of the week. | bratao wrote: | One big factor that keep us on MySQL is the MyRocks engine. We | have huge databases with billions of rows. The MyRocks enable the | use of it with heavy compression, that PostgreSQL can't handle | it, as it is much slower and uses 30x more disk usage, even with | heavy TOAST tuning and/or ZFS compression. | eqvinox wrote: | To be fair, at the scale of your use case there I really hope | you have a proper DBA who understands multiple database systems | and their details, and is able to make the best choice for your | setup. (At some point the commercial and/or oddball SQL servers | become an option too...) | | For everyone else who's in most cases not even stuffing a | million rows into their database... just stick with Postgres :) | [deleted] | ac2u wrote: | You choose postgres. | craigkerstiens wrote: | Having answered this a ton over the years, don't want to really | take shots at MySQL. But Postgres stands in pretty unique ground. | | 1. It's solid as a really reach data platform (more than just a | relational database). It's extension framework is quite unique | compared to others. It's JSONB support was the first among other | relational databases and is feature rich and performant. Multiple | index types. Transactional DDL. The list goes on. | | 2. No central owner. A lot of open source is source code is open, | but it's maintained by a central company. | | 3. I mentioned extensions, but really that is understated. It can | do really advanced geospatial, full text search, time series, the | list goes on. | | Having explained this a ton of times first 10 years ago - | https://www.craigkerstiens.com/2012/04/30/why-postgres/ and then | again 5 years later with and updated version, most recently tried | to capture more of this in an updated form on the Crunchy Data | blog - https://www.crunchydata.com/why-postgres | ungawatkt wrote: | postgres just because I'm more familiar with it, and the | extension ecosystem (TimescaleDB, PostGIS, etc etc). | | That said, I'm not going to be sad with MySQL, though I'd | probably go with MariaDB just because of full open source (note, | I don't know any details there, being a postgres guy) | duiker101 wrote: | Most answers seem written by fanboys rather than legit answers. | | I would say go with what you know and are most comfortable with. | You are more likely to get the better outcome. | [deleted] | chunk_waffle wrote: | This. | | I've heard countless times that Postgres is better and I've | watched talks where they show how loosey-goosey MySQL is with | some things but I know how to backup, restore, tune, secure and | replicate MySQL. I grok it's permissions in depth more than I | ever have with Postgres and I've even written a mysql plugin in | C so I have that in my toolbox if I need it. So I'd by default, | usually go with MySQL (or in some cases SQLite.) but if I | didn't have to administer or maintain it, and someone else was | handling that I think I'd be fine with Postgres too. | bags43 wrote: | If you are using .NET then Postgres might be better choice. (much | better support for drivers and default ORM). | | If you need replication go with MySQL. | vp8989 wrote: | https://mysqlconnector.net/ is very good. | | Dapper and ^ that works very well IME. | | Agreed about replication. | mixmastamyk wrote: | 1) The choice is Postgres if you care about your data at all. | | 2) Yes, if you are already HUGE and have requirements on Vitesse | then by all means use it. If so, you are not asking this question | --see #1. | | 3) It's a blog or something where it doesn't matter, use a static | site generator. | harryvederci wrote: | I flipped a coin, it landed on its side so I went with SQLite. | carpo wrote: | Am I working on a Wordpress site ? Yes -> MySql No -> Postgres | dijit wrote: | There is almost no good reason to choose MySQL over PostgreSQL | for any operational reason, I did a deep dive many moons ago | (before major improvements in performance to postgres) and people | were _saying_ that MySQL was faster. I found that not to be true | and the differences have only gained even more favour towards | postgres. | | also, I assume you mean MariaDB as MySQL is owned by Oracle and I | would greatly implore anyone and everyone to avoid Oracle as if | it has herpes. | | There are a lot of historic problems with MySQL accepting invalid | data, committing data even when there are constraint issues, and | having very poor transactional isolation, I am not sure if these | have improved. | | Truthfully, the _only_ benefits you gain from using MariaDB or | MySQL are: | | * Memory tables | | * Having inconsistent replicas (which can be useful when you want | your downstream to have less data than your upstream and you know | it won't get updated.) | 0xbadcafebee wrote: | > avoid Oracle as if it has herpes | | herpes isn't that bad. most people will get it in their | lifetime. 1 in 6 people have hsv-2, the less common variant. | trying to avoid herpes is like trying to avoid chickenpox | (although herpes isn't nearly as harmful as chickenpox). | | you should avoid Oracle like it's a blood pathogen. | unethical_ban wrote: | hello, fellow person with herpes! (I assume) | | The worst part about having it is having to talk about having | it. It's really not bad as a condition separate from societal | concern. | enneff wrote: | It's not so bad for most people but if you're one of the | unfortunate few who suffer chronic symptoms it can be truly | awful. Not worth playing that lottery if you can avoid it. | stavros wrote: | I find similar societal concern when I tell friends I use | Oracle. | r2_pilot wrote: | Having Oracle experience on the resume is a positive, I | suppose, but I'm not sure it's been worth the exposure. | [deleted] | za3faran wrote: | > most people will get it in their lifetime | | Citation needed. | yakshaving_jgt wrote: | Here you go. | | https://www.who.int/news/item/28-10-2015-globally-an- | estimat... | hamilyon2 wrote: | As a person who has herpes firmly in his nerves, I would say | don't underestimate herpes. | soperj wrote: | Chickenpox is actually caused by a herpesvirus. herpes | varicella zoster. | noodlesUK wrote: | I think that the only reasons to choose MySQL (or Maria) over | Postgres for a new project _are_ operational. Postgres is | probably the better database in almost all respects, but major | version upgrades are much much more of a pain on Postgres than | on almost any other system I have ever used. That being said, I | would choose Postgres pretty much every time for a new project. | The only reason I would use Maria or MySQL would be if I | thought I later would want to have something like Vitess, for | which I think there isn 't really an equivalent for Postgres. | dijit wrote: | > but major version upgrades are much much more of a pain on | Postgres than on almost any other system I have ever used. | | This is a thread comparing MySQL and Postgres and your claim | is that _postgres_ is harder to do major version upgrades | than anything you have used?? | | Context is important here, have you honestly actually | upgraded a MySQL node? It's a lesson in pain and "major" | version changes happen on minor versions, like the entire | query planner completely trashing performance in 5.6->5.7 | | Postgres has two forms of updates: | | 1) in place binary upgrade. | | Fast, clean, simple, requires that you have the binaries for | the old and the new database. | | 2) dump/restore. | | Serialise the database into text files, load a new database | and deserialise those files into it. | | Slow, but works flawlessly & consistently with relatively low | danger. | | MySQL can only do option 2. | | You can sort of fake an "update" by abusing the fact that | MYSQLs replication offers no guarantees, so you can make a | new server a replica; then roll over. But it is impossible to | know what data was lost in that transition and MySQL will | happily continue without ever telling you. | | I have experienced this behaviour in large e-commerce | retailers. MySQL was very popular for a very long time and I | am _intimately_ aware of operational best practices and how | they are merely patching over an insane system. | evanelias wrote: | MySQL doesn't use SemVer. MySQL 5.6 vs 5.7 are different | "release series", and switching between them is considered | a "major" version change. | | MySQL absolutely fully supports in-place binary upgrades, | saying otherwise is pure FUD. And the upgrade process in | MySQL doesn't even iterate over your table data in any way, | so claiming it will cause "data loss" is also pure FUD. | | At Facebook we automated rolling in-place updates of our | entire fleet, with new point builds of fb-mysql going out | several times a month, to the largest MySQL deployment in | the world. Worked flawlessly and this was a full decade | ago. | | MySQL is widely considered easier to upgrade (relative to | Postgres) because MySQL's built-in replication has always | been logical replication. Replicating from an older-version | primary to a newer-version replica is fully supported. When | upgrading a replica set, the usual dance is "upgrade the | replicas in-place one at a time, promote one of the | replicas to be the new primary while temporarily booting | out the old primary; upgrade the old primary and then | rejoin the replica set". | dijit wrote: | Facebook has, _at minimum_ , 3 _teams_ maintaining MySQL. | including a team who genuinely modifies it into | submission. so much that they needed 1,700 patches to | port their modified version to 8.0. | | It is not relevant to the discussion to discuss how | Facebook has managed to munge it to work reasonably well | by pouring thousands of hours of engineer time into the | effort; and MySQLs in-place upgrades absolutely _do not_ | work the way you describe consistently. | | I know this because I have been in the code, and only | after having _experienced it_. Maybe some of your lovely | colleagues has helped out your particular version to be | marginally more sane. | | It genuinely must be nice having a dozen people who can | work around these issues though, I certainly wouldn't | consider it an operational win, most companies have no DB | automation engineers, or DB performance engineers or | MySQL infrastructure engineers. | | > Replicating from an older-version primary to a newer- | version replica is fully supported. | | Here also be dragons, as eluded to. I know it works quite | often, I have used it. | | FWIW: I ran global AAA online-only game profile systems | on a handful of Postgres machines at about 120k | transactions/s in 2016, I would have needed 5x as many | instances to do it in MySQL, and this was only tiny part | of our hosted infra.. which included a global edge | deployment of game servers, auth servers, matchmaking, | voice bridges and so on. | | and we only had two people responsible for the entire | operation | evanelias wrote: | Please educate me on how my statement about MySQL | upgrades is incorrect, I'd love to hear this. I've been | using MySQL for 20 years, and while 2 of those years were | at Facebook, 18 were not. I've performed MySQL upgrades | in quite a wide range of environments, and what you're | saying here about lack of in-place upgrades or eating | data is simply not aligned with reality. | | I haven't made any comments regarding performance | comparisons, and have also run extremely large DB | footprints with tiny teams, but I don't see how any of | that is relevant to the specific topic of new-version | upgrade procedure! | dijit wrote: | Because it depends so much on your storage engine and | schema, I have never seen it recommended because there | are circumstances where you have data which is | unrepresentative unless you are very careful or you don't | actually use the expressiveness of the DB. | | I mean, I've also seem my share of "ERROR 1071 (42000) at | line xxx: Specified key was too long; max key length is | xxx bytes" randomly that basically means the machine | needs manual recovery. | | God help you if you don't have innodb_file_per_table | enabled to begin with too. | | I know you want me to cite exactly. That will take me | time to find because I stopped caring about MySQL 7 years | ago, but I will dig for you. | evanelias wrote: | > I've also seem my share of "ERROR 1071 (42000) at line | xxx: Specified key was too long; max key length is xxx | bytes" randomly that basically means the machine needs | manual recovery. | | What? This error has nothing to do with upgrades, nothing | to do with manual recovery, and hasn't been a common | problem for many many years. | | In old versions of MySQL, it just meant you needed to | configure a few things to increase the InnoDB index limit | to 3072 bytes, instead of the older limit of 767 bytes: | | innodb_file_per_table=ON innodb_large_prefix=ON | innodb_file_format=barracuda | | and then ensure the table's row_format is DYNAMIC or | COMPRESSED. | | But again, all of this happens by default in all modern | versions of MySQL and MariaDB. | | Should it have been the defaults much earlier? Absolutely | yes, MySQL used to have bad defaults. It doesn't anymore. | dijit wrote: | The error I gave is a similar one to the one I used to | get with "major" upgrades that happened when Ubuntu | decided it was time to upgrade. | | It happens and I seriously never claimed that it was an | ultra common problem, merely that upgrades in Postgres | are more intentional and not painful except for a little | extra work between major versions. The standard upgrade | path within major versions; 9.x or 10.x or 11.x or 12.x | is working just the same as MySQL, except I have much | more experience of MySQL completely fumbling their | "automatic unattended" upgrade or even the mysql_upgrade | command. | | Mostly because in the real world outside of engineering | cultures databasen are massively abused, ISAM tables that | are constantly updated, InnoDB ibdata1 in the terabytes, | poor configs, replicas that have skipped a few queries, | column changes inside a transaction that failed but | actually modified data, it happens. Usually I am called | in to clean the mess. | | Major difference here is that Postgres doesn't leave a | mess, so I never have the kind of issues that I am | describing in this thread with it, and _you_ don't | because I am guessing that you're there when they're | installed, someone with knowledge was actively | maintaining. or you have a lot of people to help with | shortcomings. | | I get it though. you've got your sunk cost knowledge of | MySQL and you've been on large support teams for it. | Maybe you're afraid I'm suggesting that this knowledge | goes out the window. and it has gotten better, but I | wouldn't give my kids watered down led infused soft | drinks just because I had suffered through led poisoning. | I remember coming to blows with you in other threads over | the years because you think MySQL can be saved or is | totally fine, but honestly, just, no. | evanelias wrote: | I'm primarily a software engineer, not a member of "large | support teams". I've also worked for many years as an | independent consultant, brought in when things go wrong, | certainly not when they were first "installed". I'm not | "afraid" of anything concerning my knowledge going "out | the window". If MySQL suddenly disappeared worldwide, I | could happily pivot to some other area of software | engineering, or I could simply retire. Please stop make | assumptions about other people who you know nothing | about. | | I'm responding to you because you're repeatedly posting | factually incorrect items, for years. For example you and | I have directly discussed the "MySQL doesn't use SemVer" | thing before on HN, and yet here you are again in this | thread, claiming 5.6 to 5.7 should be a "minor" upgrade. | | Anyway, to the topic at hand, as others have also | mentioned in this thread: historically the difficulty | with Postgres upgrades has been the lack of cross-version | replication, due to Postgres WAL replication being a low- | level physical replication system. This made it difficult | to perform an upgrade while keeping your site fully | online. Perhaps the newer logical replication support | makes this easier these days. I hope to learn more about | it someday. If you can share your process for upgrading a | Postgres cluster while keeping it online, that would be | helpful and informative. | dijit wrote: | 1. The log-replication method of upgrading can be | performed using the built-in logical replication | facilities as well as using external logical replication | systems such as pglogical, Slony, Londiste, and Bucardo. | Most of which have existed essentially forever. | | 2. Failovers of any database are not instant, but they | are indeed quick! So let's not claim that you can do an | upgrade with zero downtime. | | 3. In-place upgrades are extremely fast and you can test | the speed using a physical replica before hand, usually | it's a couple of seconds though the docs say minutes. | | 4. MySQLs major version being in the minor position is | exactly the kind of "you should be sure you know what | you're doing but we won't make it obvious" territory that | I really despise. | JohnBooty wrote: | While you two have agreed on approximately nothing, this | has been an informative discussion and I do thank you | both. | Volundr wrote: | FWIW while I use Postgres for my own development I've had | to administer a number of MySQL servers for other devs. | Upgrades have always been updating the MySQL package, | restarting MySQL, then running `mysql_upgrade`, and | restart the server again. I'm pretty sure the | mysql_upgrade has even been missed a number of times and | it's worked fine. | | I won't say it's impossible you ran into issues doing | this, but it is the documented and supported upgrade | path. | | I love Postgres, but as someone whose maintained both for | years, upgrades (at small scale) are the one area where | I'd say MySQL has Postgres beat. | LammyL wrote: | Is there a good way to do case-insensitive accent-insensitive | collations yet in postgresql? It's been a holdup for using that | for some use cases like searching for data, like a person's | name, in pgsql when the casing or accents don't match | perfectly. | | Mssql has had this for ever, and I'm pretty sure MySQL has it | as well. | dijit wrote: | Maybe this helps: | https://stackoverflow.com/posts/11007216/revisions ? | | My gut tells me that I would do it in the query itself | though, and not rely on the collation. Maybe I am | misunderstanding. | EwanToo wrote: | Not really, no, it's doable but not easily | za3faran wrote: | MySQL is free, regardless of Oracle's ownership. | srcreigh wrote: | Postgres is >50x slower for range queries(example below) and is | akin to using array-of-pointers (ie Java) whereas MySQL | supports array-of-struct (C). Illustration from Dropbox scaling | talk below. | | Sneak peek photo [1] (from [2]). Just imagine its literally | 500-1000x more convoluted per B-tree leaf node. That's _every | Postgres table_ unless you CLUSTER periodically. | | [1]: https://josipmisko.com/img/clustered-vs-nonclustered- | index.w... | | [2]: https://josipmisko.com/posts/clustered-vs-non-clustered- | inde... | | Mind boggling how many people aren't aware of primary indexes | in MySQL that is not supported at all in Postgres. For certain | data layouts, Postgres pays either 2x storage (covering index | containing every single column), >50x worse performance by | effectively N+1 bombing the disk for range queries, or blocking | your table periodically (CLUSTER). | | In Postgres the messiness loading primary data after reaching | the B-tree leaf nodes pollutes caches and takes longer. This is | because you need to load one 8kb page for every row you want, | instead of one 8kb with 20-30 rows packed together. | | Example: Dropbox file history table. They initially used | autoinc id for primary key in MySQL. This causes everybodys | file changes to be mixed together in chronological order on | disk in a B-Tree. The first optimization they made was to | change the primary key to (ns_id, latest, id) so that each | users (ns_id) latest versions would be grouped together on | disk. | | Dropbox scaling talk: https://youtu.be/PE4gwstWhmc?t=2770 | | If a dropbox user has 1000 files and you can fit 20 file- | version rows on each 8kb disk page (400bytes/row), the | difference in performance for querying across those 1000 files | is 20 + logN disk reads (MySQL) vs 1000 + logN disk reads | (Postgres). AKA 400KiB data loaded (MySQL) vs 8.42MiB loaded | (Postgres). AKA >50x improvement in query time and disk page | cache utilization. | | In Postgres you get two bad options for doing this: 1) Put | every row of the table in the index making it a covering index, | and paying to store all data twice (index and PG heap). No way | to disable the heap primary storage. 2) Take your DB offline | every day and CLUSTER the table. | | Realistically, PG users pay that 50x cost without thinking | about it. Any time you query a list of items in PG even using | an index, you're N+1 querying against your disk and polluting | your cache. | | This is why MySQL _is_ faster than Postgres most of the time. | Hopefully more people become aware of disk data layout and how | it affects query performance. | | There is a hack for Postgres where you store data in an array | within the row. This puts the data contiguously on disk. It | works pretty well, sometimes, but it's hacky. This strategy is | part of the Timescale origin story. | | Open to db perf consulting. email is in my profile. | Shorel wrote: | You are confusing two concepts here. In InnoDB, the tables | are always ordered by the primary key when written to actual | disk storage. | | This is not the same as "having a primary key", Postgres also | has primary keys. It just stores the PK index separately from | the bulk of the data. | | Oracle also has primary keys, even if the order of the rows | is different to the key order. In Oracle, when the rows are | stored in the same order as the keys in the primary index, it | is a special case and these tables are called IOT, index | ordered tables. | | The disadvantages of IOT are that inserts are slower, because | in a normal table, the data is appended to the table, which | is the fastest way to add data, and only the index needs to | be reordered. In an IOT, the entire table storage is | reordered to take the new data into account. | | Select queries, OTOH, are much faster when using IOT, for | obvious reasons, and this is what you describe in your | comment. | | If you use TEXT, BLOB, or JSON fields, even in MySQL, the | actual data is stored separately. | srcreigh wrote: | I said primary _index_ , not primary key (primary key and | primary index is synonymous in mysql Dropbox example). | Primary index is database theory lingo for storing all the | primary row data inside a B-tree. It's synonymous with what | you say IOT although that's a new term for me. | | You're incorrect about IOT reordering the entire table at | least wrt mysql. MySQL uses a B-tree to store rows, so at | most it's insertion sort on a B-tree node and rare b-tree | rebalance. Most b-tree leaf nodes have empty space to allow | for adding new data without shifting more than a few | hundred other rows. Also, non-IOT tables also need to do a | similar process to write to each of its indexes. Last, it's | sort of a tossup since if you're appending to an IOT table | frequently, the right edge of the B-tree is likely cached. | (similarly for any small number of paths through the | primary index B-tree). At worst Postgres heap will need to | surface one new heap disk page for writing, although I'm | sure they have some strategy for caching the pages they | write new data to. | | Sorry to spam this info! Glad to see we both love databases | and I'm always please to see engagement about this topic! | JohnBooty wrote: | Thanks for that informative link. It's rare in these sorts of | discussions. | otabdeveloper4 wrote: | Does Postgres have binlog replication yet? | dijit wrote: | Yes, for over half a decade at least, but "binlog" is a MySQL | term, for postgresql it has the much more apt name: write- | ahead log. | | it is the only official, in-binary replication mechanism. | evanelias wrote: | Postgres WAL replication is a _physical_ replication | stream. MySQL binlog replication is a _logical_ replication | stream, a higher-level abstraction which is independent of | the storage engines. | | Postgres does separately support logical replication now, | but it has some limitations, such as not permitting | replication of DDL: | https://www.postgresql.org/docs/current/logical- | replication-... | asdfman123 wrote: | The main problem with herpes is the stigma against it. Don't | besmirch it by associating with Oracle. | pyuser583 wrote: | Does Postgres have an archive mode? | dijit wrote: | If you say what you're trying to actually achieve I can help | with a solution, but asking if it supports an arbitrary | feature is not going to get the answer you want because | depending on what you're actually using an archive table for, | Postgres might have something already built in but it will | almost assuredly not be _exactly_ like an archive table | storage type. | bellBivDinesh wrote: | MySQL for the quick and dirty and Postgres for anything else | itake wrote: | Why is mysql better for quick and dirty? I feel like pg | extensions offer a lot more "dirtiness" running inside pg than | mysql has. | weaksauce wrote: | yeah that's a weird take. if you want quick and dirty you use | sqlite and if you need something more you go with postgres. | some replication things are nicer in mysql apparently but | postgres is the better option for most workflows ___________________________________________________________________ (page generated 2023-05-11 23:01 UTC)