[HN Gopher] Babelfish: SQL Server-to-Postgres Translation Layer ___________________________________________________________________ Babelfish: SQL Server-to-Postgres Translation Layer Author : ke4qqq Score : 215 points Date : 2020-12-01 17:17 UTC (5 hours ago) (HTM) web link (aws.amazon.com) (TXT) w3m dump (aws.amazon.com) | bdcravens wrote: | As someone who has been fighting with a SQL Server to Postgresql | conversion this sounds AMAZING. Too bad it won't be available | before my conversion is complete (and if it is, that's an even | sadder proposition) | jnsie wrote: | Out of curiosity: besides cost, are there other significant | reasons that drive your desire to switch? | zmmmmm wrote: | I guess it's linked to cost but the pure painfulness of just | having licensing in the way of your infrastructure management | is pretty annoying. | | We literally have had this problem the last few months where | we had a spike in load causing wide spread performance issues | and the obvious answer was to give the server more cores but | "we're not licensed for that" ... so everybody just suffered | through it because temporarily giving the DB more cores for a | few hours was just too painful / costly from a licensing | point of view. | pc86 wrote: | Don't discount the cost, which is borderline astronomical; if | you're on Enterprise, you're paying tens of thousands a year. | For large installation, potentially six figures. And that's | on-prem which is the cheapest way to do it more often than | not. | | A 2-core Enterprise license is nearly $14,000. | bdcravens wrote: | Plenty of reasons, but principally we want to bring our stack | in line with the most common practices (ie, Rails/Postgresql) | for simplicity's sake; this is crucial for our small team. | PeterZaitsev wrote: | This would be even greater news if it would not be vaporware | "Babelfish for PostgreSQL will be available on Github in 2021." | https://babelfish-for-postgresql.github.io/babelfish-for-pos... | orf wrote: | > A commonly used datatype to store monetary values is the MONEY | data type. In SQL Server, the MONEY data type's behavior is fixed | using four digits to the right of the decimal (e.g., $12.8123). | However, in PostgreSQL, the MONEY data type is fixed using two | digits to the right of the decimal. | | > So, when the application tries to store a value of $12.8123, by | example, PostgreSQL will round to $12.81. This subtle difference | will result in a rounding error and break an application if not | correctly addressed. To ensure correctness in Babelfish, we need | to ensure such differences, small and large, are handled with | absolute fidelity. | | How are they going to solve this with just a query translation | layer? Isn't information lost on save? | OJFord wrote: | Presumably by not storing SQL Server's MONEYs in pg MONEYs, but | CASTing to a pg MONEY if pg asks for it. | radiowave wrote: | My guess would be: by not using Postgres's money type. | lowcodetv wrote: | > "...without changing libraries, database schema, or SQL | statements" | | You don't need to manually redefine your table's schema from | SQL Server to PostgreSQL, but Babelfish must translate the | CREATE/ALTER TABLE command on the fly, changing the field to | DECIMAL or FLOAT when it does the migration, no? | dragonwriter wrote: | > How are they going to solve this with just a query | translation layer? | | Well, the translation layer isn't just a query (DQL) | translation layer, its an SQL Translation layer including DDL, | DML, etc. Since both Postgres MONEY and SQL Server MONEY are | 8-byte, fixed-precision decimal types, with the only difference | being the position of the implicit decimal, a translation layer | can use one as the backing store for something that is | logically treated as the other without data loss, though it | will have to be aware of the difference when presenting data | and also when doing conversions to other datatypes, doing math | other than addition/subtraction, etc. | | It would be even easier, I think, to just use, what, | DECIMAL(19,4) in Postgres for SQL Server MONEY, with some | special handling to have the right failure behavior at the edge | of the slightly-narrower range of the SQL Server MONEY type. | [deleted] | xupybd wrote: | I hope this means I can finally connect postgres to excel with | the same ease I can connect SQL server. | pletnes wrote: | What does "connect" mean in this context? Read data from SQL | server into an excel spreadsheet? | pc86 wrote: | Not the GP but yes I've seen a lot of folks go through the | Data tab in Excel and connect to a SQL database to display | data directly. | ComodoHacker wrote: | Excel can connect to PostgreSQL as well. | vetinari wrote: | Yes, it can on Windows, both via psqlODBC and Npgsql, | when you find out which .net runtime version your Excel | version uses. | | On Mac, it is more interesting; there's only ODBC, | Microsoft doesn't support the same psqlODBC you can use | in Windows and you have to purchase one of the supported | commercial ODBC drivers. | temp667 wrote: | Why not do this for Oracle? I've not found SQL Server to be too | bad from the crazy Oracle stuff (light experience only - maybe | bigger players have it worse?). | croddin wrote: | I bet they are working on it. | throw93 wrote: | About 10 years back my company hired consultants, spent close | to 6 months translating SQL queries/stored procs to be Oracle. | The goal was to support both MSSQL Server & Oracle for the on- | premise product. It was quite costly undertaking. Then few | years later they just abandoned Oracle because of maintenance | costs. | | I wonder if anyone starting out today chooses Oracle as their | relational database. | throwaway201103 wrote: | Oracle is very powerful. As a developer I found it easy to | work with. But the cost to run in production is eye-watering. | IDK how their licensing works today but in the past it was | free to use for evaluation or for developing a prototype. | Once you were "gaining business value" i.e. using it even if | only internally to develop a real product, you were supposed | to be paying for it. | lambda_obrien wrote: | Yes, large organizations that aren't tech oriented will use | Oracle every time over other solutions. | | It's not a great solution for tech companies, but they do say | no one ever got fired for choosing Oracle or IBM or whatever, | mainly because you can always pay them to fix your shit and | there will always be someone supporting those products. | the-dude wrote: | > I wonder if anyone starting out today chooses Oracle as | their relational database. | | Your boss does. | rst wrote: | Oracle has peculiarities that you won't see in most other | databases, which may make this a bit harder -- including such | things as an oddball set of date/time types ("date" includes | time to seconds), and treating zero length strings as SQL nulls | -- '' is null | | evaluates to true in Oracle. It also has features like "fast | refresh" on materialized views which might be tricky to emulate | on other db engines. Nothing's impossible, but it's certainly | more of a challenge. | throw93 wrote: | >> '' is null | | This was a major pain for us especially in our java data- | access layer. | RegnisGnaw wrote: | As a DBA, T-SQL is much more standard then PL/SQL. Its probably | step 1 of the plan, with step 2 being Oracle. | linuxhiker wrote: | This is interesting because it will also help Sybase migrations. | SQL Server is the "brand name" but there are still a lot of | people stuck on Sybase. | lukaseder wrote: | The two dialects are very different today | JonathonW wrote: | Will it? MSSQL and Sybase diverged somewhere around 27 years | ago; anything that Sybase and Microsoft did differently since | then would likely be completely incompatible. | Svip wrote: | As someone who converted a 400'000+ LoC Sybase SQL codebase | to Microsoft SQL Server about 5-6 years ago, they did | diverged since around 2000, but not by a lot. We ended up | with a codebase that could support both Sybase and MSSQL, | where when differences occurred, we essentially used compiler | directives, which wasn't that often. | throwaway201103 wrote: | I would guess at least 80% of common SQL and T-SQL from | Sybase is still completely compatible with SQL Server. | | As a side note, I had no idea Sybase still existed at all. | Looks like it's now part of SAP's portfolio. | conroy wrote: | Any idea what language it's written in? | BrentOzar wrote: | "Babelfish is written in C, which is the same programming | language used to develop PostgreSQL. Some parts of Babelfish | are developed using procedural language in PL/pgSQL. Many test | cases are written in PL/pgSQL and T-SQL." | conroy wrote: | Argh, I scanned the article multiple times and missed that | section. Thank you | edoceo wrote: | Heres ms2pg https://edoceo.com/dev/ms2pg | | A tool I made and used over a decade ago when migrating a bunch | of stuff | etaioinshrdlu wrote: | Don't hate me for it, but I'd like this for MySQL to postgres | too. At least as a stepping stone. | | Use case: some of my SQL syntax depends on MySQL but I realize I | made a poor life choice and would rather have transactional DDL | and a myriad of better features on postgres. | ziftface wrote: | Something like this would solve a lot of my problems | leesalminen wrote: | I'd considered migrating MySQL to Postgres on a 200 table | production app more than once. Couldn't find any good tooling | at the time so I just sucked it up and lived with my life | choices. | The_rationalist wrote: | Why not migrate to MariaDB? It's better than MySQL and has easy | and great clustering contrary to Postgres | lukaseder wrote: | Use https://www.jooq.org/translate/ | robocat wrote: | However if the translation layer is designed for consistency, | like Babelfish, it would have to replicate many of the MySQL | "faults" too... e.g. it would need to have the same non- | transactional DDL so it acted "correctly"! | | And it would need to replicate the specialised data types like | utf8mb3. | | To use PostgreSQL features (such as JSON datatype) would | presumably need a separate data connection and transaction, | because if it isn't implemented by MySQL, then the syntax | wouldn't be supported by Babelfish. | | You would get some wins, but you don't get a blend of the best | of both. | etaioinshrdlu wrote: | It would be nice if it could aim for doing the sane thing | over exact MySQL compatibility. But I agree it would be a | minefield... | lutoma wrote: | I would love that. There is so much mediocre but popular | software out there that uses hard-coded MySQL queries instead | of some abstraction layer. | | Having to use MySQL/MariaDB after usually dealing with | PostgreSQL always feels like such a downgrade. | touisteur wrote: | Can't you do it with a foreign data wrapper or the equivalent | in mysql? Maybe keep your mysql-specific queries as views in | mysql and call them from pg? | | Just spitballing, sorry, I love FDWs. | linsomniac wrote: | You might be able to get away with staying with MySQL or a | variant. Years ago I did an "interview problem" to generate a | report of some accounting data of a sample data set in MySQL. | The details are foggy now, but I ended up doing it all using | advanced SQL features that I thought only existed in PostgreSQL | (you know, or Oracle), but whatever engine I ended up using had | it as well. It was probably the latest Percona or Maria? | Conlectus wrote: | MySQL has worse guarantees in several cases, though. For | example, using `SERIALIZABLE` isolation means your queries | can deadlock when run in parallel. | | Likewise, selecting a column not present in a GROUP BY leads | to random values being returned. | pletnes wrote: | For SQL server, AWS can save their customers money by cutting | license cost (to MS). MySQL is already free so I don't see how | they can benefit from such a project. | etaioinshrdlu wrote: | I agree, but there might be a creative way to do it that | increases customer happiness and still is a decent business. | justizin wrote: | i swear the fuck to god if one more piece of technology is called | babelfish i am going to find out who is responsible and toilet | paper their house. | technion wrote: | I currently support the following products named "Integrity": | | - Law firm management software - Document management software - | A DVR appliance | | Previously I supported HPE Integrity hardware. Certain names | just seem way overused. | justizin wrote: | imagine how hilarious douglas adams would find it to try and | google babelfish today lol. | beoberha wrote: | Sounds like they're open sourcing it to get some help on it. I | have to wonder if they've found it not worth the time to make it | fully production ready. | BrentOzar wrote: | > I have to wonder if they've found it not worth the time to | make it fully production ready. | | It sounds like it's not compatible with all T-SQL commands and | data types. It would be tough to get full feature completeness: | Microsoft is constantly adding new stuff in each version, like | the ability to run Java stored procedures, R & Python in the | database, etc. Open sourcing it would let companies expand it | to get specific features coded that they need - that otherwise | might not get attention from the general public. | mjasay wrote: | Correct. The focus is on 100% correctness. As I wrote in the | post: "Over its 35 years in existence, SQL Server has evolved | to meet a wide array of use cases. When first made available | on GitHub, Babelfish won't be able to handle every use case, | but will be able to tackle the most common application | scenarios. Most importantly, Babelfish will meet the | correctness objective. That is, if Babelfish doesn't yet | support specific SQL Server functionality, it will return an | error to the application, rather than defaulting to | PostgreSQL behavior. Why? Because, again, developers (and the | enterprises for which they work) must be able to depend | absolutely on the correctness of SQL Server compatibility." | | At launch Babelfish will be able to handle - with 100% | correctness - the main semantics you'd want. HOWEVER, as you | said (and as mentioned in the post), there's a large surface | area, and a "long tail" of functionality that needs expertise | from us _and_ others to cover. So to do this right, it needs | a community. It will be great for many use cases right from | the start, but to ensure it 's great/perfect for _your_ use | case...well, that just might take your help. Please work with | us on this. | mattashii wrote: | Would you know if Babelfish will be released for RDS for | PostgreSQL as well? | | The only mentions so far are for Aurora for PostgreSQL, or | self-hosted PostgreSQL. | Tostino wrote: | Very good question, seems to be a strange decision. Maybe | it doesn't meet what they deem up to the standards to | include it in RDS due to the nature of how the backend is | hosted (just a guess). Could be something as simple as | the dependencies the extension requires are incompatible | with what the RDS servers run. It would be very nice if | the RDS team had some guidelines on what it takes to get | an extension accepted for RDS with a good framework | people can follow to ensure their extension complies with | the security requirements, etc. Then have a way to submit | an extension to be included and have it be reviewed and | given guidance on any changes needed to comply (for a fee | potentially). | Yeroc wrote: | Interesting. Now where's the Oracle-to-Postgres Translation | project? ;-) | tyldum wrote: | I have used ora2pg with great success. | Grazester wrote: | Their legal team might still be preparing for this one | given Oracle's reputation. | topspin wrote: | SCOTUS hasn't yet ruled on Google LLC v. Oracle America | Inc. One imagines that if Oracle prevails in some | demonstrative manner they may very well believe their | flavor of SQL is a software interface protected by | copywrite. | | So will all the other 'software interface' owners of the | world. | pjmlp wrote: | Apparently everyone keeps forgetting about ISO. | topspin wrote: | That's because everyone knows about the delta between ISO | and real implementations. | pjmlp wrote: | Yet they still forget about pay to play with ISO | standards. | zozbot234 wrote: | IBM might have something to say about that, though. | topspin wrote: | Wouldn't that be ironic. | homarp wrote: | isn't that what enterprisedb is selling ? | linuxhiker wrote: | Yes but it isn't Open Source. | dragonwriter wrote: | Amazon acquiring EDB and rolling some of the currently- | proprietary bits like the Oracle compatibility layer into | open source Postgres while increasing resources to | developing them could be an interesting play. | throw_m239339 wrote: | At that point, thinking that SQL somehow is "portable" from a | DB to another is a bit of a mirage. Way too many differences | between systems, which does influence data modelling itself. | hobs wrote: | It sounds like their goals are pretty lofty - | | >Those are the mechanics, but developers need to be certain | that Babelfish truly speaks SQL Server's language in a | dependable, predictable way. As such, the guiding principle for | Babelfish is correctness, with no compromises. What do I mean | by correctness? Namely, that applications designed to use SQL | Server semantics will behave the same on PostgreSQL as they | would on SQL Server. | | I am assuming they just mean in regards to types and the like - | to get the same performance characteristics out of the code | would be bonkers. | dragonwriter wrote: | > Sounds like they're open sourcing it to get some help on it. | | My take is that they are open sourcing it because they don't | just want to use it to compete with Microsoft to host SQL | Server workloads on a better basis than Microsoft's licensing | policy has let them in the past, but also to just undercut | Microsoft's SQL Server licensing revenue generally. | | As a business strategy, sure, but given the way everyone I've | heard from AWS talks about Microsoft and SQL Server licensing, | I wouldn't be surprised if there was a good deal of spite | involved, too. | coredog64 wrote: | One word: JEDI. | x0x0 wrote: | It feels like this is AWS fighting back with Microsoft after | Microsoft raised fees to run SQL Server on AWS. Not only will | AWS build this for themselves, but they'll contribute for | anyone who wants it. That's a bigger blow to Microsoft than | just facilitating migrations from MS SQL to pg in AWS; this | impacts non-cloud-lifted database revenue. | joshuaellinger wrote: | This is great. I suspect that I have exactly the right use case | for this. | | The two main issues with running SQL Server are (1) you have to | license all the cores on a system and (2) the standard license | only recognizes up to 64GB RAM. So I actually wound up buying a | 3GHz single-socket system for around $10K to save $20K on the SQL | license. | | With this, I can move a couple of the big DBs to another system | that has 32 Cores with 256GB RAM and the entire DB will fit in | memory, put in 5GB ethernet, and gain a tremendous amount of | performance. | | But, more importantly, I can migrate the workload on a case-by- | case basis. Human costs always dwarf my software and hardware | costs. | sqlserver1 wrote: | (2) is not accurate. SQL Server Standard Edition supports up to | 128 GB RAM | | https://docs.microsoft.com/en-us/sql/sql-server/editions-and... ___________________________________________________________________ (page generated 2020-12-01 23:01 UTC)