[HN Gopher] Friendlier SQL with DuckDB
       ___________________________________________________________________
        
       Friendlier SQL with DuckDB
        
       Author : hfmuehleisen
       Score  : 271 points
       Date   : 2022-05-12 14:50 UTC (8 hours ago)
        
 (HTM) web link (duckdb.org)
 (TXT) w3m dump (duckdb.org)
        
       | aerzen wrote:
       | If anyone is interested in improvements to SQL, checkout PRQL
       | https://github.com/prql/prql, a pipelined relational query
       | language.
       | 
       | It supports:
       | 
       | - functions,
       | 
       | - using an alias in same `select` that defined it,
       | 
       | - trailing commas,
       | 
       | - date literals, f-strings and other small improvements we found
       | unpleasant with SQL.
       | 
       | https://lang.prql.builders/introduction.html
       | 
       | The best part: it compiles into SQL. It's under development,
       | though we will soon be releasing version 0.2 which would be "you
       | can check it out"-version.
        
         | lijogdfljk wrote:
         | This is neat. Have you found this new found capability at odds
         | with "good SQL"? Eg, i run a fairly large application that has
         | a huge DB schema, and more often than not when the SQL gets
         | huge and ugly it often means we're asking too much of the DB.
         | "Too much" being more easy to run into poor indexes, giving
         | more chances for it to pull in unexpectedly large number of
         | rows, etc.
         | 
         | My fear with PRQL is that i'd more easily ask too much of the
         | DB, given how easy it looks to write larger and more complex
         | SQL. Thoughts?
        
           | aljazmerzen wrote:
           | That's true - when you hit 4th CTE you are probably doing
           | something wrong.
           | 
           | But not always. Some analytical queries may actually need
           | such complexity. Also, during development, you would
           | sometimes pick only first 50 rows before joining and
           | grouping, with intention of not overloading the db. To do
           | this you need a CTE (or nested select), but in PRQL you just
           | add a `take 50` transform to the top.
        
       | eyelidlessness wrote:
       | Often efforts and articles like this feel like minor affordances
       | which don't immediately jump out as a big deal, even if they
       | eventually turn out to be really useful down the road. Seeing the
       | article title, that's what I expected. I did not expect to read
       | through the whole thing with my inner voice, louder and more
       | enthusiastically, saying "yes! this!" Very cool.
        
       | Oxodao wrote:
       | Came across this a few time but never got to try it out because
       | the only golang binding is unofficial and I can't get CGO to work
       | as expected...
       | 
       | That would be really neat to have an official one. This articles
       | makes me want to try it even more
        
         | 1egg0myegg0 wrote:
         | Here is a solved Github Issue related to CGO for the Go
         | bindings! If you have another issue, please feel free to post
         | it on their Github page!
         | 
         | https://github.com/marcboeker/go-duckdb/issues/4
        
           | Oxodao wrote:
           | Thanks! I didn't see that, I'll give it a try again!
        
         | nlittlepoole wrote:
         | Ran into some similar issues with those bindings. We switched
         | to using the ODBC drivers and its been great. Those are
         | official and we just use an ODBC library in go to connect
        
       | carlineng wrote:
       | I love these updates. It would be great to see some of the major
       | data warehouse vendors (Snowflake, BigQuery, Redshift) follow
       | suit.
        
       | eis wrote:
       | I was just yesterday exploring DuckDB and it looked very
       | promising but I was very surprised to find out that indexes are
       | not persisted (and I assume that means they must fit in RAM).
       | 
       | > Unique and primary key indexes are rebuilt upon startup, while
       | user-defined indexes are discarded.
       | 
       | The second part with just discarding previously defined indexes
       | is super surprising.
       | 
       | https://duckdb.org/docs/sql/indexes
       | 
       | This was an instant showstopper for me or I assume most people
       | whose databases grow to a bigger size at which point an OLAP DB
       | becomes interesting in the first place.
       | 
       | Also the numerous issues in Github regarding crashes make me
       | hesitant.
       | 
       | But I really like the core idea of DuckDB being a very simple
       | codebase with no dependencies and still providing very good
       | performance. I guess I just would like to see more SQLite-esque
       | stability/robustness in the future and I'll surely revisit it at
       | some point.
        
         | 1egg0myegg0 wrote:
         | Persistent indexes are being actively worked on! Stay tuned. As
         | for the crashes - DuckDB is very well tested and used in
         | production in many places. The core functionality is very
         | mature! Let us know if you test it out! Happy to help if I can.
         | 
         | (disclaimer - on the DuckDB team)
        
           | eis wrote:
           | Hi, good to hear that you guys care about testing. One thing
           | apart from the Github issues that led me to believe it might
           | not be super stable yet was the benchmark results on
           | https://h2oai.github.io/db-benchmark/ which make it look like
           | it couldn't handle the 50GB case due to a out of memory
           | error. I see that the benchmark and the used versions are
           | about a year old so maybe things changed a lot since then.
           | Can you chime in regarding the current story of running
           | bigger DBs like 1TB on a machine with just 32GB or so RAM?
           | Especially regardung data mutations and DDL queries. Thanks!
        
             | 1egg0myegg0 wrote:
             | Yes, that benchmark result is quite old in Duck years! :-)
             | 
             | We actually run that benchmark as a part of our test suite
             | now, so I am certain that there is improvement from that
             | version.
             | 
             | The biggest DuckDB I've used so far was about 400 GB on a
             | machine with ~250 GB of RAM.
             | 
             | There is ongoing work that we are treating as a high
             | priority for handling larger-than-memory intermediate
             | results within a query. But we can handle larger than RAM
             | in many cases already - we sometimes run into issues today
             | if you are joining 2 larger than RAM tables together
             | (depending on the join), or if you are aggregating a larger
             | than RAM table with really high cardinality in one of the
             | columns you are grouping on.
             | 
             | Would you be open to testing out your use case and letting
             | us know how it goes? We always appreciate more test cases!
        
         | pholanda wrote:
         | Hey eis, you are correct. We do not support index storage in
         | our latest release. I am currently implementing this, and it is
         | in a fairly advanced stage. So it should be featured in the
         | next release.
         | 
         | This took a little while because we use a fairly modern index
         | structure with no literature definition on how to buffer manage
         | it.
        
           | eis wrote:
           | It's good to hear that persistent indexes are coming soon. I
           | saw it was on the roadmap but didn't know how far out this
           | feature was. Do you have an idea when that release could be
           | out?
           | 
           | BTW Do you have some kind of code/docs one can take a look at
           | regarding the index structure? I'm a part-time data structure
           | nerd :)
        
             | 1egg0myegg0 wrote:
             | Here is a paper that was used when building the DuckDB
             | approach!
             | 
             | https://db.in.tum.de/~leis/papers/ART.pdf
             | 
             | There are some other papers and details about the
             | architecture here: https://duckdb.org/why_duckdb
        
       | flakiness wrote:
       | I love the attitude towards ergonomics over standard compliance.
       | And you'll see why SQL has never been really portable across
       | databases ;-)
        
         | aerzen wrote:
         | Well, it's not portable, but you don't have to learn a new
         | language every time you encounter a new DBMS (unlike MongoDB or
         | InfluxDB).
        
         | db65edfc7996 wrote:
         | Maybe if SQL was a better language at the start, there would be
         | more incentive to follow the spec.
        
           | throwawayboise wrote:
           | SQL is an amazingly good language for what it does. It has
           | been with us coming up on 5 decades.
        
             | lijogdfljk wrote:
             | Minus everyone having different versions of it to
             | facilitate the missing functionality/needs of users
        
           | zasdffaa wrote:
           | IIRC there was am ANSI/FIPS standard which was dropped under
           | the bill clinton's administration, which is when things
           | started to diverge.
           | 
           | (Info from memory, may be wrong or a bit mangled)
        
       | wenc wrote:
       | This is fantastic. Column aliases are super helpful in reducing
       | verbose messiness.
       | 
       | DuckDB has all but replaced Pandas for my use cases. It's much
       | faster than Pandas even when working with Pandas data frames. I
       | "import duckdb as db" more than I "import pandas as pd" these
       | days.
       | 
       | The only thing I need now is a parallelized APPLY syntax in
       | DuckDB.
        
         | 1egg0myegg0 wrote:
         | Fugue has a DuckDB back end and I believe they can actually use
         | Dask and DuckDB in combination for what I believe is similar to
         | what you are looking for! There is also a way to map Python
         | functions in DuckDB using the relational (dataframe-like) API.
         | 
         | https://fugue-tutorials.readthedocs.io/tutorials/integration...
         | 
         | https://github.com/duckdb/duckdb/pull/1569
        
       | projektfu wrote:
       | I find that the examples are very confusing because they are
       | using names that sound like rows or tables (jar_jar_binks,
       | planets) as fields in the examples.
        
         | 1egg0myegg0 wrote:
         | Ah, well, that was a risk that I took... Thank you for the
         | feedback though! The Star Wars puns were too hard to resist...
         | 
         | If you have a specific question, definitely post it here and I
         | will clarify!
        
           | petepete wrote:
           | I'd agree. Took me a couple of reads to make sense of it.
           | 
           | Keep the puns, I just think with a bit of adjustment the
           | examples would be easier to understand.
        
       | mattrighetti wrote:
       | `EXCLUDE`
       | 
       | Extremely useful, is there a reason why this is something not
       | implemented in SQL in the first place? I often find myself
       | writing very long queries just to select basically all columns
       | except for two or three of them.
        
         | heyda wrote:
         | because columns can be added to tables in production databases,
         | so any time you use select * you run the chance the number of
         | columns changing and breaking anything you wrote.
        
           | wruza wrote:
           | This seems reasonable on its own, but then you can add a
           | compound index and forget to join on a second part, or
           | refactor a column in two and only collect one value into
           | aggregation. This spotted babysitting is just stupid. If
           | you're anxious about query integrity, get some tooling and
           | check your sqls/ddls against some higher-level schema.
           | 
           | Even if that turns out to be a constant source of trouble
           | worth _not_ having, then why SQL can't provide columnsets at
           | least, so that queries could include, group or join on these
           | predefined sets of columns instead of repeating tens of
           | columns and /or expressions and/or aggregations many times
           | across a single query. You had employees.bio_set=(name, dob),
           | now you add `edu` to it and it just works everywhere, because
           | you think in sets rather than in specific columns. Even group
           | by bio_set works. Heck, I bet most of ORMs partially exist
           | only to generate SQL, because it's sometimes unbearable as
           | is.
        
           | justin_oaks wrote:
           | That's a problem with select * in general, not a problem with
           | using EXCLUDE with select *. So that still doesn't explain
           | why it's not in SQL to begin with.
        
             | munk-a wrote:
             | I've always viewed SELECT * as a convinence for schema
             | discovery and a huge bonus for subqueries - our shop
             | excludes its use at the top level in production due to the
             | danger of table definitions changing underneath... but we
             | happily allow subqueries to use SELECT * so long as that
             | column list is clearly defined before we leave the
             | database.
             | 
             | Worst, by far, than a column you didn't expect being added
             | is a column you did expect being _removed_. Depending on
             | how thorough your integration tests are (and ideally they
             | should be pretty thorough) you could suddenly start getting
             | strange array key access (or object key unfound) errors
             | somewhere on the other side of the codebase.
        
               | throwawayboise wrote:
               | Yeah I tend to use "select *" in interactive queries when
               | I'm working out what I want, but then write explicit
               | column names in anything going into production. This
               | helps with the column-being-removed case, as the query
               | will fail immediately selecting a nonexistent column,
               | whereas "select *" will not fail and the error will
               | happen somewhere else.
        
             | heyda wrote:
             | "A traditional SQL SELECT query requires that requested
             | columns be explicitly specified, with one notable
             | exception: the * wildcard. SELECT * allows SQL to return
             | all relevant columns. This adds tremendous flexibility,
             | especially when building queries on top of one another.
             | However, we are often interested in almost all columns. In
             | DuckDB, simply specify which columns to EXCLUDE:"
             | 
             | It appears how this works is that is selects all columns
             | and then EXCLUDES only the column's specified, the reason
             | this doesn't exist in normal SQL is because it is a
             | terrible idea. This is something that will break at many
             | companies with large technical debt if it is ever used.
        
           | gmueckl wrote:
           | That's why good database wrappers support referencing columns
           | in result sets by column name. It's good practice.
        
       | wolf550e wrote:
       | That 750KB PNG can probably be a 50KB PNG. Even without resizing
       | it compresses to less than half its size.
       | 
       | https://duckdb.org/images/blog/duck_chewbacca.png
        
         | 1egg0myegg0 wrote:
         | Thanks! Can you tell that my SQL-fu is stronger than my HTML-
         | fu? :-) Much appreciated!
        
           | andai wrote:
           | In this case it should probably be a JPEG? (Unless it has a
           | transparent background _and_ the site responds to the user 's
           | dark-mode setting? :) Also, this image looks like it almost
           | certainly _was_ a JPEG, at some point!)
        
             | tracker1 wrote:
             | The color palette is pretty limited, so using something
             | that can have a specific/limited palette (say 48-64 color
             | in this case) is probably going to have a better result
             | than jpeg. Also, optimizing for display size would take it
             | further still. Alpha transparency support is also a bonus
             | for png over jpeg.
        
       | getravi wrote:
       | I would go even further and say that "GROUP BY ALL" and "ORDER BY
       | ALL" should be implied if not provided in the query.
       | 
       | EDIT: Typo
        
       | foxbee wrote:
       | This is awesome and would love to chat around building an
       | integration to the low-code platform Budibase:
       | https://github.com/Budibase/budibase
        
       | parentheses wrote:
       | Though many of the queries don't make complete sense the mapping
       | of queries to Star Wars is :chefkiss:
        
       | Timpy wrote:
       | Wow this was definitely a pessimistic click for me, I was
       | thinking "trying to replace SQL? How stupid!" But it just looks
       | like SQL with all the stuff you wish SQL had, and some more stuff
       | you didn't even know you wanted.
        
       | zasdffaa wrote:
       | This looks a little odd                  SELECT age,
       | sum(civility) as total_civility        FROM star_wars_universe
       | ORDER BY ALL        -- ORDER BY age, total_civility
       | 
       | there's no GROUP BY?
       | 
       | edit: (removed edit, I blew it, sorry)
        
         | 1egg0myegg0 wrote:
         | Good catch! Fixed! Needed a group by all in there.
        
       | _raoulcousins wrote:
       | I love love love DuckDB. When I can use DuckDB + pyarrow and not
       | import pandas, it makes my day.
        
       | pxtail wrote:
       | Wow so many nice database-related news recently - feels like
       | database week or something! :)
        
       | roncohen wrote:
       | Lots of great additions. I will just highlight two:
       | 
       |  _Column selection_ :
       | 
       | When you have tons of columns these become useful. Clickhouse
       | takes it to the next level and supports APPLY and COLUMN in
       | addition to EXCEPT, REPLACE which DuckDB supports:
       | - APPLY: apply a function to a set of columns         - COLUMN:
       | select columns by matching a regular expression (!)
       | 
       | Details here: https://clickhouse.com/docs/en/sql-
       | reference/statements/sele...
       | 
       |  _Allow trailing commas_ :
       | 
       | I can't count how many times I've run into a problem with a
       | trailing comma. There's a whole convention developed to overcome
       | this: the prefix comma convention where you'd write:
       | SELECT           first_column           ,second_column
       | ,third_column
       | 
       | which lets you easily comment out a line without worrying about
       | trailing comma errors. That's no longer necessary in DuckDB.
       | Allowing for trailing commas should get included in the SQL spec.
        
         | 1egg0myegg0 wrote:
         | Thank you for the feedback! I will check those Clickhouse
         | features out. I totally agree on the trailing commas, and I use
         | commas first syntax for that same reason! But maybe not
         | anymore... :-)
        
         | go_prodev wrote:
         | EXCEPT columns would get my vote for ansi standard SQL
         | adoption. So much time is spent selecting all but a few
         | columns.
        
         | gregmac wrote:
         | Yes, trailing commas should work everywhere!
         | 
         | JSON is the other one where it annoys me, but luckily I rarely
         | hand-write any JSON anymore (and there are semi-solutions for
         | this like json5).
         | 
         | In code I always add trailing commas to anything comma-
         | separated. It makes editing simpler (you can shuffle lines
         | without thinking about commas). In a diff or blame it doesn't
         | show adding a comma as a change.
         | 
         | SQL is the one spot where this doesn't work, and it's a
         | constant foot-gun as I often don't remember until I run and get
         | a syntax error.
        
           | skrtskrt wrote:
           | JSONC allows comments and trailing commas, but adoption seems
           | to be low.
           | 
           | VSCode uses it for configuration, but when I wanted to use it
           | in Python (to add context to source-controlled Elasticsearch
           | schemas) there were only a couple old barely-maintained
           | libraries for parsing.
        
             | yunohn wrote:
             | > there were only a couple old barely-maintained libraries
             | for parsing.
             | 
             | Do they work, though? If it's a mostly stable standard,
             | doesn't seem like you'd need a frequently updated parser.
        
         | nicoburns wrote:
         | > Allowing for trailing commas should get included in the SQL
         | spec.
         | 
         | Yep! That would be my #1 request for SQL. Seems ridiculous that
         | it's not supported already.
        
           | _dark_matter_ wrote:
           | BigQuery also supports trailing commas!
        
         | skeeter2020 wrote:
         | You can do the same thing with your WHERE clause and ANDs by
         | always starting them WHERE 1=1 as well.
         | 
         | >> Allowing for trailing commas should get included in the SQL
         | spec.
         | 
         | So there is no "SQL spec" per se, there's an ANSI specification
         | with decades of convention and provider-specific customizations
         | piled on top. This support for trailing commas is the best
         | you're going to get.
        
         | snidane wrote:
         | Allow referencing columns defined previously in the same query
         | would make duckdb competitive for data analytics. Without that
         | one has to chain With statements for just the tiniest
         | operations.                 select 1 as x, x + 2 as y, y/x as
         | z;
        
           | zasdffaa wrote:
           | (nothing to do with DuckDB but..) SQL is complex enough, and
           | allowing this (and acyclically as mentioned below) would do
           | my $%^& nut implementing it.
           | 
           | But I know a user requirement when I hear one, so can you
           | give me an large, real example of where allowing this would
           | make things easier? That would be mega helpful, ta
        
             | wruza wrote:
             | _SQL is complex enough_
             | 
             | No, it is not. I mean it is, but not in parts where that
             | could be seen as useful and/or convenient. [A]cyclic graph
             | traversal/etc is one of the basic tests in a modern
             | interview at any CRUD studio. How come it could do $%^& to
             | any part of yours?
        
               | zasdffaa wrote:
               | > How come it could do $%^& to any part of yours?
               | 
               | Because just implementing the standard stuff nearly did
               | my $^&% nut. Also I know about graphs & posets, and it's
               | potentially a little more complex than it seems. The
               | variables                   select x * x as y, 1 as x
               | 
               | is meh, but what about                   select
               | (select tbl.z from tbl where tbl.y = y) as subq,
               | x * yy as y,             xx + 1 as x,             subq +
               | yy as zzz         from (              select xx, yy
               | from ... )
               | 
               | I just don't fancy supporting that.
        
               | wruza wrote:
               | _what about_
               | 
               | That's no different than the first snippet, if you aren't
               | parsing it with regexps, of course. The resulting AST
               | identifiers would simply refer to not only column names,
               | but also to other defined expressions. This is the case
               | for both snippets. It's either cyclic or not, and when
               | not, it is easy to substitute/cse/etc as usual. The
               | complexity of these expressions is irrelevant.
        
             | yread wrote:
             | for example                   select id, count(...something
             | complicated) as complicated_count         from ....
             | order by complicated_count
             | 
             | would help
        
               | jsmith99 wrote:
               | 'ORDER BY 2' would work here, but using the named column
               | is a lot nicer.
        
               | tomjakubowski wrote:
               | Wow, TIL. Great tip for those random one-off queries you
               | have to bash out when investigating a problem.
        
           | 1egg0myegg0 wrote:
           | Yes, good thought! That is listed at the bottom of the
           | article as something we are looking at for the future.
        
           | flakiness wrote:
           | There is a bug for that and it looks someone is even working
           | on it. https://github.com/duckdb/duckdb/issues/1547
        
             | karmakaze wrote:
             | There's also no need to make it left to right usage, as
             | long as it's acyclic:                 select y-2 as x, 3 as
             | y, y/x as z;
        
               | sonthonax wrote:
               | Would this be compiled into a graph of subqueries and
               | window statements?
        
               | karmakaze wrote:
               | I'm not following, the original could be written as
               | select x + 2 as y, 1 as x, y/x as z;
               | 
               | with the same column values in a different order. Order
               | of arguments shouldn't matter is all I was saying.
        
           | gigatexal wrote:
           | Yep! Agreed!
        
         | IshKebab wrote:
         | Matching columns by regular expression sounds like a _terrible_
         | feature. Talk about bug-prone!
        
       | tosh wrote:
       | How does DuckDB compare to SQLite (e.g. which workloads are a
       | good fit for what? Would it be a good idea to use both?)
       | 
       | I found https://duckdb.org/why_duckdb but I'm sure someone here
       | can share some real world lessons learned?
        
         | eatonphil wrote:
         | DuckDB: embedded OLAP, SQLite: embedded OLTP. For small
         | datasets (<1M rows let's say) either would be similar in
         | performance.
         | 
         | I need to do the benchmarks to substantiate this but this is my
         | intuition.
        
         | enjalot wrote:
         | one thing I love about DuckDB is that it supports Parquet
         | files, which means you can get great compression on the data.
         | Here's an examples getting a 1 million row CSV under 50mb and
         | interactive querying in the browser:
         | https://observablehq.com/@observablehq/bandcamp-sales-data?c...
         | 
         | the other big thing is better native data types, especially
         | dates. With SQLite if you want to work with timeseries you need
         | to do your own date/time casting.
        
           | 1egg0myegg0 wrote:
           | Yes, it is always difficult to use dates in SQLite... DuckDB
           | makes dates easier - like they should be!
        
         | 1egg0myegg0 wrote:
         | Excellent question! I'll jump in - I am a part of the DuckDB
         | team though, so if other users have thoughts it would be great
         | to get other perspectives as well.
         | 
         | First things first - we really like quite a lot about the
         | SQLite approach. DuckDB is similarly easy to install and is
         | built without dependencies, just like SQLite. It also runs in
         | the same process as your application just like SQLite does.
         | SQLite is excellent as a transactional database - lots of very
         | specific inserts, updates, and deletes (called OLTP workloads).
         | DuckDB can also read directly out of SQLite files as well, so
         | you can mix and match them!
         | (https://github.com/duckdblabs/sqlitescanner)
         | 
         | DuckDB is much faster than SQLite when doing analytical queries
         | (OLAP) like when calculating summaries or trends over time, or
         | joining large tables together. It can use all of your CPU cores
         | for sometimes ~100x speedup over SQLite.
         | 
         | DuckDB also has some enhancements with respect to data transfer
         | in and out of it. It can natively read Pandas, R, and Julia
         | dataframes, and can read parquet files directly also (meaning
         | without inserting first!).
         | 
         | Does that help? Happy to add more details!
        
           | OskarS wrote:
           | One of SQLite's most appealing aspects to me is using it as
           | an application file format, as described in this article:
           | https://www.sqlite.org/appfileformat.html
           | 
           | How does DuckDB compare in that aspect? Does it have the same
           | kind of guarantees of robustness, incorruptibility and
           | performance (especially reading/writing binary blobs) that
           | SQLite does?
           | 
           | In any case: DuckDB looks great, nice work! Good to have more
           | players in this space!
        
             | 1egg0myegg0 wrote:
             | It is a goal of ours to become a standard multi-table
             | storage format! However, today we are still in beta and
             | have made some breaking changes in the last few releases.
             | (Exporting from the old version, then reimporting your DB
             | in the new allows you to upgrade!) Those should happen less
             | often as we move forward (the storage format was
             | genericized a bit and is more resilient to future
             | enhancements now), and locking in our format amd
             | guaranteeing backwards compatibility will occur when we go
             | to 1.0!
        
           | 1egg0myegg0 wrote:
           | I should add that we can read/write Apache Arrow as well!
        
           | tosh wrote:
           | Thanks, it does help! I understand SQLite might be
           | better/ideal for OLTP (?) but would DuckDB also work for use
           | cases where I query for specific records (e.g. based on
           | primary key) or would I rather use SQLite for OLTP stuff and
           | then read SQLite from DuckDB for analytical workloads?
           | 
           | Basically I'm wondering: if I go all in on DuckDB instead of
           | SQLite would I notice? Do I have to keep anything in mind?
           | 
           | I know, probably difficult to answer without a concrete
           | example of data, schema, queries and so on.
           | 
           | The SQL query features in the article seem really neat. Kudos
           | @ shipping.
        
             | 1egg0myegg0 wrote:
             | Good questions! You are correct that it depends. We do have
             | indexes to help with point queries, but they are not going
             | to be quite as fast as SQLite because DuckDB stores data in
             | a columnar format. (Soon they will be persistent - see
             | comments above!) That columnar format is really great for
             | scanning many items, but not optimal for grabbing all of a
             | single row.
             | 
             | With DuckDB, bulk inserts are your friend and are actually
             | super fast.
             | 
             | Definitely let us know what you find! Just open up a
             | discussion on Github if you'd like to share what you find
             | out: https://github.com/duckdb/duckdb/discussions
        
         | ergocoder wrote:
         | Sqlite's SQL is severely limited.
         | 
         | If you want to do something a bit more complex, you will have a
         | bad time. Hello! With recursive.
        
         | IshKebab wrote:
         | I haven't used DuckDB yet but the biggest differences I've
         | discovered if you aren't working on huge datasets where the
         | column/row thing makes a difference (you're probably not) are:
         | 
         | 1. SQLite has a great GUI and is really really widely
         | supported.
         | 
         | 2. DuckDB is properly statically typed with a much wider range
         | of types than SQLite, which is dynamically typed and only
         | _just_ added support for any kind of type checking at all.
        
           | 1egg0myegg0 wrote:
           | If you'd like to work with DuckDB in a SQL IDE/GUI, we
           | recommend DBeaver! It uses the DuckDB JDBC connector. A quick
           | how to guide is here:
           | https://duckdb.org/docs/guides/sql_editors/dbeaver
        
       | chrisjc wrote:
       | What are some potential long-term liabilities we might see in
       | choosing to adopt duckdb today?
       | 
       | Obviously there will be a desire to monetize this project, if not
       | for the very simple reason of subsidizing the cost of its
       | development and maintenance. I love everything I hear and see
       | about this project, but it makes me nervous to recommend this
       | internally due to it not only being in such an early stage, but
       | also bc of any unforeseen costs and liabilities that it might
       | introduce in the future.
        
         | 1egg0myegg0 wrote:
         | Let me see if I can assuage some of your concerns!
         | 
         | First off - DuckDB is MIT licensed, so you are welcome to use
         | and enhance it essentially however you please!
         | 
         | DuckDB Labs is a commercial entity that offers commercial
         | support and custom integrations. (https://duckdblabs.com/). If
         | the MIT DuckDB works for what you need, then you are all set no
         | matter what!
         | 
         | However, much of the IP for DuckDB is owned by a foundation, so
         | it is independent of that commercial entity.
         | (https://duckdb.org/foundation/)
         | 
         | Does that help? Happy to answer any other questions!
        
       | learndeeply wrote:
       | Since the DuckDB people are here, just want to say that what
       | you're doing is going to be a complete game-changer in the next
       | few years, much like SQLite changed the game. Thanks for making
       | it open source!
        
       ___________________________________________________________________
       (page generated 2022-05-12 23:00 UTC)