[HN Gopher] Xlite: Query Excel and Open Document spreadsheets as...
       ___________________________________________________________________
        
       Xlite: Query Excel and Open Document spreadsheets as SQLite virtual
       tables
        
       Author : thunderbong
       Score  : 201 points
       Date   : 2022-06-25 12:34 UTC (10 hours ago)
        
 (HTM) web link (github.com)
 (TXT) w3m dump (github.com)
        
       | mwenge wrote:
       | https://dirtylittlesql.com and https://io10.dev also support
       | this.
        
         | _qua wrote:
         | Well this is the coolest thing I've seen all week.
        
         | olah_1 wrote:
         | I'm looking for a way to have a searchable/filterable CSV file
         | of 30k rows as a web app.
         | 
         | Some kind of fork of DirtyLittleSQL looks like it is the
         | solution!
        
       | bsilvereagle wrote:
       | One-liner for doing this for CSVs from a few days ago:
       | https://news.ycombinator.com/item?id=31824030
        
       | damagednoob wrote:
       | As another alternative, you can use in2csv[1] to convert excel
       | documents to csv and pipe them into the other tools of csvkit.
       | 
       | [1]:
       | https://csvkit.readthedocs.io/en/latest/tutorial/1_getting_s...
        
       | eatonphil wrote:
       | This is a cool project! But if you just want a simple CLI, you
       | can query Excel and ODS files with dsq [0]. It's the same end
       | result without needing to carry plugins around and with support
       | for many additional data formats. Plus a growing standard library
       | of functions that don't come built into SQLite such as best-
       | effort date parsing, URL parsing/extraction, statistical
       | aggregation functions, math functions, string and regex helpers,
       | hashing functions and so on [1].
       | 
       | An annoying thing about this extension-based style of file
       | support is needing to create a new table for every new file if
       | the schema is different. This is a limitation [2] of sqlite
       | unfortunately. dsq doesn't work this way so it doesn't have that
       | limit.
       | 
       | On the other hand, if you go this route you can more easily
       | combine with other extensions. That's not really possible with
       | dsq right now.
       | 
       | [0] https://github.com/multiprocessio/dsq
       | 
       | [1] https://github.com/multiprocessio/go-sqlite3-stdlib
       | 
       | [2] https://sqlite.org/forum/forumpost/ec944414fa
        
         | sitkack wrote:
         | I think when Wasm gets a little bit more mature, it will enable
         | the creation of more composable tools. One might be able to hot
         | create modules? Kinda ironic that this is how it is structured
         | given that arbitrary projections is what SQL does.
         | 
         | Is there a way to hook lower in the SQLite stack, and make it
         | think that your DS is something it already understands?
         | 
         | What about a function that returns a view that it dynamically
         | generates?
        
           | eatonphil wrote:
           | Sqlite does not allow table valued functions to return
           | results with varying schemas. The schema must be defined once
           | up front and then cannot change.
           | 
           | That forum post is a request I made to them to allow dynamic
           | columns but they don't seem interested so far.
        
             | sitkack wrote:
             | I get that as how it currently stands. But somewhere in the
             | sqlite codebase, everything is a projection.
             | 
             | Still thinking and reading through sqlite source ...
        
       | kbd wrote:
       | Do any of these tools take into account hidden Excel rows? It's a
       | constant frustration.
        
         | eatonphil wrote:
         | Dunno! But if you send me a sample I can try it out for dsq.
        
           | kbd wrote:
           | Sweet. Here's my open issue with VisiData about this:
           | https://github.com/saulpw/visidata/issues/1398
           | 
           | There's a sample file at the end of that thread.
        
             | no_identd wrote:
             | Do you have a workaround workflow in the meantime? I could
             | imagine a few non-Macro or Macro-light solutions to this.
             | Use a pivot table for example, the results of which I
             | believe get 'magiced' into their destination cells instead
             | of getting hidden, I however don't recall whether they
             | auto-inherit autofilters set on the source table and/or
             | source cell range.
             | 
             | How much organizational (not technical, we'll get to that)
             | control/authority/input/stakeholder role do you have on the
             | source document templates?
             | 
             | Also, you might want to check out
             | https://github.com/microsoft/advanced-formula-environment/
        
               | kbd wrote:
               | > Do you have a workaround workflow in the meantime?
               | 
               | Yeah in the VisiData ticket I link to a post[1] with some
               | code to handle hidden rows and columns in
               | Pandas+openpyxl.
               | 
               | [1] https://towardsdatascience.com/how-to-load-excel-
               | files-with-...
        
       | justsomeuser wrote:
       | How do virtual tables compare with regular tables for
       | performance?
       | 
       | I would assume SQLite has some optimisations for native tables
       | (rather than reading the data from another virtual table backed
       | file)?
        
         | lazypenguin wrote:
         | It's implementation dependent. You implement an interface and
         | it's up to the implementation to optimize the calls. The
         | interface is also rather simplistic so there's some
         | optimization that can't be done at the implementation level
         | since SQLite didn't provide all the information at the time of
         | the query.
        
       | Fnoord wrote:
       | rga (ripgrep all) is basically rg (ripgrep) for binaries.
        
       | Linda703 wrote:
        
       | pmeira wrote:
       | FreeXL is another alternative: https://www.gaia-
       | gis.it/fossil/freexl/index
        
       | don-code wrote:
       | In addition to just loading and querying data, spreadsheets are
       | great as database mocks in integration tests. I long ago used
       | XLSQL (note: there is a new XLSQL project unrelated to this one)
       | to mock an Oracle database:
       | https://sourceforge.net/projects/xlsql/.
        
       ___________________________________________________________________
       (page generated 2022-06-25 23:00 UTC)