[HN Gopher] Show HN: Google Drive to SQLite
       ___________________________________________________________________
        
       Show HN: Google Drive to SQLite
        
       Here's the repo: https://github.com/simonw/google-drive-to-sqlite
       The README is using a trick I'm increasingly leaning on: parts of
       that document - the --help output and the example database schema -
       are automatically generated using Cog:
       https://nedbatchelder.com/code/cog and
       https://til.simonwillison.net/python/cog-to-update-help-in-r...
        
       Author : simonw
       Score  : 195 points
       Date   : 2022-02-21 16:12 UTC (6 hours ago)
        
 (HTM) web link (simonwillison.net)
 (TXT) w3m dump (simonwillison.net)
        
       | [deleted]
        
       | nickcw wrote:
       | We discovered (in rclone) that Google are deprecating the auth
       | method you are using - the OOB auth here:
       | https://github.com/simonw/google-drive-to-sqlite/blob/121509...
       | 
       | Here is the google page:
       | https://developers.googleblog.com/2022/02/making-oauth-flows...
       | 
       | Here is the discussion on the rclone issue - there are links to
       | more stuff there: https://github.com/rclone/rclone/issues/6000
       | 
       | Luckily rclone already implements the oauth webserver for other
       | backends, but for tools like this it has suddenly made
       | authenticating with Google a whole lot harder.
        
         | simonw wrote:
         | Hah, wow thanks I hadn't seen that! Just my luck to implement
         | that the same week they announce they are deprecating it.
        
           | simonw wrote:
           | It's pretty hard to understand what I'm meant to do for a CLI
           | app instead here. Are they removing all versions of the "copy
           | and paste this code" auth flow?
           | 
           | Do I need to spin up a localhost web server on a port
           | instead? How am I supposed to do that sensibly on a remote
           | server that I'm SSHing into?
           | 
           | Also that announcement seemed to indicate that localhost:
           | redirects were going to be disabled too, wouldn't that make
           | the local web server option not work either?
           | 
           | I swear, obtaining an OAuth token to interact with my Google
           | account is the new hardest problem in computer science. Oh
           | for a GitHub-style personal access token!
        
             | mey wrote:
             | Considering the GCloud SDK CLI Auth uses the copy/paste
             | this code auth flow for certain scenarios, this will be
             | interesting.
        
               | aaaaaaaaata wrote:
               | They literally couldn't care less.
               | 
               | Either you're a sticky customer and you'll figure it out,
               | 
               | or you're a high value customer that they'll sick an
               | engineer or two on for a week to re-tool.
               | 
               | They do. not. care.
        
             | nickcw wrote:
             | My reading is that a localhost web server on a port is the
             | way to go.
             | 
             | This aligns with what nearly all the other cloud providers
             | do with oauth.
             | 
             | > How am I supposed to do that sensibly on a remote server
             | that I'm SSHing into?
             | 
             | If you want to see rclone's solution then check out
             | 
             | https://rclone.org/remote_setup/
             | 
             | It's a bit of a pain!
             | 
             | > Also that announcement seemed to indicate that localhost:
             | redirects were going to be disabled too, wouldn't that make
             | the local web server option not work either?
             | 
             | As you noted below that is only for Android, IOS and
             | ChromeOS. I hadn't noticed that before though...
             | 
             | > I swear, obtaining an OAuth token to interact with my
             | Google account is the new hardest problem in computer
             | science.
             | 
             | Ha ha!
             | 
             | I note that you can get "rclone authorize" to do it if you
             | want as you can set the scopes and the client id / client
             | secret.
        
             | simonw wrote:
             | I'm collecting notes on how to address this here:
             | https://github.com/simonw/google-drive-to-sqlite/issues/39
             | 
             | It looks like the localhost redirects are being disabled
             | for "iOS, Android and Chrome app OAuth client types" so
             | hopefully I can still use those by spinning up a localhost
             | web server after all.
        
       | samhw wrote:
       | > The README is using a trick I'm increasingly leaning on: parts
       | of that document - the --help output and the example database
       | schema - are automatically generated using Cog
       | 
       | How does this interact with IDEs? Has anyone written a VSCode
       | extension yet? I'd probably be reluctant to use something like
       | this without IDE support. (Though I may well be in the minority,
       | and I appreciate that for open source tools it's not necessarily
       | reasonable to expect people to dedicate so much of their time to
       | DX.)
        
         | _hl_ wrote:
         | Notable alternative for emacs users: Github supports rendering
         | .org files, so you can have a README.org using babel (just
         | remember to execute all code blocks before pushing).
        
         | simonw wrote:
         | To regenerate the markdown file I run this command in the
         | console:                   cog -r README.md
         | 
         | Presumably most IDEs have q mechanism that can run that command
         | automatically, either when a file changes or just
         | intermittently over time.
         | 
         | I use VS Code's markdown preview panel, which updates
         | automatically when I manually run the "cog -r" command.
        
           | ash wrote:
           | Nice! I didn't know about cog.
           | 
           | I've developed a simple "codeblocks" tool that to keep
           | README.md up-to-date. In your example, here is how to insert
           | the latest `--help` output:                   codeblocks
           | usage-auth README.md -- google-drive-to-sqlite auth --help
           | 
           | Where "usage-auth" is the block language in README.md:
           | Full `--help`:         ```usage-auth         Usage: ...
           | ```
           | 
           | Checking is also possible:                   codeblocks
           | --check usage-auth README.md -- google-drive-to-sqlite auth
           | --help
           | 
           | It's not as generic as cog, but does not require writing
           | codegen scripts inside Markdown file.
           | 
           | https://github.com/shamrin/codeblocks
        
         | whateveracct wrote:
         | hm I'm guessing..run the cog command in the VSCode terminal
        
       | NicoJuicy wrote:
       | > Pulling the metadata--file names, sizes, file types, file
       | owners, creation dates--into a SQLite database felt like a great
       | way to start understanding the size and scope of what had been
       | collected so far.
       | 
       | I would:
       | 
       | - sync the GDrive locally
       | 
       | - setup "search anything service" from voidtools
       | 
       | Voila. A super fast programmable index of the files.
       | 
       | If you want to dump it to SQLite, go ahead.
       | 
       | Ps. Applicable to Dotnet
       | 
       | Note: file owners wouldn't be known.
        
         | nijave wrote:
         | If you have a lots of files in Google Drive, this can take a
         | significant amount of time or you may not be able to sync it
         | locally. I push incremental backups & disk images without a
         | retention period to Google Drive (it's unlimited storage) so
         | there's quite a bit of data shoved in there
         | 
         | It's pretty easy to get rate limited so it could take days or
         | weeks to build an index (so having a tool that talks to the API
         | is generally more reliable)
        
         | simonw wrote:
         | In my case the Google Drive folder that was shared with me has
         | about half a TB in it, so going via the API seemed like a
         | better option!
        
           | NicoJuicy wrote:
           | Agreed :p
        
       | gwbas1c wrote:
       | FWIW: If you find that you need something like this, consider
       | using a different file storage product. Most likely, it's using
       | SQLite and the schema is useful.
       | 
       | I used to be the lead on Syncplicity, a desktop file
       | synchronization product. All of the local state was in SQLite.
       | Early versions of the desktop client kept the complete path to
       | every file in the associated row in SQLite. Later versions kept
       | the filename and foreign key to the directory row.
       | 
       | Bittorrent Sync (I forget its new name) also used to use a SQLite
       | database for local metadata.
       | 
       | And, finally: It's not that hard to write a tool to recursively
       | scan a folder and grab all the metadata of files. You can scan
       | 1000s of files in a matter of seconds.
        
       | anonymouse008 wrote:
       | TELL HN: More things like this please -- though I have no real
       | use for this tool directly, it's paradigms are super helpful in
       | thinking through other projects I'm working on.
       | 
       | Thanks simonw - neat stuff
        
         | [deleted]
        
         | mixcocam wrote:
         | Agreed. More uses of SQLite too!
         | 
         | I have been working with it to gather ecom sales data across
         | platforms for a couple of months. Such a powerful tool! I want
         | to do more with it!
        
       | rbosinger wrote:
       | I dived into Datasette a bit last year. Simon has a whole set of
       | tools around SQLite and data analysis with it. I haven't used any
       | of it for anything major but I love to see what he's got going
       | on. My Google Drive is out of control, I hope I can find the time
       | to try this one out.
        
       | tonymet wrote:
       | An Alfred workflow that queries this index would be fantastic.
       | Alternatively feed it into spotlight
        
         | bjtitus wrote:
         | There's a Google Drive workflow for Alfred which I've used
         | previously.
         | 
         | https://www.alfredforum.com/topic/17318-google-drive-%E2%80%...
        
       ___________________________________________________________________
       (page generated 2022-02-21 23:00 UTC)