[HN Gopher] Show HN: Google Drive to SQLite ___________________________________________________________________ Show HN: Google Drive to SQLite Here's the repo: https://github.com/simonw/google-drive-to-sqlite The README is using a trick I'm increasingly leaning on: parts of that document - the --help output and the example database schema - are automatically generated using Cog: https://nedbatchelder.com/code/cog and https://til.simonwillison.net/python/cog-to-update-help-in-r... Author : simonw Score : 195 points Date : 2022-02-21 16:12 UTC (6 hours ago) (HTM) web link (simonwillison.net) (TXT) w3m dump (simonwillison.net) | [deleted] | nickcw wrote: | We discovered (in rclone) that Google are deprecating the auth | method you are using - the OOB auth here: | https://github.com/simonw/google-drive-to-sqlite/blob/121509... | | Here is the google page: | https://developers.googleblog.com/2022/02/making-oauth-flows... | | Here is the discussion on the rclone issue - there are links to | more stuff there: https://github.com/rclone/rclone/issues/6000 | | Luckily rclone already implements the oauth webserver for other | backends, but for tools like this it has suddenly made | authenticating with Google a whole lot harder. | simonw wrote: | Hah, wow thanks I hadn't seen that! Just my luck to implement | that the same week they announce they are deprecating it. | simonw wrote: | It's pretty hard to understand what I'm meant to do for a CLI | app instead here. Are they removing all versions of the "copy | and paste this code" auth flow? | | Do I need to spin up a localhost web server on a port | instead? How am I supposed to do that sensibly on a remote | server that I'm SSHing into? | | Also that announcement seemed to indicate that localhost: | redirects were going to be disabled too, wouldn't that make | the local web server option not work either? | | I swear, obtaining an OAuth token to interact with my Google | account is the new hardest problem in computer science. Oh | for a GitHub-style personal access token! | mey wrote: | Considering the GCloud SDK CLI Auth uses the copy/paste | this code auth flow for certain scenarios, this will be | interesting. | aaaaaaaaata wrote: | They literally couldn't care less. | | Either you're a sticky customer and you'll figure it out, | | or you're a high value customer that they'll sick an | engineer or two on for a week to re-tool. | | They do. not. care. | nickcw wrote: | My reading is that a localhost web server on a port is the | way to go. | | This aligns with what nearly all the other cloud providers | do with oauth. | | > How am I supposed to do that sensibly on a remote server | that I'm SSHing into? | | If you want to see rclone's solution then check out | | https://rclone.org/remote_setup/ | | It's a bit of a pain! | | > Also that announcement seemed to indicate that localhost: | redirects were going to be disabled too, wouldn't that make | the local web server option not work either? | | As you noted below that is only for Android, IOS and | ChromeOS. I hadn't noticed that before though... | | > I swear, obtaining an OAuth token to interact with my | Google account is the new hardest problem in computer | science. | | Ha ha! | | I note that you can get "rclone authorize" to do it if you | want as you can set the scopes and the client id / client | secret. | simonw wrote: | I'm collecting notes on how to address this here: | https://github.com/simonw/google-drive-to-sqlite/issues/39 | | It looks like the localhost redirects are being disabled | for "iOS, Android and Chrome app OAuth client types" so | hopefully I can still use those by spinning up a localhost | web server after all. | samhw wrote: | > The README is using a trick I'm increasingly leaning on: parts | of that document - the --help output and the example database | schema - are automatically generated using Cog | | How does this interact with IDEs? Has anyone written a VSCode | extension yet? I'd probably be reluctant to use something like | this without IDE support. (Though I may well be in the minority, | and I appreciate that for open source tools it's not necessarily | reasonable to expect people to dedicate so much of their time to | DX.) | _hl_ wrote: | Notable alternative for emacs users: Github supports rendering | .org files, so you can have a README.org using babel (just | remember to execute all code blocks before pushing). | simonw wrote: | To regenerate the markdown file I run this command in the | console: cog -r README.md | | Presumably most IDEs have q mechanism that can run that command | automatically, either when a file changes or just | intermittently over time. | | I use VS Code's markdown preview panel, which updates | automatically when I manually run the "cog -r" command. | ash wrote: | Nice! I didn't know about cog. | | I've developed a simple "codeblocks" tool that to keep | README.md up-to-date. In your example, here is how to insert | the latest `--help` output: codeblocks | usage-auth README.md -- google-drive-to-sqlite auth --help | | Where "usage-auth" is the block language in README.md: | Full `--help`: ```usage-auth Usage: ... | ``` | | Checking is also possible: codeblocks | --check usage-auth README.md -- google-drive-to-sqlite auth | --help | | It's not as generic as cog, but does not require writing | codegen scripts inside Markdown file. | | https://github.com/shamrin/codeblocks | whateveracct wrote: | hm I'm guessing..run the cog command in the VSCode terminal | NicoJuicy wrote: | > Pulling the metadata--file names, sizes, file types, file | owners, creation dates--into a SQLite database felt like a great | way to start understanding the size and scope of what had been | collected so far. | | I would: | | - sync the GDrive locally | | - setup "search anything service" from voidtools | | Voila. A super fast programmable index of the files. | | If you want to dump it to SQLite, go ahead. | | Ps. Applicable to Dotnet | | Note: file owners wouldn't be known. | nijave wrote: | If you have a lots of files in Google Drive, this can take a | significant amount of time or you may not be able to sync it | locally. I push incremental backups & disk images without a | retention period to Google Drive (it's unlimited storage) so | there's quite a bit of data shoved in there | | It's pretty easy to get rate limited so it could take days or | weeks to build an index (so having a tool that talks to the API | is generally more reliable) | simonw wrote: | In my case the Google Drive folder that was shared with me has | about half a TB in it, so going via the API seemed like a | better option! | NicoJuicy wrote: | Agreed :p | gwbas1c wrote: | FWIW: If you find that you need something like this, consider | using a different file storage product. Most likely, it's using | SQLite and the schema is useful. | | I used to be the lead on Syncplicity, a desktop file | synchronization product. All of the local state was in SQLite. | Early versions of the desktop client kept the complete path to | every file in the associated row in SQLite. Later versions kept | the filename and foreign key to the directory row. | | Bittorrent Sync (I forget its new name) also used to use a SQLite | database for local metadata. | | And, finally: It's not that hard to write a tool to recursively | scan a folder and grab all the metadata of files. You can scan | 1000s of files in a matter of seconds. | anonymouse008 wrote: | TELL HN: More things like this please -- though I have no real | use for this tool directly, it's paradigms are super helpful in | thinking through other projects I'm working on. | | Thanks simonw - neat stuff | [deleted] | mixcocam wrote: | Agreed. More uses of SQLite too! | | I have been working with it to gather ecom sales data across | platforms for a couple of months. Such a powerful tool! I want | to do more with it! | rbosinger wrote: | I dived into Datasette a bit last year. Simon has a whole set of | tools around SQLite and data analysis with it. I haven't used any | of it for anything major but I love to see what he's got going | on. My Google Drive is out of control, I hope I can find the time | to try this one out. | tonymet wrote: | An Alfred workflow that queries this index would be fantastic. | Alternatively feed it into spotlight | bjtitus wrote: | There's a Google Drive workflow for Alfred which I've used | previously. | | https://www.alfredforum.com/topic/17318-google-drive-%E2%80%... ___________________________________________________________________ (page generated 2022-02-21 23:00 UTC)