[HN Gopher] Show HN: Checksum.sh verify every install script ___________________________________________________________________ Show HN: Checksum.sh verify every install script The pattern of downloading and executing installation scripts without verifying them has bothered me for a while. I started messing around with a way to verify the checksum of scripts before I execute them. I've found it a really useful tool for installing things like Rust or Deno. It's written entirely as a shell script, and it's easy to read and understand what's happening. I hope it may be useful to someone else! Author : gavinuhma Score : 65 points Date : 2022-10-28 18:38 UTC (4 hours ago) (HTM) web link (checksum.sh) (TXT) w3m dump (checksum.sh) | dontbenebby wrote: | >The pattern of downloading and executing installation scripts | without verifying them has bothered me for a while. | | Thanks for sharing this work OP! I didn't see a license mentioned | -- did you intend this to go into the public domain? I like how | you set up a cool domain name and did some sick graphics, but I'm | not sure how I can _legally_ use your code in the future. | | That being said, I appreciate the work you put into this project. | | I'm not going to list off specific examples, but MANY open source | projects serve either PGP keys or hashes in the clear. Or they | serve just hashes over HTTPS and now you have a trust issue. | | Or, in one case, my favorite -- they had lovingly listed out the | MD5 sum for the program... but they served both that checksum, | and the code itself... over HTTPS. | | Now, to be fair, HTTPS _does_ provide an integrity check, so | there 's a benefit beyond privacy or whatever but... this is a | RAMPANT problem in the open source community. | | I ran into it mostly when trying to find esoteric security tools | when I was attempting OSCP and interviewing around for | penetration testing roles. | | I got the sense rapidly shifting from "I was so scared of the | CFAA I did an entire master's thesis on the design of censorship | circumvention tools" to "Oh gee, I used to be such a narcissis, | demanding a high falutin salary when I couldmn't even fire up | Metasploit to wipe a server." | | (The implication being that some folks abused their access when | my powers were week, and now, in time for spooky season, it's | time lean in to letting people take whatever drug they want if | they feel scared -- reality scares me too some days.) | gavinuhma wrote: | Good catch. Let me add a license | dontbenebby wrote: | Thanks, it wasn't meant in a gotcha way. | gavinuhma wrote: | I totally just forgot to add one. Added MIT just now. | Appreciate it! | orf wrote: | I feel like bash/sh should have this built in | dundarious wrote: | There are two big problems with the use of `echo $s` in | bash/POSIX sh: | | 1. Never use echo to output untrusted content as the first | argument | | Let's say `s='-e 1\n2'`, then `echo $s` will output: | | > 1 | | > 2 | | Instead of: | | > -e 1\n2 | | Always use printf if you want to start output with untrusted | content, e.g., `printf %s\\\n "$s"`. | | 2. Never use unquoted variable expansion when trying to exactly | reproduce contents of the variable | | Similarly, unquoted variable expansion re-tokenizes the contents | and will not preserve spaces appropriately. Say | `s='"a<space><space>b"'` (where each <space> is a literal ' ', HN | seems to be collapsing 2 spaces down to 1), then `echo $s` will | output: | | > "a<space>b" | | Instead of: | | > "a<space><space>b" | | You can get the latter with `echo "$s"` but use `printf %s\\\n | "$s"` to fix both issues. | | PS: If you fail to use quoted expansion with printf, for example | like so, `printf %s\\\n $s`, then you'll notice the problem right | away, as it will effectively turn that into `for i in $s ; do | printf %s\\\n "$i" ; done`. That's actually a very useful feature | of printf if you know to use it. | | Edit: These problems exist for bash/POSIX sh at least. Perhaps | you're using a shell that works differently, like zsh, because | otherwise issue 2 would probably have led to some checksum fails | for you already. | googlryas wrote: | Great post, you are wise in the ways of the shell. Minutiae | like this is exactly why I stop writing shell scripts the | moment I start, and reach for python or some other sane | language. But, I can't help but respect when I see masters of | sh work their magic. | dundarious wrote: | Honestly, 90% of problems with scripts are people forgetting | to put double quotes around stuff. The other stuff doesn't | come up that much, and once you write a few decent scripts, | the other stuff is as easy as noticing someone wrote `open = | True` in Python, not realizing they've redefined a builtin | function, and the fix is just do `is_open = True`. | | So just put double quotes around all your variable expansions | unless you know you shouldn't -- 90% of scripts would be | "fixed" with just that. And don't bother putting curly braces | into the variable expansion unless you know you need to. | People tend to think `echo ${s}` is somehow better than `echo | $s` when it's exactly the same -- the curly braces are just a | way to allow you to, e.g., write `"${s}_"` as distinct from | `"${s_}"`. AFAIK in fish `${s}` is identical to `"$s"`, but | that's a different kettle of sh. | rnhmjoj wrote: | For more caveats like this one I recommend reading: | https://www.etalabs.net/sh_tricks.html | gavinuhma wrote: | This is awesome. Thank you! I've been through so many | iterations but it's been fun to improve | gavinuhma wrote: | Like this? https://github.com/gavinuhma/checksum.sh/pull/2 | dundarious wrote: | Missed the other `echo $s` piped into shasum. But I echo the | sentiment of the another commenter that I'd rather rely on | `shasum --check` to give the OK or not. | gavinuhma wrote: | Got it. Thanks. | | Re --check, I suppose the way to do that would be to | download the file to disk, which --check requires as fair | as I can tell. So I could download the file to disk, | --check, and then remove it. I think most of these installs | scripts are trying not to leave any artifacts around from | install, other than the resulting binary. | dundarious wrote: | You only need to create a temp file for the checksum | file, not the downloaded contents. In the below example, | no file exists on disk with the contents of `$s`. | | > $ s='1<space><space>2' | | > $ printf %s\\\n "$s" | shasum -a 256 > tmp.sum | | > $ printf %s\\\n "$s" | shasum --check tmp.sum | | > -: OK | | So you can just `printf '%s<space><space>-\n' "$c" > | tmp.sum` and check with `printf %s\\\n "$s" | shasum | --check --status tmp.sum || { echo "checksum failed" > &2 | ; exit 1 ; }` | | Having to create temp files is a wrinkle (could probably | avoid it by using process substitution if you want to | give up on POSIX sh), but so is writing bash scripts in | general. | gavinuhma wrote: | Solid! I couldn't figure this out which I why I stopped | using "---check". I'll take a look | yjftsjthsd-h wrote: | If I may pile on with a general suggestion for people writing | shell scripts: Use shellcheck. Always. It will catch these | things automatically for you:) | throwawaaarrgh wrote: | If we kept a mirrored or distributed decentralized network of | just cryptographic hashes, that might solve a huge number of | problems around distributing files securely. | ithkuil wrote: | Awesome. I made something similar in | https://github.com/mkmik/runck | | But I didn't but a fancy domain name :-) | gavinuhma wrote: | Haha thanks! Honestly when I saw the domain was available it | motivated me to finish the project and share it | thewataccount wrote: | Serious question - What is the benefit of verifying a hash? Are | we really worried about file integrity? Why don't people use GPG? | | The hash only verifies file integrity, and that the content of | the url doesn't switch the script later. But keep in mind in most | scenerios, and attacker would also just change the hash listed | too (they're usually on the same website). This only mitigates | one very specific attack. | | Why don't we use GPG here? That way we can verify ownership and | file integrity with at minimum TOFU, plus optional manual | verification? If we're going through the work of adding a wrapper | and all that, we may as well no? | | This has the benefit that you only need to import the owner's | cert once, all future changes have the same cert. Where hashes | are obviously different every time, you have to trust the source | of the hash every time it changes. With GPG at the very least you | have TOFU with certs - and very best can have better assurance of | the initial download too. | | EDIT: Just want to clarify - I'm openly asking why the "developer | community" is going the direction of hashes for script | verification vs GPG signatures. | | I don't mean to diminish your project, your project looks fun, | and does make verifying hashes easier :) | [deleted] | tomrod wrote: | I'm not terribly deep in this space. What is the conceptual | difference of hash vs GPG sig? | atoav wrote: | A hash is the same when the values of the content are the | same. But when you get a new (maliciously hacked) install | script chances are that you won't have an old hash lying | around to check whether the script changed. Any attacker who | could swap the sceipt could also swap the hash, unless it is | a different channel. | | With GPG the developer has a key pair (one private, one | public). They can then sign all their scripts with their | private key and publish the public one wherever. You can then | take that public key and verify that the script has been | indeed signed by the developers private key. | thewataccount wrote: | Admittedly this is likely the main reason GPG isn't more | common place because of the complexity. | | This is the overview: | | Developer generates a private/public key they use for all of | their projects. | | You import their public key once - you can verify this from | their github, twitter, etc but that's optional. | | They can sign a file with their key. You can check this | signature against their public key. This will guarantee the | file was signed by using that key and is unmodified. | | If someone hijacks the website after this point and signs the | new downloads with their own key - then you will be able to | see it's invalid. | | If you manually verify the key then you'll know your initial | download is valid - if you trust on first use then you at | least know all future files signed from that developer with | that cert are valid. | | They also are effectively a hash for file integrity. | | tl;dr - hashes tell you if a file is changed. Signatures tell | you if the file is changed, and who the person that made the | file is. | Jarwain wrote: | Hash essentially proves that the file you downloaded is the | same as the file that was uploaded. It tells you nothing | about Who uploaded the file. An attacker could make you | download their own file, but then the hash of the file won't | match what's published (unless the attacker changes the | published hash). | | A GPG sig proves that the file was signed & uploaded by the | author, which defacto doubles as proof that it's the same | file. The idea here is that the author uploads their public | key, signs the package with their private key, and now | there's an association between the package and the author. An | attacker would have to obtain the author's private key, or | replace the public key with their own. Changing the public | key, however, is a big red flag. | pvg wrote: | Because for all of its problems, Web PKI is a working, | practical, large scale system of verification and GPG isn't - | you don't get much by trying to replicate what your web browser | and CAs do for you but clunkier. | XCSme wrote: | > would also just change the hash listed too | | In my project I "host" the hash on a different medium, so in | order to compromise the file download the attacker would have | to compromise both the file hosting server and the hash hosting | medium (which in my case is GitHub). | | I also don't really display the hashes, as the download only | happens when the script is updated, so your current version of | the script will check the hash on GitHub vs the hash of the | file download from the file hosting server. | | EDIT: To be clear, this doesn't solve the problem with the | initial install and it is also not related to the Checksum.sh | script. | thewataccount wrote: | Interesting idea, | | Does the script get the new version url&expected hash from | the website alone? Or does it get the expected hash from the | website, then calculate the URL from github? | | Basically I'm wondering if that prevents just needing to | attack the website - if the url to download the update and | the expected hash are in the same place then it's still a | single point of failure. | XCSme wrote: | The latest file download URL is always the same /latest, | hosted on my server. | | The version number and latest file hash are also fixed | URLs, stored on GitHub. | | So for an update, the script checks GitHub for latest | version number, if newer it downloads the latest version | from my server, computes the hash and compares it to the | hash stored on the fixed GitHub URL before proceeding. | | I think there's no way to replace the file with a malicious | one that will be distributed to the users unless you get | access to both my server and the GitHub repository. | thewataccount wrote: | Yeah I think that should work. | | It does have the downside still that changes to the | website/github might break future updates in a way that | isn't (easily) verifiable. | | While this is a solution personally I still like the idea | of GPG more since it'll work for any new files, works for | your new projects automagically, etc. | | But I think you did at least fix the future update | problem with auto-updates, which is a lot more work then | most people put into it so thank you for addressing the | issue! | koolba wrote: | Just remember that any script that fetches anything else remotely | would still pass the checksum as only the initial script is | checked. | ChadNauseam wrote: | Yep. As an example, rustup happens to be in this category as | the checksums for rustc, cargo, etc. aren't checked. | gavinuhma wrote: | It's really interesting. There should be a massive ledger of | checksums for software | jandrese wrote: | It's called apt. Or dnf. Or most any package manager. | Having a gigantic general list runs into the problem of how | do you update it and how do you verify the updates? | yjftsjthsd-h wrote: | You use GPG and trust the people publishing things, who | sign the artifact that you actually download. Which is | internally how every package manager I've seen works | internally, anyways. | jandrese wrote: | It's the age old root of trust problem. In practice the good | enough is that if it passes SSL/TLS authentication on the | official domain then we wouldn't be able to stop an injection | attack either way. Validating against the source is no good if | it is the source that is compromised. | | That's also kind of the issue with a lot of these shell | injection attacks. Sure someone could insert environment | variables or other shenanigans to take over your machine, but | if they have that much control over your shell there are | countless other ways they could also do it. Guarding against | this one particular case doesn't buy you much. | gavinuhma wrote: | Definitely. Important to note. There is a long long supply | chain | neeh0 wrote: | I wrote hundreds of those checks in scripts, makefiles, CI and | whatever else. After I found Nix (and NixOS) it's ridiculous not | to use it. Use it. | gavinuhma wrote: | I hadn't heard of NixOS. Super cool | NovemberWhiskey wrote: | I don't know; what's the threat model here? | | If the script is deliberately malicious as originally published, | then the publisher will provide a valid checksum; so it doesn't | help. | | If the script source is subverted by an attacker, then it only | helps if the attacker doesn't also have the means to change the | published checksum too. | | If an attacker can modify the site which publishes the URL for | the script and the checksum, they can modify both at the same | time. | nerdponx wrote: | Why not use the -c option? Especially if you're using Bash or Zsh | which has "here-strings": checksum() { | hash="$1" file="$2" sha256sum -c <<< "${hash} | ${file}" } | | Or if you need to use a POSIX-ish shell: | checksum() { hash="$1" file="$2" | printf '%s %s' "$hash" "$file" | sha256sum -c } | | Of course you can add a `--binary` option (uses '%s *%s' instead | of '%s %s'), options to use different hash functions, etc. | | I also think it's weird to use `alias` inside a function, instead | of just using a parameter to store the name of the program to | execute. | gavinuhma wrote: | Great point on alias, thanks. I think that was a relic of an | older iteration. | | I'll work through these suggestions. Appreciate it. Feel free | to send a PR if you want. | | For the here string I think that won't work because the file | isn't being saved locally, it's just being piped (so $2 is a | URL). I can't do the usual `shasum -c <<< | "132e320edb0027470bfd836af8dadf174e4fee00 install.sh" which | takes a local filename but not the file content. As far as I | could tell anyway. I'll try it some more ___________________________________________________________________ (page generated 2022-10-28 23:01 UTC)