[HN Gopher] "Exit Traps" Can Make Your Bash Scripts Way More Rob... ___________________________________________________________________ "Exit Traps" Can Make Your Bash Scripts Way More Robust and Reliable Author : ekiauhce Score : 192 points Date : 2023-06-20 06:34 UTC (16 hours ago) (HTM) web link (redsymbol.net) (TXT) w3m dump (redsymbol.net) | franknord23 wrote: | I wish Bash had 'defer' like Go. | c7DJTLrn wrote: | https://cedwards.xyz/defer-for-shell/ | | Enjoy. (blog post is mine) | chasil wrote: | I used an exit trap to kill an SSH agent that I am running, and I | noticed that dash did not kill if the script was interrupted, but | only if it ran to successful completion. | | I asked on the mailing list if this was expected behavior, and it | turns out that POSIX only requires EXIT to run on a clean | shutdown; to catch interruptions, add more signals. | trap 'eval $(ssh-agent -k)' EXIT INT ABRT KILL TERM | lgsymons wrote: | The signals EXIT HUP INT TERM cover everything I've run into | (I'm actually using EXIT SIGHUP SIGINT SIGTERM but presumably | it's equivalent). | | In basic terms for my purposes these respectively account for a | clean exit, the terminal emulator being closed, ctrl-c, the | kill command. | diarrhea wrote: | Harsh lesson, those five signal names are identical if one | squints real good. Would have never known. | hoherd wrote: | `man 7 signal` on linux or just `man signal` on macOS will | give you more information about the different signals, and | shows what the different meanings of those are. | chasil wrote: | "kill -l" gives you a terse (but complete) list. | simcop2387 wrote: | You can't trap kill can you? doesn't that just go kill your | process without any possibility of intervention or other | actions? Also you probably want to handle HUP there too I would | think (depending on what the script does) | chasil wrote: | That's what they put on the ticket, so that's what I'm using, | but you're probably right. | mike_hock wrote: | SIGABRT is also not a normal termination signal. Seems out of | place here. | paulddraper wrote: | SIGABRT would have to come from the process itself; IDK when | if ever the shell would do that. | | And SIGKILL can't be handled, so that is indeed pointless. | arp242 wrote: | POSIX says that "setting a trap for SIGKILL or SIGSTOP | produces undefined results", but for signals it describes | SIGKILL as "Kill (cannot be caught or ignored)". | | I'm guessing this is some relic from 80s Unix systems where | SIGKILL behaved different, or perhaps just an | inconsistency/oversight. | nerdponx wrote: | I read that as undefined in terms of how the shell itself | handles it, because the OS doesn't care. | [deleted] | js2 wrote: | I think you want: trap 'ssh-agent -k' EXIT INT | TERM | | I don't see any reason for the eval as "ssh-agent -k" doesn't | return anything useful you want the shell to evaluate. | chasil wrote: | That's not what the eval is for. | | The "ssh-agent -k" command will emit shell commands that the | shell must then execute which will kill the agent daemon and | unset the socket environment variable. | leodag wrote: | > The "ssh-agent -k" command will emit shell commands | | Does it really? I've executed it here and it just runs | kill, doesn't emit any bash. Running just ssh-agent | (without any args) does that though, which is what's | probably causing the confusion. | chasil wrote: | I am on OpenBSD 7.2, and I see: $ eval | $(ssh-agent) Agent pid 56785 $ ssh-agent | -k unset SSH_AUTH_SOCK; unset SSH_AGENT_PID; | echo Agent pid 56785 killed; | | The correct processing of that output requires an eval. | | Did you have any other questions? | js2 wrote: | If all you care about is killing it, you don't need to eval | the output. The output just unsets two environment | variables which only matters in the current shell context. | $ ssh-agent SSH_AUTH_SOCK=/var/folders/8p/_pwq997168s | 7vdwwdg_qr1j40000gn/T//ssh-DE0IoJfU5rrM/agent.15015; export | SSH_AUTH_SOCK; SSH_AGENT_PID=15016; export | SSH_AGENT_PID; echo Agent pid 15016; $ | SSH_AGENT_PID=15016; export SSH_AGENT_PID; $ ssh- | agent -k unset SSH_AUTH_SOCK; unset | SSH_AGENT_PID; echo Agent pid 15016 killed; | | That said, it doesn't hurt to eval it, so I overstated my | case in my original comment. | snapcaster wrote: | Very cool! didn't know about these | gscho wrote: | Off topic but I really enjoy the lofi website design! | arp242 wrote: | An annoying thing about bash is that EXIT will _also_ run on | SIGINT (^C), which most other shells won 't (in my reading it's | also not POSIX compliant, although the document is a bit vague). | Some might argue this is a feature, but IMHO it's a bug - | sometimes you really _don 't_ want cleanup to happen so people | can inspect the contents of temporary files for debugging. | Because trap doesn't pass the signal information to the handler | it's hard to not do cleanup on SIGINT, so it's certainly less | flexible, and it's an annoying incompatibility between bash and | any other shell. | | Also, zsh has a much nicer mechanism for the common case: | { echo lol } always { # Ensure *all* | temporary files are cleaned up. nohup rm -rf / & | } | jcotton42 wrote: | > Because trap doesn't pass the signal information to the | handler it's not hard not to do cleanup on SIGINT | | Did you mean "it's hard" instead of "it's not hard"? | arp242 wrote: | Oops, yes, thanks; seems a "not" got duplicated in editing - | still within edit window. | ggm wrote: | Some temporary file remover. Lol indeed | wkat4242 wrote: | // Thinks about that type I typed rm -rf /<space>something by | mistake. | | It took a few seconds before I thought... "Why does it take | that long for only a handful of files?" | | I never did that again. | | Had my DOS filesystem mounted under Linux too (yes that long | ago), and I spent a few days guessing the first letter of | each deleted file with norton disk doctor or undeleter or | something. That was fun (FAT16 filesystems overwrote the | first letter of each filename to delete it) | | At least it wasn't a mistake I made at work on some | production thing. Though there is a reason I make all the | desktops on windows production servers bright red. One time I | was tired and shut down "my laptop" forgetting I was still | logged into a remote server 200km away..... :/ Of course the | iLO wasn't hooked up but I was extremely happy to find that | HP servers listen to wake on LAN even when they're off. | Another one for the never again books :P | arp242 wrote: | "Keep non-temporary files intact" was not part of the design | document. | c5c3c9 wrote: | [flagged] | js2 wrote: | > Because trap doesn't pass the signal information to the | handler | | You can examine $? on entry to the trap function. On signals, | it will be 128 + signal. i.e. on TERM (15) it will be 143. On | INT (2) it will be 130. #!/bin/bash | skip_exit= on_exit() { code=$? if test | $code == 130; then skip_exit=1 fi if | test -n "$skip_exit"; then return fi | echo "Exiting with: $code" return $code } | trap on_exit INT EXIT sleep 2 false | | With ctrl-c: $ ./foo.sh ^C | | After 2 seconds: $ ./foo.sh Exiting with: | 1 | | You can also setup separate handlers for each signal and use a | sentinel: $ cat foo.sh #!/bin/bash | skip_exit= on_int() { echo int | skip_exit=1 } on_exit() { test -n | "$skip_exit" && return echo exit } | trap on_int INT trap on_exit EXIT sleep 2 | | With ctrl-c: $ ./foo.sh ^Cint | | After 2 seconds: $ ./foo.sh exit | telotortium wrote: | @redsymbol your site has a TLS certificate error. On Chrome I get | NET::ERR_CERT_COMMON_NAME_INVALID because your certificate is | from mobilewebup.com | | Otherwise a good article. I use the following code to enable | passing the signal name to the trap handler, so that I can kill | the Bash process with the correct signal name, which is best | practice for Unix signal handling (EXIT would have to be handled | specially in `sig_rekill`): # Set trap for | several signals and pass signal name to trap function. # | https://stackoverflow.com/a/2183063/207384 | trap_with_arg() { func="$1" ; shift for | sig ; do trap "$func $sig" "$sig" | done } sig_rekill() { # Kill whole | process group. trap "$1"; kill -"$1" -$$ } | # Catch signal and kill whole process group. | trap_with_arg sig_rekill HUP INT QUIT PIPE TERM | [deleted] | smcleod wrote: | Yep, I use these all the time, they're very useful indeed. | jeron wrote: | I thought exit traps were just SPACs | filereaper wrote: | >The secret sauce is a pseudo-signal provided by bash, called | EXIT, that you can trap; commands or functions trapped on it will | execute when the script exits for any reason. | | "Secret Sauce", why is this secret at all. | | Nothing against the author who's helping the ecosystem here, but | is there an authoritative guide on Bash that anyone can | recommend? | | Hopefully something that's portable between Mac & Linux. | | The web is full of contradictory guides and shellcheck seems to | be the last line of defense. | | - https://github.com/koalaman/shellcheck | sigg3 wrote: | Yes. I use them for cleanup in every non-trivial script I write. | waselighis wrote: | I wish there was a nicer shell scripting language that simply | transpiled to Bash and would generate all this boilerplate code | for me. There is https://batsh.org/ which has a nice syntax but | it doesn't even support pipes or redirection, making it pretty | worthless for shell scripting. I haven't found any other such | scripting languages. | paulddraper wrote: | What's the difference between that and Go? | burnished wrote: | Go doesnt seem related at all? | cvalka wrote: | [flagged] | usr1106 wrote: | bash scripts have their use cases, many things are shorter and | simpler than in Python. But coders should bother to learn how | bash works and use shellcheck. Just guessing from how things | work in another language typically leads to buggy code. Keeping | a daemon always running is not a task for bash. systemd is | typically much better at that (although something like | exponential backoff in case of failure seem to be tricky) | anaganisk wrote: | A good read before dismissing http://n-gate.com/software/2017/ | arsome wrote: | What's the gripe with Let's Encrypt? Certificate | transparency? | mttjj wrote: | Can you expand on your first sentence with some reasons or | justifications for stating this? | ipnon wrote: | I just learned about these through "pair" programming with | ChatGPT. It is the quintessential ML-enhanced programming trick: | Using some old, robust language feature I'm skilled enough to | grok but never had the time to learn about through endless | documentation spelunking. | | My opinion is that LLM pair programming is most or maybe only | beneficial to already skilled programmers. ChatGPT can open the | door for you, but it can't show you where the door is. I needed | the experience to ask it for a Bash script that handles exit | codes gracefully, which is not a question all junior programmers | would be able to ask. | abathur wrote: | I like combining this with a bash implementation of an event API | (https://github.com/bashup/events). This makes it easy/idiomatic, | for example, to conditionally add cleanup as you go. | | Glossing over some complexity, but roughly: | add_cleanup(){ event on cleanup "$@" } | trap "event emit 'cleanup'" HUP EXIT | start_postgres(){ add_cleanup stop_postgres | # actually start pg } start_apache(){ | add_cleanup stop_apache # actually start apache | } | | I wrote a little about some other places where I've used it in | https://www.t-ravis.com/post/shell/neighborly_shell_with_bas... | and https://t-ravis.com/post/nix/avoid_trap_clobbering_in_nix- | sh... (though I make the best use of it in my private bootstrap | and backup scripts...) | e12e wrote: | Thank you for sharing - if i understand the code, the queue is | serialized into bash variable(s) (arrays)? | | I must admit I find the code somewhat painfully terse and hard | to read. | | Still, interesting idea. I wonder if using a temporary | SQLite/Berkeley DB/etc for queue might generalize the idea to a | "Unix" event system - allowing other programs and scripts to | use it for coordinating? (Like logger(1) does for logging)? | phh wrote: | This 100%. | | I'll complete with patterns I'm using for exit traps: | | - for temporary files I have a global array that lists files to | remove (and for my use case umount them beforehand) | | - in the EC2 example, I add a line with just "bash", so I have an | env with the container still running to debug what happened and I | just need to close that shell to clear the allocated resources | tommica wrote: | This is very useful to know - thanks for sharing! | rgrau wrote: | I couldn't find a way to have more than one callback per signal, | and created a system to have an array of callbacks: | | https://github.com/kidd/scripting-field-guide/blob/master/bo... | | A nice bonus is that it also keeps the return value of the last | non-callback function, so your script behaves better when called | from other scripts. | ch33zer wrote: | Should go without saying, but don't rely on this for anything | critical. It's not guaranteed this will run, even on successful | completion of the script. Simple example: power is cut between | the last line of the script and before the trap runs. Just a | heads up | JohnMakin wrote: | I like to use these in combination with set -e and report the | error that happened to whatever is capturing stdout for logging. | | You can report the error code with $? at the start of your trap, | IIRC. ___________________________________________________________________ (page generated 2023-06-20 23:00 UTC)