[HN Gopher] Hidden gems of moreutils ___________________________________________________________________ Hidden gems of moreutils Author : jiripospisil Score : 153 points Date : 2023-12-30 12:00 UTC (10 hours ago) (HTM) web link (jpospisil.com) (TXT) w3m dump (jpospisil.com) | throw0101b wrote: | For _execsnoop_ , people running systems with DTrace can find the | same: | | * https://github.com/jorgev/dtrace-scripts/blob/master/execsno... | | On macOS Monterey+ you'll probably have to install the Kernel | Debug Kit per: | | * https://developer.apple.com/forums/thread/692444 | | The Linux variant was written Brendan Gregg (who previous did a | lot of work on Solaris, where DTrace was created): | | * https://github.com/brendangregg/perf-tools/blob/master/execs... | | * https://github.com/iovisor/bcc/blob/master/tools/execsnoop.p... | bloopernova wrote: | In case anyone was wondering, the moreutils tools: | chronic: runs a command quietly unless it fails combine: | combine the lines in two files using boolean operations | errno: look up errno names and descriptions ifdata: get | network interface info without parsing ifconfig output | ifne: run a program if the standard input is not empty | isutf8: check if a file or standard input is utf-8 lckdo: | execute a program with a lock held mispipe: pipe two | commands, returning the exit status of the first parallel: | run multiple jobs at once pee: tee standard input to pipes | sponge: soak up standard input and write to a file ts: | timestamp standard input vidir: edit a directory in your | text editor vipe: insert a text editor into a pipe | zrun: automatically uncompress arguments to command | | from | https://rentes.github.io/unix/utilities/2015/07/27/moreutils... | | Similarly, there's some lesser-known useful stuff in GNU | Coreutils: | | https://en.wikipedia.org/wiki/List_of_GNU_Core_Utilities_com... | paste: Merges lines of files expand: Converts tabs to | spaces seq: prints a sequence of numbers shuf: | shuffles its input | koolba wrote: | seq is "less known"? I'd assume anyone familiar with shell | scripting would know about it. | | It's a great starting point for entertaining children on a | terminal: seq 99 -1 0 | xargs printf '%s | bottles of beer on the wall...\n' | foobarian wrote: | Bash kinda broke seq for me since you can just write {99..0} | o11c wrote: | Or if you need dynamic arguments, you probably want: | N=9; for ((i=0; i<=N; ++i)); do echo "$i"; done | andrewshadura wrote: | Why use this form of for when you can just use seq and it | works in any shell including fish? | o11c wrote: | Because capturing the output of `seq` requires spawning a | whole separate process (significant for small sequences) | and shoving all the data into a single buffer | (significant for large sequences) rather than working | incrementally. | bloopernova wrote: | Less known maybe for most people who read HN, but you're | right that a lot of shell scripting folks would know about | it. | agumonkey wrote: | thanks for the tl;dr | | moreutils and similar pops up every year but it's still easy to | forget.. they should be part of core distributions nowadays.. | jeffbee wrote: | Confusingly, moreutils parallel is not "GNU parallel". | Moreutils parallel is very simple, while the other parallel is | very featureful. Linux distributions can deal with the | conflict, but bad package managers like homebrew cannot. | mlk wrote: | "rename" shares the same fate, there are several | implantations, completely different from each other | atomicstack wrote: | I use `ts` quite often in adhoc logging/monitoring contexts. | Because it uses strftime() under the hood, it has automatic | support for '%F' and '%T', which are much easier to type than | '%Y-%m-%d' and '%H:%M:%S'. Plus, it also has support for high- | resolution times via '%.T', '%.s', and '%.S': | echo 'hello world' | ts '[%F %.T]' [2023-12-30 | 16:25:40.463640] hello world | c0l0 wrote: | Assuming semi-recent _bash(1)_ , you can also get away with | something like while read -r line; do printf | '%(%F %T %s)T %s\n' "-1" "${line}"; done | | as the right-hand side/reader of a shell pipe for most of what | _ts(1)_ offers. ( "-1" makes the embedded _strftime(3)_ format | string assume the current time as its input). | smalu wrote: | I recommend zmwangx/ets package, it is the modern version of | ts. I'm using it in CI/CD pipeline in gitlab for debugging | performance. | loeg wrote: | The 'logger' command can also be useful. | Karellen wrote: | I use `sponge` and `ts` (mentioned in the article) pretty | regularly, and am really happy for them. | | I have used `isutf8` a fair amount in the past, but I find it | mostly redundant these days (thankfully!) | | The other one that I don't use very often, but is absolutely | invaluable when I do need it, is `ifne` - "Run command if the | standard input is not empty". It's like `-r` from GNU `xargs`, | but for any command in a pipeline. | 1f60c wrote: | One typo that's easy to make is: sort file.txt | | sponge > file.txt | | (i.e., using redirection rather than passing the path as an | argument to sponge) | | This is wrong and will not work! I've been bitten by it before. | 22c wrote: | moreutils parallel can also come in handy for quick command | parallelization (not to be confused with GNU parallel which | serves a similar purpose but can be more complicated) | croemer wrote: | And GNU parallel is very aggressive about citations which I get | but it's also too much | ostensible wrote: | I have switched to using xargs to parallelize things: it has a | benefit of being part of posix, and is not annoying about | citations like parallel. | twic wrote: | The parallelism isn't part of POSIX though (AFAIK), that's an | extension by whoever wrote your xargs. | | If what you really mean is that it's already installed on | every machine you use, fair enough. But it's not strictly | portable in some standards-based sense. | pie_flavor wrote: | That they occupy the same namespace is always very annoying. | Instead of just `brew upgrade` I must unlink and later link | --overwrite parallel. | croemer wrote: | Not a moreutil, but I recently discovered `pv`, the pipe viewer | and it's so useful. Like tqdm (Python progressbar library) but as | a Unix utility. Just put it between two pipes and it'll display | rate of bytes/lines | | Apparently it's neither a coreutil nor a moreutil. | | Here's an HN discussion from 2022: | https://news.ycombinator.com/item?id=33244768 | chlorion wrote: | I have also discovered that certain implementations of dd have | a progress printing functionality that can be used for similar | purposes. You can put a "dd status=progress" in a pipeline and | it will print the amount and rate of data being piped! | | This dd option is not as nice as pipe viewer but it's handy for | when pv isn't around for some reason. | derefr wrote: | Even if you don't pass this argument, you can poke most | implementations of dd(1) with a certain POSIX signal, and | they'll respond by printing a progress line. | | On Linux, this is SIGUSR1, and you have to use kill(1) to | send it. | | On BSDs (incl. macOS), though, the signal dd listens for is | instead called SIGINFO (which probably makes this make a lot | more sense for why a process would have this response to it.) | Shells/terminal emulators on these platforms emit SIGINFO | when you type Ctrl+T into them! | | (For a lot more useful info about this behavior: | https://stuff-things.net/2016/04/06/that-one-stupid-dd- | trick...) | | Bonus fact not mentioned in the above article: dd used in the | middle of a pipeline will still "hear" Ctrl+T and print | progress, since signals generated by a shell (think: SIGINT | from Ctrl+C) are propagated to _all_ processes in the process | group started by the command. Test it yourself: | cat /dev/zero | dd count=10000000 bs=1024 | cat > /dev/null | BlackLotus89 wrote: | Yeah bit me in the butt once on Mac OS usr1 killed dd. When | I want progress of dd and didn't define status I mostly use | progress now. Also works with many pther utils like cp, xz | and all the usual suspects | cycomanic wrote: | When did I mentioned one should always point to | https://www.vidarholen.net/contents/blog/?p=479 | | For like >99% of cases where people used dd they would have | been better of using a different tool. | derefr wrote: | You can also use pv as you would use cat, e.g. | pv file.tar.gz.part1 file.tar.gz.part2 | tar -x -z | | Just like cat, pv used this way will stream out the | concatenation of the passed-in file paths; but it will _also_ | add up the sizes of these files and use them to calculate the | total stream size, i.e. the divisor for its displayed progress | percentage. | matrss wrote: | > Like tqdm (Python progressbar library) but as a Unix utility. | | FYI: tqdm can be used in a shell pipeline as well. It's | documented (at least) in their readme: | https://github.com/tqdm/tqdm#module | genman wrote: | It is a really incredible anxiety reducing small tool when you | have to transfer large files. | LeoPanthera wrote: | If you have a long running copy process running but forget to | enable progress, you can use the "progress" utility to show the | progress of something that is already running. | | It supports: cp mv dd tar bsdtar cat rsync scp grep fgrep egrep | cut sort cksum md5sum sha1sum sha224sum sha256sum sha384sum | sha512sum adb gzip gunzip bzip2 bunzip2 xz unxz lzma unlzma 7z | 7za zip unzip zcat bzcat lzcat coreutils split gpg gcp gmv | kyrofa wrote: | Yeah I use chronic all the time for my cron jobs so they only | email me if they fail and I can still print helpful output from | them. Love moreutils. | dig1 wrote: | I just learned about vidir [1]. Emacs Dired [2] can rename & | delete files by editing the buffer directly, and let's say I was | thrilled when I saw someone replicated that behavior as a general | Unix tool. | | [1] https://github.com/trapd00r/vidir | | [2] | https://www.gnu.org/software/emacs/manual/html_node/emacs/Wd... | gpvos wrote: | I'd never heard of the :cq command in vim before. Seems useful, | but in practice it's so unknown that things like editing the git | commit message cannot rely on it and instead check whether the | file has been changed. Also, reading its documentation, it | probably would be better named :cqall . | cassepipe wrote: | I was wondering that too although I don't have access to vim | right now. What's the punch line ? EDIT : The difference with | :q! is the exit code ! | | (Yes, and :wall is actually the :update command on all your | buffers, that is, unlike :w, buffers are written only if there | has been changes. Bad naming is the mother of all pedagocical | pain) | andrewshadura wrote: | As far as I remember, it works with git commit just fine. It's | also far from being unknown. | gpvos wrote: | Yeah, I should've written "cannot rely on that alone and | _also_ check ". I've worked with vi, later vim, for 34 years | and read about it here first; ddg'ing for it doesn't give | many hits. | manx wrote: | I use that often for aborting the current commit or the current | git interactive rebase | kiprasmel wrote: | it's v useful if you want to abort, e.g. when editing an | interactive rebase & decide to not go thru w/ it. | cbarrick wrote: | > `pee` [...] It runs the commands using popen, so it's actually | passing them to /bin/sh -c (which on most systems is a symlink to | Bash). | | Do not assume /bin/sh is Bash!! | | On Debian-based systems, including Ubuntu, Dash is the default | non-interactive shell. Specifically, Dash does not support common | Bashisms. | | (Not to mention Alpine, OpenWRT, FreeBSD, ...) | | This is a bit of a pet-peeve of mine. If you're dropping a | reference to `/bin/sh -c` like the reader knows what that means, | then you don't need to tell them that "it's a symlink to Bash." | They know their own system better than you. | 8organicbits wrote: | Huh. I didn't realize that | | https://wiki.debian.org/Shell | throwaway892238 wrote: | Rule of thumb about shells: You'll never know | for sure what shell is the default, so write your scripts to | a minimum shell family, and encode the name of that shell | family in the shebang. | | The default shell, for either the entire system or an | individual user, could be: - A Bourne shell | - A C shell - A POSIX shell - A "modern" shell, | like Bash, Zsh, Fish, Osh | | Use the following shebangs to call the class of shell you | expect. Start with _/ usr/bin/env_ to allow the system to | locate the executable based on the current _PATH_ environment | variable rather than a fixed filesystem path. | #!/usr/bin/env sh # ^ should result in a | Bourne-like shell, on modern systems, but could also be | # a POSIX shell (like Ash), which is not | really backwards compatible #!/usr/bin/env csh | # ^ should result in a C shell | #!/usr/bin/env bash # ^ should result in a | Bash shell, though the only version you should # | expect is version 3.2.57, the last GPLv2 version | #!/usr/bin/env dash # ^ you expect the | system to have a specific shell. if your code depends # | on Dash features, do this. otherwise write your code using an | earlier # version of the original shell | family, such as Bourne or POSIX. | | If you need to pass a command-line argument to the shell | command before interpreting your script, use the fixed | filesystem path for the shell. Due to bugs in the way kernels | execute scripts, all arguments after the initial shebang path | may be sent as a single argument, which is probably not what | you (or the shell command) expect. (e.g. use _#! /bin/bash | --posix_ instead of _#! /usr/bin/env bash --posix_ as the | latter may fail) | | Different shells have different implementation details. For | example, Bash tends to skew toward POSIX conformance by | default, even if it conflicts with traditional Bourne shell | behavior. Dash may try to implement POSIX semantics, but also | has its own specific behavior incompatible with POSIX. | | In order to get specific behavior (like POSIX emulation), you | may need to add a flag like _--posix_ , or set an environment | variable like _POSIXLY_CORRECT_. For shells that support it, | you can also detect the version of the shell running, and | bail if the version isn 't compatible with your script. | | Here are some references of the differences between shells: | - Comparison of command shells[1] - Major differences | between Bash and Bourne shell[2] - Practical | differences between Bash and Zsh[3] - Fundamental | differences between mainstream \*NIX shells[4] | | [1] | https://en.wikipedia.org/wiki/Comparison_of_command_shells | [2] https://www.gnu.org/software/bash/manual/html_node/Major- | Dif... [3] | https://apple.stackexchange.com/questions/361870/what-are- | th... [4] https://unix.stackexchange.com/questions/3320/what- | are-the-f... | throw0101b wrote: | > _Specifically, Dash does not support common Bashisms._ | | More importantly, Bash should not support Bashims when called | as /bin/sh (either). | | If you want to use Bashisms just invoke Bash. | cassepipe wrote: | > In Vim / Helix you do that with :cq | | Never heard of that before. I generally use :q! or ZQ | | Is there a difference ? | mbwgh wrote: | Yes, the exit code. See e.g. `:help cq` in vim. :q! and ZQ will | yield exit code 0, which sometimes is not what you want if you | want to ensure some task is properly aborted. | cassepipe wrote: | Thanks ! | katzgrau wrote: | `pee` - no doubt the dev was delighted and amused | opan wrote: | vidir within ranger is really nice. vipe is also pretty cool. | Mostly I use vipe for editing my clipboard contents and then | sending the modified version back to the clipboard, or | occasionally editing some text stream before sending it to my | clipboard, such as some grep output I only want some of. ___________________________________________________________________ (page generated 2023-12-30 23:00 UTC)