[HN Gopher] Hidden gems of moreutils
       ___________________________________________________________________
        
       Hidden gems of moreutils
        
       Author : jiripospisil
       Score  : 153 points
       Date   : 2023-12-30 12:00 UTC (10 hours ago)
        
 (HTM) web link (jpospisil.com)
 (TXT) w3m dump (jpospisil.com)
        
       | throw0101b wrote:
       | For _execsnoop_ , people running systems with DTrace can find the
       | same:
       | 
       | * https://github.com/jorgev/dtrace-scripts/blob/master/execsno...
       | 
       | On macOS Monterey+ you'll probably have to install the Kernel
       | Debug Kit per:
       | 
       | * https://developer.apple.com/forums/thread/692444
       | 
       | The Linux variant was written Brendan Gregg (who previous did a
       | lot of work on Solaris, where DTrace was created):
       | 
       | * https://github.com/brendangregg/perf-tools/blob/master/execs...
       | 
       | * https://github.com/iovisor/bcc/blob/master/tools/execsnoop.p...
        
       | bloopernova wrote:
       | In case anyone was wondering, the moreutils tools:
       | chronic: runs a command quietly unless it fails       combine:
       | combine the lines in two files using boolean operations
       | errno: look up errno names and descriptions       ifdata: get
       | network interface info without parsing ifconfig output
       | ifne: run a program if the standard input is not empty
       | isutf8: check if a file or standard input is utf-8       lckdo:
       | execute a program with a lock held       mispipe: pipe two
       | commands, returning the exit status of the first       parallel:
       | run multiple jobs at once       pee: tee standard input to pipes
       | sponge: soak up standard input and write to a file       ts:
       | timestamp standard input       vidir: edit a directory in your
       | text editor       vipe: insert a text editor into a pipe
       | zrun: automatically uncompress arguments to command
       | 
       | from
       | https://rentes.github.io/unix/utilities/2015/07/27/moreutils...
       | 
       | Similarly, there's some lesser-known useful stuff in GNU
       | Coreutils:
       | 
       | https://en.wikipedia.org/wiki/List_of_GNU_Core_Utilities_com...
       | paste: Merges lines of files       expand: Converts tabs to
       | spaces       seq: prints a sequence of numbers       shuf:
       | shuffles its input
        
         | koolba wrote:
         | seq is "less known"? I'd assume anyone familiar with shell
         | scripting would know about it.
         | 
         | It's a great starting point for entertaining children on a
         | terminal:                   seq 99 -1 0 | xargs printf '%s
         | bottles of beer on the wall...\n'
        
           | foobarian wrote:
           | Bash kinda broke seq for me since you can just write {99..0}
        
             | o11c wrote:
             | Or if you need dynamic arguments, you probably want:
             | N=9; for ((i=0; i<=N; ++i)); do echo "$i"; done
        
               | andrewshadura wrote:
               | Why use this form of for when you can just use seq and it
               | works in any shell including fish?
        
               | o11c wrote:
               | Because capturing the output of `seq` requires spawning a
               | whole separate process (significant for small sequences)
               | and shoving all the data into a single buffer
               | (significant for large sequences) rather than working
               | incrementally.
        
           | bloopernova wrote:
           | Less known maybe for most people who read HN, but you're
           | right that a lot of shell scripting folks would know about
           | it.
        
         | agumonkey wrote:
         | thanks for the tl;dr
         | 
         | moreutils and similar pops up every year but it's still easy to
         | forget.. they should be part of core distributions nowadays..
        
         | jeffbee wrote:
         | Confusingly, moreutils parallel is not "GNU parallel".
         | Moreutils parallel is very simple, while the other parallel is
         | very featureful. Linux distributions can deal with the
         | conflict, but bad package managers like homebrew cannot.
        
           | mlk wrote:
           | "rename" shares the same fate, there are several
           | implantations, completely different from each other
        
       | atomicstack wrote:
       | I use `ts` quite often in adhoc logging/monitoring contexts.
       | Because it uses strftime() under the hood, it has automatic
       | support for '%F' and '%T', which are much easier to type than
       | '%Y-%m-%d' and '%H:%M:%S'. Plus, it also has support for high-
       | resolution times via '%.T', '%.s', and '%.S':
       | echo 'hello world' | ts '[%F %.T]'       [2023-12-30
       | 16:25:40.463640] hello world
        
         | c0l0 wrote:
         | Assuming semi-recent _bash(1)_ , you can also get away with
         | something like                   while read -r line; do printf
         | '%(%F %T %s)T %s\n' "-1" "${line}"; done
         | 
         | as the right-hand side/reader of a shell pipe for most of what
         | _ts(1)_ offers. ( "-1" makes the embedded _strftime(3)_ format
         | string assume the current time as its input).
        
         | smalu wrote:
         | I recommend zmwangx/ets package, it is the modern version of
         | ts. I'm using it in CI/CD pipeline in gitlab for debugging
         | performance.
        
         | loeg wrote:
         | The 'logger' command can also be useful.
        
       | Karellen wrote:
       | I use `sponge` and `ts` (mentioned in the article) pretty
       | regularly, and am really happy for them.
       | 
       | I have used `isutf8` a fair amount in the past, but I find it
       | mostly redundant these days (thankfully!)
       | 
       | The other one that I don't use very often, but is absolutely
       | invaluable when I do need it, is `ifne` - "Run command if the
       | standard input is not empty". It's like `-r` from GNU `xargs`,
       | but for any command in a pipeline.
        
       | 1f60c wrote:
       | One typo that's easy to make is:                 sort file.txt |
       | sponge > file.txt
       | 
       | (i.e., using redirection rather than passing the path as an
       | argument to sponge)
       | 
       | This is wrong and will not work! I've been bitten by it before.
        
       | 22c wrote:
       | moreutils parallel can also come in handy for quick command
       | parallelization (not to be confused with GNU parallel which
       | serves a similar purpose but can be more complicated)
        
         | croemer wrote:
         | And GNU parallel is very aggressive about citations which I get
         | but it's also too much
        
         | ostensible wrote:
         | I have switched to using xargs to parallelize things: it has a
         | benefit of being part of posix, and is not annoying about
         | citations like parallel.
        
           | twic wrote:
           | The parallelism isn't part of POSIX though (AFAIK), that's an
           | extension by whoever wrote your xargs.
           | 
           | If what you really mean is that it's already installed on
           | every machine you use, fair enough. But it's not strictly
           | portable in some standards-based sense.
        
         | pie_flavor wrote:
         | That they occupy the same namespace is always very annoying.
         | Instead of just `brew upgrade` I must unlink and later link
         | --overwrite parallel.
        
       | croemer wrote:
       | Not a moreutil, but I recently discovered `pv`, the pipe viewer
       | and it's so useful. Like tqdm (Python progressbar library) but as
       | a Unix utility. Just put it between two pipes and it'll display
       | rate of bytes/lines
       | 
       | Apparently it's neither a coreutil nor a moreutil.
       | 
       | Here's an HN discussion from 2022:
       | https://news.ycombinator.com/item?id=33244768
        
         | chlorion wrote:
         | I have also discovered that certain implementations of dd have
         | a progress printing functionality that can be used for similar
         | purposes. You can put a "dd status=progress" in a pipeline and
         | it will print the amount and rate of data being piped!
         | 
         | This dd option is not as nice as pipe viewer but it's handy for
         | when pv isn't around for some reason.
        
           | derefr wrote:
           | Even if you don't pass this argument, you can poke most
           | implementations of dd(1) with a certain POSIX signal, and
           | they'll respond by printing a progress line.
           | 
           | On Linux, this is SIGUSR1, and you have to use kill(1) to
           | send it.
           | 
           | On BSDs (incl. macOS), though, the signal dd listens for is
           | instead called SIGINFO (which probably makes this make a lot
           | more sense for why a process would have this response to it.)
           | Shells/terminal emulators on these platforms emit SIGINFO
           | when you type Ctrl+T into them!
           | 
           | (For a lot more useful info about this behavior:
           | https://stuff-things.net/2016/04/06/that-one-stupid-dd-
           | trick...)
           | 
           | Bonus fact not mentioned in the above article: dd used in the
           | middle of a pipeline will still "hear" Ctrl+T and print
           | progress, since signals generated by a shell (think: SIGINT
           | from Ctrl+C) are propagated to _all_ processes in the process
           | group started by the command. Test it yourself:
           | cat /dev/zero | dd count=10000000 bs=1024 | cat > /dev/null
        
             | BlackLotus89 wrote:
             | Yeah bit me in the butt once on Mac OS usr1 killed dd. When
             | I want progress of dd and didn't define status I mostly use
             | progress now. Also works with many pther utils like cp, xz
             | and all the usual suspects
        
             | cycomanic wrote:
             | When did I mentioned one should always point to
             | https://www.vidarholen.net/contents/blog/?p=479
             | 
             | For like >99% of cases where people used dd they would have
             | been better of using a different tool.
        
         | derefr wrote:
         | You can also use pv as you would use cat, e.g.
         | pv file.tar.gz.part1 file.tar.gz.part2 | tar -x -z
         | 
         | Just like cat, pv used this way will stream out the
         | concatenation of the passed-in file paths; but it will _also_
         | add up the sizes of these files and use them to calculate the
         | total stream size, i.e. the divisor for its displayed progress
         | percentage.
        
         | matrss wrote:
         | > Like tqdm (Python progressbar library) but as a Unix utility.
         | 
         | FYI: tqdm can be used in a shell pipeline as well. It's
         | documented (at least) in their readme:
         | https://github.com/tqdm/tqdm#module
        
         | genman wrote:
         | It is a really incredible anxiety reducing small tool when you
         | have to transfer large files.
        
         | LeoPanthera wrote:
         | If you have a long running copy process running but forget to
         | enable progress, you can use the "progress" utility to show the
         | progress of something that is already running.
         | 
         | It supports: cp mv dd tar bsdtar cat rsync scp grep fgrep egrep
         | cut sort cksum md5sum sha1sum sha224sum sha256sum sha384sum
         | sha512sum adb gzip gunzip bzip2 bunzip2 xz unxz lzma unlzma 7z
         | 7za zip unzip zcat bzcat lzcat coreutils split gpg gcp gmv
        
       | kyrofa wrote:
       | Yeah I use chronic all the time for my cron jobs so they only
       | email me if they fail and I can still print helpful output from
       | them. Love moreutils.
        
       | dig1 wrote:
       | I just learned about vidir [1]. Emacs Dired [2] can rename &
       | delete files by editing the buffer directly, and let's say I was
       | thrilled when I saw someone replicated that behavior as a general
       | Unix tool.
       | 
       | [1] https://github.com/trapd00r/vidir
       | 
       | [2]
       | https://www.gnu.org/software/emacs/manual/html_node/emacs/Wd...
        
       | gpvos wrote:
       | I'd never heard of the :cq command in vim before. Seems useful,
       | but in practice it's so unknown that things like editing the git
       | commit message cannot rely on it and instead check whether the
       | file has been changed. Also, reading its documentation, it
       | probably would be better named :cqall .
        
         | cassepipe wrote:
         | I was wondering that too although I don't have access to vim
         | right now. What's the punch line ? EDIT : The difference with
         | :q! is the exit code !
         | 
         | (Yes, and :wall is actually the :update command on all your
         | buffers, that is, unlike :w, buffers are written only if there
         | has been changes. Bad naming is the mother of all pedagocical
         | pain)
        
         | andrewshadura wrote:
         | As far as I remember, it works with git commit just fine. It's
         | also far from being unknown.
        
           | gpvos wrote:
           | Yeah, I should've written "cannot rely on that alone and
           | _also_ check ". I've worked with vi, later vim, for 34 years
           | and read about it here first; ddg'ing for it doesn't give
           | many hits.
        
         | manx wrote:
         | I use that often for aborting the current commit or the current
         | git interactive rebase
        
         | kiprasmel wrote:
         | it's v useful if you want to abort, e.g. when editing an
         | interactive rebase & decide to not go thru w/ it.
        
       | cbarrick wrote:
       | > `pee` [...] It runs the commands using popen, so it's actually
       | passing them to /bin/sh -c (which on most systems is a symlink to
       | Bash).
       | 
       | Do not assume /bin/sh is Bash!!
       | 
       | On Debian-based systems, including Ubuntu, Dash is the default
       | non-interactive shell. Specifically, Dash does not support common
       | Bashisms.
       | 
       | (Not to mention Alpine, OpenWRT, FreeBSD, ...)
       | 
       | This is a bit of a pet-peeve of mine. If you're dropping a
       | reference to `/bin/sh -c` like the reader knows what that means,
       | then you don't need to tell them that "it's a symlink to Bash."
       | They know their own system better than you.
        
         | 8organicbits wrote:
         | Huh. I didn't realize that
         | 
         | https://wiki.debian.org/Shell
        
           | throwaway892238 wrote:
           | Rule of thumb about shells:                 You'll never know
           | for sure what shell is the default, so write your scripts to
           | a minimum shell family, and encode the name of that shell
           | family in the shebang.
           | 
           | The default shell, for either the entire system or an
           | individual user, could be:                 - A Bourne shell
           | - A C shell       - A POSIX shell       - A "modern" shell,
           | like Bash, Zsh, Fish, Osh
           | 
           | Use the following shebangs to call the class of shell you
           | expect. Start with _/ usr/bin/env_ to allow the system to
           | locate the executable based on the current _PATH_ environment
           | variable rather than a fixed filesystem path.
           | #!/usr/bin/env sh       #              ^ should result in a
           | Bourne-like shell, on modern systems, but could also be
           | #                a POSIX shell (like Ash), which is not
           | really backwards compatible            #!/usr/bin/env csh
           | #              ^ should result in a C shell
           | #!/usr/bin/env bash       #              ^ should result in a
           | Bash shell, though the only version you should       #
           | expect is version 3.2.57, the last GPLv2 version
           | #!/usr/bin/env dash       #              ^ you expect the
           | system to have a specific shell. if your code depends       #
           | on Dash features, do this. otherwise write your code using an
           | earlier       #                version of the original shell
           | family, such as Bourne or POSIX.
           | 
           | If you need to pass a command-line argument to the shell
           | command before interpreting your script, use the fixed
           | filesystem path for the shell. Due to bugs in the way kernels
           | execute scripts, all arguments after the initial shebang path
           | may be sent as a single argument, which is probably not what
           | you (or the shell command) expect. (e.g. use _#! /bin/bash
           | --posix_ instead of _#! /usr/bin/env bash --posix_ as the
           | latter may fail)
           | 
           | Different shells have different implementation details. For
           | example, Bash tends to skew toward POSIX conformance by
           | default, even if it conflicts with traditional Bourne shell
           | behavior. Dash may try to implement POSIX semantics, but also
           | has its own specific behavior incompatible with POSIX.
           | 
           | In order to get specific behavior (like POSIX emulation), you
           | may need to add a flag like _--posix_ , or set an environment
           | variable like _POSIXLY_CORRECT_. For shells that support it,
           | you can also detect the version of the shell running, and
           | bail if the version isn 't compatible with your script.
           | 
           | Here are some references of the differences between shells:
           | - Comparison of command shells[1]       - Major differences
           | between Bash and Bourne shell[2]       - Practical
           | differences between Bash and Zsh[3]       - Fundamental
           | differences between mainstream \*NIX shells[4]
           | 
           | [1]
           | https://en.wikipedia.org/wiki/Comparison_of_command_shells
           | [2] https://www.gnu.org/software/bash/manual/html_node/Major-
           | Dif... [3]
           | https://apple.stackexchange.com/questions/361870/what-are-
           | th... [4] https://unix.stackexchange.com/questions/3320/what-
           | are-the-f...
        
         | throw0101b wrote:
         | > _Specifically, Dash does not support common Bashisms._
         | 
         | More importantly, Bash should not support Bashims when called
         | as /bin/sh (either).
         | 
         | If you want to use Bashisms just invoke Bash.
        
       | cassepipe wrote:
       | > In Vim / Helix you do that with :cq
       | 
       | Never heard of that before. I generally use :q! or ZQ
       | 
       | Is there a difference ?
        
         | mbwgh wrote:
         | Yes, the exit code. See e.g. `:help cq` in vim. :q! and ZQ will
         | yield exit code 0, which sometimes is not what you want if you
         | want to ensure some task is properly aborted.
        
           | cassepipe wrote:
           | Thanks !
        
       | katzgrau wrote:
       | `pee` - no doubt the dev was delighted and amused
        
       | opan wrote:
       | vidir within ranger is really nice. vipe is also pretty cool.
       | Mostly I use vipe for editing my clipboard contents and then
       | sending the modified version back to the clipboard, or
       | occasionally editing some text stream before sending it to my
       | clipboard, such as some grep output I only want some of.
        
       ___________________________________________________________________
       (page generated 2023-12-30 23:00 UTC)