[HN Gopher] Show HN: Hck - a fast and flexible cut-like tool
       ___________________________________________________________________
        
       Show HN: Hck - a fast and flexible cut-like tool
        
       Author : totalperspectiv
       Score  : 101 points
       Date   : 2021-07-10 15:46 UTC (7 hours ago)
        
 (HTM) web link (github.com)
 (TXT) w3m dump (github.com)
        
       | queuebert wrote:
       | Yay, no more piping multiple cuts when you have multiple
       | delimiters.
        
       | lillesvin wrote:
       | I wrote something similar (but necet really finished it), called
       | 'gut', in Go a few years back. Funny thing is, that I literally
       | never use it. I thought splitting on regexes and that stuff would
       | be super useful, but it turns out that I just use Perl one-liners
       | instead. And Perl is available on something like 99.99% of all
       | *nix machines, which my own 'cut'-substitute isn't.
       | 
       | Still a good exercise for me to write it, and I assume for OP
       | too.
        
         | mongol wrote:
         | A book "Minimal Perl" used to be referred to often in these
         | discussions but I never hear about it any more. It was teaching
         | these kind of tricks for command line magic.
        
         | c54 wrote:
         | I've never used perl, but i love concise bash 1-liner wizard
         | incantations. What are some examples of things it's handy for?
        
           | atsaloli wrote:
           | See https://catonmat.net/perl-one-liners-explained-part-one
           | 
           | And https://nostarch.com/perloneliners
        
         | totalperspectiv wrote:
         | It was indeed an great exercise! Part of the motivation for me
         | was also performance oriented. I should add some Perl one
         | liners to the benchmarks to see where they land as well. My
         | experience is that they are usually a bit slower than awk.
        
         | FractalHQ wrote:
         | What tool would you recommend to someone who is starting out
         | and wants to learn to write nifty scripts this day in age? I'm
         | currently studying bash but there are so many scripting
         | languages that I hear about and it's hard to know what to
         | invest time into.
        
           | andrewzah wrote:
           | For a lot of tasks, posix-compliant Bash scripts are more
           | than adequate. Use Perl, Python, or Ruby (your choice) if it
           | becomes more complex (especially with state). It's worth
           | considering ones that are installed by default on most linux
           | distros.
           | 
           | There's no reason to chase X script/lang of the month. Bash
           | etc are extremely well documented and there's a very good
           | chance someone already asked how to do something similar to
           | what you're doing on stackoverflow, etc.
        
           | fragmede wrote:
           | Invest time into what you need to get your job done. Easy
           | when summarized like that, but lets dig in.
           | 
           | First consider what systems you want your skills to be
           | applicable for.
           | 
           | Do you need tools that work on many random Linux machines
           | that you have little control over? Then go with the lowest
           | common denominator - bash, and various command line tools
           | (sed,awk,grep) included with every system, and get good with
           | the subset of command line options common on all of them -
           | most likely limited by the oldest system you need to work
           | with. (There are still Windows XP and Redhat 4 systems out in
           | the wild, if you're unlucky enough to have to work with
           | them.)
           | 
           | Do you need to work with OS X at all? I never learned to use
           | Apple's outdated versions of programs, instead I heavily
           | customized my laptop to have compatible versions of things
           | but this only works because there's 1 os x machine I ever
           | deal with.
           | 
           | Then it's about the right tool for the right job. Do you want
           | to process text? Awk will take you a _long_ way, but
           | ultimately, Perl is your friend. Do you want to want more
           | structured programming type things (aka objects /classes)?
           | Then Python is your friend. There's a certain mindset that
           | thinks that if everything is in one language things are
           | better, but that's a trap. With enough work, you can do the
           | same thing in any language, but each languages is better than
           | others at some specific thing. (working legacy code is that
           | something that a language can be better at than others.)
           | 
           | These days, it's more important to learn what tools are
           | available and how to use them, but because you can just
           | google 'awk print second to last column' and plug that into
           | your script, and continue working, there's less of a need to
           | truely grok awk's language (for example). (I mean, spend the
           | time to learn it once so it will come back to you the next
           | time you need to do something more custom with it)
        
             | JulianWasTaken wrote:
             | > instead I heavily customized my laptop to have compatible
             | versions of things but this only works because there's 1 os
             | x machine I ever deal with.
             | 
             | This is all good advice, but to be fair, "heavily
             | customized" these days is nearly:                   brew
             | install awk coreutils findutils gnu-tar gnu-sed gnu-which
             | gnu-time
        
       | toastal wrote:
       | Heck
        
         | valbaca wrote:
         | > hck is a shortening of hack, a rougher form of cut.
        
       | bilalhusain wrote:
       | It is interesting to note how it compares to "choose" (also in
       | Rust) in the benchmarks.
       | 
       | single character                   hck           1.494 +- 0.026s
       | hck (no-mmap) 1.735 +- 0.004s         choose        4.597 +-
       | 0.016s
       | 
       | multi character                   hck           2.127 +- 0.004s
       | hck (no-mmap) 2.467 +- 0.012s         choose        3.266 +-
       | 0.011s
       | 
       | The single pass optimization trick[1] seems to be helping a lot
       | in single character case.
       | 
       | Of course, doing away with a pass is suppossed to give 2x, and I
       | am wondering whether the regex constraint lead to this "side-
       | effect".
       | 
       | [1] fast mode -
       | https://github.com/sstadick/hck/blob/master/src/lib/core.rs#...
       | https://github.com/sstadick/hck/blob/master/src/lib/core.rs#...
        
       | visarga wrote:
       | <offtopic> I have implemented a `_split` command to split a line
       | by a separator and `_stat` command that does basically `sort |
       | uniq -c | sort -nr` counting elements and sorting by frequency.
       | Really useful operations for me.
       | 
       | When my one liners become 2-3 lines long I need to switch to a
       | regular script, but I also log all my shell commands years back
       | and have something a bit better than `history | grep word` to
       | search it.</>
        
       | rashil2000 wrote:
       | Love seeing these modern alternatives to coreutils! Ripgrep, fd,
       | hyperfine, bat, exa, bottom, gdu, wc, sd, hexyl...
       | 
       | Yet to find a GNU 'tr' alternative though
        
         | sieste wrote:
         | > Ripgrep, fd, hyperfine, bat, exa, bottom, gdu, wc, sd,
         | hexyl...
         | 
         | Thanks for that list! Is there any place where more of these
         | "modern alternatives to coreutils" are collected?
        
           | basetensucks wrote:
           | https://github.com/ibraheemdev/modern-unix is a pretty decent
           | list.
        
         | tyingq wrote:
         | Here's tr in Perl:
         | https://metacpan.org/dist/PerlPowerTools/source/bin/tr
        
         | kristopolous wrote:
         | What would you like it to do?
        
           | rashil2000 wrote:
           | It's not like anyone absolutely needs it, I was just
           | fascinated by the recent surge in faster and more cross-
           | platform utilities.
        
       | kitd wrote:
       | Nice work!
       | 
       | I don't know whether anyone here has used Rexx. The 'parse'
       | instruction in Rexx was incredibly powerful, breaking up text by
       | field/position/delimiter and assigning to variables all in one
       | line.
       | 
       | I've often wondered if there was a command-line equivalent. Awk
       | is great but you have to 'program' the parsing spec, rather than
       | declare it.
        
         | tyingq wrote:
         | Not declarative, but Perl can do something like that.
         | 
         | Delimeters/Regex:                 $ perl -ne
         | '($name,$pass,$uid,$gid,$therest)=split(/:/);print "$name
         | $gid\n"' /etc/passwd       root 0       daemon 1       bin 2
         | ...
         | 
         | Fixed width:                 $ printf "1234XY\n5678AB" | perl
         | -ne '($f1,$f2)=unpack("a4 a2");print "$f2 $f1\n"'       XY 1234
         | AB 5678
         | 
         | I believe Rexx's parse is fancier still, but this is reasonably
         | close.
        
         | twic wrote:
         | > Awk is great but you have to 'program' the parsing spec,
         | rather than declare it.
         | 
         | You could probably turn a declarative spec into an awk program
         | with an awk program.
        
       ___________________________________________________________________
       (page generated 2021-07-10 23:00 UTC)