[HN Gopher] Show HN: Hck - a fast and flexible cut-like tool ___________________________________________________________________ Show HN: Hck - a fast and flexible cut-like tool Author : totalperspectiv Score : 101 points Date : 2021-07-10 15:46 UTC (7 hours ago) (HTM) web link (github.com) (TXT) w3m dump (github.com) | queuebert wrote: | Yay, no more piping multiple cuts when you have multiple | delimiters. | lillesvin wrote: | I wrote something similar (but necet really finished it), called | 'gut', in Go a few years back. Funny thing is, that I literally | never use it. I thought splitting on regexes and that stuff would | be super useful, but it turns out that I just use Perl one-liners | instead. And Perl is available on something like 99.99% of all | *nix machines, which my own 'cut'-substitute isn't. | | Still a good exercise for me to write it, and I assume for OP | too. | mongol wrote: | A book "Minimal Perl" used to be referred to often in these | discussions but I never hear about it any more. It was teaching | these kind of tricks for command line magic. | c54 wrote: | I've never used perl, but i love concise bash 1-liner wizard | incantations. What are some examples of things it's handy for? | atsaloli wrote: | See https://catonmat.net/perl-one-liners-explained-part-one | | And https://nostarch.com/perloneliners | totalperspectiv wrote: | It was indeed an great exercise! Part of the motivation for me | was also performance oriented. I should add some Perl one | liners to the benchmarks to see where they land as well. My | experience is that they are usually a bit slower than awk. | FractalHQ wrote: | What tool would you recommend to someone who is starting out | and wants to learn to write nifty scripts this day in age? I'm | currently studying bash but there are so many scripting | languages that I hear about and it's hard to know what to | invest time into. | andrewzah wrote: | For a lot of tasks, posix-compliant Bash scripts are more | than adequate. Use Perl, Python, or Ruby (your choice) if it | becomes more complex (especially with state). It's worth | considering ones that are installed by default on most linux | distros. | | There's no reason to chase X script/lang of the month. Bash | etc are extremely well documented and there's a very good | chance someone already asked how to do something similar to | what you're doing on stackoverflow, etc. | fragmede wrote: | Invest time into what you need to get your job done. Easy | when summarized like that, but lets dig in. | | First consider what systems you want your skills to be | applicable for. | | Do you need tools that work on many random Linux machines | that you have little control over? Then go with the lowest | common denominator - bash, and various command line tools | (sed,awk,grep) included with every system, and get good with | the subset of command line options common on all of them - | most likely limited by the oldest system you need to work | with. (There are still Windows XP and Redhat 4 systems out in | the wild, if you're unlucky enough to have to work with | them.) | | Do you need to work with OS X at all? I never learned to use | Apple's outdated versions of programs, instead I heavily | customized my laptop to have compatible versions of things | but this only works because there's 1 os x machine I ever | deal with. | | Then it's about the right tool for the right job. Do you want | to process text? Awk will take you a _long_ way, but | ultimately, Perl is your friend. Do you want to want more | structured programming type things (aka objects /classes)? | Then Python is your friend. There's a certain mindset that | thinks that if everything is in one language things are | better, but that's a trap. With enough work, you can do the | same thing in any language, but each languages is better than | others at some specific thing. (working legacy code is that | something that a language can be better at than others.) | | These days, it's more important to learn what tools are | available and how to use them, but because you can just | google 'awk print second to last column' and plug that into | your script, and continue working, there's less of a need to | truely grok awk's language (for example). (I mean, spend the | time to learn it once so it will come back to you the next | time you need to do something more custom with it) | JulianWasTaken wrote: | > instead I heavily customized my laptop to have compatible | versions of things but this only works because there's 1 os | x machine I ever deal with. | | This is all good advice, but to be fair, "heavily | customized" these days is nearly: brew | install awk coreutils findutils gnu-tar gnu-sed gnu-which | gnu-time | toastal wrote: | Heck | valbaca wrote: | > hck is a shortening of hack, a rougher form of cut. | bilalhusain wrote: | It is interesting to note how it compares to "choose" (also in | Rust) in the benchmarks. | | single character hck 1.494 +- 0.026s | hck (no-mmap) 1.735 +- 0.004s choose 4.597 +- | 0.016s | | multi character hck 2.127 +- 0.004s | hck (no-mmap) 2.467 +- 0.012s choose 3.266 +- | 0.011s | | The single pass optimization trick[1] seems to be helping a lot | in single character case. | | Of course, doing away with a pass is suppossed to give 2x, and I | am wondering whether the regex constraint lead to this "side- | effect". | | [1] fast mode - | https://github.com/sstadick/hck/blob/master/src/lib/core.rs#... | https://github.com/sstadick/hck/blob/master/src/lib/core.rs#... | visarga wrote: | <offtopic> I have implemented a `_split` command to split a line | by a separator and `_stat` command that does basically `sort | | uniq -c | sort -nr` counting elements and sorting by frequency. | Really useful operations for me. | | When my one liners become 2-3 lines long I need to switch to a | regular script, but I also log all my shell commands years back | and have something a bit better than `history | grep word` to | search it.</> | rashil2000 wrote: | Love seeing these modern alternatives to coreutils! Ripgrep, fd, | hyperfine, bat, exa, bottom, gdu, wc, sd, hexyl... | | Yet to find a GNU 'tr' alternative though | sieste wrote: | > Ripgrep, fd, hyperfine, bat, exa, bottom, gdu, wc, sd, | hexyl... | | Thanks for that list! Is there any place where more of these | "modern alternatives to coreutils" are collected? | basetensucks wrote: | https://github.com/ibraheemdev/modern-unix is a pretty decent | list. | tyingq wrote: | Here's tr in Perl: | https://metacpan.org/dist/PerlPowerTools/source/bin/tr | kristopolous wrote: | What would you like it to do? | rashil2000 wrote: | It's not like anyone absolutely needs it, I was just | fascinated by the recent surge in faster and more cross- | platform utilities. | kitd wrote: | Nice work! | | I don't know whether anyone here has used Rexx. The 'parse' | instruction in Rexx was incredibly powerful, breaking up text by | field/position/delimiter and assigning to variables all in one | line. | | I've often wondered if there was a command-line equivalent. Awk | is great but you have to 'program' the parsing spec, rather than | declare it. | tyingq wrote: | Not declarative, but Perl can do something like that. | | Delimeters/Regex: $ perl -ne | '($name,$pass,$uid,$gid,$therest)=split(/:/);print "$name | $gid\n"' /etc/passwd root 0 daemon 1 bin 2 | ... | | Fixed width: $ printf "1234XY\n5678AB" | perl | -ne '($f1,$f2)=unpack("a4 a2");print "$f2 $f1\n"' XY 1234 | AB 5678 | | I believe Rexx's parse is fancier still, but this is reasonably | close. | twic wrote: | > Awk is great but you have to 'program' the parsing spec, | rather than declare it. | | You could probably turn a declarative spec into an awk program | with an awk program. ___________________________________________________________________ (page generated 2021-07-10 23:00 UTC)