[HN Gopher] Understanding Awk ___________________________________________________________________ Understanding Awk Author : todsacerdoti Score : 332 points Date : 2021-09-30 15:27 UTC (7 hours ago) (HTM) web link (earthly.dev) (TXT) w3m dump (earthly.dev) | adamgordonbell wrote: | Thanks for sharing this. I'm the author. | | When I wrote my introduction to JQ someone mentioned JQ was | tricky but super-useful like AWK. I nodded along with this, but | actually, I had no idea how Awk worked. | | So I learned how it worked and wrote this up. It is a bit long, | but if you don't know Awk that well, or at all, I think it should | get the basics across to you by going step by step through | examining the book reviews for The Hunger Games trilogy. | | Let me know what you think. And also let me know if you have any | interesting Awk one-liners to share. | choffman wrote: | I really appreciate you writing this guide. As a long time | Linux user, I've always wanted to learn AWK, but it seemed too | daunting. Three minutes into your guide and I immediately saw | how I could use it in my day-to-day usage. | adamgordonbell wrote: | Thank you! It took me longer to write then I expected it | would. I was originally just going to do some small examples | of each idea. | | But once I got the idea of aggregating the book review data | from amazon I felt I had to see it through. | foobarian wrote: | The funny thing is, by and large my only use case for awk is to | print out whitespace delimited columns where the amount of | whitespace is variable. Surprisingly hard to do with other Unix | tools. | | Neat discussions around that sort of thing at least here: | https://news.ycombinator.com/item?id=23427479 | goohle wrote: | ls -l | tr -s ' ' | cut -d ' ' -f 5 | foobarian wrote: | Exactly! Exactly! And now fix it to work with tabs :-) | tyingq wrote: | And leading whitespace. Compare: $ printf | " one two three" | tr -s ' ' | cut -d ' ' -f 1 | $ printf " one two three" | awk '{print $1}' one | goohle wrote: | ps ax | sed 's/^\s\+//; s/\s\+/ /g;' | cut -d ' ' -f 4 | goohle wrote: | echo -e '1\t2\t3\t4\t5' | expand -t 1 | cut -d ' ' -f 3 | tyingq wrote: | The syntax isn't nearly as nice, but Perl can be handy if | you're doing something more after splitting into columns. And | it's usually already there / installed, like awk. For just | columns: $ printf "a b c d e\n1 2 3 4 5" | | perl -lanE 'say "$F[2] $F[4]"' c e 3 5 | adamgordonbell wrote: | It surprized me that AWK had dictionaries and no | declaration of vars that make it feel like a modern | scripting langauge even though it was written in the 70s. | | It turns out though that this is because Perl and later | Ruby were inspired by AWK and even support these line by | line processing idioms with BEGIN and END sa well. | ruby -n -a -e 'puts "#{$F[0] $F[1]}"' ruby -ne | ' BEGIN { $words = Hash.new(0) } | $_.split(/[^a-zA-Z]+/).each { |word| | $words[word.downcase] += 1 } END { | ... | flandish wrote: | A long while ago I wrote up a little processor to determine | field lengths in a given file - I forgot the original reason. | ( https://github.com/sullivant/csvinfo ) | | However, I feel I really should have taken the time to learn | Awk better as it could probably be done there, and simply! | (It was a good excuse to tinker with rust, but that's an | aside.) | tyingq wrote: | For some idea, a one liner to find the (last) longest | username and length in /etc/passwd: $ awk | -F: '{len=length($1);if(len>max){max=len;user=$1}}END{print | user,max}' /etc/passwd | flandish wrote: | Thanks for that reply! It's good to work with an example. | genewitch wrote: | I'll mark this on my GitHub when I get back on a computer, | I take public datasets and make graphs and transforms and | reports. The big survey companies have weird data records | and having to write a parser is my least favorite part. I | think other people who ingest my content don't appreciate | the effort, but that's a near universal feeling I think, | heh. | adamgordonbell wrote: | choose from your link does look nice for simple column | selection. echo -e "foo bar baz" | | choose -1 -2 | | vs awks echo -e "foo bar baz" | awk '{ | print $2, $3}' | | I love the effort people are putting into reinventing the | core unix tools. | | I think I'll stick with Awk for now though. | foobarian wrote: | The problem with new tools is | | $ choose | | bash: choose: command not found... | twic wrote: | If i don't use awk, i throw tr -s ' ' into the pipeline, and | then the delimiter is a single space, so you can just cut. | kevinwang wrote: | As someone who's never used awk before, I really enjoyed this | write-up and I think it was very well written! | mousepilot wrote: | chiming in, I had a feeling that the article and the comments | here would contain some jewels and both have exceeded | expectations. | nrclark wrote: | I'm always happy when I see posts that promote AWK. It's a very | underappreciated tool in my opinion. I was a Linux user for 20 | years before I got familiar with it. AWK is super powerful for | text processing, and I like that it's included in Busybox for use | on the embedded systems that I design. | | For any complex text processing, it's way better and more robust | than having a super long pipeline of a bunch of sed/grep. | | Most recently, I used awk in a script that parses /proc/mount to | grab the mountpoint of a partition, or print something different | if the partition isn't mounted. Doable with a bunch of sed/grep | and some shell logic? Definitely. But easier and cleaner in AWK, | and equally easy to inline in a shell-script. | throwaway894345 wrote: | I do a lot of work with structured data--json, yaml, etc. For | me, this is how I feel about jq. One of my favorite use-cases | is querying Kubernetes resources. E.g., `kubectl get secret | <secret-name> -o json | jq -r '.data | map_values(@base64d)'` | (fetch a secret and decode all of its values). | kbenson wrote: | I've never bothered to learn much AWK, but that's mostly | because Perl is my bread and butter language and has been for | 20 years, and focusing on knowledge of that seemed a better | investment (especially since with a few judicious flags, Perl | is a passable AWK replacement even for very small one liners). | | That said, if you just want to supplement your knowledge of | other shell tools and pull out something that can do some | obvious text munging, AWK has always looked attractive for the | task to me. | chasil wrote: | The problem is that awk is in POSIX, and perl is not. | | There are two common sources of awk for Windows, for example, | that drop one exe to provide the interpreter: | | http://unxutils.sourceforge.net/ | | https://frippery.org/busybox/ | | Perl simply wasn't designed to do that. | newaccount2021 wrote: | But perl is available by default in almost every free *nix, | and for most people, Windows isn't a requirement | Lio wrote: | Yep awk is lovely and well worth the time to learn. | | This is probably not important for embedded but doesn't a | pipeline of small scripts (which could be in awk) give you | better threading support? | | Xargs, GNU parallel or even make then scale that out really | quickly. | SavantIdiot wrote: | Came here to say this. Glad to see /bin getting respect. | | To anyone processing huge quantities of text and text files, | someone very likely had the same problem you faced back in the | 1980's and there's a Unix/GNU tool for it already. | dylan604 wrote: | I was introduced to *nix from processing very large text | files that text editors I was familiar with choked and died. | Someone showed me sed/awk/grep, and it took seconds to | process when other GUI editors couldn't open the file. Never | looked back. | GekkePrutser wrote: | Not having to parse the output at all is even better. I really | like the way Powershell can pass structured data like this. | | I'm a huge Linux/Unix fan but sometimes a rethink really works | out. I hope Linux will get something similar. I know Powershell | is available for Linux but without an adapted userland there's | not much benefit | invisible wrote: | For some purposes, awk+xargs can replace hours of work to write | a tool to automate some process. It's my go-to for ops work | that I don't expect to live very long and just needs to | _happen_. | | Also, happy 1337 karma day :). | 5e92cb50239222b wrote: | > awk+xargs can replace hours of work | | Including machine hours of work. | | Wasn't there a famous story of replacing a Hadoop cluster | with an awk script (which was a couple orders of magnitude | faster)? | | Oh yes, there was: | https://news.ycombinator.com/item?id=17135841 | dapids wrote: | In fairness it's xargs that is providing the command | parallelization, not awk, but I agree both combined are a | good match. | genewitch wrote: | If one considers the idea of map reduce to be taking a set | of data and ending up with a subset that is relevant, I've | used tons of simple things to do that, and never Hadoop. | | I think parsing logs to find pain areas or potential | exploit/exfil is a map reduce job, for instance, and grep | or awk can manage that just fine. | freedomben wrote: | Nice article. Seems we went through a very similar progression! | :-D | | If anyone is interested in learning more, I built a conference | talk to teach awk, and a set of exercises also that has gotten | pretty positive feedback: | | Presentation: https://youtu.be/43BNFcOdBlY | | Exercises (for you to try): https://github.com/FreedomBen/awk- | hack-the-planet | | Exercises (me solving): https://youtu.be/4UGLsRYDfo8 | stevebmark wrote: | There are things I've come to dislike and avoid when programming | in general: | | - Avoid programming in strings (especially in Bash, where nested | quotes are full of pitfalls) | | - Avoid magic switches that change behavior (like -F) | | - Avoid terse or cryptic variable names (like $NF) | | - Avoid terse and magical syntax (sorry Perl, happy to leave you | behind me) | | - Avoid programs that are hard to read | | - Avoid programs that are difficult to debug while writing them | | - Avoid programs that ignore types | | For these reasons, I prefer to avoid awk for anything except the | most trivial of tasks. I think the prevalence of scripting | languages and the speed of execution and debugging today has made | awk not as necessary as it may have been in the 70s. And as to | the first point, I'm aware you can write awk scripts in files, | and I feel like if your script has gotten complex enough that you | need a file, you're creating something unmaintainable and | unreadable that would be better suited in a different programming | language. | | Edit: I should add this article is great and a good introduction | to awk, regardless of my personal taste for the tool. | throwaway38941 wrote: | I've been doing systems work for 20 years. Here's why most of | those things are actually good: | | - Strings are subtly complex, but strings are not variables. | You can assign a string, and later handle it as a variable, and | not deal with any of the specifics of string-iness. Likewise, | you can take a variable, and later treat it as a string (for | loosely or not-typed variables). | | - Magic switches are not magic, they are options. Virtually | every program takes options. Sometimes they impact a lot of | things, sometimes a little. Only the context determines how | much is "too much". | | - Terse/cryptic variables allow you to write complex | expressions in a compact form. This allows you to read more in | a small space, making it easier to reason about or form complex | expressions. Human languages are flush with these, as is | mathematics. But you have to balance the terse, cryptic and | magical with guilelessness, or it becomes a mess. | | - Terse and magical syntax is, again, a feature, not a bug. | Using magical syntax I can do in a few characters what would | take me many lines with a traditional language, and as we all | know, increased number of lines correlates to bugs, in addition | to simply making it harder to grok. | | - Types aren't ignored, but they may be very loosely enforced. | If you want to write a quick program to get something done, | typing is a curse. If you want to write a very thorough | program, typing is a blessing. In many cases, loosely or | untyped programs actually work _better_ than their typed | cousins, because they allow for more unexpected behaviors | without failing. Failing early and often may be a modern trend, | but... it literally means things fail more, and this is often | not desirable. | | Caveats: | | - Programs that are hard to read do indeed suck, and it takes | lots of experience to make some kinds of programs easier to | read. But that's not an indictment of the program, it's an | indictment of the person who wrote it. We don't indict English | when somebody writes a document that's impossible to | comprehend. | | - Interestingly, some of the more popular languages are the | worst to debug. Perl is probably one of the easiest languages | to debug, not inconsequently because of how good the | interpreter is at suggesting to the user what the actual | problem was and almost exactly how to fix it. | [deleted] | jrumbut wrote: | The thing that prevents awk from being a major part of my daily | routine is that it (amazingly) has poor CSV support. Consider | the following: | | col1,col2,col3 | | 1,2,3 | | 4,"hello, \"world\"",6 | | "7 buckets",,9 | | To get the usual awk experience with this very common file | format, exactly the type of thing you want to parse with awk, | you first need to install gawk, then use a big FPAT regex that | needs to be adjusted for any new CSV variant. | | I would love to see awk with "CSV mode", where it intelligently | handles formats like this if you just pass a flag. I think awk | would do well to differentiate itself with excellent 2d dataset | parsing functionality, but at least catchup up to the average | scripting language would be great. | | I'm half expecting someone to say "just pass -csv it does what | you want" and if so I'll be very excited. | nmz wrote: | You can just use https://github.com/Nomarian/Awk- | Batteries/blob/master/Units/... and use as so | awk -f ./ucsv.awk -e '{print $5}' | | Also this | | > 4,"hello, \"world\"",6 | | Is incorrect per https://tools.ietf.org/html/rfc4180 so you | should just fix it with a sed -i 's/\\\"/""/g' and then just | parse as normal. | | https://github.com/Nomarian/Awk-Batteries/wiki/Formats | sk5t wrote: | 'miller' and 'xsv' are pretty good tools for wrangling CSV. | (And regexp is kind of a terrible tool for it, too many edge | cases.) | jrumbut wrote: | Yeah, I don't want to have to write a CSV library each | time, that's what I'm trying to get at. | | I just end up using Python/Perl but I do have a soft spot | for awk so it would be cool if good support was built-in. | sk5t wrote: | Who's writing a library? Just use xsv or miller to | extract the bits you want from the CSV, change the | delimiter or escapes to something more convenient, etc., | then feed that to awk or other CSV-unaware text | processors. | jrumbut wrote: | I was agreeing with your point about regexes, that it's | good to avoid trying to deal with all the corner cases | yourself when you're just trying to write a small script. | sk5t wrote: | Ah, understood! CSV is funny, it seems like a more | trivial thing than it really is, and its human | readability sort of invites broken approaches in a way | that something like Parquet would not. | | XML is somewhere in the middle--I've seen some horrible | abuses of CDATA sections way back when--but at least | there are accepted ways to prove what's invalid. | nickcw wrote: | There is an answer to CSV mode a bit further down the page | | https://news.ycombinator.com/item?id=28708145 | | ...but if your files are CSV, there is a CSV extension for | gawk @include "csv" BEGIN { CSVMODE | = 1 } | jrumbut wrote: | Well there you go, for the sake of my pride at least it's | an extension. | | It's funny searches for awk CSV seem to yield a bunch of SO | questions where the answers are increasingly cumbersome | regexes instead of this extension. | | Of course, you can't count of this extension being widely | installed, but it's great for my own desktop. | nmz wrote: | that's because the extension only works in gawk. its not | portable anywhere else. | [deleted] | m463 wrote: | I use awk for one-liners, no more. | | Looking at my command history, I mostly use awk to extract a | field like this: <something> | awk '{print | $3}' | | (I know "cut" is supposed to do the same thing, but it was | never reliable for me - maybe tabs/spaces?) | likpok wrote: | Consider the input a b | | Awk will treat it as having two columns (by default), while | cut will treat each space as it's own column. | | Awk is also a little nicer for whitespace; cut makes | specifying the delimiter (with say "-d\ ") a little more | vexing. | chasil wrote: | Here is a GAWK program of mine that implements outgoing SMTP. | While not a one-liner, this is much shorter and less tedious | than trying to do it in C. $ cat | /bin/awkmail #!/bin/gawk -f BEGIN { | smtp="/inet/tcp/0/smtp.yourco.com/25"; ORS="\r\n"; | r=ARGV[1]; s=ARGV[2]; sbj=ARGV[3]; # /bin/awkmail to from | subj < in print "helo " ENVIRON["HOSTNAME"] | |& smtp; smtp |& getline j; print j print | "mail from: " s |& smtp; smtp |& getline | j; print j if(match(r, ",")) { | split(r, z, ",") for(y in z) { print "rcpt to: " | z[y] |& smtp; smtp |& getline j; print j } } | else { print "rcpt to: " r |& smtp; smtp |& | getline j; print j } print "data" | |& smtp; smtp |& getline j; print j print | "From: " s |& smtp; ARGV[2] = "" # | not a file print "To: " r | |& smtp; ARGV[1] = "" # not a file if(length(sbj)) | { print "Subject: " sbj |& smtp; ARGV[3] = "" } # not a | file print "" |& smtp | while(getline > 0) print |& smtp | print "." |& smtp; smtp |& | getline j; print j print "quit" | |& smtp; smtp |& getline j; print j close(smtp) | } # /inet/protocol/local-port/remote-host/remote-port | meltedcapacitor wrote: | Cheap fix: the space after MAIL FROM: and RCPT TO: is not | standard compliant. | m463 wrote: | IMHO, that's too big for awk, why not python? | | for example: #!/usr/bin/python | import smtplib from email.mime.text import MIMEText | msg = 'hi' subj='read this!' | smtp_server='mail.example.com' | smtp_from='me@example.com' | smtp_to='you@example.com' m = MIMEText(msg) | m['To'] = smtp_to m['From'] = smtp_from | m['Subject'] = subj s = | smtplib.SMTP(smtp_server) s.sendmail(smtp_from, | [smtp_to], m.as_string()) s.quit() | | of course, you seem to think in gawk so if that works for | you that's what you should continue doing! | | by the way, I hacked this example from another script which | attached a logfile: with | open(arg.logfile) as f: log_contents = f.read() | m = MIMEText(log_contents) | | you can also use: from email.mime.image | import MIMEImage from email.mime.text import | MIMEText from email.mime.multipart import | MIMEMultipart | | and then: m = MIMEMultipart() | m.attach(MIMEText('\n\n%s\n\n'%xkcd_img_title)) | m.attach(MIMEImage(xkcd_img)) | chousuke wrote: | Your script doesn't even do the same thing. You are | importing a library that implements SMTP, which is | missing the point. | | The AWK script doesn't need libraries, so it can actually | be useful in places where you have awk but not Python. | jrumbut wrote: | That's a beautiful use of the language, it reminds me of | some of the awk CGI efforts out there. | | For example: https://www.gnu.org/software/gawk/manual/gawki | net/html_node/... | ChuckMcM wrote: | I take it you LOVE ada :-) | | There is a lot of wisdom in the things you avoid, however I | would ask one question, "How often do you use it?" | | For me, the best systems are those that can be wordy and | prescriptive but as you get to know them you can use more short | hand so they "get out of the way" as it were. A good example of | that philosophy is keyboard short cuts. When I'm learning a | program I'm happy to pause and sling the mouse around to find | the thing I need in the labeled menu stack with an appropriate | name which also tells me what the keyboard short cut is for | that thing. Then as I get better I can just use the short cut | and my workflow gets faster. Once I've internalized the keymap | my flow is held up by how fast I can think, not by how fast I | can take my hand off the keyboard, move the mouse, click and | then put it back on the keyboard. | | Awk is one of those things that once you internalize what it | can do, you can use it for a lot of stuff, and you can do it | quickly. | ketanmaheshwari wrote: | One tip I have to make large-ish awk programs readable is to name | the columns in the BEGIN section. Then, you'd use $colname | instead of $1, $2, etc. for instance: | | BEGIN{ item_type = 1; item_name = 2; price = 3; sale = 4; #etc } | | Now, in place of $1, you'd say $item_type which significantly | improves overall readability of the code. | jayknight wrote: | I've also used this to address columns by name for files with | lots of columns that I'm too lazy to count: | https://unix.stackexchange.com/a/359699 | dredmorbius wrote: | You can also put a similar code block at the start of a general | processing entry. This applies on both flat (uniform record) | and hierarchical (multiple record-type) data. | | E.g.: { name = $1 dob = $2 | grade = $3 # ... # Do stuff with name / | dob / grade, etc. } | | If the data are structured, so that there are multiple record | types (typically defined by prefix or some other regex) you can | put variable assignments within each block. | /^rectype1/ { var1 = $1; var2 = $2, ... } /^rectype2/ { | varA = $1; varB = $2, ... } | | I prefer to leave BEGIN blocks for defining constants or tables | and such. | ulucs wrote: | Nice tip, so basically like excel with tables | dima55 wrote: | If you want to do that, use vnlog instead. You're 90% there | already. | | https://github.com/dkogan/vnlog/ | ufo wrote: | One thing that I would love to hear about is suggestions of how | to make my files/output more awk-friendly. | adamgordonbell wrote: | This isn't your question but if your files are CSV, there is a | CSV extension for gawk @include "csv" | BEGIN { CSVMODE = 1 } | tejtm wrote: | Tab separated values all the things | buzzwords wrote: | Thanks for this tutorial and everyone else that posted some great | tips and links. I find myself needing to use awk once in a blue | moon and every time it eats a lot of my time. I hope I remember | your tutorial next time I need it. | jrochkind1 wrote: | This is a great model of how to do a tutorial. | 1vuio0pswjnm7 wrote: | Its common as in the OP to see awk recommended for something as | simple as extracting a column from tab or space-separated values. | IMO, its quite a bit of typing to do on the fly at a command | prompt. Performance-wise, it could be significantly slower that | other utilities that are equally as ubiquitous as awk. | echo one two three|awk '{print $2}' | | Are there other ways to do this. Are they faster. | cat > awc #!/bin/sh test $# -eq 1||exit | exec tr \\40 \\11|exec cut -f "$1"|exec tr \\11 \\40 ^D | echo one two three|awc 2 | | Test it on a file to see if it is faster than awk. | time awk '{print $2}' file time awc 2 < file | fmakunbound wrote: | For those kinds of tasks I use Awk to process the data into a | SQLite database. Then I do the queries on that since it's easier | and more advanced things (grouping, having) are much easier | declaratively. | mongol wrote: | Yes! Another recent thread recently discussed best practice and | whether something like that exist. I believe this is a good | example. | iefbr14 wrote: | It's awksome :) | bright_day wrote: | kkkkkk | calvinmorrison wrote: | Can't recommend the gawk manual enough, and "The awk manual" | enough | | https://www.gnu.org/software/gawk/manual/gawk.pdf | | and | | http://www.cs.unibo.it/~sacerdot/doc/awk/nawkA4.pdf | | enough | chasil wrote: | The original language specification, written by the authors, is | now free online. Chapter 2 covers the whole language in a | little over 40 pages. | | https://archive.org/download/pdfy-MgN0H1joIoDVoIC7/The_AWK_P... | calvinmorrison wrote: | have a copy on my bookshelf! Didn't have a pdf though nice. | | The gawk one is useful if you're into some of the gnuism | specifics | dredmorbius wrote: | Severely underrated comment. | | Having relied heavily on the (unofficial, non-GNU) gawk manpage | extensively (it's quite good), I instantly started learning | very useful features reading the GNU docs. (I still need to | fully internalise those). Yes, the full manual is very much | better than the manpage. | | (Also recommend _The AWK Programming Language_ mentioned here, | though I 'd suggest the GNU manual adds to that as well.) | corpMaverick wrote: | I find it amusing that AWK is coming back. I used it extensively | back on the day, but let it go when I picked up Perl 4 and then | Perl 5. So Perl is no longer king for unix scripting. It was | replaced by other languages; but it seems like there is a niche | that they were not able to fill since AWK is back. | xphos wrote: | This was one of the best awk tutorials I've read its very concise | and digestible. I sometimes use awk but the more complex things | get the more i feel like i cannot use it. This tutorial made me | feel otherwise | ChuckMcM wrote: | And if you learned awk(1) first, then when you saw perl for the | first time it immediately made sense to you as a 'super awk'. | abzug wrote: | That happened to me. AWK -> Perl -> Ruby. | theophrastus wrote: | At some point in every bioinformatics lecture i always manage | something akin to: "Learn awk! (or perl) You'll need it. Your | data will come from various disparate sources, and you need to | get them into some well-defined useful format from the get go." | cafard wrote: | Thanks! I had been putting it off, but after looking at the | article, I wrote a little but useful script with a line of awk in | it. | naikrovek wrote: | so this isn't related to the article so much, but to something | the article reminded me about: why do people use /usr/bin/env to | find a program rather than setting the PATH within the script to | a known-good value then using that to locate things? | | the path that /usr/bin/env returns is (essentially) a global | variable that can change underneath you, right? I mean that just | screams "variable that may be changed by others" to me. | | I've never understood why /usr/bin/env exists. | dredmorbius wrote: | Portability. | | The /usr/bin/env trick will work on a wide range of systems, in | which even common utilities might have numerous locations: | /bin, /sbin, /usr/bin, /usr/bin/local, /opt, or others. If | you're writing scripts for portability and ohers, this has | value. | | That said, /usr/bin/env fails on Android/Termux AFAIU. | dmux wrote: | I've never gone further than thinking about it, but I've always | been curious as to how simple it would be to use Awk as an | interpreter for a really simple Tcl-like language: | set a 1 set b 2 define add (n,m) $n + $m | set result [add a b] | | I think it would be simple enough to come up with some Awk | pattern/actions to parse the above and execute the commands. | Stratoscope wrote: | I used to love Awk! I still do, even if I don't use it much any | more. | | Awk has a reputation for being hard to read (as noted in | stevebmark's comment), but when I was using it actively, I tried | to treat it as a serious programming language and write readable | programs in it. | | Several years ago I tracked down a couple of my old Awk programs | from around 1990 and posted them here: | | https://github.com/geary/awk | | SHANEY.AWK is an implementation of the infamous Mark V. Shaney: | | https://www.clear.rice.edu/comp200/09fall/textriff/sci_am_pa... | | This was probably the first program that made me really impressed | with Awk. People were writing rather complicated Shaney | implementations in C, and I thought, "this could be really simple | in Awk." And it was! | | LJPII.AWK is the Awk program I'm most proud of. This was in the | days when we had tiny screens and no multiple monitors and you | always printed out your code to read it. In my circles we also | fond of inserting "separator lines" between functions, in various | formats such as this one: // - - - - - - - - - - | - - - - - - - - | | So I wrote LJPII to print source code in "two up" format (two | pages side by side in landscape mode) on my LaserJet II. It also | converted the separator lines into graphical boxes, and tried to | avoid splitting a function across multiple pages. It wasted some | paper but made nicely readable printouts. | | I wish I still had some of my old printouts, but they are long | gone. One of these days I will have to see if I can update the | code to work with the LaserJet emulation in my Brother printer! | (It should mostly work, but I wrote this in the old Thompson Awk | for DOS, so there are a couple of non-standard things in it.) | | Looking at the code again, it's amusing to see some old Windows | Hungarian notation which was popular/notorious back then, for | example an "f" prefix for a boolean (flag) value, and "af" prefix | for an array of flags. | | Hungarian aside, I tried to make this code as readable as I | could. | | Random fun fact! Someone who used to be an avid Awk programmer is | Will Hearst (William Randolph Hearst III). It's been many years | since I talked with him, so no idea if he still does any Awk | programming. | whymarrh wrote: | "If you like this you might also like" https://ferd.ca/awk- | in-20-minutes.html | | I too am happy to see more Awk material in the world, once I | learned a bit about it I started reaching for it more and more. | MisterTea wrote: | > _Awk is a record processing tool_ | | Actually, AWK is a domain specific programming language. When you | start treating AWK as such then you can really gain an | appreciation for it. I too treated it as a dumb one liner | relegated to ingesting cryptic regexp one liners in shell | scripts. After reading the original AWK book it completely | changed my outlook on the language. I had no idea you could | define functions or perform basic math so one could use it for | very basic tabular operations such as spread sheets. AWK can even | be used as a standalone language outside of shell scrips by | writing a program, insert a shebang on the first line calling | awk, and mark the file as executable. | adamgordonbell wrote: | shebangs and more complex scripts are covered in the article. | | But yes, I agree that the original AWK book is really good. | After covering some basics and the language reference, it has | some fun projects that you can build with AWK. | EvanKelly wrote: | Lots of great AWK tutorials in here that are more in depth, but | I'll share another. I always go back to Brian Kernighan's | personal help file: | | https://www.cs.princeton.edu/courses/archive/spring19/cos333... | | Brian Kernighan has a knack for explaining languages very | precisely and elegantly. | cf100clunk wrote: | And for the flash card type of learners it is good to see the | "HANDY ONE-LINE SCRIPTS FOR AWK" page is still available. See | the links in the Credits section at the bottom for more great | reading: | | https://www.pement.org/awk/awk1line.txt | | That author also edited the "USEFUL ONE-LINE SCRIPTS FOR SED" | page: | | https://www.pement.org/sed/sed1line.txt | zabzonk wrote: | Well, this is OK I guess. But if you really want to learn Awk you | want the book "The AWK Programming Language", mostly written by | Brian Kernighan (he's the K in AWK and in K&R), and as usual for | all of his books, it's brilliant. | dang wrote: | Significant past threads. I had to leave a ton of submissions | out! Any others that are particularly good? | | _Awk: The Power and Promise of a 40-Year-Old Language_ - | https://news.ycombinator.com/item?id=28441887 - Sept 2021 (118 | comments) | | _Awk is the coolest tool you don 't know_ - | https://news.ycombinator.com/item?id=27039608 - May 2021 (20 | comments) | | _CGI with Awk on OpenBSD Httpd (2020)_ - | https://news.ycombinator.com/item?id=27037113 - May 2021 (22 | comments) | | _The State of the Awk_ - | https://news.ycombinator.com/item?id=25142867 - Nov 2020 (58 | comments) | | _Awk: `Begin { ` Part 1_ - | https://news.ycombinator.com/item?id=24940661 - Oct 2020 (106 | comments) | | _Show HN: Awk-JVM - A toy JVM in Awk_ - | https://news.ycombinator.com/item?id=23612910 - June 2020 (27 | comments) | | _Running Awk in parallel to process 256M records_ - | https://news.ycombinator.com/item?id=23394024 - June 2020 (101 | comments) | | _The State of the AWK_ - | https://news.ycombinator.com/item?id=23240800 - May 2020 (86 | comments) | | _Awk in 20 Minutes (2015)_ - | https://news.ycombinator.com/item?id=23048054 - May 2020 (126 | comments) | | _Show HN: An eBook with hundreds of GNU Awk one-liners_ - | https://news.ycombinator.com/item?id=22758217 - April 2020 (48 | comments) | | _Learn Awk by Example (2019)_ - | https://news.ycombinator.com/item?id=22455779 - March 2020 (29 | comments) | | _Awk As A Major Systems Programming Language, Revisited (2018)_ | - https://news.ycombinator.com/item?id=22304017 - Feb 2020 (80 | comments) | | _Why Learn Awk? (2016)_ - | https://news.ycombinator.com/item?id=22108680 - Jan 2020 (235 | comments) | | _Learn Just a Little Awk (2010)_ - | https://news.ycombinator.com/item?id=21101478 - Sept 2019 (69 | comments) | | _Awk by Example_ - https://news.ycombinator.com/item?id=20308865 | - June 2019 (21 comments) | | _Removing duplicate lines from files keeping the original order | with Awk_ - https://news.ycombinator.com/item?id=20037366 - May | 2019 (154 comments) | | _GNU Awk 5.0_ - https://news.ycombinator.com/item?id=19671983 - | April 2019 (49 comments) | | _Learn just a little Awk (2010)_ - | https://news.ycombinator.com/item?id=17322412 - June 2018 (244 | comments) | | _The Awk Programming Language (1988) [pdf]_ - | https://news.ycombinator.com/item?id=17140934 - May 2018 (207 | comments) | | _Learn to use Awk with hundreds of examples_ - | https://news.ycombinator.com/item?id=15549318 - Oct 2017 (116 | comments) | | _Awk for multimedia_ - | https://news.ycombinator.com/item?id=15410259 - Oct 2017 (24 | comments) | | _Awk driven IoT_ - https://news.ycombinator.com/item?id=14735752 | - July 2017 (35 comments) | | _Skip grep, use awk_ - | https://news.ycombinator.com/item?id=14692233 - July 2017 (130 | comments) | | _Awk vs. Perl (2009)_ - | https://news.ycombinator.com/item?id=14647022 - June 2017 (71 | comments) | | _The Awk Programming Language (1988) [pdf]_ - | https://news.ycombinator.com/item?id=13451454 - Jan 2017 (103 | comments) | | _Show HN: 3D shooter in your terminal using raycasting in Awk_ - | https://news.ycombinator.com/item?id=10896901 - Jan 2016 (55 | comments) | | _Awk in 20 Minutes_ - | https://news.ycombinator.com/item?id=8893302 - Jan 2015 (85 | comments) | | _An Awk Primer_ - https://news.ycombinator.com/item?id=7961848 - | June 2014 (28 comments) | | _A Crash Course In Awk_ - | https://news.ycombinator.com/item?id=6578960 - Oct 2013 (37 | comments) | | _Why Awk for AI? (1997)_ - | https://news.ycombinator.com/item?id=5725291 - May 2013 (53 | comments) | | _Ask HN: Do people build websites in Awk?_ - | https://news.ycombinator.com/item?id=5041323 - Jan 2013 (12 | comments) | | _Why you should learn just a little Awk - A Tutorial by Example_ | - https://news.ycombinator.com/item?id=2932450 - Aug 2011 (76 | comments) | | _Announcing my first e-book "Awk One-Liners Explained"_ - | https://news.ycombinator.com/item?id=2674284 - June 2011 (24 | comments) | | _AWK-ward Ruby_ - https://news.ycombinator.com/item?id=2486231 - | April 2011 (31 comments) | | _Music with AWK_ - https://news.ycombinator.com/item?id=2294909 | - March 2011 (15 comments) | | _Exercise #1: Learning awk Basics_ - | https://news.ycombinator.com/item?id=2210085 - Feb 2011 (20 | comments) | | _Why you should learn at least a little bit of Awk_ - | https://news.ycombinator.com/item?id=1738688 - Sept 2010 (62 | comments) | | _Don 't MAWK AWK - the fastest and most elegant big data munging | language_ - https://news.ycombinator.com/item?id=815529 - Sept | 2009 (22 comments) ___________________________________________________________________ (page generated 2021-09-30 23:00 UTC)