tshocco - tomb - the crypto undertaker
 (HTM) git clone git://parazyd.org/tomb.git
 (DIR) Log
 (DIR) Files
 (DIR) Refs
 (DIR) README
 (DIR) LICENSE
       ---
       tshocco (16974B)
       ---
            1 #!/bin/sh
            2 # **shocco** is a quick-and-dirty, literate-programming-style documentation
            3 # generator written for and in __POSIX shell__. It borrows liberally from
            4 # [Docco][do], the original Q&D literate-programming-style doc generator.
            5 #
            6 # `shocco(1)` reads shell scripts and produces annotated source documentation
            7 # in HTML format. Comments are formatted with Markdown and presented
            8 # alongside syntax highlighted code so as to give an annotation effect. This
            9 # page is the result of running `shocco` against [its own source file][sh].
           10 #
           11 # shocco is built with `make(1)` and installs under `/usr/local` by default:
           12 #
           13 #     git clone git://github.com/rtomayko/shocco.git
           14 #     cd shocco
           15 #     make
           16 #     sudo make install
           17 #     # or just copy 'shocco' wherever you need it
           18 #
           19 # Once installed, the `shocco` program can be used to generate documentation
           20 # for a shell script:
           21 #
           22 #     shocco shocco.sh
           23 #
           24 # The generated HTML is written to `stdout`.
           25 #
           26 # [do]: http://jashkenas.github.com/docco/
           27 # [sh]: https://github.com/rtomayko/shocco/blob/master/shocco.sh#commit
           28 
           29 # Usage and Prerequisites
           30 # -----------------------
           31 
           32 # The most important line in any shell program.
           33 set -e
           34 
           35 # There's a lot of different ways to do usage messages in shell scripts.
           36 # This is my favorite: you write the usage message in a comment --
           37 # typically right after the shebang line -- *BUT*, use a special comment prefix
           38 # like `#/` so that its easy to pull these lines out.
           39 #
           40 # This also illustrates one of shocco's corner features. Only comment lines
           41 # padded with a space are considered documentation. A `#` followed by any
           42 # other character is considered code.
           43 #
           44 #/ Usage: shocco [-t <title>] [<source>]
           45 #/ Create literate-programming-style documentation for shell scripts.
           46 #/
           47 #/ The shocco program reads a shell script from <source> and writes
           48 #/ generated documentation in HTML format to stdout. When <source> is
           49 #/ '-' or not specified, shocco reads from stdin.
           50 
           51 # This is the second part of the usage message technique: `grep` yourself
           52 # for the usage message comment prefix and then cut off the first few
           53 # characters so that everything lines up.
           54 expr -- "$*" : ".*--help" >/dev/null && {
           55     grep '^#/' <"$0" | cut -c4-
           56     exit 0
           57 }
           58 
           59 # A custom title may be specified with the `-t` option. We use the filename
           60 # as the title if none is given.
           61 test "$1" = '-t' && {
           62     title="$2"
           63     shift;shift
           64 }
           65 
           66 # Next argument should be the `<source>` file. Grab it, and use its basename
           67 # as the title if none was given with the `-t` option.
           68 file="$1"
           69 : ${title:=$(basename "$file")}
           70 
           71 # These are replaced with the full paths to real utilities by the
           72 # configure/make system.
           73 MARKDOWN='/usr/bin/markdown_py'
           74 PYGMENTIZE='/usr/bin/pygmentize'
           75 
           76 # On GNU systems, csplit doesn't elide empty files by default:
           77 CSPLITARGS=$( (csplit --version 2>/dev/null | grep -i gnu >/dev/null) && echo "--elide-empty-files" || true )
           78 
           79 # We're going to need a `markdown` command to run comments through. This can
           80 # be [Gruber's `Markdown.pl`][md] (included in the shocco distribution) or
           81 # Discount's super fast `markdown(1)` in C. Try to figure out if either are
           82 # available and then bail if we can't find anything.
           83 #
           84 # [md]: http://daringfireball.net/projects/markdown/
           85 # [ds]: http://www.pell.portland.or.us/~orc/Code/discount/
           86 command -v "$MARKDOWN" >/dev/null || {
           87     if command -v Markdown.pl >/dev/null
           88     then alias markdown='Markdown.pl'
           89     elif test -f "$(dirname $0)/Markdown.pl"
           90     then alias markdown="perl $(dirname $0)/Markdown.pl"
           91     else echo "$(basename $0): markdown command not found." 1>&2
           92          exit 1
           93     fi
           94 }
           95 
           96 # Check that [Pygments][py] is installed for syntax highlighting.
           97 #
           98 # This is a fairly hefty prerequisite. Eventually, I'd like to fallback
           99 # on a simple non-highlighting preformatter when Pygments isn't available. For
          100 # now, just bail out if we can't find the `pygmentize` program.
          101 #
          102 # [py]: http://pygments.org/
          103 command -v "$PYGMENTIZE" >/dev/null || {
          104     echo "$(basename $0): pygmentize command not found." 1>&2
          105     exit 1
          106 }
          107 
          108 # Work and Cleanup
          109 # ----------------
          110 
          111 # Make sure we have a `TMPDIR` set. The `:=` parameter expansion assigns
          112 # the value if `TMPDIR` is unset or null.
          113 : ${TMPDIR:=/tmp}
          114 
          115 # Create a temporary directory for doing work. Use `mktemp(1)` if
          116 # available; but, since `mktemp(1)` is not POSIX specified, fallback on naive
          117 # (and insecure) temp dir generation using the program's basename and pid.
          118 : ${WORK:=$(
          119       if command -v mktemp 1>/dev/null 2>&1
          120       then
          121           mktemp -d "$TMPDIR/$(basename $0).XXXXXXXXXX"
          122       else
          123           dir="$TMPDIR/$(basename $0).$$"
          124           mkdir "$dir"
          125           echo "$dir"
          126       fi
          127   )}
          128 
          129 # We want to be absolutely sure we're not going to do something stupid like
          130 # use `.` or `/` as a work dir. Better safe than sorry.
          131 test -z "$WORK" -o "$WORK" = '/' && {
          132     echo "$(basename $0): could not create a temp work dir."
          133     exit 1
          134 }
          135 
          136 # We're about to create a ton of shit under our `$WORK` directory. Register
          137 # an `EXIT` trap that cleans everything up. This guarantees we don't leave
          138 # anything hanging around unless we're killed with a `SIGKILL`.
          139 trap "rm -rf $WORK" 0
          140 
          141 # Preformatting
          142 # -------------
          143 #
          144 # Start out by applying some light preformatting to the `<source>` file to
          145 # make the code and doc formatting phases a bit easier. The result of this
          146 # pipeline is written to a temp file under the `$WORK` directory so we can
          147 # take a few passes over it.
          148 
          149 # Get a pipeline going with the `<source>` data. We write a single blank
          150 # line at the end of the file to make sure we have an equal number of code/comment
          151 # pairs.
          152 
          153 # Folding.el support: turn {{{ folds }}} into titles -jrml
          154 (cat "$file" \
          155     | sed -e 's/^# {{{/# #/' -e 's/^# }}}.*/# --------------/' \
          156     | awk '
          157 /function.*\(\) {$/ { print "# ### " $2; print $0; next }
          158 /\(\) {$/ { print "# ### " $1; print $0; next }
          159 {print $0}' \
          160     && printf "\n\n# \n\n")         |
          161 
          162 # We want the shebang line and any code preceding the first comment to
          163 # appear as the first code block. This inverts the normal flow of things.
          164 # Usually, we have comment text followed by code; in this case, we have
          165 # code followed by comment text.
          166 #
          167 # Read the first code and docs headers and flip them so the first docs block
          168 # comes before the first code block.
          169 (
          170     lineno=0
          171     codebuf=;codehead=
          172     docsbuf=;docshead=
          173     while read -r line
          174     do
          175         # Issue a warning if the first line of the script is not a shebang
          176         # line. This can screw things up and wreck our attempt at
          177         # flip-flopping the two headings.
          178         lineno=$(( $lineno + 1 ))
          179         test $lineno = 1 && ! expr "$line" : "#!.*" >/dev/null &&
          180         echo "$(basename $0): $(file):1 [warn] shebang! line missing." 1>&2
          181 
          182         # Accumulate comment lines into `$docsbuf` and code lines into
          183         # `$codebuf`. Only lines matching `/#(?: |$)/` are considered doc
          184         # lines.
          185         if expr "$line" : '# ' >/dev/null || test "$line" = "#"
          186         then docsbuf="$docsbuf$line
          187 "
          188         else codebuf="$codebuf$line
          189 "
          190         fi
          191 
          192         # If we have stuff in both `$docsbuf` and `$codebuf`, it means
          193         # we're at some kind of boundary. If `$codehead` isn't set, we're at
          194         # the first comment/doc line, so store the buffer to `$codehead` and
          195         # keep going. If `$codehead` *is* set, we've crossed into another code
          196         # block and are ready to output both blocks and then straight pipe
          197         # everything by `exec`'ing `cat`.
          198         if test -n "$docsbuf" -a -n "$codebuf"
          199         then
          200             if test -n "$codehead"
          201             then docshead="$docsbuf"
          202                  docsbuf=""
          203                  printf "%s" "$docshead"
          204                  printf "%s" "$codehead"
          205                  echo "$line"
          206                  exec cat
          207             else codehead="$codebuf"
          208                  codebuf=
          209             fi
          210         fi
          211     done
          212 
          213     # We made it to the end of the file without a single comment line, or
          214     # there was only a single comment block ending the file. Output our
          215     # docsbuf or a fake comment and then the codebuf or codehead.
          216     echo "${docsbuf:-#}"
          217     echo "${codebuf:-"$codehead"}"
          218 )                                            |
          219 
          220 # Remove comment leader text from all comment lines. Then prefix all
          221 # comment lines with "DOCS" and interpreted / code lines with "CODE".
          222 # The stream text might look like this after moving through the `sed`
          223 # filters:
          224 #
          225 #     CODE #!/bin/sh
          226 #     CODE #/ Usage: shocco <file>
          227 #     DOCS Docco for and in POSIX shell.
          228 #     CODE
          229 #     CODE PATH="/bin:/usr/bin"
          230 #     CODE
          231 #     DOCS Start by numbering all lines in the input file...
          232 #     ...
          233 #
          234 # Once we pass through `sed`, save this off in our work directory so
          235 # we can take a few passes over it.
          236 sed -n '
          237     s/^/:/
          238     s/^:[         ]\{0,\}# /DOCS /p
          239     s/^:[         ]\{0,\}#$/DOCS /p
          240     s/^:/CODE /p
          241 ' > "$WORK/raw"
          242 
          243 # Now that we've read and formatted our input file for further parsing,
          244 # change into the work directory. The program will finish up in there.
          245 cd "$WORK"
          246 
          247 # First Pass: Comment Formatting
          248 # ------------------------------
          249 
          250 # Start a pipeline going on our preformatted input.
          251 # Replace all CODE lines with entirely blank lines. We're not interested
          252 # in code right now, other than knowing where comments end and code begins
          253 # and code begins and comments end.
          254 sed 's/^CODE.*//' < raw                      |
          255 
          256 # Now squeeze multiple blank lines into a single blank line.
          257 #
          258 # __TODO:__ `cat -s` is not POSIX and doesn't squeeze lines on BSD. Use
          259 # the sed line squeezing code mentioned in the POSIX `cat(1)` manual page
          260 # instead.
          261 cat -s                                       |
          262 
          263 # At this point in the pipeline, our stream text looks something like this:
          264 #
          265 #     DOCS Now that we've read and formatted ...
          266 #     DOCS change into the work directory. The rest ...
          267 #     DOCS in there.
          268 #
          269 #     DOCS First Pass: Comment Formatting
          270 #     DOCS ------------------------------
          271 #
          272 # Blank lines represent code segments. We want to replace all blank lines
          273 # with a dividing marker and remove the "DOCS" prefix from docs lines.
          274 sed '
          275     s/^$/##### DIVIDER/
          276     s/^DOCS //'                              |
          277 
          278 # The current stream text is suitable for input to `markdown(1)`. It takes
          279 # our doc text with embedded `DIVIDER`s and outputs HTML.
          280 $MARKDOWN                                    |
          281 
          282 # Now this where shit starts to get a little crazy. We use `csplit(1)` to
          283 # split the HTML into a bunch of individual files. The files are named
          284 # as `docs0000`, `docs0001`, `docs0002`, ... Each file includes a single
          285 # doc *section*. These files will sit here while we take a similar pass over
          286 # the source code.
          287 (
          288     csplit -sk                               \
          289            $CSPLITARGS                       \
          290            -f docs                           \
          291            -n 4                              \
          292            - '/<h5>DIVIDER<\/h5>/' '{9999}'  \
          293            2>/dev/null                      ||
          294     true
          295 )
          296 
          297 
          298 # Second Pass: Code Formatting
          299 # ----------------------------
          300 #
          301 # This is exactly like the first pass but we're focusing on code instead of
          302 # comments. We use the same basic technique to separate the two and isolate
          303 # the code blocks.
          304 
          305 # Get another pipeline going on our performatted input file.
          306 # Replace DOCS lines with blank lines.
          307 sed 's/^DOCS.*//' < raw                     |
          308 
          309 # Squeeze multiple blank lines into a single blank line.
          310 cat -s                                      |
          311 
          312 # Replace blank lines with a `DIVIDER` marker and remove prefix
          313 # from `CODE` lines.
          314 sed '
          315     s/^$/# DIVIDER/
          316     s/^CODE //'                             |
          317 
          318 # Now pass the code through `pygmentize` for syntax highlighting. We tell it
          319 # the the input is `sh` and that we want HTML output.
          320 $PYGMENTIZE -l sh -f html -O encoding=utf8  |
          321 
          322 # Post filter the pygments output to remove partial `<pre>` blocks. We add
          323 # these back in at each section when we build the output document.
          324 sed '
          325     s/<div class="highlight"><pre>//
          326     s/^<\/pre><\/div>//'                    |
          327 
          328 # Again with the `csplit(1)`. Each code section is written to a separate
          329 # file, this time with a `codeXXX` prefix. There should be the same number
          330 # of `codeXXX` files as there are `docsXXX` files.
          331 (
          332     DIVIDER='/<span class="c"># DIVIDER</span>/'
          333     csplit -sk                   \
          334            $CSPLITARGS           \
          335            -f code               \
          336            -n 4 -                \
          337            "$DIVIDER" '{9999}'   \
          338            2>/dev/null ||
          339     true
          340 )
          341 
          342 # At this point, we have separate files for each docs section and separate
          343 # files for each code section.
          344 
          345 # HTML Template
          346 # -------------
          347 
          348 # Create a function for apply the standard [Docco][do] HTML layout, using
          349 # [jashkenas][ja]'s gorgeous CSS for styles. Wrapping the layout in a function
          350 # lets us apply it elsewhere simply by piping in a body.
          351 #
          352 # [ja]: http://github.com/jashkenas/
          353 # [do]: http://jashkenas.github.com/docco/
          354 layout () {
          355     cat <<HTML
          356 <!DOCTYPE html>
          357 <html>
          358 <head>
          359     <meta http-equiv='content-type' content='text/html;charset=utf-8'>
          360     <title>$1</title>
          361     <link rel=stylesheet href="docco.css">
          362     <link rel=stylesheet href="style.css">
          363     <link rel=stylesheet href="public/stylesheets/normalize.css">
          364 </head>
          365 <body>
          366 <div id=container>
          367     <div id=background></div>
          368     <table cellspacing=10 cellpadding=10>
          369     <thead>
          370       <tr>
          371         <th class=docs><h1>$1</h1></th>
          372         <th class=code></th>
          373       </tr>
          374     </thead>
          375     <tbody>
          376         <tr><td class='docs'>$(cat)</td><td class='code'></td></tr>
          377     </tbody>
          378     </table>
          379 </div>
          380 </body>
          381 </html>
          382 HTML
          383 }
          384 
          385 # Recombining
          386 # -----------
          387 
          388 # Alright, we have separate files for each docs section and separate
          389 # files for each code section. We've defined a function to wrap the
          390 # results in the standard layout. All that's left to do now is put
          391 # everything back together.
          392 
          393 # Before starting the pipeline, decide the order in which to present the
          394 # files.  If `code0000` is empty, it should appear first so the remaining
          395 # files are presented `docs0000`, `code0001`, `docs0001`, and so on.  If
          396 # `code0000` is not empty, `docs0000` should appear first so the files
          397 # are presented `docs0000`, `code0000`, `docs0001`, `code0001` and so on.
          398 #
          399 # Ultimately, this means that if `code0000` is empty, the `-r` option
          400 # should not be provided with the final `-k` option group to `sort`(1) in
          401 # the pipeline below.
          402 if stat -c"%s" /dev/null >/dev/null 2>/dev/null ; then
          403     # GNU stat
          404     [ "$(stat -c"%s" "code0000")" = 0 ] && sortopt="" || sortopt="r"
          405 else
          406     # BSD stat
          407     [ "$(stat -f"%z" "code0000")" = 0 ] && sortopt="" || sortopt="r"
          408 fi
          409 
          410 # Start the pipeline with a simple list of split out temp filename. One file
          411 # per line.
          412 ls -1 docs[0-9]* code[0-9]* 2>/dev/null      |
          413 
          414 # Now sort the list of files by the *number* first and then by the type. The
          415 # list will look something like this when `sort(1)` is done with it:
          416 #
          417 #     docs0000
          418 #     code0000
          419 #     docs0001
          420 #     code0001
          421 #     docs0002
          422 #     code0002
          423 #     ...
          424 #
          425 sort -n -k"1.5" -k"1.1$sortopt"              |
          426 
          427 # And if we pass those files to `cat(1)` in that order, it concatenates them
          428 # in exactly the way we need. `xargs(1)` reads from `stdin` and passes each
          429 # line of input as a separate argument to the program given.
          430 #
          431 # We could also have written this as:
          432 #
          433 #     cat $(ls -1 docs* code* | sort -n -k1.5 -k1.1r)
          434 #
          435 # I like to keep things to a simple flat pipeline when possible, hence the
          436 # `xargs` approach.
          437 xargs cat                                    |
          438 
          439 
          440 # Run a quick substitution on the embedded dividers to turn them into table
          441 # rows and cells. This also wraps each code block in a `<div class=highlight>`
          442 # so that the CSS kicks in properly.
          443 {
          444     DOCSDIVIDER='<h5>DIVIDER</h5>'
          445     DOCSREPLACE='</pre></div></td></tr><tr><td class=docs>'
          446     CODEDIVIDER='<span class="c"># DIVIDER</span>'
          447     CODEREPLACE='</td><td class=code><div class=highlight><pre>'
          448     sed "
          449         s@${DOCSDIVIDER}@${DOCSREPLACE}@
          450         s@${CODEDIVIDER}@${CODEREPLACE}@
          451     "
          452 }                                            |
          453 
          454 # Pipe our recombined HTML into the layout and let it write the result to
          455 # `stdout`.
          456 layout "$title"
          457 
          458 # More
          459 # ----
          460 #
          461 # **shocco** is the third tool in a growing family of quick-and-dirty,
          462 # literate-programming-style documentation generators:
          463 #
          464 #   * [Docco][do] - The original. Written in CoffeeScript and generates
          465 #     documentation for CoffeeScript, JavaScript, and Ruby.
          466 #   * [Rocco][ro] - A port of Docco to Ruby.
          467 #
          468 # If you like this sort of thing, you may also find interesting Knuth's
          469 # massive body of work on literate programming:
          470 #
          471 #   * [Knuth: Literate Programming][kn]
          472 #   * [Literate Programming on Wikipedia][wi]
          473 #
          474 # [ro]: http://rtomayko.github.com/rocco/
          475 # [do]: http://jashkenas.github.com/docco/
          476 # [kn]: http://www-cs-faculty.stanford.edu/~knuth/lp.html
          477 # [wi]: http://en.wikipedia.org/wiki/Literate_programming
          478 
          479 # Copyright (C) [Ryan Tomayko <tomayko.com/about>](http://tomayko.com/about)<br>
          480 # This is Free Software distributed under the MIT license.
          481 :