TEXT JUNIOR So I ended my previous post with the following idea for plain-text formatting that went something like this (to paraphrase myself): I want to write my content in an unobtrusive Markdown-like format but I don't feel like maintaining a complicated text formatting engine. Raw troff/groff can do the hard work, but is no fun to write. But, if I write a pre-processor for *roff, we can get the best of both worlds cheap! Well, it turns out that the idea was viable. I've got a working solu- tion. The hardest part, by far, was figuring out which of the cryptic *roff commands would accomplish what I wanted in the "ascii" output type. At one point, I even found myself reading the man doc macro ("an- old.tmac") and the "grotty" C++ source (which is an output "device" for groff which can produce ANSI terminal output as well as the "ascii" and "utf8" output types. I spent more time trying to figure out how to produce output as one continuous "page" with no breaks or vertical padding (on the last page to make the "printer" completely feed out the whole sheet of paper!) than on all of the other tasks combined. Here's the answer to that one, by the way. Just stick this at the very end of your document: .pl 0 It tells groff that your page "length" should be 0 so that it won't attempt to pad the last "page" with any addtitional vertical space. But you can't put that line anywhere else in your document or groff will think that each line is a separate "page", which will cause it to collapse the other vertical spacing in your document. A simple formatting language ====================================================================== I've written a couple homebrew text formats before (always with HTML as the target output). I've written line-based parsers and tokenizing parsers for them. It always ends up being harder than I'd expected. I knew I wanted this to be dirt-simple, so I make the syntax strictly line-based with on/off syntax. For example, a code block begins and ends with exactly three backticks (```) on a line. Nothing more or less is allowed. Example: ``` if(foo){ echo "Hello!"; } ``` Block quotes begin and end with exactly three double-quotes ("""). Titles and headings must follow this precise format: # Title ## A Heading ### A Sub-Heading All whitespace outside of code blocks is normalized (for example, out- put paragraphs are separated by exactly one blank line). Pretty nor- mal stuff, just very strict about the block syntax. All output formatting is specified with groff commands with the excep- tion of the heading lines, which I'm drawing with my preprocessor be- cause that was just so much easier than figuring out how to do it pro- gramatically with groff! The tool ====================================================================== By yesterday, I had a test.groff document which produced the desired text document output when I ran it through groff like this: $ groff -Tascii test.groff > output.txt By this morning, I had a 20-line Perl script which did pretty much ev- erything I needed. By the end of the day, it's grown to 92 lines and seems to be able to handle whatever I throw at it. ___ ___ ___ /\ \ /\ \ /\ \ \:\ \ \:\ \ /::\ \ \:\ \ ___ /::\__\ /:/\:\ \ /::\ \ /\ /:/\/__/ /::\~\:\ \ /:/\:\__\ \:\/:/ / /:/\:\ \:\__\ /:/ \/__/ \::/ / \/_|::\/:/ / /:/ / \/__/ |:|::/ / \/__/ |:|\/__/ |:| | \|__| I named it Text Junior (tjr) to emphasize how small it is because I'm giving all of the crappy work to groff. :-) This post was processed with tjr. I'll put the source up on RGB (this Gopher burrow) soon. Until next time, happy hacking!