Tiny tools for great writers ================================== Unix has traditionally had a quite tight relationship with electronic text production and editing. In fact, most of early unix development at Bell Labs was funded under the promise that those long-bearded folks were developing a new text-processing system for the AT&T patent office. So when the operating system started circulating inside and outside the Labs, well, people expected it to have powerful text production tools bundled in. And the auld craftsmen of Murray Hill managed to do the needed magic, and stuffed the early unix releases with lots of professional tools for writers. Most of them have survived until the current era. We will shortly look at some of them here, in particular dict(1), spell(1), diction(1) and style(1). Future phlogs from The Dwarven Blacksmith will most probably feature other tools from that pack. The first tool is dict(1). This is a client for the DICT protocol (RFC 2229 [1]) which allows to query a remote or local dictionary server to obtain words definitions: $ dict dict://dict.org/d:GNU:wn 1 definition found From WordNet (r) 3.0 (2006) [wn]: gnu n 1: large African antelope having a head with horns like an ox and a long tufted tail [syn: {gnu}, {wildebeest}] $ In this case we have asked dict(1) to contact the dictionary server at dict.org, and to ask for all the existing definitions of the word `GNU` in the dictionary `wn` (WordNet). If you want to query all the existing dictionaries, just remove the `:wn` from the query. Despite using remote dictionary servers is possible, for small installations it makes sense to have a dict(1) server running locally. In that case, you just need to type: $ dict -d jargon PDP-11 1 definition found From The Jargon File (version 4.4.7, 29 Dec 2003) [jargon]: PDP-11 Possibly the single most successful minicomputer design in history, a favorite of hackers for many years, and the first major Unix machine, The first PDP-11s (the 11/15 and 11/20) shipped in 1970 from {DEC}; the last (11/93 and 11/94) in 1990. Along the way, the 11 gave birth to the {VAX}, strongly influenced the design of microprocessors such as the Motorola 6800 and Intel 386, and left a permanent imprint on the C language (which has an odd preference for octal embedded in its syntax because of the way PDP-11 machine instructions were formatted). There is a history site. $ Notice that the option '-d' allows to choose a specific dictionary. In this case we chose the Jargon File [2]. The second one is spell(1), which survives today in at least two separate incarnations, namely ispell(1) and aspell(1). These tools take as input a file, look for spelling mistakes, and propose corrections using a system-wide wordlist. To spell a file you could use either: $ ispell textfile.txt or: $ aspell -c textfile.txt No need to say that these are quite useful for a life on the terminal, and several editors have interfaces to one or both of them. Just refer to their man pages for more info. Another cool pair of tools for writers are diction(1) and style(1). The former scans a text and finds typical mistakes in grammar and sentence construction. Let's run it on the current phlog: $ diction -s 20190129_texttools.txt 20190129_texttools.txt:5: In fact, [most -> Do not use as substitute for "almost."] of early unix development at Bell Labs was funded under the impression that those long-bearded [folks -> Avoid using "folks", when writing formally, to refer to your family or friends.] were developing a new text-processing [system -> Frequently used without need.] for the AT&T patent office. 20190129_texttools.txt:8: [So -> (do not use as intensifier)] when the operating [system -> Frequently used without need.] started circulating inside and outside the Labs, well, [people -> Do not use with numbers or as substitute for "public".] [expected -> Use "expect" for simple predictions and "anticipate" for more complex actions in advance of an event.] it to have [powerful -> Overused, especially in computer industry press releases.] text production tools bundled in. 20190129_texttools.txt:15: This is a client for the DICT protocol (RFC 2229 [1]) [which -> (use "that" if clause is restrictive)] allows to query a remote or local dictionary server to obtain words definitions: ..... 20 phrases in 22 sentences found. $ diction(1) will mark all the 'suspect' words and sentences using brackets []. The option '-s' forces the inclusion of suggestions for alternative wordings, if at all available. Finally, style(1) is a tool to compute readability statistics on a textfile. For instance: $ style 20190129_texttools.txt readability grades: Kincaid: 8.5 ARI: 9.4 Coleman-Liau: 9.2 Flesch Index: 68.2/100 (plain English) Fog Index: 11.7 Lix: 39.1 = school year 6 SMOG-Grading: 10.6 sentence info: 1910 characters 424 words, average length 4.50 characters = 1.41 syllables 22 sentences, average length 19.3 words 50% (11) short sentences (at most 14 words) 18% (4) long sentences (at least 29 words) 7 paragraphs, average length 3.1 sentences 0% (0) questions 31% (7) passive sentences longest sent 54 wds at sent 16; shortest sent 5 wds at sent 17 word usage: verb types: to be (8) auxiliary (2) types as % of total: conjunctions 4% (19) pronouns 4% (17) prepositions 12% (49) normalisations 2% (9) sentence beginnings: pronoun (2) interrogative pronoun (0) article (2) subordinating conjunction (1) conjunction (1) preposition (6) $ Now, I am not a linguist, but it looks like the higher the values of each of the readability grades, the more readable your text is. But each measure has a different range, so it's probably better to have a look at the corresponding man page. [1] gopher://gopher.rbfh.de/0/RFC/rfc2229.txt [2] gopher://orion.ka10.de/0/books/Jargon.2003 -+-+-+- spell(1) appeared in Unix-v7 (January 1979) diction(1) and style(1) were part of the Unix Documenter's WorkBench (DWB) dict(1) was written by Rik Faith in the early 1990s