---------------------------------------- Using ptx to generate one-time pads March 15th, 2018 ---------------------------------------- I have been working my way through coreutils [0] recently when I came across ptx. $ apropos ptx ptx (1) - produce a permuted index of file contents What the hell does that mean? I know... $ man ptx PTX(1) User Commands PTX(1) NAME ptx - produce a permuted index of file contents SYNOPSIS ptx [OPTION]... [INPUT]... (without -G) ptx -G [OPTION]... [INPUT [OUTPUT]] DESCRIPTION Output a permuted index, including context, of the words in the input files. With no FILE, or when FILE is -, read standard input. Mandatory arguments to long options are mandatory for short options too. ... Oh that totally clears it... nope. Still no clue. So I asked on Mastodon and a few people had some suggestions in particular someone was able to shoot me over to a blog post [1] which tries to clear up what a 'purmuted index' even is. And that's the key. So check this out: A while back before we had badass search engines and hyperlinked doom shenanigans manually finding the reference to a word in a document SUUUUUUUUCKED. So they made this index in the back that listed all the key terms alphebetically in the middle column of a page. To the left of that word it would list whatever sentence led up to it. To the right they'd list the sentence fragment that followed the term. Finally, the page number. With that you could jump to the page and eye-ball search it yourself. It's been around since systemV and it's pretty much useless, right? Well, foxy, I think I came up with a fun hobby use-case. Pick a book with a publically available canonical plain-text source. Oh, I dunno, head over to Project Gutenburg [2] or something and wrestle yourself up some Joyce (or ILLEGAL GERMAN NOVELS!!!!! [3]). We're gonna shove that badboy into ptx like a champ. Here we go... $ curl https://www.gutenberg.org/files/4300/4300-0.txt > ulysses.txt $ ptx ulysses.txt SCREEN EXPLODES WITH TEXT FOR SEVERAL MINUTES!!!!! That's not how that works. Back to manpage! Hmmm... ...assumes latin-1 charset... ...ignore case, perhaps... ...[.?!][]\"')}]*\\($\\|\t\\| \\)[ \t\n]*... ...Emacs next-error, grumble... ...-w, width, ahha... ROFF! NO FUCKING WAY! One of the output formats for ptx is freaking roff! Syncronicity, baby! [4] Lets try something a little smaller. $ curl http://www.gutenberg.org/cache/epub/1065/pg1065.txt > theraven.txt $ ptx -O -f -w 66 theraven.txt > theraven-index.txt That sorta works. Ugh, but I'm getitng tired. Here's the plan for what's next: - Figure out how to format this stuff so I can awk it - awk so that the text key and one more word to the right are the output. Two words with a space between, that's it. - sort unique that bad-boy by each column in turn so both pairs of words are unique. - Use whatever words are in your primary list to write a plain text message. If your source document is large enough that's virtually any word you'd like to use. - Use awk to replace your words with the one to the right via a lookup file - Send secret message to a friend. The knowledge of which book is your cypher is all that's necessary to repeat the process in reverse. Huzzah for secret codes. If I get some time this weekend I'll look at writing a script to automate this for you. Provide a book and a message and indicate whether to encode or decode. Oh what fun that would be for some private crypto. Thinking you could do this in perl? Wanna show me up? Put your illogical collection of special characters where your mouth is, buddy! (TXT) [0] GNU Core Utilities (HTM) [1] Reading a Permuted Index (DIR) [2] Project Gutenberg on Gopher (HTM) [3] Project Gutenberg Blocks Access to Germany (TXT) [4] dbucklin - Formatting for Gopher with GNU troff