GOPHER 2.0 - MARKUP This is post 3 of 4 (?) in which I talk about one of my favorite subjects: linked documents and lightweight markup languages. Ratfactor's Apologia ================================================================= I knew that the title "Gopher 2.0" would be a little contentious. It's certainly more attention-grabbing than "Ideas for a New Con- tent Delivery Protocol Heavily Inspired by Gopher". (Though now that I see it, the last "PHIG" part does make me smile.) Part of me wishes I could have thought of a better name for these posts... *but* part of me doesn't. I don't mind stirring the pot to see where everybody stands on the upgrade vs. clean break issue. I was already leaning hard towards the "clean break" camp because I love retrocomputing and I want old machines and old software to keep working as-is for as long as possible. Now I'm completely convinced. :-) Also, this reminds me somewhat tangentially of Cunningham's Law [0] "The best way to get the right answer on the Internet is not to ask a question, it's to post the wrong an- swer." Where in this case, Gopher 2.0 is the "wrong" title for the "right" (for me) content. Why markup - encoding ================================================================= Far more than the protocol, this is where I start to really get excited. I loooooove "plain text" documents. But. There's no such thing. If you said "plain text" far enough back in time, I wouldn't know if you were talking about something encoded in EBCDIC or ASCII. If you say it now, I don't know if you're talking about 7-bit ASCII or 8-bit ISO 8859-1 (Latin-1) or a multi-byte Unicode en- coding or something else entirely! And let's not even speak of line endings ("\r\n" vs "\n"). Please. Let's not speak of it. The wounds are still too fresh. So do I *even need* to mention that UTF-8 would be required for any next-gen document format? Okay: UTF-8 is required. That's a position I'm happy to defend. Why markup - hypertext ================================================================= I am deeply invested in the concept of hypertext. [1] I've experimented with Wikis and HTML content generators to a de- gree that may not even be healthy. :-) One of my favorite tools is the lightweight VimWiki plugin for Vim, which allows me to quickly create, edit, arrange, and navi- gate text documents within my editor. (And yes, I'm aware of and jealous of Emacs and Org Mode.) For VimWiki (or any hypertext document system) to work, it needs to have a way to link directly to other documents. HTML does this with anchors: Wigglers VimWiki does this with links: [[wigglers|Wigglers]] Gopher does this with "Directory Entities" (but only in directory listings): 0/docs/wigglers[TAB]Wigglers[TAB]example.com[TAB]70 And informally, many folks have adopted this presumably Markdown- inspired "reference-style" link pattern for Gopher content: Wigglers [1] ... [1] gopher://docs/wigglers What I like about the Gopher directory entity style is that it enforces (or at least strongly suggests) a one-line-each linear list of links. What I don't like about them is typing and read- ing them. I like the "reference-style" links for the same reasons. I also like that the path is completely visible and not replaced with alternate text. What I don't like about them is that they are not actually part of Gopher. Why markup - text wrapping ================================================================= One of the biggest problems I have with viewing Gopher content is that it doesn't display well on different sized screens. Isn't it painfully ironic that something as *simple* as *text* is so hard to format for a cell phone screen vs an old 80-column terminal vs a widescreen desktop monitor? This is one thing that HTML gets 100% correct: by default, all text reflows to fit the container. The problem is that we can't just remove all of the line ending characters from our documents and hope for the best: we'd lose source code formatting, ASCII art, and all the other little de- tails that make "plain" text so wonderful to view! So somehow you have to specify, "here is paragraph text - please make this look right for my readers," but also, "here is a cool Figlet logo or a diagram made out of | + - characters, don't touch this!" Markup perspectives ================================================================= Like HTTP comes with HTML (or vice versa), I believe a next-gen rodent-based protocol for content specifies the format of that content to the degree that we can link to other documents and identify, minimally, how to display that content. But, again, this format needs to balance the concerns of the three perspectives I used to look at the protocol: developers, content creators, and end-users. Let's look at each of those now: 1. Developers In my mind, a good format specification is unambiguous, simple, and flexible. In the spirit of Gopher, I suggest a format that is as *easy to parse as possible*. Therefore, I currently favor an extremely limited line-based syn- tax. I'll show examples later. 2. Content creators As a content creator, I want the syntax to get out of my way and let me type as rapidly as I can compose my thoughts. I want the flexibility to be able to accomplish any (reasonable) thing I can think of doing with plain text, but not have to memo- rize a huge set of rules. I feel like developer and content creator perspectives don't have to be at odds so long as both agree on *utter simplicity* as a core tenant. 3. End-users As an end-user, I want to be able to view content so that it is formatted as nicely for my screen as possible; I want to be able to view a document on my phone, printed out on paper, or jacked into my cyberdeck on the neon rooftop of a megacorp in the pour- ing rain. About the markup example ================================================================= I'm already dogfooding [2] a prototype of this syntax and have been using it since mentioning a tool I created called Text Ju- nior (tjr) in a post back in April. [3] (By the way, piping through groff hasn't been quite the panacea I'd hoped it would be, but I'm otherwise pretty happy with the little tool and the syntax. I've been noodling with a replacement written in AWK/gawk.) I've borrowed things I like from existing syntaxes such as Mark- down, AsciiDoc, and various wikis. I honestly can't keep them all straight anymore. The common feature here is that all formatting is "line-based": paragraphs are separated by blank lines. Headings are on lines that start with one or more "#" characters. Other blocks start and end with lines containing nothing but symmetrical triplets of characters that are easy to type on the keyboard and are hopeful- ly easy to remember (because of certain existing conventions). Unambiguity, ease of typing, and ease of parsing are the primary goals (in that order). Markup example ================================================================= Enough talk, let's see an example: # Example Document Hello. Here is a paragraph of text. It reflows as needed to fit the desired output width. I like the idea of enforcing that links be on a line of their own. I'm not super sure about the exact syntax. For reasons I'll get into in Part 4 (the client), I want to support relative document links. So here's something to look at: link:/docs/wigglers link://example.com/danglers http://example.com/ telnet://example.com:23 The first two are for *this* new imaginary protocol. The last two are for *other* protocols in URL form. Also note that there is no "display" text for the links. I like the idea that the end-user is completely aware of where they're going when they follow a link. Now a "preformatted" or "code" block: ``` example(){ +------------------------+ print("Hello world!"); | Code or art goes here | } +------------------------+ ``` I consider these to be nice-to-have formatting items: """ A block quote will stand out from the paragraph text. It will also flow and wrap like paragraph. """ It's also hard to make a good document without this ability: 1. Ordered and unordered lists are always nice to have 2. I'm not certain how necessary it is to support nested lists. I guess it would be nice. The end. Example rendering ================================================================= Here's an example rendering as it might appear if your screen just happened to match the width of this document. :-) Of course, you have to use your imagination to visualize how links might be highlighted and such: EXAMPLE DOCUMENT Hello. Here is a paragraph of text. It reflows as needed to fit the desired output width. I like the idea of enforcing that links be on a line of their own. I'm not super sure about the exact syntax. For reasons I'll get into in Part 4 (the client), I want to support relative document links. So here's something to look at: link:/docs/wigglers link://example.com/danglers http://example.com/ telnet://example.com:23 The first two are for *this* new imaginary protocol. The last two are for *other* protocols in URL form. Also note that there is no "display" text for the links. I like the idea that the end- user is completely aware of where they're going when they follow a link. Now a "preformatted" or "code" block: example(){ +------------------------+ print("Hello world!"); | Code or art goes here | } +------------------------+ I consider these to be nice-to-have formatting items: A block quote will stand out from the paragraph text. It will also flow and wrap like paragraph. It's also hard to make a good document without this ability: 1. Ordered and unordered lists are always nice to have 2. I'm not certain how necessary it is to support nested lists. I guess it would be nice. The end. Okay, that's it ================================================================= (I had to fake the ordered list because I don't actually have that working in tjr yet.) Again, the client will be given leeway to display the document in whatever way makes it most enjoyable for the end-user. By the way, I could also see an argument being made for standard- izing *strong* and _emphasized_ text. But, making unambiguous rules for these that covers all corner cases is extremely hard. Also, it breaks the "line-based" nature of the formatting so far. There are lots of other little details and persuasive arguments I could try to pack in here, but I think this post has gone on long enough. Actually, the next part, The Client, is where I'm *most excited*. Thanks for reading thus far! Well, almost done ================================================================= Oh, one more thing: I've read all of the feedback (I could find) so far and taken it all to heart. I'm happy to see the passion and *no hard feelings* if I've rubbed some folks the wrong way with all of this. I wanted to specifically acknowledge what gallowsgryph wrote about a next-gen protocol: [4] "And I have a name suggestion for it: /Meerkat/. The burrowing Savannah dweller that have large fami- lies... Much like pubnix groups, if you think about it." I *love* that suggestion. "Meerkat Protocol." "MML - Meerkat Markup Language." That could work. What other rodents and small mammals could be pressed into ser- vice? Shrews, moles, voles, mice, rats, hedgehogs, hamsters, lemmings... Ha! This is fun. *** **** ******* ******** *******love******* ****ratfactor*** ************ ******** **** ** See you in cyberspace, Gophers! [0] https://en.wikipedia.org/wiki/Ward_Cunningham#Cunningham's_Law [1] https://en.wikipedia.org/wiki/Hypertext [2] https://en.wikipedia.org/wiki/Eating_your_own_dog_food [3] gopher://sdf.org/0/users/ratfactor/phlog/2019-04-21-text-junior [4] gopher://sdf.org/0/users/gallowsgryph/phlog/2019-06-07_gopher2_part2.txt