GOPHER 2.0 - MARKUP

This  is post 3 of 4 (?) in which I talk about one of my favorite
subjects: linked documents and  lightweight markup languages.



Ratfactor's Apologia
=================================================================

I knew that the title "Gopher 2.0" would be a little contentious.

It's certainly more attention-grabbing than "Ideas for a New Con-
tent Delivery Protocol Heavily Inspired by Gopher".  (Though  now
that I see it, the last "PHIG" part does make me smile.)

Part of me wishes I could have thought of a better name for these
posts...  *but* part of me doesn't.

I don't mind stirring the pot to see where  everybody  stands  on
the  upgrade  vs.  clean break issue.  I was already leaning hard
towards the "clean break" camp because I love retrocomputing  and
I want old machines and old software to keep working as-is for as
long as possible.

Now I'm completely convinced. :-)

Also, this reminds me somewhat tangentially of  Cunningham's  Law
[0]

      "The best way to get the right answer on the Internet
      is not to ask a question, it's to post the wrong  an-
      swer."

Where  in  this  case,  Gopher  2.0  is the "wrong" title for the
"right" (for me) content.



Why markup - encoding
=================================================================

Far  more  than the protocol, this is where I start to really get
excited.

I loooooove "plain text" documents.

But.

There's no such thing.

If you said "plain text" far enough back in time, I wouldn't know
if you were talking about something encoded in EBCDIC or ASCII.

If  you  say  it  now, I don't know if you're talking about 7-bit
ASCII or 8-bit ISO 8859-1 (Latin-1) or a multi-byte  Unicode  en-
coding or something else entirely!

And let's not even speak of line endings ("\r\n" vs "\n").

Please. Let's not speak of it.  The wounds are still too fresh.

So  do  I *even need* to mention that UTF-8 would be required for
any next-gen document format?

Okay: UTF-8 is required.  That's a position I'm happy to defend.



Why markup - hypertext
=================================================================

I am deeply invested in the concept of hypertext. [1]

I've experimented with Wikis and HTML content generators to a de-
gree that may not even be healthy. :-)

One of my favorite tools is the lightweight  VimWiki  plugin  for
Vim,  which allows me to quickly create, edit, arrange, and navi-
gate text documents within my editor.  (And yes, I'm aware of and
jealous of Emacs and Org Mode.)

For  VimWiki (or any hypertext document system) to work, it needs
to have a way to link directly to other documents.

HTML does this with anchors:

  <a href="wigglers">Wigglers</a>

VimWiki does this with links:

  [[wigglers|Wigglers]]

Gopher does this with "Directory Entities" (but only in directory
listings):

  0/docs/wigglers[TAB]Wigglers[TAB]example.com[TAB]70

And informally, many folks have adopted this presumably Markdown-
inspired "reference-style" link pattern for Gopher content:

  Wigglers [1]
     ...
  [1] gopher://docs/wigglers

What I like about the Gopher directory entity style  is  that  it
enforces  (or  at least strongly suggests) a one-line-each linear
list of links.  What I don't like about them is typing and  read-
ing them.

I  like the "reference-style" links for the same reasons.  I also
like that the path is completely visible and  not  replaced  with
alternate  text.   What  I don't like about them is that they are
not actually part of Gopher.



Why markup - text wrapping
=================================================================

One of the biggest problems I have with viewing Gopher content is
that it doesn't display well on different sized screens.

Isn't it painfully ironic that something as *simple* as *text* is
so  hard  to  format  for a cell phone screen vs an old 80-column
terminal vs a widescreen desktop monitor?

This is one thing that HTML gets 100% correct:  by  default,  all
text reflows to fit the container.

The  problem  is that we can't just remove all of the line ending
characters from our documents and hope for the  best:  we'd  lose
source  code  formatting, ASCII art, and all the other little de-
tails that make "plain" text so wonderful to view!

So somehow you have to specify, "here is paragraph text  - please
make  this  look right for my readers," but also, "here is a cool
Figlet logo or a diagram made out of  |  +  -  characters,  don't
touch this!"



Markup perspectives
=================================================================

Like HTTP comes with HTML (or vice versa), I believe  a  next-gen
rodent-based  protocol  for  content specifies the format of that
content to the degree that we can link  to  other  documents  and
identify, minimally, how to display that content.

But,  again,  this  format  needs  to balance the concerns of the
three perspectives I used to look at  the  protocol:  developers,
content creators, and end-users.

Let's look at each of those now:

1. Developers

In  my  mind, a good format specification is unambiguous, simple,
and flexible.  In the spirit of Gopher, I suggest a  format  that
is as *easy to parse as possible*.

Therefore, I currently favor an extremely limited line-based syn-
tax.  I'll show examples later.

2. Content creators

As a content creator, I want the syntax to get out of my way  and
let me type as rapidly as I can compose my thoughts.

I  want the flexibility to be able to accomplish any (reasonable)
thing I can think of doing with plain text, but not have to memo-
rize a huge set of rules.

I feel like developer and content creator perspectives don't have
to be at odds so long as both agree on *utter  simplicity*  as  a
core tenant.

3. End-users

As  an  end-user, I want to be able to view content so that it is
formatted as nicely for my screen as possible; I want to be  able
to  view  a document on my phone, printed out on paper, or jacked
into my cyberdeck on the neon rooftop of a megacorp in the  pour-
ing rain.



About the markup example
=================================================================

I'm already dogfooding [2] a prototype of this  syntax  and  have
been  using  it since mentioning a tool I created called Text Ju-
nior (tjr) in a post back in April. [3]

(By the way, piping through groff hasn't been quite  the  panacea
I'd  hoped  it  would be, but I'm otherwise pretty happy with the
little tool and the syntax. I've been noodling with a replacement
written in AWK/gawk.)

I've  borrowed things I like from existing syntaxes such as Mark-
down, AsciiDoc, and various wikis.  I honestly  can't  keep  them
all straight anymore.

The  common  feature here is that all formatting is "line-based":
paragraphs are separated by blank lines.  Headings are  on  lines
that  start  with one or more "#" characters.  Other blocks start
and end with lines containing nothing but symmetrical triplets of
characters that are easy to type on the keyboard and are hopeful-
ly easy to remember (because of certain existing conventions).

Unambiguity, ease of typing, and ease of parsing are the  primary
goals (in that order).



Markup example
=================================================================

Enough talk, let's see an example:

   # Example Document

   Hello.
   Here is a paragraph of text.
   It reflows as needed to fit the desired output width.

   I like the idea of enforcing that links be on a line of their own.
   I'm not super sure about the exact syntax.
   For reasons I'll get into in Part 4 (the client), I want to support relative document links.
   So here's something to look at:

   link:/docs/wigglers
   link://example.com/danglers
   http://example.com/
   telnet://example.com:23

   The first two are for *this* new imaginary protocol. The last two are for *other* protocols in URL form.
   Also note that there is no "display" text for the links.
   I like the idea that the end-user is completely aware of where they're going when they follow a link.

   Now a "preformatted" or "code" block:

   ```
   example(){                 +------------------------+
     print("Hello world!");   |  Code or art goes here |
   }                          +------------------------+
   ```

   I consider these to be nice-to-have formatting items:

   """
   A block quote will stand out from the paragraph text.
   It will also flow and wrap like paragraph.
   """

   It's also hard to make a good document without this ability:

   1. Ordered and unordered lists are always nice to have
   2. I'm not certain how necessary it is to support nested lists. I guess it would be nice.

   The end.


Example rendering
=================================================================

Here's  an  example  rendering  as it might appear if your screen
just happened to match the width of this document. :-) Of course,
you  have to use your imagination to visualize how links might be
highlighted and such:

                        EXAMPLE DOCUMENT

Hello.  Here is a paragraph of text.  It reflows as needed to fit
the desired output width.

I  like  the  idea  of enforcing that links be on a line of their
own.  I'm not super sure about the  exact  syntax.   For  reasons
I'll  get into in Part 4 (the client), I want to support relative
document links.  So here's something to look at:

  link:/docs/wigglers
  link://example.com/danglers
  http://example.com/
  telnet://example.com:23

The first two are for *this* new imaginary protocol. The last two
are  for  *other* protocols in URL form.  Also note that there is
no "display" text for the links.  I like the idea that  the  end-
user  is completely aware of where they're going when they follow
a link.

Now a "preformatted" or "code" block:

  example(){                 +------------------------+
    print("Hello world!");   |  Code or art goes here |
  }                          +------------------------+

I consider these to be nice-to-have formatting items:

      A block quote will stand out from the paragraph text.
      It will also flow and wrap like paragraph.

It's also hard to make a good document without this ability:

    1. Ordered and unordered lists are always nice to have
    2. I'm not certain how necessary it is to support nested
       lists. I guess it would be nice.

The end.



Okay, that's it
=================================================================

(I had to fake the ordered list because  I  don't  actually  have
that working in tjr yet.)

Again, the client will be given leeway to display the document in
whatever way makes it most enjoyable for the end-user.

By the way, I could also see an argument being made for standard-
izing  *strong*  and  _emphasized_ text.  But, making unambiguous
rules for these that covers all corner cases is  extremely  hard.
Also, it breaks the "line-based" nature of the formatting so far.

There are lots of other little details and persuasive arguments I
could try to pack in here, but I think this post has gone on long
enough.

Actually, the next part, The Client, is where I'm *most excited*.
Thanks for reading thus far!



Well, almost done
=================================================================

Oh,  one more thing: I've read all of the feedback (I could find)
so far and taken it all to heart.  I'm happy to see  the  passion
and  *no  hard  feelings* if I've rubbed some folks the wrong way
with all of this.

I wanted to  specifically  acknowledge  what  gallowsgryph  wrote
about a next-gen protocol: [4]

      "And  I have a name suggestion for it: /Meerkat/. The
      burrowing Savannah  dweller  that  have  large  fami-
      lies...  Much  like pubnix groups, if you think about
      it."

I *love* that suggestion.  "Meerkat Protocol."   "MML  -  Meerkat
Markup Language."  That could work.

What  other  rodents and small mammals could be pressed into ser-
vice?  Shrews, moles, voles,  mice,  rats,  hedgehogs,  hamsters,
lemmings...

Ha! This is fun.

         ***     ****
       ******* ********
      *******love*******
       ****ratfactor***
         ************
           ********
             ****
              **

See you in cyberspace, Gophers!

  [0] https://en.wikipedia.org/wiki/Ward_Cunningham#Cunningham's_Law
  [1] https://en.wikipedia.org/wiki/Hypertext
  [2] https://en.wikipedia.org/wiki/Eating_your_own_dog_food
  [3] gopher://sdf.org/0/users/ratfactor/phlog/2019-04-21-text-junior
  [4] gopher://sdf.org/0/users/gallowsgryph/phlog/2019-06-07_gopher2_part2.txt