gopher-extension.md - gopher-protocol - Gopher Protocol Extension Project
 (HTM) git clone git://bitreich.org/gopher-protocol git://enlrupgkhuxnvlhsf6lc3fziv5h2hhfrinws65d7roiv6bfj7d652fid.onion/gopher-protocol
 (DIR) Log
 (DIR) Files
 (DIR) Refs
 (DIR) Tags
 (DIR) README
 (DIR) LICENSE
       ---
       gopher-extension.md (12346B)
       ---
            1 Gopher Extension
            2 ================
            3 
            4 # Goals of this document
            5 
            6 The intention is to not make radical changes to the RFC1336 standard.
            7 
            8 This document also describes the common-used extensions to the
            9 Gopher RFC and some clarifications to the wording of the RFC.
           10 
           11 Since the publication of the RFC1436 standard around March 1993 there
           12 have been developments, such as the adoption of the UTF-8
           13 text-encoding and the use of SSL and later TLS encryption.
           14 
           15 The recommendations can be therefore be seen as guidelines or
           16 "SHOULD".
           17 
           18 
           19 # Added Types
           20 
           21 Types can be added, this doesn't violate the RFC specification:
           22 section 3.8: "Characters '0' through 'Z' are reserved.".
           23 
           24 These are types that are commonly used.
           25 
           26 * The 'h' type: HTML file, with the "URL:" prefix in the selector it points to
           27   an URL, see historical mail conversation (embedded and referenced below).
           28 * The 'i' type, Informational message: display as text.
           29   i Some message <TAB> empty selector server TAB port CR LF
           30   The server and port should be included for compatibility.
           31 * As mentioned in the original Gopher RFC, for other types:
           32   Anything primary Text file? Use the 0 type.
           33   Anything unknown  or binary file? Use the 9 type and a file extension.
           34 * Use the image (I) type for png, jpg etc. Make sure to set the file extension.
           35 
           36 
           37 # Using the proper type for Text file or binary
           38 
           39 * Sometimes a question comes up where PDF or XML is binary. If the
           40   file is readable as text it is a text file, otherwise it is binary.
           41 
           42   For example PDF would be using the binary 9 type and a .pdf file
           43   extension.  XML would be a text 0 type.
           44 
           45   Type 0 is files which are pure text and can be displayed in a text
           46   editor.
           47 
           48 
           49 # Text Encoding
           50 
           51 * The Notes section in the Gopher RFC mentions Latin1 encoding.
           52 
           53   Recommendation: Use UTF-8 or ASCII-only for the Gopher
           54   username/title field.  A client may want to display the other fields,
           55   so be polite and use UTF-8 or ASCII there as well if possible.
           56 
           57   Reason: UTF-8 is a simple text-encoding and commonly used these days.
           58 
           59   People who use Latin1 eat children.
           60 
           61 
           62 # Accessibility
           63 
           64 * Printable characters and line width, from the Gopher RFC standard:
           65 
           66         It is *highly* recommended that the User_Name field contain only
           67         printable characters, since many different clients will be using
           68         it.  However if eight bit characters are used, the characters
           69         should conform with the ISO Latin1 Character Set.  The length of
           70         the user-displayable line should be less than 70 characters; longer
           71         lines may not fit across some screens."
           72 
           73 New recommendations:
           74 * Don't use longer than 79 columns of UTF-8 encoded displayed "username" text.
           75 * Try to reduce the amount of ASCII art which can contain non-printable
           76   characters. Think of the blind or tools used to parse actual textual content.
           77 
           78 Reason: A clarification of the term characters is needed.
           79 
           80 
           81 * "The selector string should be no longer than 255 characters."
           82 
           83 Recommendation: use no longer than 255 bytes.
           84 
           85 Reasons for this are:
           86 * A clarification of the term "characters" is needed. Characters could
           87   nowadays be interpreted as unicode characters or column size of unicode
           88   characters instead of bytes.
           89 * Clients can simply use a static buffer to fit 255 bytes.
           90 * Although Gopher does not have to map to a filesystem, filesystems typically
           91   have a limit of around 255 bytes also.
           92 
           93 
           94 * From section 3.5:
           95 
           96         If a client does not understand what a, say, type 'B' item (not a core
           97         item) is, then it may simply ignore the item in the directory
           98         listing; the user never even has to see it.  Alternatively, the item
           99         could be displayed as an unknown type.
          100 
          101 Recommendation: For clients, do not silently ignore an item, but display it
          102 as an unknown type.
          103 Reason: Define a recommendation for consistent behaviour in clients.
          104 
          105 
          106 # Server and client handling of text file types
          107   
          108 The RFC defines:
          109   
          110         Textfile Entity
          111         
          112         TextFile  ::= {TextBlock} Lastline
          113 
          114 and:
          115 
          116         Note:  Lines beginning with periods must be prepended with an extra
          117                period to ensure that the transmission is not terminated early.
          118                The client should strip extra periods at the beginning of the line.
          119   
          120 and:
          121 
          122         Note:  The client should be prepared for the server closing the
          123                connection without sending the Lastline.  This allows the
          124                client to use fingerd servers.
          125   
          126 From section 4:
          127   
          128         (b) The well-tempered server ought to send "text" (unless a file
          129         must be transferred as raw binary).  Should this text include
          130         tabs, formfeeds, frufru?  Probably not, but rude servers will
          131         probably send them anyway.  Publishers of documents should be
          132         given simple tools (filters) that will alert them if there are any
          133         funny characters in the documents they wish to publish, and give
          134         them the opportunity to strip the questionable characters out; the
          135         publisher may well refuse.
          136         
          137         (c) The well-tempered client should do something reasonable with
          138         funny characters received in text; filter them out, leave them in,
          139         whatever.
          140   
          141 The above description we think is too vague and it can be simpler.
          142   
          143 Recommendation: handle retrieving text file types the same as binary types.
          144 For clients the Lastline pattern (".\r\n") is not handled specially in this case,
          145 it is part of the data.
          146 For servers no preprocessing is done on the TextFile data.
          147   
          148 Reason: Simplify the implementation of handling text types. Make the behaviour
          149 of text output consistent for clients.
          150   
          151   
          152 # The 'h' type: extract from the file references/h_type.txt
          153           
          154         Below is an archived conversation about the Gopher 'h' type:
          155           
          156         Received: with LISTAR (v1.0.0; list gopher); Tue, 12 Feb 2002 14:19:47 -0500 (EST)
          157         Return-Path: <jgoerzen@complete.org>
          158         Delivered-To: gopher@complete.org
          159         To: gopher@complete.org
          160         Subject: [gopher] Links to URL
          161         From: John Goerzen <jgoerzen@complete.org>
          162         Date: 12 Feb 2002 14:19:46 -0500
          163         Content-type: text/plain; charset=us-ascii
          164         Content-Transfer-Encoding: 8bit
          165         
          166         I think it is best to start small with modifications to the protocol.
          167         Therefore, I propose the following:
          168         
          169         Method to link to URLs from Gopherspace
          170         ---------------------------------------
          171         
          172         1. Protocol issues
          173         
          174         Links to URLs from a gopher directory shall be defined as follows:
          175         
          176          Type -- the appropriate character corresponding to the type of the
          177          document on the remote end; h if HTML.
          178         
          179          Path -- the full URL, preceeded by "URL:".  For instance:
          180                  URL:http://www.complete.org/
          181         
          182          Host, Port -- pointing back to the gopher server that provided
          183          the directory for compatibility reasons.
          184         
          185          Name -- as usual for a Gopher directory entry.
          186         
          187         2. Conforming client requirements
          188         
          189         A client adhering to this specification will, when it sees a Gopher
          190         selector with a path starting with URL:, interpret the path as a URL.
          191         It will ignore the host and port components of the Gopher selector,
          192         using those components from the URL instead (if applicable).
          193         
          194         3. Conforming server requirements
          195         
          196         A server with Gopher URL support will not, in most cases, need to take
          197         extra steps to provide this support beyond those outlined in
          198         Compatibility below.  Servers not implementing those steps outlined in
          199         Compatibility will be deemed to be not in compliance.
          200         
          201         4. Authoring compliance
          202         
          203         The use of URL: selectors should be avoided wherever possible.  In
          204         particular, it should be avoided when pre-existing gopher facilities
          205         exist for the type of content linked.  The following URL types are
          206         explicitly prohibited by this specification:
          207         
          208           gopher
          209           telnet
          210           tn3270
          211         
          212         Authors should avoid links to any document not of HTML type whenever
          213         possible.  Linking to non-HTML documents will break compatibility with
          214         Gopher browsers that do not implement this specification.  The ranks
          215         of these browsers include most Web browsers, so that is a significant
          216         audience.
          217         
          218         5. Compatibility
          219         
          220         Links to HTML pages may be accomodated even for non-comforming
          221         browsers by providing additional capabilities in the server.
          222         
          223         When a non-conforming browser is instructed to follow a link to a URL,
          224         it will contact the Gopher server that provided the menu (since these
          225         are specified per section 1).
          226         
          227         When a conforming Gopher server receives a request whose path begins
          228         with URL:, it will write out a HTML document that will send the
          229         non-compliant browser to the appropriate place.  One such conforming
          230         document is:
          231         
          232           <HTML>
          233           <HEAD>
          234           <META HTTP-EQUIV="refresh" content="2;URL=http://www.acm.org/classics/">
          235           </HEAD>
          236           <BODY>
          237           You are following a link from gopher to a web site.  You will be
          238           automatically taken to the web site shortly.  If you do not get sent
          239           there, please click
          240           <A HREF="http://www.acm.org/classics/">here</A> to go to the web site.
          241           <P>
          242           The URL linked is:
          243           <P>
          244           <A HREF="http://www.acm.org/classics/">http://www.acm.org/classics/</A>
          245           <P>
          246           Thanks for using gopher!
          247           </BODY>
          248           </HTML>
          249         
          250         This document may be any desired by the server authors, but must
          251         adhere to these requirements:
          252          * It must provide a refresh of a duration of 10 seconds or less
          253          * It must not use IMG tags, frames, or have any reference whatsoever
          254            to content outside that particular file -- other than the link
          255            to the real destination.
          256          * It must not use JavaScript.
          257          * It must adhere to the W3C HTML 3.2 standard.
          258         
          259         When a non-conforming Gopher client finds a reference to a HTML file
          260         (type h), it will open up the file via Gopher (getting the redirect
          261         document) but using a web browser.  The web browser will then be
          262         redirected to the actual link destination.  Conforming clients will
          263         follow the link directly.
          264         
          265         END
          266         
          267 
          268 # TLS support
          269   
          270 From: 2020-06-07  Gopher TLS prototype in geomyidae by 20h at
          271 <gophers://bitreich.org/0/usr/20h/phlog/2020-06-07T18-28-23-863932.md>:
          272           
          273         # 2020-06-07 18:28:23.863932 UTC (+0000)
          274         
          275         Gopher TLS prototype in geomyidae
          276           
          277         We are  happy and proud  to announce, that there  is now a  prototype of
          278         gopher tls in geomyidae
          279           
          280                         git://bitreich.org/geomyidae
          281           
          282         How does it work?
          283         
          284         When a  client tries to  connect via TLS, the  first byte of  the packet
          285         will be 0x16 or 22 decimal, which  is forbidden as a selector in Gopher.
          286         This  gives the  server a  hint to  start TLS.  Old servers  will simply
          287         reject such a connection attempt.
          288         
          289         For now clic supports  TLS. We are working on hurl  TLS support. And for
          290         sacc it is on its way.
          291         
          292                 git://bitreich.org/clic
          293                 git://bitreich.org/sacc
          294                 git://codemadness.org/hurl
          295         
          296         Hopefully further support will come to other clients.
          297         
          298         If you do not have anything at hand, here are some commandline clients:
          299         
          300         Plain old Gopher:
          301         
          302                 printf "/\r\n" | nc bitreich.org 70
          303         
          304         And with TLS:
          305         
          306                 printf "/\r\n" | socat openssl-connect:bitreich.org:70,verify=0 -
          307         
          308         Have fun using TLS on gopher!
          309         
          310         
          311         All patches and recommendations are welcome.
          312           
          313           
          314         Sincerely yours,
          315         
          316         20h
          317         Senior Security Manager (SSM)
          318 
          319 
          320 # Gopher TLS URI
          321 
          322 A gopher TLS URI is the same as the Gopher URI described in RFC4266,
          323 except the protocol scheme is gophers://.
          324 
          325 When the client using the Gopher protocol does not support TLS it can
          326 simply use a plain gopher:// connection.
          327 
          328 
          329 # Gopher TLS downgrades
          330 
          331 A client COULD implement the following logic:
          332 
          333 When a user uses gophers:// then it should use TLS and not downgrade
          334 automatically to a plain connection. The client COULD also show a
          335 _clear_ message if the TLS connection is not accepted and offer a
          336 manual downgrade option to plain-text.
          337 
          338 When further selectors of the same host and port are accessed it should use
          339 TLS automatically as well.
          340 
          341 
          342 # Gopher+ compatibility
          343 
          344 Gopher+ allows adding more TAB-separated fields to the output.  For
          345 Gopher, to be compatible with Gopher+ clients, it can simply accept the
          346 line, but ignore these additional fields.
          347 
          348 
          349 # Other references:
          350 
          351 * RFC1436 - The Internet Gopher Protocol
          352   <https://www.rfc-editor.org/rfc/rfc1436.txt>
          353   or see the file references/rfc1436.txt
          354 
          355 * RFC4266 - The gopher URI Scheme
          356   <https://www.rfc-editor.org/rfc/rfc4266.txt>
          357   or see the file references/rfc4266.txt
          358 
          359 * Gopher+:
          360   <https://github.com/gopher-protocol/gopher-plus/blob/main/gopherplus.txt>
          361   or references/gopherplus.txt
          362 
          363 * geomyidae Gopher server:
          364   <git://bitreich.org/geomyidae>
          365 
          366 * Helper tool to validate gopher and DirEntities:
          367   <git://bitreich.org/gopher-validator>
          368 
          369 References in this repository: <gopher://bitreich.org/1/scm/gopher-protocol>