Get repository to current brcon2022 state. - gopher-protocol - Gopher Protocol Extension Project (HTM) git clone git://bitreich.org/gopher-protocol git://enlrupgkhuxnvlhsf6lc3fziv5h2hhfrinws65d7roiv6bfj7d652fid.onion/gopher-protocol (DIR) Log (DIR) Files (DIR) Refs (DIR) Tags (DIR) README (DIR) LICENSE --- (DIR) commit f7bb7959972ce0d459cb2a7b478250f908b3728d (HTM) Author: Christoph Lohmann <20h@r-36.net> Date: Sat, 6 Aug 2022 14:06:48 +0200 Get repository to current brcon2022 state. Diffstat: A LICENSE | 12 ++++++++++++ A README.md | 29 +++++++++++++++++++++++++++++ A TODO.md | 7 +++++++ A references/gopherplus.txt | 1267 ++++++++++++++++++++++++++++++ A references/rfc1436.txt | 906 +++++++++++++++++++++++++++++++ A references/rfc4266.txt | 339 +++++++++++++++++++++++++++++++ 6 files changed, 2560 insertions(+), 0 deletions(-) --- (DIR) diff --git a/LICENSE b/LICENSE @@ -0,0 +1,12 @@ +/* + * ---------------------------------------------------------------------------- + * "THE FRIKANDEL-WARE LICENSE": + * The authors of this repository wrote this file. As long as you retain this + * notice you can do whatever you want with this stuff. If we meet some day, + * and you think this stuff is worth it, you can buy me a frikandel in return. + * ---------------------------------------------------------------------------- + * + * Authors: + * Hiltjo Posthuma <hiltjo@codemadness.org> + * Christoph Lohmann <20h@r-36.net> + */ (DIR) diff --git a/README.md b/README.md @@ -0,0 +1,29 @@ +# Gopher Protocol + +## Introduction + +Gopher is a really old protocol. It has much history to preserve. Over +time and development the usage changed from the original RFC 1436 (see +rfc1436.txt) to usage now (2018). We now use UTF-8, we have other media +format, file type detection is more advanced, the web has grown and +brought use URIs and gopher+ has failed. + +The goal of this repository is to gather and document recommendations +for how to use gopher today so errors of the past are not redone. + +We do not plan to write a new RFC, since the recommendations are rather +small and do not require a big redefinition of the basic gopher +protocol. + +## Recommendations + +See the gopher-extension.md for the current state. + +## License + +See LICENSE file. + +## Have Fun + +We all should have fun! + (DIR) diff --git a/TODO.md b/TODO.md @@ -0,0 +1,7 @@ +## TODO + +* define more types such as 'd', 's', 'a'? + * TODO: add all historical item types found +* define robots.txt? (probably out of scope for this document)? + * search for mailing list post about robots.txt of 20h + (DIR) diff --git a/references/gopherplus.txt b/references/gopherplus.txt @@ -0,0 +1,1267 @@ + +Gopher+ + + +upward compatible enhancements to +the Internet Gopher protocol + + + +Farhad Anklesaria, Paul Lindner, Mark P. McCahill, +Daniel Torrey, David Johnson, Bob Alberti + +Microcomputer and Workstation Networks Center / +Computer and Information Systems +University of Minnesota + +July 30, 1993 + + + +gopher+ n. 1. Hardier strains of mammals of the +family Geomyidae. 2. (Amer. colloq.) Native or +inhabitant of Minnesota, the Gopher state, in full +winter regalia (see PARKA). 3. (Amer. colloq.) +Executive secretary. 4. (computer tech.) Software +following a simple protocol for burrowing through a +TCP/IP internet, made more powerful by simple +enhancements (see CREEPING FEATURISM). + + +Abstract + +The internet Gopher protocol was designed for +distributed document search and retrieval. The +documents "The internet Gopher protocol: a +distributed document search and retrieval protocol" +and internet RFC 1436 describe the basic protocol and +has an overview of how to implement new client and +server applications. This document describes a set of +enhancements to the syntax, semantics and +functionality of the original Gopher protocol. + + +Distribution of this document is unlimited. Please +send comments to the Gopher development team: +<gopher@boombox.micro.umn.edu>. Implementation of +the mechanisms described here is encouraged. + + + +1. Introduction + +The Internet Gopher protocol was designed primarily to +act as a distributed document delivery system. It +has enjoyed increasing popularity, and is being used +for purposes that were not visualized when the +protocol was first outlined. The rest of this +document describes the Gopher+ enhancements in a non- +rigorous but easily read and understood way. There +is a short BNF-like section at the end for exact +syntax descriptions. Throughout the document, "F" +stands for the ASCII TAB character. There is an +implicit carriage return and linefeed at the ends of +lines; these will only be explicitly mentioned where +necessary to avoid confusion. To understand this +document, you really must be familiar with the basic +Gopher protocol. + + +Servers and clients understanding the Gopher+ +extensions will transmit extra information at the ends +of list and request lines. Old, basic gopher clients +ignore such information. New Gopher+ aware servers +continue to work at their old level with unenhanced +clients. The extra information that can be +communicated by Gopher+ clients may be used to summon +new capabilities to bridge the most keenly felt +shortcomings of the venerable old Gopher. + + + + +2. How does Gopher+ work? + +Gopher+ enhancements rely on transmitting an "extra" +tab delimited fields beyond what regular (old) Gopher +servers and clients now use. If most existing (old) +clients were to encounter extra stuff beyond the +"port" field in a list (directory), most would ignore +it. Gopher+ servers will return item descriptions in +this form: + + +1Display stringFselector stringFhostFportFextra +stuff<CRLF> + + +If an existing (old) client has problems with +additional information beyond the port, it should not +take much more than a simple tweak to have it discard +unneeded stuff. + + + + +2.1 Advisory issued to client maintainers. + +If it does not do this already, your existing client +should be modified as soon as possible to ignore +extra fields beyond what it expects to find. This +will ensure thatyour clients does not break when it +encounters Gopher+ servers in gopherspace. + + +All the regular Gopher protocol info remains intact +except for: + + +(1) Instead of just a CRLF after the port field in +any item of a list (directory) there may be an +optional TAB followed by extra stuff as noted above +(explanation to follow). + + + +(2) In the original Gopher protocol, there was +provision for a date-time descriptor (sec 3.6) to be +sent after the selector (for use by autoindexer +beasts). As far as we know, while the descriptor is +implemented in the Mac server, it is not in any other +server and no clients or daemons use it. This is a +good time to withdraw this feature. The basic gopher +protocol has been revised for the final time and will +be frozen. + + + + + + +2.2 Gopher+ item lists. + +Gopher servers that can utilize the Gopher+ +enhancements will send some additional stuff +(frequently the character "+") after the port field +describing any list item. eg: + + +1Some old directoryFfoo selectorFhost1Fport1 + +1Some new directoryFbar selectorFhost1Fport1F+ + +0Some file or otherFmoo selectorFhost2Fport2F+ + + +The first line is the regular old gopher item +description. The second line is new Gopher+ item +description. The third line is a Gopher+ description +of a document. Old gopher clients can request the +latter two items using old format gopher selector +strings and retrieve the items. New, Gopher+ savvy +clients will notice the trailing + and know that they +can do extra things with these kinds of items. + + + + + +2.3 Gopher+ data transfer. + +If a client sends out a Gopher+ type request to a +server (by tagging on a tab and a "+" to the +request): + + + bar selectorF+ + + +The server may return the response in one of three +ways; examples below: + + + +5340<CRLF><data> + + + + +-1<CRLF><data><CRLF>.<CRLF> + + + + +-2<CRLF><data> + + +The first response means: I am going to send exactly +5340 bytes at you and they will begin right after this +line. The second response means: I have no idea how +many bytes I have to send (or I am lazy), but I will +send a period on a line by itself when I am done. +The third means: I really have no idea how many +bytes I have to send, and what's more, they COULD +contain the <CRLF>.<CRLF> pattern, so just read until +I close the connection. + + +The first character of a response to a Gopher+ query +denotes success (+) or failure (-). Following that is +a token to be interpreted as a decimal number. If the +number is >= 0, it describes the length of the +dataBlock. If = -1, it means the data is period +terminated. If = -2, it means the data ends when the +connection closes. + + +The server may return an error also, as in: + + +--1<CRLF><data><CRLF>.<CRLF> + + +The (short!) error message will be in ASCII text in +the data part. The first token on the first line of +the error text (data) contains an error-code (an +integer). It is recommended that the first line also +contain the e-mail address of the administrator of +the server (in angle brackets). Both the error-code +and the email address may easily be extracted by the +client. Subsequent lines contain a short error +message that may be displayed to the user. Basic error +codes are: + + + 1 Item is not available. + + 2 Try again later ("eg. My load is too high +right now.") + + 3 Item has moved. Following the error-code is +the gopher descriptor + + of where it now lives. + + +More error codes may be defined as the need arises. + + + +This should be obvious: if the client sends out an +"old" Gopher kind of request: + + + + bar selector + + + +the server will know that it is talking to an old +client and will respond in the old way. This means +that old gopher clients can still access information +on Gopher+ servers. + + + + +2.4 Gopher+ client requests. + + +Clients can send requests to retrieve the contents of +an item in this form: + + + +selectorstringF+[representation][FdataFlag]<CRLF>[dat +ablock] + + +If dataFlag is '0', or nonexistent, then the client +will not send any data besides the selector string. +If the dataFlag is '1' then a block of data will +follow in the same format as Section 2.3. The client +can send a large amount of data to the server in the +dataBlock. Representations or alternative views of an +item's contents may be discovered by interrogating the +server about the item's attribute information; this is +explained below. + + +Note that in the original Gopher protocol, a query +submitted to an index server might have a selector +string followed by a TAB and the words for which the +index server was being asked to search. In Gopher+, +the extra TAB and Gopher+ information follow the words +for which the server is being asked to search. Gopher+ +client have to be smart enough to know that in the +case of a type 7 item (an index server) they append +the Gopher+ information after the words being searched +for. + + + +2.5 Gopher+ Item Attribute Information. + + +The most basic enhancement of Gopher+ items is the +ability to associate information about an item such +as size, alternative views, the administrator, an +abstract, etc. with the item. To get Attribute +Information, a client can send out a request to the +gopher server that looks like this: + + + selector stringF!<CRLF> + + +(think of "!" as an upside-down i for "information"). +To the server this means "Instead of returning the +contents of the item, return the item's Attribute +Information". The Attribute Information is returned as +an ASCII text stream containing blocks of +information.For example, a server might return: + + + +INFO: 0Some file or otherFmoo +selectorFhost2Fport2F+ + + +ADMIN: + + Admin: Frodo Gophermeister <fng@bogus.edu> + + Mod-Date: Wed Jul 28 17:02:01 1993 +<19930728170201> + + +VIEWS: + + Text/plain: <10k> + + application/postscript: <100k> + + Text/plain De_DE: <15k> + + application/MacWriteII: <45K> + + +ABSTRACT: + + This is a short (but multi-line) abstract about +the + + item. Two or three lines ought to be enough + + +The beginning of a block of information is denoted by +a "+" in column 1 of a line. Another way to think of +it is: the name of each block begins with a + and the +rest of the name cannot contain a +. Each line of +information within a block begins with a space so +that it is easy to locate the beginning of a block. + + +There can be multiple blocks of information about an +item, but the first block must be the one-line +INFO +block containing the keyword +INFO followed by the +gopher item description. This is done to make it easy +to associate informational attributes with the gopher +items to which they refer (see section 2.7 for some +good reasons for doing this). The very first line of +Attribute Information for an item contains a one-line ++INFO block containing the gopher descriptor for the +item. All Gopher+ servers must return an "+INFO" +block for all items listed by the server. Also +present may be an +ADMIN block that can be many lines +long. The server must also send an +ADMIN block when +asked to send all the item's attributes (as in the +example above). The +ADMIN block must contain at +least an Admin attribute and Mod-Date attributes, +though there may be many other administrative items +also present in the +ADMIN block. The Admin (the +administrator of the item) and Date (the date of the +item's last modification) attributes are required to +be returned by the server, and other optional +attributes may be returned as well. In this example, +there are two pieces of information within the +ADMIN +block (Admin and Mod-Date). The Admin attribute must +contain the e-mail address of an administrator inside +angle brackets. The Admin line might also contain the +administrator's name and phone number. The Date line +must contain the modification date in angle brackets. +The format of the date is <YYYYMMDDhhmmss> where YYYY +is year, MM is month, DD is day, hh is hours, mm is +minutes, and ss is seconds. + + +The third block in the example is the +VIEWS block. +This block lists different formats in which the +document can be retrieved. The first format listed is +what the server believes to be the preferred format. +A gopher client might display the list of possible +view labels of the item to the user and let the user +select the view they prefer. Alternatively, a smart +client might look at the content of the labels and +preferentially retrieve Postscript views of items. +Note that the view labels are structured. View labels +specify a Content-Type (application/Postscript, +Text/plain, etc.), an optional language (En_US, De_DE, +etc.) and an optional size. Note that the View labels +for content type use the MIME content types to specify +names of the variious views. The optional language +descriptors use the ISO-639 codes for representing +languages to name the language. Smart clients might +want to translate these rather cryptic codes into +something mere mortals can read and understand. + + +The client software can pick off the size of each +view IF there are any angle brackets on the line. +There might not be a size that the server cares to +tell you about. Also this might NOT be the exact size +that the server will wind up delivering to you if you +ask for it... but it should be reasonably close. This +information makes it possible for clever clients to +select views based on size, data representation, or +language. See section 2.6 for how alternate +representations (views) are retrieved. + + +The next block in the example is an (optional) ++ABSTRACT. Here the block consists of lines of text +that might be displayed to the user. + + +Other blocks of information can defined and added as +the need arises. For instance, a Neuromancer-esque 3-D +cyberspace attribute might be accommodated by +including a 3D-ICON block (with an image to display +in 3-space) and a 3D-COORDINATE block (with y,x, and +z coordinates). More immediate needs can also +addressed by defining other information blocks. For +instance, a SCRIPT block would be a natural place to +put information for scripting telnet sessions. +Information blocks give us an extensible way of +adding attributes (or what Macintosh programmers call +resources) to gopher items. + + +Some of the really cool ideas we have for information +attributes may require sending large amounts of data, +some of which may not be easily represented as ASCII +text, but the idea of the attributes information is +that it is a relatively compact list of attributes. +These somewhat conflicting desires can be reconciled +by allowing references to gopher items in an +attribute. For example, an +ABSTRACT block might be +returned this way: + + + +ABSTRACT: 0long abstractFselectorFhost2Fport2F+ + + +In this example, the abstract is a document that +resides on a gopher server. By allowing references to +to gopher items, we can also accommodate data that +must be sent in an 8-bit clear stream by using the +Gopher+ methods for retrieving binary data. + + +If both a reference to an attribute and an explicit +value for the attribute are present in an attribute +list, the preferred version is the explicit value. In +the example below, the preferred version is "the +short abstract goes here". + + + +ABSTRACT: 0long abstractFselectorFhost2Fport2F+ + + the short abstract goes here + + +Note that if you want to have several views of (for +example) an +ABSTRACT this is possible by using a +reference to a item residing on a gopher server +because the item can have its own attributes. + + +Attributes names are case sensitive (easier to match +and more of them). There is no need to "preregister" +all possible attributes since we cannot anticipate +all possible future needs. However it would be +reasonable to maintain a registry for implementors +and administrators so duplication can be avoided. +Server implementors or administrators can request that +new attributes be included in the attribute registry. + + +Dream on: What gets us excited are alternate +representations for directory lists. Sure, the +standard representation for a gopher directory list +is known to us all. But isn't hypertext (in a WWW +sense) an alternate kind of directory list? We also +envisioned a "geographical view" (GView?) mapping +servers onto a map of the world (throw up a gif +picture and then overlay dots based on latitude and +longitude or xy coordinates). OK. Dream off. + + + Note that interested parties outside gopherspace have +long and complex wish-lists for "attributes" that all +well-dressed Internet citizens should have. We don't +want to comment on the use or value of these laundry- +lists. Suffice it to say that nothing precludes +server administrators from including whatever +attributes they see fit to include. Certainly IAFA +blocks are desirable, bearing UDIs, URL's or whatever +else is desired. The gopher community will probably +arrive at a list of "recommended" attributes that +server administrators should try to support. Because +not every server administrator sees advantage to +cluttering Attribute Info files with information +their primary users will never need, it does not seem +fair to "force" folks to include them; most will +just ignore the harsh protocol guideline and the +value of the protocol will be diminished. We want to +mandate as little as we possibly can. + + + + + +2.6 Using Attribute Info: Alternate +representations (+VIEWS). + + +The user may locate a document and wonder if there are +representations of it besides, say, the standard Text. +Using the appropriate client incantation (Option +Double-Click? or whatever) the user indicates a wish +to see what's available. The client retrieves the +Attribute Information, displays the list of views to +the user in some kind of scrolling list dialog. User +selects a line and client now requests the document +in say, Postscript representation: + + + the selectorF+application/Postscript + + + +Smart clients are not precluded from doing things like +"Always get Postscript if you can" or "Always get +Postscript if that is less than 700K in size." etc. +And the "smarter" you make it, the hairier your +client will become - unless you are a user interface +wizard of awesome proportions. While the example +above is of fetching a document's postscript view, +there is nothing precluding having different views +for directories. In the dream sequence earlier, we +imagined a geographic view of a directory. For a +client to fetch that view, it would say this: + + + the selectorF+GView + + + +2.7 Getting attributes for all items in a +directory in one transaction. + + +Heavyweight/clever/special-purpose clients may want to +know all the attributes of items in a given directory +in one transaction. The "$" command is used to +request all the attributes of a directory at once. +For instance, a client might sent the request: + + + selector stringF$ + + + and the server might return this: + + + +INFO: 0Salmon dogsFsome selectorFhost2Fport2F+ + + +ADMIN: + + Admin: Frodo Gophermeister <fng@bogus.edu> + + Mod-Date: August 15, 1992 <19920815185503> + + +VIEWS: + + Text/plain: <10k> + + application/Postscript De_DE: <100k> + + +ABSTRACT: + + A great recipe for making salmon + + +INFO: 0Hot pupsFother selectorFhost3Fport3F+ + + +ADMIN: + + Admin: Bilbo Gophernovice <bng@bogus.edu> + + Date: <19910101080003> + + +In this example, the server returned the attribute +lists for two items because there were only two items +in the directory.. The client software can easily +separate the attributes for the items since each +attribute list starts with "+INFO". It is also easy +for the client to use the "$" command to get +directory listings since the gopher item descriptor is +on the +INFO line for each item. + + +Note that the $ command is the only way to find the +administrator of a remote link. To get the full +attribute information for a link on another machine +may require asking the master machine for the item +information. It is possible to append which +attributes you are interested in retrieving after the +$, eg: + + + some directory selectorF$+VIEWS + +or + + other directory selectorF$+VIEWS+ABSTRACT + + + +The $ command makes it possible for a client that does +not mind burning bandwidth to get attribute +information for all items as the user navigates +gopherspace. Clients using 2400 bps SLIP links will +probably not use this method... but clients on +Ethernet may not mind. This command may also be useful +for building smart indexes of items in gopherspace. +Note that the specific requested attributes are only +suggestions to the server that the client would like +less than a full set of attributes. The server may +choose to ignore the request (if it is not capable of +extracting the required attributes) and return the +client the full set anyway. Other caveats: even if +the attributes requested are not available, the +server WILL NOT return an error, but will send +whatever IS available. It is the client's +responsibility inspect the returned attributes. + + +Analogous to use of the $ command, the ! command can +also be used to request certain attribute blocks. + + + + +2.8 Gopher+ Interactive Query items. + + +The principle here is based on Roland Schemer's "Q/q" +type ideas. We're calling it the Interactive Query +enhancements... + + +The server may list items that have a "?" following +the port field: + + + 0A fileFfile selectorFhostFportF? + + 1A directoryFdir selectorFhostFportF? + + +Now the fact that there's something after the port +field means that these are Gopher+ items. Old clients +will still be able to show such items in lists, but +if they simply send the old style plain selector +string to retrieve them, the server will respond with +an old style error telling them to get an updated +client. New clients will know that before getting one +of these items, it will be necessary to retrieve +questions from the server, have the user answer them, +and then feed the answers back to the server along +with the selector. The questions to be asked of the +user are retrieved from the server by looking at the ++ASK attribute in the item's attribute information. + + + + +When the user selects a query item, the client quickly +connects to the server and requests the Attribute +Information for the item. Then the client extracts +the information in the +ASK attribute block. Here's +an example: + + + +INFO: 0inquisitiveFmoo moo +selectorFhost2Fport2F+ + + +ADMIN + + Admin: Frank Gophermeister <fng@bogus.edu> + + Mod-Date: August 15, 1992 <19920815185503> + + +ASK: + + Ask: How many volts? + + Choose: Deliver electric shock to administrator +now?FYesFNot! + + + + + +The client will use all lines in the order they appear +in the +ASK attribute block. The content will be +presented to the user as questions or prompts or +dialogs or something like that. + + +The "Ask" presents the user with a question, supplies +a default text answer if it exists and allows the +user to enter a one-line responce. + + +The "AskP" presents the user with a question, and +bullets out the responce typed in by the user so that +someone watching over the user's sholder cannot read +the responce. + + +The "AskL" presents the user with a question, and +ideally should allo the user to enter several lines of +responce. + + +The "AskF" requests the user for a new local filename, +presumably for stashing the response returned by the +server. It may supply a default filename. + + +The "Select" presents the user with a set of options +from which the use can select one or many. This is +equivalent to Macintosh check boxes. + + +The "Choose" presents the user with a few short +choices only one of which may be selected at a time. +This is equivalent to Macintosh radio buttons. + + +The "ChooseF" requests that the user select an +existing local file, presumably for sending to the +server. On some systems, the client writer or +administrator might want to restrict the selection of +such files to the current directory (ie. not allow +paths in the filename to prevent sending things like +password files). + + +The n responses harvested from the user are sent on to +the server as the first n lines in the dataBlock. +There can only be one file sent, and it will be the +remainder of the dataBlock if any. If there is an +AskL the responce is returned with a count of the +number of lines entered by the user on a line by +itself, followed by the lines entered by the user. + + +Gopher was originally designed as an essentially +anonymous document retrieval protocol to facilitate +easy access to information rather than limited +access. Various kinds of restrictive mechanisms have +been implemented at the server end (for example, +access restriction by source IP address); however if +you have sensitive information, we emphasize that +putting it under a Gopher's nose is not a good idea. + + + + +The folks with a hirsute tendency will have noticed +that all these interactions are static rather than +truly dynamic and interactive. In other words, the +server cannot ask different questions in response to +different answers. +ASK does not constitute a +scripting language by any means. + + +To do "true" scripting, we have to do one of two +things + + +1. Write a full language parser/interpreter into +clients. The server loads a whole script into the +client's brain, and the client "runs" it. This +rather grossly violates the spirit of simplicity in +cross-platform gopher implementation. However, when +and if a standard scripting language is adopted, +there will be room for it in a SCRIPT attribute block. + + +2. Client enters a complex back-and-forth transaction +with the server. This requires the server, client, or +both to save rather a lot of state. NOPE! Server +saving state means holding open a connection or +(worse) the server retaining tokens between +connections. Client saving state means the server +has an even worse job to do. + + +As Opus the Penguin would say: a Hairball. + + + +2.9 Gopher+ Pictures, Sounds, Movies. + + +A lot of folks need ability to retrieve and display +pictures, but there is no real consensus on ONE format +for these pictures. We don't want to define a type +character for every oddball picture type. Gopher+ +handles Pictures, Movies, and Sounds by defining +three item types: ":" for bitmap images, ";" for +movies, and "<" for sounds (originally I, M, and S +were suggested, but they were informally in use in +other ways; the only thing magic about ":", ";", and +"<", is that they are the first characters after '9') + + +Note that there is NO default format for Pictures, +Movies and Sounds; the specific format of the image, +movie, or sound must be gleaned from the +VIEWS +information for the item (eg. Gif, PICT, TIFF, etc.). + + + + +Appendix I + + +Required attributes and suggested attributes. + + + +A1.0 The +INFO attribute block + + +The +INFO atttribute block is sent whenever an item's +attributes are requested. It is required that the +Attribute Information list for an item must contain a +one-line +INFO attribute, and the +INFO attribute +must contain the gopher+ descriptor for the item. + + + +INFO: 1Nice stuffF/selectorFhostFportF+ + + + +A2.0 The +ADMIN attribute + + + A Gopher+ server is required to have an +ADMIN block +for every item and the +ADMIN block must contain +Admin and a Mod-Date lines: + + + +ADMIN: + + Admin: [comments] <administrator e-mail address> + + Mod-Date: [comments] <YYYYMMDDhhmmss> + + +In addition to the required lines, we recommend that +the +ADMIN attribute of items returned by a full-text +search engine contain a SCORE attribute. The SCORE +attribute should contain the relevance ranking (an +integer) of the item. + + + Score: relevance-ranking + + +We recommend that the +ADMIN attribute of a full-text +search engine contain a Score-Range attribute. This +attribute is used to specify the range of values +taken on by the relevance ranking scores of items +returned by the search engine. The Score-Range makes +it possible to normalize scores from different search +engine technologies. The first number is the lower +bound, the second number is the upper bound. + + + Score-range: lower-bound upper-bound + + +We also recommend that the +ADMIN attribute for the +root of the server (i.e. what you get back when you +ask for the attributes of the item with the empty +selector string) also contain these fields: + + + Site: the name of the site + + Org: organization or group owning the site + + Loc: city, state, country + + Geog: latitude longitude + + TZ: timezone as gmt-offset + + +Other useful attributes might include: + + + Provider: who provided this item + + Author: who wrote this item + + Creation-Date: when it was born <YYYYMMDDhhmmss> + + Expiration-Date: when it expires <YYYYMMDDhhmmss> + + + +A3.0 The +VIEWS attribute + + +The +VIEWS attribute is used to specify alternative +representations of an item. The form of the +VIEWS +attribute is: + + + +VIEWS: [gopher descriptor] + + Content-Type[ viewLanguage]: [<56K>] + + Content-Type[ viewLanguage]: [<93K>] + + Content-Type[ viewLanguage]: [<77K>] + + +Some values for Content-Type are + + + Text/plain, application/Postscript, image/Gif, +image/jpeg, + + +Content Types are defined by the Internet Assigned +Numbers Authority (IANA). To register a new content +type send e-mail to IANA@isi.edu. For a +comprehensive list, consult the most up-to-date MIME +Request for Comments (RFC). A list of currently +defined views may be retrieved by anonymous ftp from +isi.edu in the directory + + +/pub/in-notes/MIME/mime-types + + +All gopher servers must support the Text/plain view +for readable documents and the application/gopher- +menu view (the basic Gopher+ directory list) for +directories. These are the views that must be +returned by default. If all a server supports is the +default views, then it may omit the +VIEWS attribute +block (although we suggest that it not do so). + + +The viewLanguage is defined as a concatanation of two +ISO standard values, the ISO 639 language code and +the ISO-3166 country code. + + +Some values for viewLanguage are: + + + En_US, De_DE, Es_ES, Se_SE + + + +A4.0 The +ABSTRACT attribute + + +The +ABSTRACT attribute is used to specify a short +abstract for the item. The form of the +ABSTRACT +attribute is: + + + +ABSTRACT: [gopher reference] + + A line of text<CRLF> + + another line of text<CRLF> + + still another line of text.<CRLF> + + +We recommend that a description of the sorts of +information at the site, a postal address, a phone +number, and the administrator name for the site be +included in the +ABSTRACT attribute for the server +root (i.e. what you get when you ask for the +attribute list of the server with no selector +string). + + + + + +Appendix II + + +Paul's NQBNF (Not Quite BNF) for the Gopher+ +Enhancements. + + + +Note: This is modified BNF (as used by the Pascal +people) with a few English modifiers thrown in. +Stuff enclosed in '{}' can be repeated zero or more +times. Stuff in '[]' denotes a set of items. The '- +' operator denotes set subtraction. + + +This section is not quite solid yet. Please send us +information on any errors you might notice. + + +Directory Entity + + +CR-LF ::= Carriage Return Character followed by +Line Feed character. + +Tab ::= ASCII Tab character + +NUL ::= ASCII NUL character + +PLUS ::= ASCII '+' character + +LEFT ::= ASCII '<' character + +RIGHT ::= ASCII '>' character + +OCTET ::= $00 -> $ff + +UNASCII ::= OCTET - [Tab CR-LF NUL] + +UNASCIINOPLUS ::= UNASCII - [PLUS] + +UNASCIINOANGLE ::= UNASCII - [LEFT, RIGHT] + +Lastline ::= '.'CR-LF + +TextBlock ::= Block of ASCII text not containing +Lastline pattern. + +Type ::= UNASCII + +DisplayString ::= {UNASCII} + +Selector ::= {UNASCII} + +Otherflds ::= {UNASCII + TAB} + +Host ::= {{UNASCII - ['.']} '.'} {UNASCII - +['.']} + + + +Note: This is a Fully Qualified Domain Name as defined +in RFC 830. (e.g. gopher.micro.umn.edu) Hosts that +have a CR-LF TAB or NUL in their name get what they +deserve. + + +Digit ::= '0' | '1' | '2' | '3' | '4' | '5' | '6' +| '7' | '8' | '9' + +DigitSeq ::= digit {digit}. + +Port ::= DigitSeq. + + +Note: Port corresponds the the TCP Port Number, its +value should be in the range [0..65535]; port 70 is +officially assigned to gopher. + + + +Bool ::= '0' | '1' + +G+Field ::= '+' | '?' + + +Success ::= '+' | '-'. + +Transfer ::= DigitSeq | '-1' | '-2' + +DataHead ::= Success Transfer CR-LF + +DataBlock ::= DataHead {OCTET} + + + +G1DirEntity ::= Type DisplayString Tab Selector Tab +Host Tab Port Tab Otherflds CR-LF + +G+DirEntity ::= Type DisplayString Tab Selector Tab +Host Tab Port Tab G+Field + + CR-LF + + + +Notes: + + It is *highly* recommended that the DisplayString +field contain only printable characters, since many +different clients will be using it. However if eight +bit characters are used, the characters should +conform with the ISO-Latin1 Character Set. The length +of the User displayable line should be less than 70 +Characters; longer lines may not fit across some +screens. Warning! The Selector string can be longer +than 255 characters. + + +Menu Entity + +Menu ::= DataHead {G+DirEntity}. + + +Continues ::= Bool + +Representation ::= 'Text' | 'List' | 'Postscript' | +'MacWriteII' | 'RTF' |{UNASCII} + + + +Retrieving a document/menu/etc.: + + +C: Opens Connection + +S: Accepts Connection + +C: Sends Selector String Tab '+' Representation [Tab +DataFlag] + +C: [Optional] Client sends a DataBlock depending on +value of DataFlag. + +S: Sends DataBlock + +Connection is closed by either client or server +(typically server). + + +Spaceline ::= ' ' {UNASCII} CR-LF + +Blockname ::= '+' {UNASCIINOPLUS} + +Attrblock ::= Blockname ' ' [G+Direntry] CR-LF +{Spaceline} + +Attrval ::= SPACE {UNASCII} CR-LF + +E-Mail ::= {UNASCII} s.t. it conforms to RFC 822 + +Adminval ::= ' Admin:' {UNASCII} '<' E-Mailaddr +'>' CR-LF + +Dateval ::= ' Mod-Date:' {UNASCII} '<' +YYYYMMDDhhmmss '>' CR-LF + +AdminReq ::= AdminVal Dateval + +Infoblock ::= '+INFO: ' G+Direntry CR-LF + +AdminBlock ::= '+ADMIN: ' {G+Direntry} CR-LF +AdminReq {Attrval} + +Language ::= 'English' | 'French' | 'German' | +{UNASCII} + +ViewVal ::= ' ' Representation [' ' Language] ":" +ASCIINOANGLE '<' + + Size 'k>' CR-LF + +ViewBlock ::= '+VIEWS: ' {G+Direntry} CR-LF +{ViewVal} + +AttrBlocks ::= InfoBlock ViewBlock {AttrBlock} + + + +Retrieving item Information. + + + +For non-index server (non-type 7 items) + + +C: Opens Connection + +S: Accepts Connection + +C: Sends Selector String Tab '!' {BlockName}CR-LF + +S: Sends DataBlock with data in AttrrBlocks format. + +Connection is closed by either client or server +(typically server). + + + +For index server (type 7 items) the client asks the +search engine to so a search for nothing + +(i.e. the client does not provide any words to search +for) and appends a TAB and a "!" after the null- +search: + + +C: Opens Connection + +S: Accepts Connection + +C: Sends Selector String Tab Tab '!' {BlockName}CR-LF + +S: Sends DataBlock with data in AttrrBlocks format. + +Connection is closed by either client or server +(typically server). + + +Attributes ::= {AttrBlocks} + + + +Retrieving all Item Information entries for a +directory. + + +C: Opens Connection + +S: Accepts Connection + +C: Sends Selector String Tab '$'{BlockName} CR-LF + +S: Sends DataBlock with data in Attributes format. + +Connection is closed by either client or server +(typically server). + + +. (DIR) diff --git a/references/rfc1436.txt b/references/rfc1436.txt @@ -0,0 +1,906 @@ + + + + + + +Network Working Group F. Anklesaria +Request for Comments: 1436 M. McCahill + P. Lindner + D. Johnson + D. Torrey + B. Alberti + University of Minnesota + March 1993 + + + The Internet Gopher Protocol + (a distributed document search and retrieval protocol) + +Status of this Memo + + This memo provides information for the Internet community. It does + not specify an Internet standard. Distribution of this memo is + unlimited. + +Abstract + + The Internet Gopher protocol is designed for distributed document + search and retrieval. This document describes the protocol, lists + some of the implementations currently available, and has an overview + of how to implement new client and server applications. This + document is adapted from the basic Internet Gopher protocol document + first issued by the Microcomputer Center at the University of + Minnesota in 1991. + +Introduction + + gopher n. 1. Any of various short tailed, burrowing mammals of the + family Geomyidae, of North America. 2. (Amer. colloq.) Native or + inhabitant of Minnesota: the Gopher State. 3. (Amer. colloq.) One + who runs errands, does odd-jobs, fetches or delivers documents for + office staff. 4. (computer tech.) software following a simple + protocol for burrowing through a TCP/IP internet. + + The Internet Gopher protocol and software follow a client-server + model. This protocol assumes a reliable data stream; TCP is assumed. + Gopher servers should listen on port 70 (port 70 is assigned to + Internet Gopher by IANA). Documents reside on many autonomous + servers on the Internet. Users run client software on their desktop + systems, connecting to a server and sending the server a selector (a + line of text, which may be empty) via a TCP connection at a well- + known port. The server responds with a block of text terminated by a + period on a line by itself and closes the connection. No state is + retained by the server. + + + +Anklesari, McCahill, Lindner, Johnson, Torrey & Alberti [Page 1] + +RFC 1436 Gopher March 1993 + + + While documents (and services) reside on many servers, Gopher client + software presents users with a hierarchy of items and directories + much like a file system. The Gopher interface is designed to + resemble a file system since a file system is a good model for + organizing documents and services; the user sees what amounts to one + big networked information system containing primarily document items, + directory items, and search items (the latter allowing searches for + documents across subsets of the information base). + + Servers return either directory lists or documents. Each item in a + directory is identified by a type (the kind of object the item is), + user-visible name (used to browse and select from listings), an + opaque selector string (typically containing a pathname used by the + destination host to locate the desired object), a host name (which + host to contact to obtain this item), and an IP port number (the port + at which the server process listens for connections). The user only + sees the user-visible name. The client software can locate and + retrieve any item by the trio of selector, hostname, and port. + + To use a search item, the client submits a query to a special kind of + Gopher server: a search server. In this case, the client sends the + selector string (if any) and the list of words to be matched. The + response yields "virtual directory listings" that contain items + matching the search criteria. + + Gopher servers and clients exist for all popular platforms. Because + the protocol is so sparse and simple, writing servers or clients is + quick and straightforward. + +1. Introduction + + The Internet Gopher protocol is designed primarily to act as a + distributed document delivery system. While documents (and services) + reside on many servers, Gopher client software presents users with a + hierarchy of items and directories much like a file system. In fact, + the Gopher interface is designed to resemble a file system since a + file system is a good model for locating documents and services. Why + model a campus-wide information system after a file system? Several + reasons: + + (a) A hierarchical arrangement of information is familiar to many + users. Hierarchical directories containing items (such as + documents, servers, and subdirectories) are widely used in + electronic bulletin boards and other campus-wide information + systems. People who access a campus-wide information server will + expect some sort of hierarchical organization to the information + presented. + + + + +Anklesari, McCahill, Lindner, Johnson, Torrey & Alberti [Page 2] + +RFC 1436 Gopher March 1993 + + + (b) A file-system style hierarchy can be expressed in a simple + syntax. The syntax used for the internet Gopher protocol is + easily understandable, and was designed to make debugging servers + and clients easy. You can use Telnet to simulate an internet + Gopher client's requests and observe the responses from a server. + Special purpose software tools are not required. By keeping the + syntax of the pseudo-file system client/server protocol simple, we + can also achieve better performance for a very common user + activity: browsing through the directory hierarchy. + + (c) Since Gopher originated in a University setting, one of the + goals was for departments to have the option of publishing + information from their inexpensive desktop machines, and since + much of the information can be presented as simple text files + arranged in directories, a protocol modeled after a file system + has immediate utility. Because there can be a direct mapping from + the file system on the user's desktop machine to the directory + structure published via the Gopher protocol, the problem of + writing server software for slow desktop systems is minimized. + + (d) A file system metaphor is extensible. By giving a "type" + attribute to items in the pseudo-file system, it is possible to + accommodate documents other than simple text documents. Complex + database services can be handled as a separate type of item. A + file-system metaphor does not rule out search or database-style + queries for access to documents. A search-server type is also + defined in this pseudo-file system. Such servers return "virtual + directories" or list of documents matching user specified + criteria. + +2. The internet Gopher Model + + A detailed BNF rendering of the internet Gopher syntax is available + in the appendix...but a close reading of the appendix may not be + necessary to understand the internet Gopher protocol. + + In essence, the Gopher protocol consists of a client connecting to a + server and sending the server a selector (a line of text, which may + be empty) via a TCP connection. The server responds with a block of + text terminated with a period on a line by itself, and closes the + connection. No state is retained by the server between transactions + with a client. The simple nature of the protocol stems from the need + to implement servers and clients for the slow, smaller desktop + computers (1 MB Macs and DOS machines), quickly, and efficiently. + + Below is a simple example of a client/server interaction; more + complex interactions are dealt with later. Assume that a "well- + known" Gopher server (this may be duplicated, details are discussed + + + +Anklesari, McCahill, Lindner, Johnson, Torrey & Alberti [Page 3] + +RFC 1436 Gopher March 1993 + + + later) listens at a well known port for the campus (much like a + domain-name server). The only configuration information the client + software retains is this server's name and port number (in this + example that machine is rawBits.micro.umn.edu and the port 70). In + the example below the F character denotes the TAB character. + + Client: {Opens connection to rawBits.micro.umn.edu at port 70} + + Server: {Accepts connection but says nothing} + + Client: <CR><LF> {Sends an empty line: Meaning "list what you have"} + + Server: {Sends a series of lines, each ending with CR LF} + 0About internet GopherFStuff:About usFrawBits.micro.umn.eduF70 + 1Around University of MinnesotaFZ,5692,AUMFunderdog.micro.umn.eduF70 + 1Microcomputer News & PricesFPrices/Fpserver.bookstore.umn.eduF70 + 1Courses, Schedules, CalendarsFFevents.ais.umn.eduF9120 + 1Student-Staff DirectoriesFFuinfo.ais.umn.eduF70 + 1Departmental PublicationsFStuff:DP:FrawBits.micro.umn.eduF70 + {.....etc.....} + . {Period on a line by itself} + {Server closes connection} + + + The first character on each line tells whether the line describes a + document, directory, or search service (characters '0', '1', '7'; + there are a handful more of these characters described later). The + succeeding characters up to the tab form a user display string to be + shown to the user for use in selecting this document (or directory) + for retrieval. The first character of the line is really defining + the type of item described on this line. In nearly every case, the + Gopher client software will give the users some sort of idea about + what type of item this is (by displaying an icon, a short text tag, + or the like). + + The characters following the tab, up to the next tab form a selector + string that the client software must send to the server to retrieve + the document (or directory listing). The selector string should mean + nothing to the client software; it should never be modified by the + client. In practice, the selector string is often a pathname or + other file selector used by the server to locate the item desired. + The next two tab delimited fields denote the domain-name of the host + that has this document (or directory), and the port at which to + connect. If there are yet other tab delimited fields, the basic + Gopher client should ignore them. A CR LF denotes the end of the + item. + + + + + +Anklesari, McCahill, Lindner, Johnson, Torrey & Alberti [Page 4] + +RFC 1436 Gopher March 1993 + + + In the example, line 1 describes a document the user will see as + "About internet Gopher". To retrieve this document, the client + software must send the retrieval string: "Stuff:About us" to + rawBits.micro.umn.edu at port 70. If the client does this, the + server will respond with the contents of the document, terminated by + a period on a line by itself. A client might present the user with a + view of the world something like the following list of items: + + + About Internet Gopher + Around the University of Minnesota... + Microcomputer News & Prices... + Courses, Schedules, Calendars... + Student-Staff Directories... + Departmental Publications... + + + + In this case, directories are displayed with an ellipsis and files + are displayed without any. However, depending on the platform the + client is written for and the author's taste, item types could be + denoted by other text tags or by icons. For example, the UNIX + curses-based client displays directories with a slash (/) following + the name; Macintosh clients display directories alongside an icon of + a folder. + + The user does not know or care that the items up for selection may + reside on many different machines anywhere on the Internet. + + Suppose the user selects the line "Microcomputer News & Prices...". + This appears to be a directory, and so the user expects to see + contents of the directory upon request that it be fetched. The + following lines illustrate the ensuing client-server interaction: + + + Client: (Connects to pserver.bookstore.umn.edu at port 70) + Server: (Accepts connection but says nothing) + Client: Prices/ (Sends the magic string terminated by CRLF) + Server: (Sends a series of lines, each ending with CR LF) + 0About PricesFPrices/AboutusFpserver.bookstore.umn.eduF70 + 0Macintosh PricesFPrices/MacFpserver.bookstore.umn.eduF70 + 0IBM PricesFPrices/IckFpserver.bookstore.umn.eduF70 + 0Printer & Peripheral PricesFPrices/PPPFpserver.bookstore.umn.eduF70 + (.....etc.....) + . (Period on a line by itself) + (Server closes connection) + + + + + +Anklesari, McCahill, Lindner, Johnson, Torrey & Alberti [Page 5] + +RFC 1436 Gopher March 1993 + + +3. More details + +3.1 Locating services + + Documents (or other services that may be viewed ultimately as + documents, such as a student-staff phonebook) are linked to the + machine they are on by the trio of selector string, machine domain- + name, and IP port. It is assumed that there will be one well-known + top-level or root server for an institution or campus. The + information on this server may be duplicated by one or more other + servers to avoid a single point of failure and to spread the load + over several servers. Departments that wish to put up their own + departmental servers need to register the machine name and port with + the administrators of the top-level Gopher server, much the same way + as they register a machine name with the campus domain-name server. + An entry which points to the departmental server will then be made at + the top level server. This ensures that users will be able to + navigate their way down what amounts to a virtual hierarchical file + system with a well known root to any campus server if they desire. + + Note that there is no requirement that a department register + secondary servers with the central top-level server; they may just + place a link to the secondary servers in their own primary servers. + They may indeed place links to any servers they desire in their own + server, thus creating a customized view of thethe Gopher information + universe; links can of course point back at the top-level server. + The virtual (networked) file system is therefore an arbitrary graph + structure and not necessarily a rooted tree. The top-level node is + merely one convenient, well-known point of entry. A set of Gopher + servers linked in this manner may function as a campus-wide + information system. + + Servers may of course point links at other than secondary servers. + Indeed servers may point at other servers offering useful services + anywhere on the internet. Viewed in this manner, Gopher can be seen + as an Internet-wide information system. + +3.2 Server portability and naming + + It is recommended that all registered servers have alias names + (domain name system CNAME) that are used by Gopher clients to locate + them. Links to these servers should use these alias names rather + than the primary names. If information needs to be moved from one + machine to another, a simple change of domain name system alias + (CNAME) allows this to occur without any reconfiguration of clients + in the field. In short, the domain name system may be used to re-map + a server to a new address. There is nothing to prevent secondary + servers or services from running on otherwise named servers or ports + + + +Anklesari, McCahill, Lindner, Johnson, Torrey & Alberti [Page 6] + +RFC 1436 Gopher March 1993 + + + other than 70, however these should be reachable via a primary + server. + +3.3 Contacting server administrators + + It is recommended that every server administrator have a document + called something like: "About Bogus University's Gopher server" as + the first item in their server's top level directory. In this + document should be a short description of what the server holds, as + well as name, address, phone, and an e-mail address of the person who + administers the server. This provides a way for users to get word to + the administrator of a server that has inaccurate information or is + not running correctly. It is also recommended that administrators + place the date of last update in files for which such information + matters to the users. + +3.4 Modular addition of services + + The first character of each line in a server-supplied directory + listing indicates whether the item is a file (character '0'), a + directory (character '1'), or a search (character '7'). This is the + base set of item types in the Gopher protocol. It is desirable for + clients to be able to use different services and speak different + protocols (simple ones such as finger; others such as CSO phonebook + service, or Telnet, or X.500 directory service) as needs dictate. + CSO phonebook service is a client/server phonebook system typically + used at Universities to publish names, e-mail addresses, and so on. + The CSO phonebook software was developed at the University of + Illinois and is also sometimes refered to as ph or qi. For example, + if a server-supplied directory listing marks a certain item with type + character '2', then it means that to use this item, the client must + speak the CSO protocol. This removes the need to be able to + anticipate all future needs and hard-wire them in the basic Internet + Gopher protocol; it keeps the basic protocol extremely simple. In + spite of this simplicity, the scheme has the capability to expand and + change with the times by adding an agreed upon type-character for a + new service. This also allows the client implementations to evolve + in a modular fashion, simply by dropping in a module (or launching a + new process) for some new service. The servers for the new service + of course have to know nothing about Internet Gopher; they can just + be off-the shelf CSO, X.500, or other servers. We do not however, + encourage arbitrary or machine-specific proliferation of service + types in the basic Gopher protocol. + + On the other hand, subsets of other document retrieval schemes may be + mapped onto the Gopher protocol by means of "gateway-servers". + Examples of such servers include Gopher-to-FTP gateways, Gopher-to- + archie gateways, Gopher-to-WAIS gateways, etc. There are a number of + + + +Anklesari, McCahill, Lindner, Johnson, Torrey & Alberti [Page 7] + +RFC 1436 Gopher March 1993 + + + advantages of such mechanisms. First, a relatively powerful server + machine inherits both the intelligence and work, rather than the more + modest, inexpensive desktop system that typically runs client + software or basic server software. Equally important, clients do not + have to be modified to take advantage of a new resource. + +3.5 Building clients + + A client simply sends the retrieval string to a server if it wants to + retrieve a document or view the contents of a directory. Of course, + each host may have pointers to other hosts, resulting in a "graph" + (not necessarily a rooted tree) of hosts. The client software may + save (or rather "stack") the locations that it has visited in search + of a document. The user could therefore back out of the current + location by unwinding the stack. Alternatively, a client with + multiple-window capability might just be able to display more than + one directory or document at the same time. + + A smart client could cache the contents of visited directories + (rather than just the directory's item descriptor), thus avoiding + network transactions if the information has been previously + retrieved. + + If a client does not understand what a say, type 'B' item (not a core + item) is, then it may simply ignore the item in the directory + listing; the user never even has to see it. Alternatively, the item + could be displayed as an unknown type. + + Top-level or primary servers for a campus are likely to get more + traffic than secondary servers, and it would be less tolerable for + such primary servers to be down for any long time. So it makes sense + to "clone" such important servers and construct clients that can + randomly choose between two such equivalent primary servers when they + first connect (to balance server load), moving to one if the other + seems to be down. In fact, smart client implementations do this + clone server and load balancing. Alternatively, it may make sense to + have the domain name system return one of a set of redundant of + server's IP address to load balance betwen redundant sets of + important servers. + +3.6 Building ordinary internet Gopher servers + + The retrieval string sent to the server might be a path to a file or + directory. It might be the name of a script, an application or even + a query that generates the document or directory returned. The basic + server uses the string it gets up to but not including a CR-LF or a + TAB, whichever comes first. + + + + +Anklesari, McCahill, Lindner, Johnson, Torrey & Alberti [Page 8] + +RFC 1436 Gopher March 1993 + + + All intelligence is carried by the server implementation rather than + the protocol. What you build into more exotic servers is up to you. + Server implementations may grow as needs dictate and time allows. + +3.7 Special purpose servers + + There are two special server types (beyond the normal Gopher server) + also discussed below: + + 1. A server directory listing can point at a CSO nameserver (the + server returns a type character of '2') to allow a campus + student-staff phonebook lookup service. This may show up on the + user's list of choices, perhaps preceded by the icon of a phone- + book. If this item is selected, the client software must resort + to a pure CSO nameserver protocol when it connects to the + appropriate host. + + 2. A server can also point at a "search server" (returns a first + character of '7'). Such servers may implement campus network (or + subnet) wide searching capability. The most common search servers + maintain full-text indexes on the contents of text documents held + by some subset of Gopher servers. Such a "full-text search + server" responds to client requests with a list of all documents + that contain one or more words (the search criteria). The client + sends the server the selector string, a tab, and the search string + (words to search for). If the selector string is empty, the client + merely sends the search string. The server returns the equivalent + of a directory listing for documents matching the search criteria. + Spaces between words are usually implied Boolean ANDs (although in + different implementations or search types, this may not + necessarily be true). + + The CSO addition exists for historical reasons: at time of design, + the campus phone-book servers at the University of Minnesota used the + CSO protocol and it seemed simplest to just engulf them. The index- + server is however very much a Gopher in spirit, albeit with a slight + twist in the meaning of the selector-string. Index servers are a + natural place to incorperate gateways to WAIS and WHOIS services. + +3.7.1 Building CSO-servers + + A CSO Nameserver implementation for UNIX and associated documentation + is available by anonymous ftp from uxa.cso.uiuc.edu. We do not + anticipate implementing it on other machines. + + + + + + + +Anklesari, McCahill, Lindner, Johnson, Torrey & Alberti [Page 9] + +RFC 1436 Gopher March 1993 + + +3.7.2 Building full-text search servers + + A full-text search server is a special-purpose server that knows + about the Gopher scheme for retrieving documents. These servers + maintain a full-text index of the contents of plain text documents on + Gopher servers in some specified domain. A Gopher full-text search + server was implemented using several NeXTstations because it was easy + to take advantage of the full-text index/search engine built into the + NeXT system software. A search server for generic UNIX systems based + on the public domain WAIS search engine, is also available and + currently an optional part of the UNIX gopher server. In addition, + at least one implementation of the gopher server incorperates a + gateway to WAIS servers by presenting the WAIS servers to gopherspace + as full-text search servers. The gopher<->WAIS gateway servers does + the work of translating from gopher protocol to WAIS so unmodified + gopher clients can access WAIS servers via the gateway server. + + By using several index servers (rather than a monolithic index + server) indexes may be searched in parallel (although the client + software is not aware of this). While maintaining full-text indexes + of documents distributed over many machines may seem a daunting task, + the task can be broken into smaller pieces (update only a portion of + the indexes, search several partial indexes in parallel) so that it + is manageable. By spreading this task over several small, cheap (and + fast) workstations it is possible to take advantage of fine-grain + parallelism. Again, the client software is not aware of this. Client + software only needs to know that it can send a search string to an + index server and will receive a list of documents that contain the + words in the search string. + +3.8 Item type characters + + The client software decides what items are available by looking at + the first character of each line in a directory listing. Augmenting + this list can extend the protocol. A list of defined item-type + characters follows: + + 0 Item is a file + 1 Item is a directory + 2 Item is a CSO phone-book server + 3 Error + 4 Item is a BinHexed Macintosh file. + 5 Item is DOS binary archive of some sort. + Client must read until the TCP connection closes. Beware. + 6 Item is a UNIX uuencoded file. + 7 Item is an Index-Search server. + 8 Item points to a text-based telnet session. + 9 Item is a binary file! + + + +Anklesari, McCahill, Lindner, Johnson, Torrey & Alberti [Page 10] + +RFC 1436 Gopher March 1993 + + + Client must read until the TCP connection closes. Beware. + + Item is a redundant server + T Item points to a text-based tn3270 session. + g Item is a GIF format graphics file. + I Item is some kind of image file. Client decides how to display. + + Characters '0' through 'Z' are reserved. Local experiments should + use other characters. Machine-specific extensions are not + encouraged. Note that for type 5 or type 9 the client must be + prepared to read until the connection closes. There will be no + period at the end of the file; the contents of these files are binary + and the client must decide what to do with them based perhaps on the + .xxx extension. + +3.9 User display strings and server selector strings + + User display strings are intended to be displayed on a line on a + typical screen for a user's viewing pleasure. While many screens can + accommodate 80 character lines, some space is needed to display a tag + of some sort to tell the user what sort of item this is. Because of + this, the user display string should be kept under 70 characters in + length. Clients may truncate to a length convenient to them. + +4. Simplicity is intentional + + As far as possible we desire any new features to be carried as new + protocols that will be hidden behind new document-types. The + internet Gopher philosophy is: + + (a) Intelligence is held by the server. Clients have the option + of being able to access new document types (different, other types + of servers) by simply recognizing the document-type character. + Further intelligence to be borne by the protocol should be + minimized. + + (b) The well-tempered server ought to send "text" (unless a file + must be transferred as raw binary). Should this text include + tabs, formfeeds, frufru? Probably not, but rude servers will + probably send them anyway. Publishers of documents should be + given simple tools (filters) that will alert them if there are any + funny characters in the documents they wish to publish, and give + them the opportunity to strip the questionable characters out; the + publisher may well refuse. + + (c) The well-tempered client should do something reasonable with + funny characters received in text; filter them out, leave them in, + whatever. + + + + +Anklesari, McCahill, Lindner, Johnson, Torrey & Alberti [Page 11] + +RFC 1436 Gopher March 1993 + + +Appendix + + Paul's NQBNF (Not Quite BNF) for the Gopher Protocol. + + Note: This is modified BNF (as used by the Pascal people) with a few + English modifiers thrown in. Stuff enclosed in '{}' can be + repeated zero or more times. Stuff in '[]' denotes a set of + items. The '-' operator denotes set subtraction. + + +Directory Entity + +CR-LF ::= ASCII Carriage Return Character followed by Line Feed + character. + +Tab ::= ASCII Tab character. + +NUL ::= ASCII NUL character. + +UNASCII ::= ASCII - [Tab CR-LF NUL]. + +Lastline ::= '.'CR-LF. + +TextBlock ::= Block of ASCII text not containing Lastline pattern. + +Type ::= UNASCII. + +RedType ::= '+'. + +User_Name ::= {UNASCII}. + +Selector ::= {UNASCII}. + +Host ::= {{UNASCII - ['.']} '.'} {UNASCII - ['.']}. + +Note: This is a Fully Qualified Domain Name as defined in RFC 1034. + (e.g., gopher.micro.umn.edu) Hosts that have a CR-LF + TAB or NUL in their name get what they deserve. + +Digit ::= '0' | '1' | '2' | '3' | '4' | '5' | '6' | '7' | '8' | '9' . + +DigitSeq ::= digit {digit}. + +Port ::= DigitSeq. + +Note: Port corresponds the the TCP Port Number, its value should + be in the range [0..65535]; port 70 is officially assigned + to gopher. + + + +Anklesari, McCahill, Lindner, Johnson, Torrey & Alberti [Page 12] + +RFC 1436 Gopher March 1993 + + +DirEntity ::= Type User_Name Tab Selector Tab Host Tab Port CR-LF + {RedType User_Name Tab Selector Tab Host Tab Port CR-LF} + + + +Notes: + + It is *highly* recommended that the User_Name field contain only + printable characters, since many different clients will be using + it. However if eight bit characters are used, the characters + should conform with the ISO Latin1 Character Set. The length of + the User displayable line should be less than 70 Characters; longer + lines may not fit across some screens. + + The Selector string should be no longer than 255 characters. + + +Menu Entity + +Menu ::= {DirEntity} Lastline. + + +Menu Transaction (Type 1 item) + +C: Opens Connection +S: Accepts Connection +C: Sends Selector String +S: Sends Menu Entity + + Connection is closed by either client or server (typically server). + + +Textfile Entity + +TextFile ::= {TextBlock} Lastline + +Note: Lines beginning with periods must be prepended with an extra + period to ensure that the transmission is not terminated early. + The client should strip extra periods at the beginning of the line. + + +TextFile Transaction (Type 0 item) + +C: Opens Connection. +S: Accepts connection +C: Sends Selector String. +S: Sends TextFile Entity. + + + + +Anklesari, McCahill, Lindner, Johnson, Torrey & Alberti [Page 13] + +RFC 1436 Gopher March 1993 + + + Connection is closed by either client or server (typically server). + +Note: The client should be prepared for the server closing the + connection without sending the Lastline. This allows the + client to use fingerd servers. + + +Full-Text Search Transaction (Type 7 item) + +Word ::= {UNASCII - ' '} +BoolOp ::= 'and' | 'or' | 'not' | SPACE +SearchStr ::= Word {{SPACE BoolOp} SPACE Word} + +C: Opens Connection. +C: Sends Selector String, Tab, Search String. +S: Sends Menu Entity. + +Note: In absence of 'and', 'or', or 'not' operators, a SPACE is + regarded as an implied 'and' operator. Expression is evaluated + left to right. Further, not all search engines or search + gateways currently implemented have the boolean operators + implemented. + +Binary file Transaction (Type 9 or 5 item) + +C: Opens Connection. +S: Accepts connection +C: Sends Selector String. +S: Sends a binary file and closes connection when done. + + +Syntactic Meaning for Directory Entities + + +The client should interpret the type field as follows: + +0 The item is a TextFile Entity. + Client should use a TextFile Transaction. + +1 The item is a Menu Entity. + Client should use a Menu Transaction. + +2 The information applies to a CSO phone book entity. + Client should talk CSO protocol. + +3 Signals an error condition. + +4 Item is a Macintosh file encoded in BINHEX format + + + +Anklesari, McCahill, Lindner, Johnson, Torrey & Alberti [Page 14] + +RFC 1436 Gopher March 1993 + + +5 Item is PC-DOS binary file of some sort. Client gets to decide. + +6 Item is a uuencoded file. + +7 The information applies to a Index Server. + Client should use a FullText Search transaction. + +8 The information applies to a Telnet session. + Connect to given host at given port. The name to login as at this + host is in the selector string. + +9 Item is a binary file. Client must decide what to do with it. + ++ The information applies to a duplicated server. The information + contained within is a duplicate of the primary server. The primary + server is defined as the last DirEntity that is has a non-plus + "Type" field. The client should use the transaction as defined by + the primary server Type field. + +g Item is a GIF graphic file. + +I Item is some kind of image file. Client gets to decide. + +T The information applies to a tn3270 based telnet session. + Connect to given host at given port. The name to login as at this + host is in the selector string. + +Security Considerations + + Security issues are not discussed in this memo. + +Authors' Addresses + + Farhad Anklesaria + Computer and Information Services, University of Minnesota + Room 152 Shepherd Labs + 100 Union Street SE + Minneapolis, MN 55455 + + Phone: (612) 625 1300 + EMail: fxa@boombox.micro.umn.edu + + + + + + + + + + +Anklesari, McCahill, Lindner, Johnson, Torrey & Alberti [Page 15] + +RFC 1436 Gopher March 1993 + + + Mark McCahill + Computer and Information Services, University of Minnesota + Room 152 Shepherd Labs + 100 Union Street SE + Minneapolis, MN 55455 + + Phone: (612) 625 1300 + EMail: mpm@boombox.micro.umn.edu + + + Paul Lindner + Computer and Information Services, University of Minnesota + Room 152 Shepherd Labs + 100 Union Street SE + Minneapolis, MN 55455 + + Phone: (612) 625 1300 + EMail: lindner@boombox.micro.umn.edu + + + David Johnson + Computer and Information Services, University of Minnesota + Room 152 Shepherd Labs + 100 Union Street SE + Minneapolis, MN 55455 + + Phone: (612) 625 1300 + EMail: dmj@boombox.micro.umn.edu + + + Daniel Torrey + Computer and Information Services, University of Minnesota + Room 152 Shepherd Labs + 100 Union Street SE + Minneapolis, MN 55455 + + Phone: (612) 625 1300 + EMail: daniel@boombox.micro.umn.edu + + + Bob Alberti + Computer and Information Services, University of Minnesota + Room 152 Shepherd Labs + 100 Union Street SE + Minneapolis, MN 55455 + + Phone: (612) 625 1300 + EMail: alberti@boombox.micro.umn.edu + + + +Anklesari, McCahill, Lindner, Johnson, Torrey & Alberti [Page 16] + + + + + + + + (DIR) diff --git a/references/rfc4266.txt b/references/rfc4266.txt @@ -0,0 +1,339 @@ + + + + + + +Network Working Group P. Hoffman +Request for Comments: 4266 VPN Consortium +Obsoletes: 1738 November 2005 +Category: Standards Track + + + The gopher URI Scheme + +Status of This Memo + + This document specifies an Internet standards track protocol for the + Internet community, and requests discussion and suggestions for + improvements. Please refer to the current edition of the "Internet + Official Protocol Standards" (STD 1) for the standardization state + and status of this protocol. Distribution of this memo is unlimited. + +Copyright Notice + + Copyright (C) The Internet Society (2005). + +Abstract + + This document specifies the gopher Uniform Resource Identifier (URI) + scheme that was originally specified in RFC 1738. The purpose of + this document is to allow RFC 1738 to be made obsolete while keeping + the information about the scheme on standards track. + +1. Introduction + + URIs were previously defined in RFC 2396 [RFC2396], which was updated + by RFC 3986 [RFC3986]. Those documents also specify how to define + schemes for URIs. + + The first definition for many URI schemes appeared in RFC 1738 + [RFC1738]. Because that document has been made obsolete, this + document copies the gopher URI scheme from it to allow that material + to remain on standards track. + +2. Scheme Definition + + The gopher URL scheme is used to designate Internet resources + accessible using the Gopher protocol. + + The base Gopher protocol is described in RFC 1436 [RFC1436] and + supports items and collections of items (directories). The Gopher+ + protocol is a set of upward-compatible extensions to the base Gopher + protocol and is described in [Gopher+]. Gopher+ supports associating + + + + +Hoffman Standards Track [Page 1] + +RFC 4266 The gopher URI Scheme November 2005 + + + arbitrary sets of attributes and alternate data representations with + Gopher items. Gopher URLs accommodate both Gopher and Gopher+ items + and item attributes. + + Historical note: The Gopher protocol was widely implemented in the + early 1990s, but few Gopher servers are in use today. + +2.1. Gopher URL Syntax + + A Gopher URL takes the form: + + gopher://<host>:<port>/<gopher-path> + + where <gopher-path> is one of: + + <gophertype><selector> + <gophertype><selector>%09<search> + <gophertype><selector>%09<search>%09<gopher+_string> + + If :<port> is omitted, the port defaults to 70. <gophertype> is a + single-character field to denote the Gopher type of the resource to + which the URL refers. The entire <gopher-path> may also be empty, in + which case the delimiting "/" is also optional and the <gophertype> + defaults to "1". + + <selector> is the Gopher selector string. In the Gopher protocol, + Gopher selector strings are a sequence of octets that may contain any + octets except 09 hexadecimal (US-ASCII HT or tab), 0A hexadecimal + (US-ASCII character LF), and 0D (US-ASCII character CR). + + Gopher clients specify which item to retrieve by sending the Gopher + selector string to a Gopher server. + + Within the <gopher-path>, no characters are reserved. + + Note that some Gopher <selector> strings begin with a copy of the + <gophertype> character, in which case that character will occur twice + consecutively. The Gopher selector string may be an empty string; + this is how Gopher clients refer to the top-level directory on a + Gopher server. + +2.2. Specifying URLs for Gopher Search Engines + + If the URL refers to a search to be submitted to a Gopher search + engine, the selector is followed by an encoded tab (%09) and the + search string. To submit a search to a Gopher search engine, the + Gopher client sends the <selector> string (after decoding), a tab, + and the search string to the Gopher server. + + + +Hoffman Standards Track [Page 2] + +RFC 4266 The gopher URI Scheme November 2005 + + +2.3. URL Syntax for Gopher+ Items + + Historical note: Gopher+ was uncommon even when Gopher was popular. + + URLs for Gopher+ items have a second encoded tab (%09) and a Gopher+ + string. Note that in this case, the %09<search> string must be + supplied, although the <search> element may be the empty string. + + The <gopher+_string> is used to represent information required for + retrieval of the Gopher+ item. Gopher+ items may have alternate + views and arbitrary sets of attributes, and they may have electronic + forms associated with them. + + To retrieve the data associated with a Gopher+ URL, a client will + connect to the server and send the Gopher selector, followed by a tab + and the search string (which may be empty), followed by a tab and the + Gopher+ commands. + +2.4. Default Gopher+ Data Representation + + When a Gopher server returns a directory listing to a client, the + Gopher+ items are tagged with either a "+" (denoting Gopher+ items) + or a "?" (denoting Gopher+ items that have a +ASK form associated + with them). A Gopher URL with a Gopher+ string consisting of only a + "+" refers to the default view (data representation) of the item, and + a Gopher+ string containing only a "?" refers to an item with a + Gopher electronic form associated with it. + +2.5. Gopher+ Items with Electronic Forms + + Gopher+ items that have a +ASK associated with them (i.e., Gopher+ + items tagged with a "?") require the client to fetch the item's +ASK + attribute to get the form definition, and then ask the user to fill + out the form and return the user's responses along with the selector + string to retrieve the item. Gopher+ clients know how to do this but + depend on the "?" tag in the Gopher+ item description to know when to + handle this case. The "?" is used in the Gopher+ string to be + consistent with Gopher+ protocol's use of this symbol. + +2.6. Gopher+ Item Attribute Collections + + To refer to the Gopher+ attributes of an item, the Gopher URL's + Gopher+ string consists of "!" or "$". "!" refers to all of a Gopher+ + item's attributes. "$" refers to all the item attributes for all + items in a Gopher directory. + + + + + + +Hoffman Standards Track [Page 3] + +RFC 4266 The gopher URI Scheme November 2005 + + +2.7. Referring to Specific Gopher+ Attributes + + To refer to specific attributes, the URL's gopher+_string is + "!<attribute_name>" or "$<attribute_name>". For example, to refer to + the attribute containing the abstract of an item, the gopher+_string + would be "!+ABSTRACT". + + To refer to several attributes, the gopher+_string consists of the + attribute names separated by coded spaces. For example, + "!+ABSTRACT% 20+SMELL" refers to the +ABSTRACT and +SMELL attributes + of an item. + +2.8. URL Syntax for Gopher+ Alternate Views + + Gopher+ allows for optional alternate data representations (alternate + views) of items. To retrieve a Gopher+ alternate view, a Gopher+ + client sends the appropriate view and language identifier (found in + the item's +VIEW attribute). To refer to a specific Gopher+ + alternate view, the URL's Gopher+ string would be in the form: + + +<view_name>%20<language_name> + + For example, a Gopher+ string of "+application/postscript%20Es_ES" + refers to the Spanish language postscript alternate view of a Gopher+ + item. + +2.9. URL Syntax for Gopher+ Electronic Forms + + The gopher+_string for a URL that refers to an item referenced by a + Gopher+ electronic form (an ASK block) filled out with specific + values is a coded version of what the client sends to the server. + The gopher+_string is of the form: + + +%091%0D%0A+-1%0D%0A<ask_item1_value>%0D%0A + <ask_item2_value>%0D%0A.%0D%0A + + To retrieve this item, the Gopher client sends the following text to + the Gopher server. + + <a_gopher_selector><tab>+<tab>1<cr><lf> + +-1<cr><lf> + <ask_item1_value><cr><lf> + <ask_item2_value><cr><lf> + .<cr><lf> + + + + + + + +Hoffman Standards Track [Page 4] + +RFC 4266 The gopher URI Scheme November 2005 + + +3. Security Considerations + + There are many security considerations for URI schemes discussed in + [RFC3986]. The Gopher protocol uses passwords in the clear for + authentication, and offers no privacy, both of which are considered + extremely unsafe in current practice. + +4. Informative References + + [Gopher+] Anklesaria, F., et al., "Gopher+: Upward compatible + enhancements to the Internet Gopher protocol", University + of Minnesota, July 1993, <ftp://boombox.micro.umn.edu/pub/ + gopher/gopher_protocol/Gopher+/Gopher+.txt> + + [RFC1738] Berners-Lee, T., Masinter, L., and M. McCahill, "Uniform + Resource Locators (URL)", RFC 1738, December 1994. + + [RFC2396] Berners-Lee, T., Fielding, R., and L. Masinter, "Uniform + Resource Identifiers (URI): Generic Syntax", RFC 2396, + August 1998. + + [RFC3986] Berners-Lee, T., Fielding, R., and L. Masinter, "Uniform + Resource Identifier (URI): Generic Syntax", STD 66, + RFC 3986, January 2005. + + [RFC1436] Anklesaria, F., McCahill, M., Lindner, P., Johnson, D., + Torrey, D., and B. Albert, "The Internet Gopher Protocol + (a distributed document search and retrieval protocol)", + RFC 1436, March 1993. + +Author's Address + + Paul Hoffman + VPN Consortium + 127 Segre Place + Santa Cruz, CA 95060 + US + + EMail: paul.hoffman@vpnc.org + + + + + + + + + + + + +Hoffman Standards Track [Page 5] + +RFC 4266 The gopher URI Scheme November 2005 + + +Full Copyright Statement + + Copyright (C) The Internet Society (2005). + + This document is subject to the rights, licenses and restrictions + contained in BCP 78, and except as set forth therein, the authors + retain all their rights. + + This document and the information contained herein are provided on an + "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS + OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY AND THE INTERNET + ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS OR IMPLIED, + INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE + INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED + WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. + +Intellectual Property + + The IETF takes no position regarding the validity or scope of any + Intellectual Property Rights or other rights that might be claimed to + pertain to the implementation or use of the technology described in + this document or the extent to which any license under such rights + might or might not be available; nor does it represent that it has + made any independent effort to identify any such rights. Information + on the procedures with respect to rights in RFC documents can be + found in BCP 78 and BCP 79. + + Copies of IPR disclosures made to the IETF Secretariat and any + assurances of licenses to be made available, or the result of an + attempt made to obtain a general license or permission for the use of + such proprietary rights by implementers or users of this + specification can be obtained from the IETF on-line IPR repository at + http://www.ietf.org/ipr. + + The IETF invites any interested party to bring to its attention any + copyrights, patents or patent applications, or other proprietary + rights that may cover technology that may be required to implement + this standard. Please address the information to the IETF at ietf- + ipr@ietf.org. + +Acknowledgement + + Funding for the RFC Editor function is currently provided by the + Internet Society. + + + + + + + +Hoffman Standards Track [Page 6] +