From: gopher-bounce@complete.org Date: Sun Aug 10 23:06:55 2008 Subject: [gopher] Re: Gopherness So gopher as it stands needs internationalization? I'm having trouble following a lot of these emails but i'm trying =3D-) Matt On Sun, Aug 10, 2008 at 3:34 PM, Nuno J. Silva <nunojsilva@ist.utl.pt> wrot= e: > JumpJet Mailbox <jumpjetinfo@yahoo.com> > writes: > >> --- On Mon, 8/4/08, Nuno J. Silva >> <nunojsilva@ist.utl.pt> wrote: >>> >>> "Jay Nemrow" <jnemrow@quix.us> writes: >>> >>>> On Mon, Aug 4, 2008 at 10:19 AM, Kyevan >>>> <kyevan@sinedev.org> wrote: >>>> >>>>> What about older clients, though? Modern clients will probably >>>>> handle UTF-8 at least well enough to not explode, but older clients >>>>> might not. Generally, it seems safest to stick to the subset that >>>>> is ASCII when reasonable, only using UTF-8 or such when it's >>>>> actually needed. ... is a perfectly readable replacement for >>>>> U+2026, even if it's not "typographically correct." On the other >>>>> hand, if you're trying to post a text in, say, a mix of Arabic, and >>>>> Klingon, go right ahead and use UTF-8. >>> >>> There are also these iso* charsets which just use 8 bit to encode the >>> text, not allowing a greater collection of characters, and using >>> those you wouldn't be able to mix charsets. > <snip/> >>> On the other hand, even if the choice was utf8 (so the documents would >>> be ASCII or utf8), I'd keep iso* support, just in case (therefore my >>> question is 'should we use the same sort of character encoding when >>> publishing non-english documents? if yes, which one?' and not 'what >>> should a client support?'). >>> >>> What's the actual scenario? Is there any client which crashes due to >>> utf8? Which clients are not able to render it correctly? And what >>> about iso* charsets support? >> >> How would we print a Gopher retreived text document on, for example, >> an older (or mini-mainframe) computer which only uses a Daisy Wheel >> Printer or Teletype Printer (which ONLY supports ASCII characters)? > > If the documents (in any of the mentioned encodings) have non-ASCII > characters, the behaviour is undefined (e.g., if the machine ignores the > 8th bit, another characters will be rendered instead of the desired ones)= . > > But there's nothing we can do about that, except writing some script to > replace the existing non-ASCII characters with some ASCII description. > > Avoiding the use of non-ASCII characters is, of course, a good > idea. But, if there's some document in a non-western language, or a > language which requires another alphabet, it's impossible to use ASCII > in that situation. > > <snip/> > > -- > Nuno J. Silva (aka njsg) > LEIC student at Instituto Superior T=E9cnico > Lisbon, Portugal > Homepage: http://njsg.no.sapo.pt/ > Gopherspace: gopher://sdf-eu.org/11/users/njsg > Registered Linux User #402207 - http://counter.li.org > > -=3D-=3D- > Ooh, mommy, mommy, what I have now doesn't work in this extremely > unlikely circumstance, so I'll just throw it away and write something > completely new. > -- Linus Torvalds > > > >