The vi/ex Editor, Part 7: A Little "R" and "r": The Fine Points
       of those Replacement Commands
       
         There's more to R than to r
         Quoting in Characters
         Readers Ask
       
       Tommy Spratlin & Thai-Nghia Dinh writes:
       
       Next Time Around
       
       This installment of our Vi/Ex tutorial series is a diversion from
       the subjects I promised at the end of the previous part -- the
       change is my fault, and yet it is necessary.  When I blithely
       suggested last time that the R command is just like the familiar r
       command, except for a few differences I mentioned, I was leading
       you astray.  
       
       There are several differences that can cause problems in certain
       uses unless you understand those differences.  And you won't really
       comprehend the greatest of those differences until you know about
       metacharacters in insert mode.  But as an encouragement to follow
       all this, consider that almost all of what I say here about the R
       command also is valid with all the other commands that put you into
       text insertion mode:
       
           a A i I o O c s :a :i etcetera.
       
       There's more to R than to r
       
       The r command replaces whatever character is presently under the
       cursor, so there must be some character under the cursor for it to
       replace -- otherwise it just gives you an error beep.  Not so with
       R.
       
       You can give the R command  on an empty line; whatever you type
       after that, up to the next escape character, will take the place of
       that empty line just as though you had typed past the end of an
       existing line after giving an R command.  (I was going to say
       "just as though you had given an a command", but I'm now very
       leary of making comparisons that are incomplete without paragraphs
       of explanations.)  You can even start entering text into a
       brand-new file via the R command.  
       
       The factor above can be useful in various situations; I only have
       space to mention one.  At times I want to type new characters to
       replace blank spaces in a place where some of the lines are empty.
       These do not have any blanks; no characters at all.  But I do not
       have to look at each line before I start typing on it, to see
       whether I should use an R or an a command, because R will work in
       either case.  
       
       The R command is more forgiving of your typing errors, too.
       Whatever character you type after an r is final. If you
       accidentally typed the wrong character, you can only put back what
       was there by typing a u command, if the mistake was the last
       editing command you typed, or put in the replacement you had in
       mind by returning the cursor to the spot and running another, more
       careful, r command.  
       
       But if you mistype during an R command, you can backspace over the
       error with the backspace key.  Then you can type in the character
       (or characters; you can back up multiple spaces by repeating the
       backspace key) you should have typed.  And if you simply typed too
       far, you'll be glad to know that backspacing doesn't just remove
       the incorrect characters, it restores the characters that were
       there, either right away or as soon as you hit the escape key.  You
       can even backspace over everything you've typed during this R
       command before you type escape, because the editor does not object
       to a replacement string length of zero.  
       
       One caveat here, though, lest my clarification turn out to need a
       clarification of its own.  With either of these commands it is
       possible to break a line, just by typing the return key as a
       replacement character, and with the R command this linebreaking can
       be done either while actually replacing characters or when typing
       on beyond the end of the existing line.  With almost all versions
       of the editor, it is not possible to backspace over an inserted
       linebreak, even while you are still in R insertion mode.  
       
       The most important difference, though, is the handling of
       metacharacters.  Yes, text insertion utilizes metacharacters too,
       quite apart from the ones that the replacement patterns in
       :substitute commands use.  The r command recognizes hardly any of
       these metacharacters, and quoting those in as literal characters is
       very simple.  The R command, though, recognizes almost all of them,
       and quoting characters in with R is rather complicated.  
       
       Quoting in Characters
       
       The phrase "quoting in" is standard terminology, but it is rather
       misleading in the editor.  Unlike Unix shells, the editor does not
       use any of the ASCII quotation marks: ` ' " (backquote, single and
       double quote) to quote characters into a file.  Instead, it uses
       the backslash ("\") and control-V ("^V"); the latter is what
       you send when you press the V key while holding the CONTROL or CTRL
       key down.  In either case, you quote a character in by typing the
       quoting character just prior to the character you want to quote in.
       So if @ is your line kill character, and you want to put that
       character in the text you are typing in, you would have to type
       either \@ or ^V@ to get it there.  And if you want several
       consecutive characters quoted in, you must quote each of them
       individually.  That is, if you want to put @@@ into a line, you
       must type either ^V@^V@^V@ or \@\@\@ to put that string there.  
       
       But \ and ^V are not always interchangeable.  In many cases either
       will work; but sometimes you must choose the right one.  Which one
       to use depends both on what character you want to quote in and
       whether you're using the r or R command.  
       
       One obvious use for quoting is to insert a character that normally
       erases part or all of what you've just typed in.  The ASCII
       backspace character, control-H, must be quoted in, and so must your
       own line-kill character (@ in the example above) and your own erase
       character if it is not control-H.  With the r command you quote in
       any of these with a backslash; when using R you may quote any of
       these in using either backslash or control-V.  
       
       A pause here, to answer a question that might be in the minds of
       people who know a little about Unix internals.  Ordinarily it is
       the asynchronous serial terminal line (or TTY) driver that
       recognizes the erase and line-kill characters and edits the input
       line accordingly without including these characters in the final
       result.  Then, how can one enter these same input-line characters
       into the edit buffer if they don't get past the TTY driver?
       Because Vi/Ex places the TTY driver into a special "raw" mode that
       ignores the line-editing characters passing them on to the editor.
       Otherwise you would not be able to quote these characters in.
       Also, the editor is set up to discover your erase and line-kill
       characters by querying your personal environment, and then
       interpret these characters as the line driver would have. A nifty
       feature -- but unfortunately, the editor has no way to let the user
       turn this feature off.  
       
       The editor's creators came up with a curious method for repeating
       short text insertions, where the text to go in is always the same
       but any outgoing text varies.  They decided that when you are in
       screen mode, and have just gone into typing-in-text submode, and
       make Control-@ ("^@") the first character you type in, then the
       editor should insert the last piece of text you had previously
       inserted (if it was not more than 128 characters long) and take you
       back to command mode. Unfortunately, they never made this work as
       promised.  
       
       In actuality, ^@ operates anywhere in a text insertion, not just in
       the first character position.  What a ^@ does there depends on the
       situation. If your last c d y command, or one of their variants
       such as s D etcetera, removed or copied a full line of text or
       parts of two or more lines, or if you haven't run one of those
       commands in your current editing session, then typing ^@ is just a
       nuisance.  It will take you out of text input submode and probably
       move the cursor back a few characters from where the input ended.  
       
       But if you have done at least one c d y command or a variant, and
       if the very last one you did removed or copied only a part of a
       single line of text, then surprise!  Typing a ^@ in this case will
       do three things: 
       
       Unless you typed it at the first character position on a line, it
       will move the cursor back one character.  This will move over the
       last character you typed in if you've typed any, or over one
       existing character if you type ^@ as the first character of your
       insertion, but will not erase the character it passes over.  
       
       Just to the left of the new cursor position, the editor will insert
       the text that was removed or copied by your last c d y command or
       variant.  (If you went into text-insertion submode via a c command
       or a variant of it, the text you just took out is what will be put
       back in.)  
       
       Finally, the text insertion will automatically end and you will be
       back in command submode, with the cursor positioned at the start of
       the last simple word that was inserted by the ^@ metacharacter.  
       
       Quoting a ^@ into your text isn't possible, because the editor
       reserves that character for internal use and will not accept it as
       itself in any file you may edit.  Not that there would be any
       reason to put ^@ in a file anyway: it is the ASCII character NUL, a
       padding character that is routinely inserted in data streams by
       device drivers, and just as routinely stripped at the receiving
       end, so any ^@ characters you might add would be lost in the
       shuffle.  But when you are using the R command, or any other
       command that lets you insert an indefinite amount of text, you can
       quote a ^@ anyway by preceding it with a ^V.  The result will be to
       quote ^[Pb into your file at that point; this being the command
       string the editor issues to perform the odd operation I've detailed
       above.  
       
       Those of you who are skillful with the editor may wonder why the ^@
       insertion operates only when your last text extraction was a
       fragment of one line.  After all, the P command by itself inserts
       the contents of the unnamed buffer, and that buffer holds whatever
       was extracted last, be it half a line or a hundred lines, doesn't
       it? The answer lies in one of the editor's undocumented features.
       When you give a command to insert text, even the r command that
       only inserts a single character, the editor simultaneously flushes
       the unnamed buffer and leaves it empty -- if and only if that
       buffer contained more than a fragment of one line.  So, when you
       entered the text insertion mode from which ^@ operates, you emptied
       the unnamed buffer unless there was only a fragment of one line in
       it.  
       
       At times you may want to use the beautify option to the set
       command.  This tells the editor to throw away most, but not all,
       control characters you may try to type in -- the exceptions usually
       are the tab (^I), newline (^J), and form feed (^L) -- in order to
       keep you from inadvertently putting in invisible control characters
       that will be hard to detect later.  This option is normally off,
       but you can type :se bf to turn it on.  
       
       But even when you want most control characters thrown out, there
       will be occasions when one must go in.  This is not possible using
       a r command.  The usual r technique of backslashing will usually
       bite back in this case -- the editor will interpret the control
       character by acting on its control meaning rather than inserting it
       in the text.  Using R, though, you can insert most control
       characters by preceding each with ^V.  
       
       Even this may not be enough.  Some systems are set up so that when
       certain control characters are typed in, even though preceded by
       ^V, the system acts on them as control characters before the editor
       ever sees them.  To get around this problem, many implementations
       of the editor, especially older ones, interpret an ordinary
       character typed right after a ^V as a control character.  That is,
       on these systems, typing ^VF or ^Vf while running an R command
       inserts a ^F in the file, just as typing ^V^F would on systems that
       don't have this challenge.  
       
       Readers Ask
       
       Here are the latest questions, and my solutions, from inquiring
       readers with problems you might face someday.
       
       Tommy Spratlin writes:
       
       Hi Walter, 
       
       In moving files from Windows machines to UNIX, some of our users do
       binary transfers which result in ^M characters in the ASCII files.
       Usually they occur at the ends of individual lines and I do: 
       
         :1,$ s/^M//g
       
       where ^M is generated by ^V^M and everything works fine to delete
       these characters.  I now have a new problem: I found a file with ^M
       characters embedded in it, but the file is one long line.  I need
       to replace them with Vi's line-end character to split this long
       line into multiple lines.  But I can't because it's the same as
       pressing the ENTER or RETURN key in the middle of the substitution
       command.  How can I replace the superfluous carriage return?  We
       have several files like this and it's causing problems viewing them
       with Web browsers.  
       
       I tried substituting a newline with the character code and the
       octal code unsuccessfully, and tried the ^M as a last unsuccessful
       resort.
       
       Things aren't as complicated as you make them seem, Tommy.  First
       of all, Web browsers generally ignore carriage-return and/or
       linefeed characters while formatting text for display.  If your
       browser is choking on these all-one-line files, it is probably
       because the lines are too long for your browser, or for some other
       cause not related to embedded ^M characters.  
       
       Now, as you have deduced, the difference between Microsoft and Unix
       text file formats is that Microsoft operating systems seem to favor
       carriage-return followed by linefeed (^J) as the line separator,
       while Unix systems use linefeed alone.  
       
       As you've discovered, you cannot directly quote a ^J into any
       editor command.  And yet, you put a ^J into your file every time
       you hit return during text entry, although the return key on most
       terminals sends a ^M character.  That's the trick; the substitute
       command regards a ^M in the input pattern as a signal to insert a
       ^J and discard the ^M.  So you only need to get that ^M into the
       replacement pattern by typing in your command line like this: 
       
         :1,$ s/^V^M/^V^M/g
       
       You just have to overlook the appearance of futility in this
       command line, as though it were going to replace each ^M with
       itself.  That first ^M is in the outgoing pattern, so it matches a
       real ^M. The second, in the replacement pattern, calls for a ^J as
       I explained above.  
       
       However, these all-one-line files may be too long for the Vi
       editor, which cannot handle lines much more than a thousand
       characters long in most common implementations, with shorter limits
       in older versions. The editor will truncate lines that exceed the
       limit, with only a minimal and rather cryptic warning.  In such
       cases, use the tr utility to replace the ^M characters (which is a
       very straightforward job with that tool), before you bring the file
       into the Vi editor.  
       
       You may wonder then, how you would use the substitute command to
       put ^M characters into your file.  The answer is to backslash the
       quoted-in ^M. To add a ^M at the end of every line in your file, so
       as to conform it to Microsoft practice, type this command: 
       
         :%s/$/\^V^M
       
       (Note that it is important to type the \ first, then the ^V,
       followed by the ^M.) The ^V puts the immediately-following ^M into
       the command line, and the backslash tells the command that this ^M
       is to be considered a real one, not a metacharacter for ^J. In
       fact, these are the general principles for quoting characters
       almost everywhere except in typing-in-text mode: 
       
       Precede a character by ^V to keep that character from being
       interpreted as a metacharacter at the moment you type it.  In this case,
       you don't want typing ^M to immediately end the substitution command.
       
       Precede a character by a backslash to keep that character from acting
       as a metacharacter later, when what you've typed is interpreted by the
       editor -- for example, when what you have typed in is run as a command,
       or interpreted as a search pattern.  This command uses a backslash to
       keep the command from inserting ^J instead of ^M at the time it executes.
       
       When you must use both, as in this case, type the backslash before
       you type the ^V. (If you think that this backslash would then
       affect the immediately following ^V rather than the later ^M,
       remember that the ^V is not there when the backslash takes effect.
       The ^V disappears as soon as it tells the editor to insert the ^M
       in the command instead of taking the ^M as the signal to end the
       command.)  
       
       Finally, you can replace linefeed characters with something else
       via line mode commands, but you must use two commands and only one
       of them is the substitute command.  Suppose you need to change a
       short file's format from a number of lines to the format Tommy
       encountered: a single line with ^M separators.  That is, replace
       each ^J (except the last) with a ^M. (This had better be a fairly
       short file, because even newer versions of the editor can't handle
       any lines longer than 1024 characters.)  
       
       Start by using a command similar to the one above to put ^M at the
       end of every line except the last.  (Since these ^M characters are
       to separate lines, there's no use for one at the end of the last
       line.) Then use this command: 
       
         :%j!
       
       to join all the lines into one.  The "j" in this command line is
       the shortest abbreviation for the line mode join command, and the
       "!" switch at the end of it tells the command not to insert blank
       space between the lines it joins.  
       
       Thai-Nghia Dinh writes:
       
       Hi,
       
       I have a question (rather simple, really) but no one seem able to
       know the answer.  Not even the help desk (with all the Vi gurus
       :)). I'm hoping you can help me with it.  
       
       I have a text file of unknown length.  Each line of the file can be
       very short or very long (from 3 characters up to 1000 characters).  
       
       Within this file, I'm trying to locate (search) the nth occurrence
       of a word.  
       
       Here are a few things I've tried: 
       
         The simple solution would be (from visual command mode): a
         /foobar command followed by the n command typed n-1 times.  But
         what if n is large, say 200 or greater?)  
       
         :1,$ global /^/ /foobar/ (and its variations) Nothing useful...  
       
       Can you suggest a better way?  
       
       Yes, although it involves a slightly tricky procedure.  Consider
       the following command string: 
       
         :$|/\<foobar\>/s//QQQ
       
       The first command in this string takes us to the last line of our
       file and -- incidentally -- displays it on our screen, which is not
       important here.  The second command searches forward for a line
       containing "foobar" as a word, and starting from the last line the
       search must wrap around and find the first instance in the file.
       Then that second command replaces the word "foobar" with "QQQ",
       leaving the cursor at the point where the substitution was made.  
       
       Now let us make an addition to the start of this command string: 
       
         :1,199g/^/$|/\<foobar\>/s//QQQ
       
       This revised string repeats the procedure 199 times; each time the
       first instance of "foobar" remaining in the file is the one
       replaced. So we end up sitting on the "QQQ" string that replaced
       the 199th instance of "foobar"; simply typing n will bring us to
       the 200th instance.  And if we move off that 200th instance for any
       reason, going to the top of the file and searching for "foobar"
       will bring us right back to it, because the first 199 are now gone.  
       
       When we are finished with that 200th "foobar", this command: 
       
         :%s/QQQ/foobar/g
       
       will change those 199 "QQQ" strings back to "foobar".  Of course,
       if there is any chance that "QQQ" might occur in the document as
       itself, we can choose another dummy string.  
       
       And while I'm at it, I've got another question.  
       
       How do I delete all lines beginning with a certain string, say,
       !@#$ (or foobar for that matter). And a related question: how to
       delete lines containing the word foobar (anywhere within the line)?  
       
       The first command line following will solve your first problem, and
       the second will solve your second: 
       
         :g/^foobar/d
         :g/\<foobar\>/d
       
       Next Time Around
       
       To make room to answer two readers' questions, I had to skip
       presenting three great Vi tools -- autoindent, abbreviate, and map!
       -- and the effect their metacharacters have in text-insertion mode.
       They'll be first up in the next part of this tutorial.  
       
       More answers to reader questions are coming, too.  I have queries
       to answer about the semicolon address separator and about yanking
       within macros -- and if a few more significant problems arrive
       here, I'll try to fit them in, too.  
       
       And this time you won't have to wait and wait for the next tutorial
       part.  As I write this paragraph, I'm already in the middle of
       creating the next part, so you should see it within a month after
       this part appears online.  
       
 (DIR) Part 8: Indent, Like a Typewriter
 (DIR) Back to the index