[HN Gopher] Write Your Own Terminal
       ___________________________________________________________________
        
       Write Your Own Terminal
        
       Author : ingve
       Score  : 120 points
       Date   : 2023-11-10 08:15 UTC (14 hours ago)
        
 (HTM) web link (flak.tedunangst.com)
 (TXT) w3m dump (flak.tedunangst.com)
        
       | Instantnoodl wrote:
       | I wrote a small terminal emulator a while ago to have a portable
       | terminal for my terminal based game. It's very specific but I had
       | great fun with it.
       | 
       | https://github.com/bigjk/crt
        
         | clemailacct1 wrote:
         | That terminal and even the associated game look incredible!
        
         | sotix wrote:
         | This is awesome! Excited to see what you do with the project.
         | Definitely keep us updated.
        
       | billconan wrote:
       | I wanted to write a terminal emulator, the biggest hurdle is
       | understanding the escape sequences. all documents seem to be
       | unreadable, including those mentioned in the post, and those
       | often referenced in projects, like
       | https://vt100.net/emu/dec_ansi_parser
       | 
       | the second difficulty is handling reflow. a real terminal can't
       | resize its screen, but an emulator can. how to implement that
       | correctly with cursor movements?
       | 
       | the third difficulty is handling font fallback and rendering
       | emojis and other combinatory glyphs correctly.
        
         | SoftTalker wrote:
         | Some real terminals had a few different row/column modes, e.g.
         | 80x24 or 132x40 (IIRC). I don't recall if text was reflowed
         | when switching.
        
           | jws wrote:
           | The early DEC terminals did not reflow on switching.
        
           | wyan wrote:
           | Usually the screen was cleared when switching modes, so no
           | text reflow.
        
         | dilap wrote:
         | haven't tried, but i'd guess chatgpt (especially 4) could help
         | a ton w/ this.
        
         | jws wrote:
         | I've written ANSI terminal handling before. The original VT100
         | paper manual that came with the terminal is a nice start,
         | because the complexity hadn't really happened yet and people
         | used to write useful documentation. With that as a starting
         | point for understanding it isn't hard to extend into handling
         | the full spec.
         | 
         | That diagram you linked is actually quite nice, but visually
         | intimidating. If you think of it as a handful of regular
         | expressions exploded into a state diagram that helps. For
         | instance, the entire left half of the diagram is just the CSI
         | code acceptor, see "CSI (Control Sequence Introducer)
         | sequences" on the "ANSI escape code" wikipedia page.
         | 
         | You can write a regular expression to match a CSI and carry on.
         | This is 2023 and you aren't using an 8080 with 3k of RAM.
         | (probably) The only tiny trick is that you have to handle the
         | "incomplete trailing regex" and wait for more data to arrive
         | and try again.
         | 
         | As for handling reflow. I wouldn't call that an
         | "implementation" problem. I'd call it a "specification"
         | problem. I'd approach it by seeing what Apple's Terminal
         | program does, write that down, and call that my specification.
        
         | alpaca128 wrote:
         | I always used Wikipedia's article on ANSI escape sequences. A
         | few details could be explained a bit better but overall I found
         | it useful. The diagram you linked is probably a more complete
         | and compact overview of all possible combinations, but I don't
         | find it very intuitive either.
        
           | sureglymop wrote:
           | I think it lacks some important escape sequences. E.g. how do
           | programs like vim and tmux switch to another buffer and then
           | restore the buffer? I vaguely know about it but never saw an
           | actually complete documentation.
        
             | vidarh wrote:
             | The diagram is complete. It shows the collection of the raw
             | sequences, which includes a bunch of parameters that you
             | then need to process separately to determine what to
             | actually do.
             | 
             | To switch to/from the alternate screen mode is \e[?47h and
             | \e[?47l. "\e[?" is DEC private mode which are DEC private
             | mode codes. The number specifies a range of settings to
             | switch on or off. The "h" and "l" determines if you're
             | setting or clearing the setting respectively.
             | 
             | The parsing of those are handled by the escape, csi entry,
             | and csi param boxes in the diagram.
        
         | vidarh wrote:
         | Ignore all of this, and start simple.
         | 
         | You can get _something_ going with just the most rudimentary
         | escape handling just by spitting what programs write to your
         | terminal to debug output in the terminal you run your new
         | terminal from, and add a proper parser a bit later.
         | 
         | You can totally ignore reflow. It'll look ugly. It doesn't
         | matter. When running full screen applications you need to
         | handle width/height reporting, that's all.
         | 
         | Font fallback and nice font handling is a detail to worry about
         | well down the line. There are libraries that can do a lot of
         | the lifting for you depending on language/platform. Just pick a
         | font with reasonable coverage and worry about the rest later.
        
         | blueflow wrote:
         | Ignore all that ANSI stuff, implement escape sequences like you
         | think its easy to implement and then write a terminfo file for
         | that so applications know how to use it.
        
       | er4hn wrote:
       | Mitchell Hashimoto, Hashicorps longest serving IC, has been
       | working on his own terminal emulator as a side project:
       | https://mitchellh.com/ghostty . It's been interesting to read
       | through his logs and see how it develops along with the gnarly
       | bugs he gets to work through.
        
         | mtlynch wrote:
         | Agreed! I started reading not understanding anything about
         | terminal emulators, and it's been interesting following his
         | progress.
         | 
         | > _Mitchell Hashimoto, Hashicorps longest serving IC_
         | 
         | Small correction: I don't think this is right. He only became
         | an IC two years ago.[0]
         | 
         | [0] https://www.hashicorp.com/blog/mitchell-s-new-role-at-
         | hashic...
        
       | keithwinstein wrote:
       | FWIW, I wouldn't try to parse escape sequences directly from the
       | input bytestream -- it's easy to end up with annoying edge cases.
       | :-/ In my experience you'll thank yourself if you can separate
       | the logic into something like:
       | 
       | - First step (for a UTF-8-input terminal) is interpreting the
       | input bytestream as UTF-8 and "lexing" into a stream of Unicode
       | Scalar Values
       | (https://www.unicode.org/versions/Unicode15.1.0/ch03.pdf#P.12...
       | ; https://github.com/mobile-
       | shell/mosh/blob/master/src/termina...).
       | 
       | - Second step is "parsing" the scalar values by running them
       | through the DEC parser/state machine. This is independent of the
       | escape sequences (https://vt100.net/emu/dec_ansi_parser ;
       | https://github.com/mobile-shell/mosh/blob/master/src/termina...).
       | 
       | - And then the third step is for the terminal to execute the
       | dispatch/execute/etc. actions coming from the parser, which is
       | where the escape sequences and control chars get implemented
       | (https://www.vt100.net/docs/vt220-rm/ ; https://invisible-
       | island.net/xterm/ctlseqs/ctlseqs.html ;
       | https://github.com/mobile-shell/mosh/blob/master/src/termina...).
       | 
       | Without this separation, it's easier to end up with bugs where,
       | e.g., a UTF-8 sequence or an ANSI escape sequence is treated
       | differently if it's split between read() calls
       | (https://bugs.chromium.org/p/chromium/issues/detail?id=212702),
       | or invalid input isn't correctly recovered-from, etc.
        
       | azinman2 wrote:
       | The comments here don't seem to reflect what I think is the most
       | interesting point here: quick loops of satisfaction. So much of
       | programming often takes forever to get any real utility or see
       | progress. That can really be depressing, especially for a side
       | project. That's what I love about cooking or sewing; you quickly
       | see the process come together. I wish programming was like that
       | more often.
        
         | vidarh wrote:
         | Yeah, I'm using my own terminal, and my own editor. The
         | terminal also relies on a font-engine I have heavily modified
         | (I converted the original from C to Ruby). So I "control" the
         | whole pipeline from the editor to the actual pixels, and on one
         | hand it has all kinds of quirks I wouldn't wish on someone
         | else, on the other hand they all have bits and pieces that are
         | custom-written to fit exactly what I want, and which features
         | gets implemented are decided almost entirely based on which
         | little change feels like it'll immediately improve my life
         | right now (and I'm not joking - I spend enough time in front of
         | my terminal that fixing small aspects of the terminal or my
         | editor does feel like it is making an actual improvement in my
         | happiness).
        
       | norir wrote:
       | > It's also possible to write a terminal in a terminal, like
       | tmux, but I'd save this for my second attempt. It's very helpful
       | to have a place to dump logging info that's not also the screen
       | we're writing to.
       | 
       | I don't fully understand what the author is saying here or
       | precisely what they mean by writing a terminal in a terminal.
       | From my perspective though, it is easier to write a hosted
       | terminal that runs inside of an existing terminal. Writing the
       | full thing from scratch is a much harder problem. A terminal has
       | many subproblems that are best attacked separately in my opinion.
       | 
       | At its heart, a terminal reads formatted text from standard input
       | and writes formatted text to standard output. It is essentially a
       | REPL. So the first step is to write a (R)ead function. Then you
       | pass the result of read to the (E)valuate function which will
       | process the input and finally pass it in to the (P)rint function.
       | If you start with a hosted terminal, the read and print functions
       | can be modeled with posix read and write so you can devote most
       | of your time to the evaluate function.
       | 
       | Once you have a good evaluate function and the terminal works as
       | you like in the hosted environment, then it makes sense to go
       | back and write new implementations of read and write that target
       | a new host environment. This is when it makes sense to switch to
       | QT or opengl: when you have already implemented the core logic of
       | the terminal and want better io performance. But it also might
       | make sense to target html/js for maximum portability. You can
       | either reuse or rewrite the backend that was used in the
       | bootstrap terminal depending on how it was written. Even if you
       | are changing languages, the rewrite should be much easier than
       | the initial implementation since you already know what
       | functionality is necessary and how to do it.
       | 
       | If you start with QT or opengl, you might never even get to a
       | useful terminal because you get so bogged down in the incidental
       | details.
       | 
       | What I am describing is essentially quite similar to
       | bootstrapping a new programming language. The initial
       | implementation should be done in the most convenient
       | language/environment possible for the author. People commonly
       | make the mistake of implementing a bootstrap compiler in a low
       | level language, which is almost always a premature optimization
       | and forces you to take on accidental complexity (such as memory
       | management) that is secondary to your primary goals. Remember
       | Fred Brook's advice to plan to throw the first implementation
       | away. It is so much easier to do something that you have already
       | done before than something new.
        
         | vidarh wrote:
         | The entire backend rendering to raw X11 calls for my personal
         | terminals is ~160 lines of Ruby, and that includes support for
         | oddities like double width/double height, and optimizations you
         | can drop at first like scroll up/down (as opposed to taking the
         | slow approach of redrawing, which is enough for a first
         | approximation). You need very little to do the bare minimum
         | graphical output.
        
           | norir wrote:
           | Exactly. So start there and build up.
        
       | winstonrc wrote:
       | I built a fake terminal on my website[0]. I've been planning on
       | building an actual one that is compiled to WASM, but it was fun
       | building the little features such as a memory of entered commands
       | that can be navigated by pressing up and down with the arrow
       | keys. This looks like a great resource for me to take it to the
       | next level. Are there any concerns I should be aware of if I were
       | to deploy a working terminal on a website?
       | 
       | [0] https://www.winstoncooke.com/terminal
        
       | c-smile wrote:
       | Just in case, Sciter has built-in element <terminal> that can be
       | used for various purposes.
       | 
       | Escape codes are supported, see: https://sciter.com/wp-
       | content/uploads/2022/10/terminal.png
       | 
       | Docs/API: https://docs.sciter.com/docs/behaviors/behavior-
       | terminal
        
       ___________________________________________________________________
       (page generated 2023-11-10 23:00 UTC)