[HN Gopher] Write Your Own Terminal ___________________________________________________________________ Write Your Own Terminal Author : ingve Score : 120 points Date : 2023-11-10 08:15 UTC (14 hours ago) (HTM) web link (flak.tedunangst.com) (TXT) w3m dump (flak.tedunangst.com) | Instantnoodl wrote: | I wrote a small terminal emulator a while ago to have a portable | terminal for my terminal based game. It's very specific but I had | great fun with it. | | https://github.com/bigjk/crt | clemailacct1 wrote: | That terminal and even the associated game look incredible! | sotix wrote: | This is awesome! Excited to see what you do with the project. | Definitely keep us updated. | billconan wrote: | I wanted to write a terminal emulator, the biggest hurdle is | understanding the escape sequences. all documents seem to be | unreadable, including those mentioned in the post, and those | often referenced in projects, like | https://vt100.net/emu/dec_ansi_parser | | the second difficulty is handling reflow. a real terminal can't | resize its screen, but an emulator can. how to implement that | correctly with cursor movements? | | the third difficulty is handling font fallback and rendering | emojis and other combinatory glyphs correctly. | SoftTalker wrote: | Some real terminals had a few different row/column modes, e.g. | 80x24 or 132x40 (IIRC). I don't recall if text was reflowed | when switching. | jws wrote: | The early DEC terminals did not reflow on switching. | wyan wrote: | Usually the screen was cleared when switching modes, so no | text reflow. | dilap wrote: | haven't tried, but i'd guess chatgpt (especially 4) could help | a ton w/ this. | jws wrote: | I've written ANSI terminal handling before. The original VT100 | paper manual that came with the terminal is a nice start, | because the complexity hadn't really happened yet and people | used to write useful documentation. With that as a starting | point for understanding it isn't hard to extend into handling | the full spec. | | That diagram you linked is actually quite nice, but visually | intimidating. If you think of it as a handful of regular | expressions exploded into a state diagram that helps. For | instance, the entire left half of the diagram is just the CSI | code acceptor, see "CSI (Control Sequence Introducer) | sequences" on the "ANSI escape code" wikipedia page. | | You can write a regular expression to match a CSI and carry on. | This is 2023 and you aren't using an 8080 with 3k of RAM. | (probably) The only tiny trick is that you have to handle the | "incomplete trailing regex" and wait for more data to arrive | and try again. | | As for handling reflow. I wouldn't call that an | "implementation" problem. I'd call it a "specification" | problem. I'd approach it by seeing what Apple's Terminal | program does, write that down, and call that my specification. | alpaca128 wrote: | I always used Wikipedia's article on ANSI escape sequences. A | few details could be explained a bit better but overall I found | it useful. The diagram you linked is probably a more complete | and compact overview of all possible combinations, but I don't | find it very intuitive either. | sureglymop wrote: | I think it lacks some important escape sequences. E.g. how do | programs like vim and tmux switch to another buffer and then | restore the buffer? I vaguely know about it but never saw an | actually complete documentation. | vidarh wrote: | The diagram is complete. It shows the collection of the raw | sequences, which includes a bunch of parameters that you | then need to process separately to determine what to | actually do. | | To switch to/from the alternate screen mode is \e[?47h and | \e[?47l. "\e[?" is DEC private mode which are DEC private | mode codes. The number specifies a range of settings to | switch on or off. The "h" and "l" determines if you're | setting or clearing the setting respectively. | | The parsing of those are handled by the escape, csi entry, | and csi param boxes in the diagram. | vidarh wrote: | Ignore all of this, and start simple. | | You can get _something_ going with just the most rudimentary | escape handling just by spitting what programs write to your | terminal to debug output in the terminal you run your new | terminal from, and add a proper parser a bit later. | | You can totally ignore reflow. It'll look ugly. It doesn't | matter. When running full screen applications you need to | handle width/height reporting, that's all. | | Font fallback and nice font handling is a detail to worry about | well down the line. There are libraries that can do a lot of | the lifting for you depending on language/platform. Just pick a | font with reasonable coverage and worry about the rest later. | blueflow wrote: | Ignore all that ANSI stuff, implement escape sequences like you | think its easy to implement and then write a terminfo file for | that so applications know how to use it. | er4hn wrote: | Mitchell Hashimoto, Hashicorps longest serving IC, has been | working on his own terminal emulator as a side project: | https://mitchellh.com/ghostty . It's been interesting to read | through his logs and see how it develops along with the gnarly | bugs he gets to work through. | mtlynch wrote: | Agreed! I started reading not understanding anything about | terminal emulators, and it's been interesting following his | progress. | | > _Mitchell Hashimoto, Hashicorps longest serving IC_ | | Small correction: I don't think this is right. He only became | an IC two years ago.[0] | | [0] https://www.hashicorp.com/blog/mitchell-s-new-role-at- | hashic... | keithwinstein wrote: | FWIW, I wouldn't try to parse escape sequences directly from the | input bytestream -- it's easy to end up with annoying edge cases. | :-/ In my experience you'll thank yourself if you can separate | the logic into something like: | | - First step (for a UTF-8-input terminal) is interpreting the | input bytestream as UTF-8 and "lexing" into a stream of Unicode | Scalar Values | (https://www.unicode.org/versions/Unicode15.1.0/ch03.pdf#P.12... | ; https://github.com/mobile- | shell/mosh/blob/master/src/termina...). | | - Second step is "parsing" the scalar values by running them | through the DEC parser/state machine. This is independent of the | escape sequences (https://vt100.net/emu/dec_ansi_parser ; | https://github.com/mobile-shell/mosh/blob/master/src/termina...). | | - And then the third step is for the terminal to execute the | dispatch/execute/etc. actions coming from the parser, which is | where the escape sequences and control chars get implemented | (https://www.vt100.net/docs/vt220-rm/ ; https://invisible- | island.net/xterm/ctlseqs/ctlseqs.html ; | https://github.com/mobile-shell/mosh/blob/master/src/termina...). | | Without this separation, it's easier to end up with bugs where, | e.g., a UTF-8 sequence or an ANSI escape sequence is treated | differently if it's split between read() calls | (https://bugs.chromium.org/p/chromium/issues/detail?id=212702), | or invalid input isn't correctly recovered-from, etc. | azinman2 wrote: | The comments here don't seem to reflect what I think is the most | interesting point here: quick loops of satisfaction. So much of | programming often takes forever to get any real utility or see | progress. That can really be depressing, especially for a side | project. That's what I love about cooking or sewing; you quickly | see the process come together. I wish programming was like that | more often. | vidarh wrote: | Yeah, I'm using my own terminal, and my own editor. The | terminal also relies on a font-engine I have heavily modified | (I converted the original from C to Ruby). So I "control" the | whole pipeline from the editor to the actual pixels, and on one | hand it has all kinds of quirks I wouldn't wish on someone | else, on the other hand they all have bits and pieces that are | custom-written to fit exactly what I want, and which features | gets implemented are decided almost entirely based on which | little change feels like it'll immediately improve my life | right now (and I'm not joking - I spend enough time in front of | my terminal that fixing small aspects of the terminal or my | editor does feel like it is making an actual improvement in my | happiness). | norir wrote: | > It's also possible to write a terminal in a terminal, like | tmux, but I'd save this for my second attempt. It's very helpful | to have a place to dump logging info that's not also the screen | we're writing to. | | I don't fully understand what the author is saying here or | precisely what they mean by writing a terminal in a terminal. | From my perspective though, it is easier to write a hosted | terminal that runs inside of an existing terminal. Writing the | full thing from scratch is a much harder problem. A terminal has | many subproblems that are best attacked separately in my opinion. | | At its heart, a terminal reads formatted text from standard input | and writes formatted text to standard output. It is essentially a | REPL. So the first step is to write a (R)ead function. Then you | pass the result of read to the (E)valuate function which will | process the input and finally pass it in to the (P)rint function. | If you start with a hosted terminal, the read and print functions | can be modeled with posix read and write so you can devote most | of your time to the evaluate function. | | Once you have a good evaluate function and the terminal works as | you like in the hosted environment, then it makes sense to go | back and write new implementations of read and write that target | a new host environment. This is when it makes sense to switch to | QT or opengl: when you have already implemented the core logic of | the terminal and want better io performance. But it also might | make sense to target html/js for maximum portability. You can | either reuse or rewrite the backend that was used in the | bootstrap terminal depending on how it was written. Even if you | are changing languages, the rewrite should be much easier than | the initial implementation since you already know what | functionality is necessary and how to do it. | | If you start with QT or opengl, you might never even get to a | useful terminal because you get so bogged down in the incidental | details. | | What I am describing is essentially quite similar to | bootstrapping a new programming language. The initial | implementation should be done in the most convenient | language/environment possible for the author. People commonly | make the mistake of implementing a bootstrap compiler in a low | level language, which is almost always a premature optimization | and forces you to take on accidental complexity (such as memory | management) that is secondary to your primary goals. Remember | Fred Brook's advice to plan to throw the first implementation | away. It is so much easier to do something that you have already | done before than something new. | vidarh wrote: | The entire backend rendering to raw X11 calls for my personal | terminals is ~160 lines of Ruby, and that includes support for | oddities like double width/double height, and optimizations you | can drop at first like scroll up/down (as opposed to taking the | slow approach of redrawing, which is enough for a first | approximation). You need very little to do the bare minimum | graphical output. | norir wrote: | Exactly. So start there and build up. | winstonrc wrote: | I built a fake terminal on my website[0]. I've been planning on | building an actual one that is compiled to WASM, but it was fun | building the little features such as a memory of entered commands | that can be navigated by pressing up and down with the arrow | keys. This looks like a great resource for me to take it to the | next level. Are there any concerns I should be aware of if I were | to deploy a working terminal on a website? | | [0] https://www.winstoncooke.com/terminal | c-smile wrote: | Just in case, Sciter has built-in element <terminal> that can be | used for various purposes. | | Escape codes are supported, see: https://sciter.com/wp- | content/uploads/2022/10/terminal.png | | Docs/API: https://docs.sciter.com/docs/behaviors/behavior- | terminal ___________________________________________________________________ (page generated 2023-11-10 23:00 UTC)