[HN Gopher] Learn x86-64 assembly by writing a GUI from scratch ___________________________________________________________________ Learn x86-64 assembly by writing a GUI from scratch Author : thunderbong Score : 366 points Date : 2023-06-01 16:11 UTC (6 hours ago) (HTM) web link (gaultier.github.io) (TXT) w3m dump (gaultier.github.io) | samsquire wrote: | This is awesome, thanks for submitting and thanks to the author. | | * I would like to understand the assembly used for exception | handling. Does anybody know how exceptions work at an assembly | level? (I am interested in algebraic effects) | | * Need to create a closure in assembly. | | * I have some assembly ported to GNU assembly based on a blog | post whose website is down that executes coroutines. | toast0 wrote: | > * I would like to understand the assembly used for exception | handling. Does anybody know how exceptions work at an assembly | level? (I am interested in algebraic effects) | | Assembly doesn't really have a concept of exceptions. System | defined exceptions and handlers exist, like if you're on x86 | and run in protected mode, you can get a processor exception if | you access a memory address that's not mapped for the type of | access you do; that functions more or less like an interrupt; | if you're running in an operating system, the operating system | will handle that in some way, and maybe pass that information | to your program in some way (or maybe just kill your program), | but again, that'll be defined by the system you're on, and we | can't talk much generally. On some systems you can get an | exception for math errors (divide by zero, overflow, etc), on | others you have to test for them, some systems will generate an | exception for unaligned data access, some won't, etc. | | > * Need to create a closure in assembly. | | Again, this isn't really an assembly concept. You've got to | define what a closure means to you, and then build that however | you like. In my mind, a closure is more or less a function plus | a list of variables, in assembly, I'd model that as the address | of a function that takes several addresses as parameters, but | passing parameters is up to you --- if you're calling your own | functions, you don't need to follow any particular convention | on parameter passing, it just needs to make sense to you, and | be written in a way that does what you mean: the computer will | do what you told it to, which isn't always what you meant. | sfink wrote: | "Awesome description of rocks. I would like to understand the | rocks used for nuclear reactors." | samsquire wrote: | This is an amusing characterisation. | | I would like to know how high level concepts map to assembly | so I can understand how to compile to it. | | I feel low level assembly gives so much freedom to decide how | to do things. | | I should probably get better at writing assembly so that I | have inspiration on how to solve the high level things. But | it's generations of technical ideas, solutions, | implementation details and understanding I have to go | through. I would like to understand exception handling to | implement algebraic effects. | | I also think structs are extremely useful and that it's | amazing that sum types were invented. | saulpw wrote: | I would recommend writing a simple Forth "interpreter". | Assembly is the easiest language to write a Forth | interpreter/compiler in, it's not that difficult (on the | order of 10 hours to get something working your first time, | and 50-100 hours to implement some of the more subtle | concepts), and it will blow your mind. | Findecanor wrote: | Both of those topics are rabbit holes to fall down into and | discover a whole lot. There is not _one_ way to do either, and | there are different conventions for different platforms, | languages and compilers. | | I'd suggest to start with the paper "Aspects of implementing | CLU" from 1978 that covers both CLU's early type of exception | handling and Iterators, which are a form of closures. To find | out how modern C++ - style exception handling is done, read | "Itanium C++ ABI" (yes, _Itanium_ !), which most of the Unix | world used as template for x86-64 and AArch64 later. Then look | up "Zero overhead deterministic exceptions" for a proposal for | C++ that didn't get picked. | justinhj wrote: | Great to see a CLU mention here. There are number of | interesting papers and documents floating around but it's | rarely mentioned, presumably because it was always a research | language and only used by a handful of people in industry. | The parameterized type system has features only recently | rediscovered in Rust and in C++23. | [deleted] | steppi wrote: | This is a really cool little example. I've been teaching myself | assembly recently and have found _Learn to Program with Assembly_ | (2021) [0] by Johnathan Bartlett to be really valuable. I had | initially looked through his freely available book _Programming | From the Ground Up_ (2003) [1], which covers x86 assembly, and | ended up buying the updated book after finding the old one to be | well written but out of date. I 've been programming in C for a | long time and it's been very cool to dig a little deeper and | understand better what's really going on under the hood. | | [0] https://www.bartlettpublishing.com/site/books/learn-to- | progr... | | [1] https://download- | mirror.savannah.gnu.org/releases/pgubook/Pr... | lost_tourist wrote: | If you're going to learn assembly for the first time I would say | start with arm-64 assembly first, the architecture is much more | refined and the assembler much more pleasurable to code with less | foot guns and complication unless you are doing only the most | basic of programs. | Croftengea wrote: | TL;DR: the article explains how to open a new window in X11 and | print "Hello, world" in assembly. The asm code to achieve this is | 618 lines long. | ripe wrote: | Thank you for summarizing! This adds a lot of color to the | headline. I was imagining a framebuffer-based GUI. | asveikau wrote: | > I will be using the Linux system call values, but 'porting' | this program to, say, FreeBSD, would only require to change those | values | | Is that true? I remember ~20 years ago I was looking at the i386 | syscall ABIs (since amd64 wasn't big then), and there, Linux | syscalls passed arguments by register and FreeBSD passed them on | the stack. Maybe for amd64, FreeBSD switched to pass by register | on Intel, but I wouldn't assume a syscall ABI is such a quick and | simple substitution. | bitshiffed wrote: | For amd64 they both use the same registers to pass arguments. | | But, the BSD syscalls use the carry flag to indicate error, | rather than the returned value of rax being negative. If your | syscalls always succeed, and never return values within what | would be a negative range as a signed value, then the code | would run; but that's not exactly "portable". | titzer wrote: | This is great! I'd like to write code to interface X11 without | going through libx11 but I've not gotten around to reading the | documentation around its binary format. This is a good start! | eschneider wrote: | You don't need assembly to do that. It's just another network | app. :) Check out Adrian Nye's "X Protocol Reference Manual" to | see how to talk X. | titzer wrote: | Sure, but any working starting point is a worthwhile read. | toast0 wrote: | If you want to be closer to X without reading the protocol | documentation, you might look into xcb; it's much less | abstraction than xlib. | [deleted] | jagged-chisel wrote: | Seeing the headline, one could be scared away thinking this is | bare metal from scratch. It is not. | | The app is an X11 client and will run under an OS, meaning you'll | learn to make system calls and other library calls to get things | on the screen. Very educational, and not scary-deep. | voidz7 wrote: | does this tutorial work on macos? | nmstoker wrote: | There are some pointers on that if you skim the tutorial. | jiffygist wrote: | Some useful gui program examples for winapi | | https://www.davidgrantham.com/ | zerkten wrote: | Writing Win32 programs in assembly was a niche in the late-90s. | This post inspired me to do some googling for a project I was | familiar with back then and discovered the author has brought it | back to life at https://github.com/ThomasJaeger/VisualMASM. | SeenNotHeard wrote: | One of the great video games of the late 90s, Rollercoaster | Tycoon, was coded in assembly. Even back then, that was | considered a feat. | FartyMcFarter wrote: | If I remember correctly there were websites with tutorials | naming this style of programming "win32asm". This is the one I | remember: | | http://www.afturgurluk.net/documents/Info/Win32ASM/Iczelion%... | maherbeg wrote: | Oh man, this brings me back to writing a hot key based | application launcher in assembly for windows to learn assembly | and the various tools for compiling and building things. Good | times! | wudangmonk wrote: | Being self-taugh I decided what better way to learn programming | than starting with the basics?. Assembly was my first language, I | could read and program in it so I considered like I knew the | language. | | It wasn't until I created a sinple 8086 emulator where you take | the raw machine code instructions and translate those into not | only the assembly instructions but actually emulate what those | instructions do that I finally felt like I REALLY knew assembly. | | My suggestion to others that want to learn assembly is to skip | any assembly books. Using whatever language you want first start | with a translator from machine code into assembly instructions, | and then do an emulator. You only need to implement a small | subset of the instructions, check out godbolt and translate some | simple programs to know which instructions you need to implement. | | Other than that all you really need is the 8086 manual, it has | all the information there. I also found this site useful when | implementing the flags https://yassinebridi.github.io/asm- | docs/8086_instruction_set.... This takes less time than finishing | a book and you learn a LOT more. | | The goal is not to program in assembly at all but to truely | understand the cost of everything and what you can expect from | your hardware. | JohnFen wrote: | I did something very similar. Assembly was not my first | language (it was my 4th), but I decided to learn it by writing | a compiler and linker in it. | | In for a penny, in for a pound. | | > The goal is not to program in assembly at all but to truely | understand the cost of everything and what you can expect from | your hardware. | | Entirely this. Also, to help you understand more deeply how | computers really work. | | That said, being able to program in assembly is still of great | use to me. I do it to this day, usually on ARM processors -- | not entire programs anymore, but critical parts. | nerpderp82 wrote: | Becoming skilled at GDB and knowing how to generate assembly | listing from your tooling is a key skill that really helps with | understanding. | | I learned assembly by reading the assembly listings from the C | compiler. It is extremely interesting to be able internalize | how high level constructs are compiled and optimized. | BlackLotus89 wrote: | There was a reverse engineering guide that I quite liked that | introduced you to assembly by first writing c examples, | compiling them and then analyzing the disassembled output. | | It was quite a long guide, but I would recommend it to anyone | starting out. I don't have it in my bookmarks it seems, but I | will try to update my comment tomorrow when/if I find it. | | Edit: damn I guess it was https://beginners.re/ before it | became pay-walled. Web archive still has copies of the book, | but if you like it you should consider buying it even if it | means signing up for patreon m-( I still got a few versions | of the book somewhere as well. Have to dive in again to see | if it is as good as I remember | circuit10 wrote: | https://godbolt.org/ is great for this | fuzztester wrote: | Two older assembly language programming books that I had checked | out earlier, and thought were good, are ones by Randal Hyde and | Paul Carter. | | Both were for 32 bit assembly, not 64 bit, IIRC. | | Paul Carter was a professor or lecturer at a US college. | | I think his book was available online. | fuzztester wrote: | >I think his book was available online. | | http://pacman128.github.io/pcasm/# | | Scroll down the page for the PDF book. | gigel82 wrote: | The title is confusing. What is a GUI from scratch? A bootloader | / mini kernel with framebuffer? A win32 application? | | Should probably be something like "Writing a Linux X11 | application in assembly". | ndesaulniers wrote: | Writing a Linux application in _Intel_ x86 assembler | syntax...smh. You do not know de wey | qayxc wrote: | It's a matter of personal taste. Some people (including myself) | simply like Intel syntax better. As a sidenote, I find it quite | fitting since Linux - unlike Unix - was "born" on Intel | hardware after all :) | freedomben wrote: | What sort of jobs are there these days that use assembly? Is | anybody still using it directly? | | These are pretty non-specific, but these are area I know about | already for others who may have the same question as me: | | 1. Compiler development | | 2. Security research (malware analysis/reverse engineering) - | although not much if any writing assembly, just reading | | 3. Kernel development - again mostly just reading assembly, not | writing it. Bulk of code written in C (or potentially a very | recent development, rust) | | 4. Driver development - mostly C but some devices can involve | assembly | TheLoafOfBread wrote: | 5. Emulators - You will be trying to understand every | instruction as deeply as possible. | panxyh wrote: | High end malware development. | nvy wrote: | Could you elaborate, or provide a link as a jumping off point | for someone who wants to learn more about this topic? | lost_tourist wrote: | It's the difference between being a script kiddie and an | actual hacker/cracker. Any web search will turn up | thousands of links on hardware hacking at all levels. | nvy wrote: | That's not really what I'm asking, though. Parent claimed | "high-level malware development" happens in ASM, but as | far as I know a good chunk of sophisticated malware | (stuxnet, wannacry, etc.) are written in plain ol' C or | C++, so I categorically disagree that the differentiator | between "script kiddie" and "leet haxor" is in whether or | not someone writes assembly. | | But I'm interested in reading about malware written in | assembly and was hoping for a diving board into that | particular pool. | bufo wrote: | Deep learning work when optimizing inference. | Hackbraten wrote: | Some software packages written in assembly during the 70s and | 80s are still in production today, and may be difficult and | expensive to replace. I did some contract work for a steel | plant in 2018. The primary control system for the plant was | written in assembly. They were in the middle of doing a full | rewrite, but in the meantime, they had to do maintenance and | bugfixing for the in-production system in assembly. | hu3 wrote: | Example of assembly in Go source code: | | https://github.com/golang/go/blob/master/src/crypto/md5/md5b... | sgt wrote: | Interestingly, I believe Go enforces the mnemonics to be | UPPER CASE. | z3t4 wrote: | For programming CPUs that cost less then 1$. Like sensors. For | low power usage. Or small form factor. | PartiallyTyped wrote: | Hypervisor work also involves assembly. | slt2021 wrote: | game engine development | nerpderp82 wrote: | Reading stack traces and low level tracing logs. | sfink wrote: | Anything where you get crash reports back from the field. It is | very valuable to be able to read assembly code and map | registers to their purpose, and then perhaps back to the source | code that generated the assembly. Debuginfo will sometimes give | you some of that, but is unreliable, incomplete, and can be | hard to match up to the stripped binary you're looking at. | Recognizing values that are likely to be stack vs uninitialized | or poisoned vs corrupted vs nullptr or offsets to nullptr... it | can turn a crash report from absolutely cryptic into something | that gives you the lead you need. | | (Also, if you are dealing with something with mass deployment, | it's good to recognize the single-bit flips that are hallmarks | of bad RAM. But don't assume too much; bit flips are also the | sign of bit flag manipulations.) | bryanlarsen wrote: | Also bootloader development will usually require some assembly. | retrac wrote: | Small embedded systems. There are microcontrollers that cost | like 3 cents in bulk. 8-bit machines with a few kilobytes of | PROM and perhaps just 64 bytes of RAM. While such machines | often do have C compilers (of a sort) for them, old-school | optimization techniques sometimes come into play. | lost_tourist wrote: | I used to enjoy that stuff, but these days if it seems like a | job requires any significant assembly, I just turn it down. I | hate worrying about every single byte of memory, it takes all | the fun out for me, but I do know those who love figuring out | a tough problem and always having to be efficient with every | bit and byte. | aidos wrote: | There are tough problems at every layer of the stack. | Granted, the problems look very different, but they're no | less challenging. I think that is one of the great things | about being a software developer - wherever you look, there | are interesting things to explore. I studied assembly some | 20+ years ago and have barely seen it since, though I've | worked on a lot of complex technical problems since then. | jcranmer wrote: | Any sort of performance engineering will likely require | competence with assembly, although direct programming in | assembly may be relatively rare in such roles. | duped wrote: | Writing it from scratch is not nearly as common as reading it | and understanding it. I think pretty much every systems | programmer will have to stare at disassembly output from time | to time. | junon wrote: | > mostly just reading assembly, not writing it | | Not always the case. You're not writing it _all the time_ but | you still have to write it. For example the trampoline I use to | jump from the boot stage to the kernel entry point is common- | mapped between the two memory spaces and performs the switch | inside of it, and then calls the kernel. That 's all in | assembly. | eschneider wrote: | Board bring up usually needs a bit of assembly. Certainly needs | some reading knowledge of assembly. | fuzztester wrote: | Developing hardware diagnostic utilities can be another area. | | The kinds of utilities that come built into ROM, or that you | run from a CD or USB drive, where you test memory and disk by | writing different bit patterns to them, reading them back, and | checking if they match, probing the hardware, processor and | peripherals, etc. | zxexz wrote: | You'd be surprised how often knowing assembly can come in | useful - I certainly never expected it. I work in the | healthcare sector, which is infamous for having tons of legacy | software. At least a couple times a year I end up finding it | useful to load some ancient binary into radare2 or Ghidra for | debugging, extracting data, or just adding a jmp to avoid a | problematic syscall. I'm by no means an assembly expert, but | know enough to get the job done. | jandrese wrote: | I'd guess there are more jobs that use assembly than jobs where | you write the X server protocol directly to the socket. | steppi wrote: | Another example is writing hand optimized matrix and vector | operation routines tailored to specific hardware for BLAS | libraries [0]. | | [0] | https://en.m.wikipedia.org/wiki/Basic_Linear_Algebra_Subprog... | KeplerBoy wrote: | Is this really still a thing? | | Do people go further than using instrinsics for let's say | AVX? | retrac wrote: | Sure. You'll see it very often in codec implementations. | From rav1e, a fast AV1 encoder mostly written in Rust: | https://github.com/xiph/rav1e/tree/master/src/x86 | | Portions of the algorithm have been translated into | assembly for ARM and x86. Shaving even a couple percent off | something like motion compensation search will add up to | meaningful gains. See also the current reference | implementation of JPEG: https://github.com/libjpeg- | turbo/libjpeg-turbo/tree/main/sim... | mikebenfield wrote: | FWIW I've found that compilers' code generation around | intrinsics is often suboptimal in pretty obvious ways, | moving data around needlessly, so I resort to assembly. For | me this has just been for hobby side projects, but I'm sure | people doing it for stuff that matters run into the same | issue. | steppi wrote: | Yeah. I'm going to be helping to work on expanding CI for | OpenBLAS and have been diving into this stuff lately. See | the discussion in this closed OpenBLAS issue gh-1968 [0] | for instance. OpenBLAS's Skylake kernels do rely heavily on | intrinsics [1] for compilers that support them, but there's | a wide range of architectures to support, and when hand- | tuned assembly kernels work better, that's what are used. | For example, [2]. | | [0] https://github.com/xianyi/OpenBLAS/issues/1968 | | [1] https://github.com/xianyi/OpenBLAS/blob/develop/kernel/ | x86_6... | | [2] https://github.com/xianyi/OpenBLAS/blob/23693f09a26ffd8 | b60eb... | zerkten wrote: | According to friends reading is still fairly prevalent for | Windows and other products at Microsoft. Kind of a requirement | to succeed in jobs with a C/C++ product where you might only | have memory dumps to debug. It's also expected to some extent | if you are a performance guru in some areas. | satiric wrote: | Is there a practical reason to do this? I don't mean that | disparagingly; it's a cool project and I can see its value. I'm | just wondering if there's also a practical reason you might do | something like this rather than just using Qt or HTML/CSS or | whatever. | pavlov wrote: | As a curiosity, it's worth mentioning there have been entire GUIs | written in assembly. Probably the last commercially released one | was GEOS a.k.a. GeoWorks Ensemble. It was a small and efficient | GUI environment for x86 PCs, briefly somewhat popular as a | Windows alternative around 1990. | | Steve Yegge worked there and tells an interesting story. 15 | million lines of hand-written x86 assembly! | | http://steve-yegge.blogspot.com/2008/05/dynamic-languages-st... | | _" OK: I went to the University of Washington and [then] I got | hired by this company called Geoworks, doing assembly-language | programming, and I did it for five years. To us, the Geoworkers, | we wrote a whole operating system, the libraries, drivers, apps, | you know: a desktop operating system in assembly. 8086 assembly! | It wasn't even good assembly! We had four registers! [Plus the] | si [register] if you counted, you know, if you counted 386, | right? It was horrible._ | | _" I mean, actually we kind of liked it. It was Object-Oriented | Assembly. It's amazing what you can talk yourself into liking, | which is the real irony of all this. And to us, C++ was the | ultimate in Roman decadence. I mean, it was equivalent to going | and vomiting so you could eat more. They had IF! We had jump CX | zero! Right? They had "Objects". Well we did too, but I mean they | had syntax for it, right? I mean it was all just such weeniness. | And we knew that we could outperform any compiler out there | because at the time, we could!_ | | _" So what happened? Well, they went bankrupt. Why? Now I'm | probably disagreeing - I know for a fact that I'm disagreeing | with every Geoworker out there. I'm the only one that holds this | belief. But it's because we wrote fifteen million lines of 8086 | assembly language. We had really good tools, world class tools: | trust me, you need 'em. But at some point, man..._ | | _" The problem is, picture an ant walking across your garage | floor, trying to make a straight line of it. It ain't gonna make | a straight line. And you know this because you have perspective. | You can see the ant walking around, going hee hee hee, look at | him locally optimize for that rock, and now he's going off this | way, right?_ | | _" This is what we were, when we were writing this giant | assembly-language system. Because what happened was, Microsoft | eventually released a platform for mobile devices that was much | faster than ours. OK? And I started going in with my debugger, | going, what? What is up with this? This rendering is just really | slow, it's like sluggish, you know. And I went in and found out | that some title bar was getting rendered 140 times every time you | refreshed the screen. It wasn't just the title bar. Everything | was getting called multiple times._ | | _" Because we couldn't see how the system worked anymore!"_ | | ...I have to say, the "140 redraws by accident" part sounds like | an ordinary day in web UI development using 2023 frameworks. The | problem of not seeing the entire picture of what's going on isn't | limited to assembly programmers. You can start from the opposite | end of the abstraction spectrum and end up with the same issues. | jaggederest wrote: | Roller Coaster Tycoon was almost entirely written in assembler | by Chris Sawyer. Pretty amazing story, and released in 1999, as | well, so well past the point most people had stopped doing 100% | assembler development. | | https://en.wikipedia.org/wiki/RollerCoaster_Tycoon_(video_ga... | cf100clunk wrote: | Early in the 1990s Photodex wrote their CompuPic photo | management program in assembly. The shareware version of | CompuPic was popular for creating/editing/retouching lowball | graphics when the www soon emerged. | masfuerte wrote: | I'm pretty sure the 90s painting app Xara Studio was also | done in assembly. | mav88 wrote: | That wouldn't surprise me. The original Xara could render | complex SVGs in under two seconds on a 486-66. The most | optimized program I have ever used. | viler wrote: | A couple of GUIs written in assembly this century are MenuetOS | and KolibriOS. | troad wrote: | For anyone interested in x64 assembly, it's worth noting that a | new edition of Jeff Duntemann's excellent and classic | introductory book on assembly, now fully updated for x64, is | sitting with his publishers and is likely to be out sometime | around the summer. | | Source: http://www.contrapositivediary.com/?m=20230222 | xurukefi wrote: | The "xor rax, rax" that I just saw at a quick glance makes me | flinch. Still putting it on my reading list though. Sounds like a | really interesting little toy project. | seritools wrote: | To explain the flinching (since I didn't catch it immediately): | | > In 64-bit mode, still use `xor r32, r32`, because writing a | 32-bit reg zeros the upper 32. `xor r64, r64` is a waste of a | byte, because it needs a REX prefix. | | (from https://stackoverflow.com/a/33668295/554577 ) | Solvency wrote: | In 2023, does anyone who writes a compiler inherently have to | know assembly? | | Or even less recently...whoever wrote the first Rust, Zig, or | insert <new compiled language> here? | | Because don't you ultimately have to know how to make your own | syntax translate into efficient assembly code? | | Or is there someway these days for programming language | designers/creators to avoid it entirely? | gamache wrote: | Compiler writers can target high-level languages too; it's not | uncommon to see e.g., a Blub-to-C compiler which leaves the asm | parts to a different toolchain. (Lots of languages without the | goal of producing native code target even higher-level | languages, for example JS.) | | Another popular way to _sort of_ avoid assembly is to target | the LLVM IR (intermediate representation), in which case LLVM | takes care of optimization and producing processor-specific | machine code for a bunch of CPU types. But LLVM IR is basically | a fancy assembly language. | [deleted] | dahfizz wrote: | Llvm abstracts the "backend" which generates the actual | assembly for each target machine. You only have to write a | "frontend" that generates an llvm intermediate representation. | | But in general, yes. To generate assembly you need to know | assembly. | Solvency wrote: | Is LLVM sufficiently "simpler" to learn and wield than | assembly, or does it just make it easier to compile to | different systems? | jcranmer wrote: | LLVM is definitely more complex than a toy assembly you | might learn in an intro computer architecture course, but | it's generally somewhat less complex than working with real | assembly languages. Although the complexity in LLVM is a | very different kind of complexity from assembly languages; | LLVM is ultimately a higher-level abstraction than machine | code, and the semantics of that abstraction can be complex | in its own right. | josephcsible wrote: | > Note that Linux has a 'fun' difference, which is that the | fourth parameter of a system call is actually passed using the | register r10. | | Why is Linux singled out there? No OS can use rcx for that, since | the syscall instruction itself overwrites rcx with the return | address. | fsckboy wrote: | are you saying "they couldn't use rcx so they use r10, just | like everybody else"? Because the quote says r10 and you | brought up rcx | | in any case, there's a good discussion of registers and | syscalls here | | https://stackoverflow.com/questions/53290932/what-are-r10-r1... | laxd wrote: | Some use stack for syscall params. | pkphilip wrote: | This is cool! | sylware wrote: | If you write a wayland compositor in x86_64 assembly... | (vulkan+drm on elf/linux), without abusing a macro processor and | without obscene code generators... ___________________________________________________________________ (page generated 2023-06-01 23:00 UTC)