[HN Gopher] A 23-byte "hello, world" program assembled with DEBU...
       ___________________________________________________________________
        
       A 23-byte "hello, world" program assembled with DEBUG.EXE in MS-DOS
        
       Author : susam
       Score  : 68 points
       Date   : 2022-10-30 20:07 UTC (2 hours ago)
        
 (HTM) web link (github.com)
 (TXT) w3m dump (github.com)
        
       | marginalia_nu wrote:
       | DEBUG.EXE is some necronomicon-tier dark magic.
        
         | Narishma wrote:
         | What do you mean? It's a simple straightforward debug tool, or
         | a monitor as it used to be called in 8-bit systems.
        
           | userbinator wrote:
           | PC magazines used to publish source code listings for little
           | utilities in the form of DEBUG scripts.
        
         | blueflow wrote:
         | The debugger and the 8086 instruction set is well documented,
         | much better than the "modern" software that i have to work with
         | at dayjob. Its not magic.
        
         | pizza234 wrote:
         | It's interesting how ASM nowadays appears as a dark magic,
         | since only a small fraction of the programmers are (rightfully)
         | very far from that level.
         | 
         | Curiously, I found learning Rust way more challenging than
         | learning 16-bit assembly (it was way simpler back then; no
         | complex instructions, less baggage, simpler processors... and
         | less expectations :)).
        
           | int_19h wrote:
           | I've learned x86 16-bit assembly originally, and I find that
           | most of that knowledge is still applicable when looking at
           | assembly listings while debugging C/C++ today (which is the
           | most likely area where one might have to deal with it in
           | production these days; few people get to write asm from
           | scratch).
           | 
           | For x86, at least, I wouldn't even say that it was less
           | complex. Segmented memory alone is a huge complication, and
           | then on top of that there was all the legacy CISC stuff like
           | the BCD helpers or ENTER/LEAVE; x64 is comparatively
           | streamlined.
        
         | mtrower wrote:
         | An 8 year old, using the ring bound manual that came with their
         | computer, can figure out the basics - enough to write a program
         | like this, and examine it. I'm not being dismissive here, but
         | speaking literally from experience. Personally, it's my opinion
         | that computers were a lot simpler back then.
         | 
         | Nowadays we're very, very far from the hardware - hardware that
         | has grown very complicated in comparison.
        
       | userbinator wrote:
       | You can just use ret at the end, saving 3 bytes. Also, the
       | initial value of bp is 09xx on every version of MS-DOS since 4.0,
       | so you can also start off with an xchg ax,bp to save another
       | byte.                   xchg ax,bp         mov dx,107         int
       | 21         ret         db "hello, world" 0d 0a '$'
       | 
       | 22 bytes, of which 15 is the message.
        
         | EvanAnderson wrote:
         | I was going to suggest the RET but I didn't remember the BP
         | being set to save the additional byte. Nice, albeit sacrificing
         | some compatibility.
         | 
         | For anybody interested some DOS default register values
         | documented: http://www.fysnet.net/yourhelp.htm
        
           | userbinator wrote:
           | It's an old demoscene trick, so perhaps a bit obscure but
           | somewhat common in the sizecoding community.
        
         | jart wrote:
         | Wow from feedback to commit in 30 minutes.
         | https://github.com/susam/hello/commit/36fa08e7cafb7c5268b651...
        
         | susam wrote:
         | Updated the repository to use your RET suggestion. Thanks!
        
           | ithinkso wrote:
           | With no attribution nevertheless :) (/s)
           | 
           | I love how, without the 'hello, world' message itself, 25% of
           | your entire HELLO.ASM codebase is from a random HN comment
        
       | colejohnson66 wrote:
       | Impressive. Is there a reason to jump over the string instead of
       | just having the string _after_ the program? Seems like one could
       | save two bytes doing that.
        
         | donio wrote:
         | debug.exe is single pass and doesn't do labels so by having the
         | string first you know its address later.
        
           | vore wrote:
           | You can compute the address of the string yourself like a
           | two-pass assembler would, though, so that shouldn't be
           | limiting.
        
         | pizza234 wrote:
         | This is an ASM program with a very standard structure
         | (including the standard printing API:
         | http://spike.scu.edu.au/~barry/interrupts.html#ah09) using a
         | very standard tool (DEBUG.exe, common at the time for quick
         | debugging); I'm confused why this is impressive.
        
         | Agingcoder wrote:
         | The COM executable gets loaded by DOS at address 100h, so the
         | first bytes have to be executable code, if memory serves me
         | well?
        
           | vore wrote:
           | I think OP is saying what if you wrote it as:
           | mov ah, 9             mov dx, offset helloworld
           | int 21             mov ah, 0             int 21
           | .helloworld:             db 'hello, world', d, A, '$'
        
         | dmitrygr wrote:
         | You are right and sister comment to this one is wrong. Thusly:
         | 
         | MOV AH, 9
         | 
         | MOV DX, str
         | 
         | INT 21
         | 
         | MOV AH, 0
         | 
         | INT 21
         | 
         | Str:
         | 
         | DB 'hello, world', d, A, '$'
        
           | q-big wrote:
           | This program can be simplified further:                 MOV
           | AH, 9       MOV DX, str       INT 21       RET       str:
           | DB 'hello, world', d, A, '$'
           | 
           | Why can                 MOV AH, 0       INT 21
           | 
           | be replaced by RET? Here is the answer:
           | https://stackoverflow.com/a/60805758
           | 
           | UPDATE: Under https://news.ycombinator.com/item?id=33398592
           | userbinator posted an additional possible optimization.
        
             | ralferoo wrote:
             | Not a size optimisation, but a performance optimisation...
             | INT 21       RET
             | 
             | can be replaced with                 JP 5
        
         | susam wrote:
         | I wrote this about 20 years ago during my university days. I
         | happened to stumble upon it today in my archives and thought of
         | sharing it on GitHub. I was still learning microprocessors back
         | then. While browsing the C:\Windows directory, I fortuitously
         | happened to discover the DEBUG.EXE program. Turned out it was
         | available on any standard installation of MS-DOS as well as
         | Windows 98. That chance encounter helped me to dive into the
         | world of assembly language programming much before the
         | coursework introduced me to more popular assemblers.
         | 
         | Since I was still learning the x86 CPUs, the intention here was
         | not to save bytes but instead to have something working. I
         | believe I picked up the style of having the string at the top
         | and jumping over it from other similar code I had come across
         | in those days.
         | 
         | You are right of course. Here is a complete example that moves
         | the string to the bottom and saves two bytes:
         | C:\>DEBUG       -A       1165:0100 MOV AH, 9       1165:0102
         | MOV DX, 10B       1165:0105 INT 21       1165:0107 MOV AH, 0
         | 1165:0109 INT 21       1165:010B DB 'hello, world', D, A, '$'
         | 1165:011A       -G       hello, world            Program
         | terminated normally       -N HELLO.COM       -R CX       CX
         | 0000       :1A       -W       Writing 0001A bytes       -Q
         | C:\>HELLO       hello, world            C:\>
         | 
         | I have now updated the GitHub repository with this updated
         | source code and binary. Thank you for the nice comment!
        
           | Narishma wrote:
           | DEBUG.EXE is present in all versions of MS-DOS since the very
           | first.
        
           | owl57 wrote:
           | Curious. These other programmers probably learned the habit
           | from some even older code. Jumping over data isn't a very
           | obvious way of organising code, so probably it served some
           | purpose many years ago.
           | 
           | Maybe someone here knows what was that purpose?
        
             | [deleted]
        
             | userbinator wrote:
             | I remember seeing that in old Asm books too. My best guess
             | is that it avoids having too many forward references, which
             | would take up precious memory in the systems of the time
             | and perhaps reach the limit of the assembler sooner.
        
               | _the_inflator wrote:
               | Maybe also better for linking the files. At least this
               | was the reason I did it on Amiga when using absolute
               | addresses. I only had to remember the start of the
               | address area even when recompiling.
        
             | jstanley wrote:
             | I wrote a compiler for a small machine that did this so
             | that it could output the string content straight away
             | without having to buffer it in memory.
        
             | jmole wrote:
             | you can hard-code values if you know their address in
             | memory. If the data section comes first, then it doesn't
             | move around if your code size changes.
        
       | secondcoming wrote:
       | This brings back memories!
       | 
       | Many years ago I used debug.exe create a bootable floppy that did
       | nothing but display my name on the screen when booted from. I
       | peed a little when it finally worked.
       | 
       | Why did MS stop shipping it???
        
       | ivoras wrote:
       | Crazy that once upon a time just sequences of machine code were
       | written to files, without any headers, or checksums, or basically
       | any modern metadata. Just plain machine code, to be directly fed
       | into the CPU.
       | 
       | Cue lamentations of today's complexity mixed with feelings of
       | life being great because of it.
        
         | pizza234 wrote:
         | Yes, although this was for COM files only, which were limited
         | to (64k - 0x100) bytes. EXEs didn't have this limitation, but
         | they indeed had a header.
        
         | Lt_Riza_Hawkeye wrote:
         | And before that, they were toggled directly into the machine
         | using physical switches!
        
         | 13of40 wrote:
         | MS-DOS v1 didn't have an assembler in debug.exe (or was it
         | .com?) so the only way to author machine code on a 5150 with no
         | extra dev tools was to code it on paper, translate it by hand,
         | and enter it in hex.
        
           | susam wrote:
           | You are right. It was DEBUG.COM. Indeed it did not have an
           | assembler. It had a disassembler though. An archived copy
           | from MS-DOS v1.25 can be found here:
           | https://github.com/microsoft/MS-
           | DOS/blob/master/v1.25/bin/DE...                 C:\>DEBUG.COM
           | -A        ^ Error       -N HELLO.COM       -L       -U 100
           | 107       0340:0100 B409          MOV     AH,09
           | 0340:0102 BA0801        MOV     DX,0108       0340:0105 CD21
           | INT     21       0340:0107 C3            RET       -
           | 
           | There is another copy of the debugger for MS-DOS v2.0 here:
           | https://github.com/microsoft/MS-
           | DOS/blob/master/v2.0/bin/DEB... . This one does does have an
           | assembler.                 C:\>DEBUG.COM       -A
           | 0482:0100 MOV AH, 2       0482:0102 MOV DL, 41
           | 0482:0104 INT 21       0482:0106 RET       0482:0107       -N
           | A.COM       -R CX       CX 0000       :7       -W
           | Writing 0007 bytes       -Q            C:\>A       A
           | C:\>
        
         | pjc50 wrote:
         | If you're programming a microcontroller, that time is now.
         | 
         | (OK, so it'll often be the output of a C compiler, but there's
         | usually some work to be done in asm to get the system in a
         | state with a stack, clocks, RAM etc to run a C program!)
        
       ___________________________________________________________________
       (page generated 2022-10-30 23:00 UTC)