[HN Gopher] x86 Is an Octal Machine ___________________________________________________________________ x86 Is an Octal Machine Author : a1a106ed5 Score : 47 points Date : 2022-02-20 20:41 UTC (2 hours ago) (HTM) web link (gist.github.com) (TXT) w3m dump (gist.github.com) | MurrayHill1980 wrote: | Because the Intel 8080 instruction set was octal, right? | | This is cool: https://dercuano.github.io/notes/8080-opcode- | map.html | userbinator wrote: | This is an old article from the early 90s and I believe it may | have been the first public mention of this fact about the x86 | encoding, although no doubt many have independently "discovered" | it before --- especially in the times when microcomputer | programming consisted largely of writing down Asm on paper and | then hand-assembling the bytes into memory using something like a | hex keypad. | | _All of these are features inherited from the 8080 /8085/Z80._ | | Here are the corresponding opcode tables in octal: | | https://dercuano.github.io/notes/8080-opcode-map.html | | http://www.righto.com/2013/02/8085-instruction-set-octal-tab... | | http://www.z80.info/decoding.htm | pvg wrote: | I kind of doubt it was actually the first. It's definitely an | interesting/cute thing to notice and write up but I think the | weird longevity of this particular piece owes more to the | pioneering clickbaity framing (x86 is not really an octal | machine, it's not surprising that people 'hadn't noticed' | because obviously they had, etc) than the observations | themselves. | klelatti wrote: | Thanks for these links - very interesting. | | Astonishing to think that we can see traces of the 8008 still | today and that it wasn't actually an Intel designed ISA (came | from CTC / Datapoint). | kens wrote: | The Datapoint 2200, the source of the 8008 instruction set, | is an interesting machine. The CPU was built from TTL chips. | To decode instructions, they used _decimal_ BCD decoder | chips, specifically the 7442. But they 'd use them as octal | decoder chips, only using 8 outputs. | | The Datapoint 2200 documentation gave the opcodes in octal, | so they were clearly thinking in octal. The 8008 | documentation, however, didn't use octal or hexadecimal. The | opcodes were given in binary, but grouped in 3 bits, octal | style, e.g. 10 111 010. (They didn't specify opcodes in octal | or hex!) I think the 8008 was right at the time where octal | was on the way out and hexadecimal was taking over. (The 8008 | assembler manual uses both octal and hexadecimal, but | hexadecimal primarily.) | | The Intel 8080 still specified the instruction set in binary, | not octal or hexadecimal. The 8085 had opcodes in binary in a | 1983 manual, but now split with a line into 4-bit chunks | (i.e. hexadecimal-style). And then an appendix gave the | opcodes in hexadecimal. | | (Just some random history.) | woodruffw wrote: | I remember seeing a copy of this Usenet post years ago! It's one | of my favorite "secrets" about x86's encoding. | | The "core" (non-E/VEX, non-SSE, etc.) x86 encoding is wonderfully | clever _and_ terrible by modern standards, and Volume 2 of Intel | 's SDM is a great reference for how x86 manages to pack | remarkably complicated addressing, operand, etc. semantics into | just a handful of bytes. The result is a format that's remarkably | hard to decode correctly, meaning that just about every software | decoder for x86 is saturated with bugs[1] (FD: my project). | | [1]: https://github.com/trailofbits/mishegos | userbinator wrote: | In my experience it's the "second page" of opcodes (0f xx) | where the difficulty lies; the first page has been thoroughly | explored and documented by now. | | Historical note: the 286 was the first to have the second page. | jazzyjackson wrote: | I haven't seen the FD abbreviation, is it "full disclosure" ? | woodruffw wrote: | Yes, that's "full disclosure." | jonsen wrote: | > ALL 80x86 OPCODES ARE CODED IN OCTAL | | That's a backwards way of saying it. I'd rather say, given the | hardware structure of the bit fields of the opcode register, the | binary opcodes are perhaps better described by octal notation | rather than hexadecimal. | woodruffw wrote: | I can't interpret the author's intent, but I think they're | trying to point out a conflict between how Intel and most other | references document the x86 opcode byte (as a byte table, with | no clear coordinate relation between bits) versus how the byte | is structured internally (around octal values, which _would_ | reveal a coordinate relationship if visualized). | jonsen wrote: | Exactly. Then there's no need to scream an odd headline. | | I wouldn't say there's any "conflict". Who needs to know the | detailed hardware structure? Compiler writers maybe, but they | experience and order of magnitude more conflicts then. | woodruffw wrote: | > Who needs to know the detailed hardware structure? | Compiler writers maybe, but they experience and order of | magnitude more conflicts then. | | I think this Usenet posting was written in the early 1990s, | when a large number of people were probably still using | macro assemblers to write large programs, and may have also | been writing binary patches for those programs back when | that was easier (no relocations to worry about!). It's | definitely more of a "cool fact" than something you'd | immediately apply, but it's the kind of thing I could see | being useful to an assembly programmer of the period. | | For my N=1 experience: I've written compact x86 decoders in | HDLs before, and this octal mapping of the opcode structure | _was_ extremely useful in helping me determine an optimal | (in terms of minimal gate counts) decoder structure. But | that is indeed a very niche use case. | jonsen wrote: | I was there. I've done my share of programming in | assembler and building hardware for embedded systems. I | certainly needed to know the hardware details, but I | never saw a need to rewrite the opcodes in another | notation. | | I definitely appreciate the point of your last paragraph. ___________________________________________________________________ (page generated 2022-02-20 23:00 UTC)