* * * * * A branchless segment of code to generate a printable hexadecimal value I was an avid fan of assembly language back in my youth and I did a lot of it. And in that time, if I needed to convert a 4-bit quantity to a hexadecimal character, I would write the obvious code: -----[ Assembly ]----- ; x86 code add al,'0' cmp al,'9' ; if '0'-'9', no adjustment needed jbe skip ; otherwise, we need to adjust add al,7 ; the resulting character by 7 ; to get 'A'-'F' skip: -----[ END OF LINE ]----- Eight bytes and a branch instruction, and not many ways I could see to improve on that, until the other day when I came across this bit of code: -----[ Assembly ]----- ; x86 code add al,90h daa adc al,40h daa -----[ END OF LINE ]----- Not only does this convert a 4-bit value to a hexadecial character, but it's two bytes shorter and it's branchless! Now, some might say this abuses the DAA (Decimal Adjust after Addition) instruction, but it works. And how it works is pretty clever I think. The DAA instruction exists to allow BCD (Binary Coded Decimal) [1] arithmetic (back when it was a thing). For each 4-bits in a byte, the DAA instruction will check to see if it's in the range of 10 to 15 and if so, add 6 to that 4-bit value to bring it back into the 0 to 9 range, and propagate a carry bit (well, it's a bit more involved than that, but that will suffice for this post—you can check my MC6809 emulator [2] for the gory details of the DAA instruction). By adding 0x90 (or 144 in decimal) to a 4-bit value then using DAA, a carry bit will be propagated if the initial value was 10 to 15; otherwise there's no carry to propagate. The ADC (Add with Carry) of 0x40 (or 64 decimal) will then add any carry of the previous two instructions into the lower four bits of the result, and the DAA will then adjust the upper 4-bits to be either 0x3 or 0x4 due to the previous addition of 0x90 (which causes the number to act like a negative number [3] if the initial value was bewteen 0 and 9). And because of the carry if the initial 4-bit value was between 10 to 15, you get the required adjustment of 7 needed for values of 10 through 15. This means the result is 0x30 to 0x39 (the ASCII values of “0” to “9”) of the 4-bit values of 0 through 9, or 0x41 to 0x46 (the ASCII values of “A” to “F”) for values 10 through 15. Quite ingenious really. I found reference to what may be the origin of this sequence: the article “A Design Philosophy for Microcomputer Architectures [4]” from the February 1977 edition of Computer (the code appears on the third page of the article), but it's unclear if the author came up with this on his own, or it was a known sequence at the time. I just wish I found out about it earlier. [1] https://en.wikipedia.org/wiki/Binary-coded_decimal [2] https://github.com/spc476/mc6809/blob/14a2690f617d6e2c39948a8390d8a309d5bdcde8/mc6809.c#L450 [3] gopher://gopher.conman.org/0Phlog:2015/06/16.1 [4] https://www.computer.org/csdl/magazine/co/1977/02/01646378/13rRUEgs2vU --- Discussions about this page A branchless segment of code to generate a printable hexadecimal value | Lobsters https://lobste.rs/s/b1zoek A branchless segment of code to generate a printable hexadecimal value | Hacker News https://news.ycombinator.com/item?id=34934417 Email author at sean@conman.org .