[HN Gopher] Hiding messages in x86 binaries using semantic duals
       ___________________________________________________________________
        
       Hiding messages in x86 binaries using semantic duals
        
       Author : todsacerdoti
       Score  : 138 points
       Date   : 2020-08-16 14:40 UTC (8 hours ago)
        
 (HTM) web link (blog.yossarian.net)
 (TXT) w3m dump (blog.yossarian.net)
        
       | praalhans wrote:
       | About ten years ago, the same question was raised on
       | StackOverflow:
       | https://stackoverflow.com/questions/2760794/x86-cmp-instruct...
       | 
       | A comment was made: "These 1-bit degrees of freedom also provide
       | a covert channel for compilers to "phone home" - they can
       | "watermark" the binaries they produce, and the compiler vendor
       | can ask you to please explain if they find your software with
       | their watermark, but with no license on file." - Bernd Jendrissek
        
       | steamraven wrote:
       | This would provide an interesting way of signing a binary. Encode
       | using all 0s, then sign, then encode the signature
        
       | crb002 wrote:
       | You could do stenography with ordering of function/data sections
       | VS an expected permutation.
        
       | 082349872349872 wrote:
       | An old test suite (Plum Hall?) used to warn it was generated
       | uniquely per-client, with steganographic watermarking at the C
       | source level.
        
         | tom_ wrote:
         | A86 reportedly did something similar:
         | https://en.wikipedia.org/wiki/A86_(software)
        
           | jlokier wrote:
           | Yes, I remember A86 documentation saying it did that to every
           | program it assembled, to watermark that A86 was used to
           | assemble it.
           | 
           | In the 80s. Blast from the past!
        
       | mikorym wrote:
       | I think it is better to call this a noncommutative operation and
       | phrase it as such. Duals _are_ often in fact, different things.
       | 
       | But I think what you are doing is pretty cool! I don't know much
       | about machine code, but whenever XOR becomes noncommutative, for
       | whatever reason, but _executes_ commutatively, then this should
       | become possible.
       | 
       | Of course, by the way, XOR also has the vanilla ability to reveal
       | information through: plaintext XOR cypher = cyphertext and then
       | you do cyphertext XOR cypher to get back plaintext. In this case,
       | XOR is commutative, as we are only looking at execution, and
       | moreover each binary string is it's own inverse: thing XOR thing
       | = 0-string. So, you can think of XOR as reversible sequences of
       | encypherment.
        
         | saagarjha wrote:
         | I'm not sure what xor's commutativity has to do with this?
        
       | BeeOnRope wrote:
       | It might be worth noting that _strictly speaking_ this isn 't
       | safe to apply to an arbitrary binary. That is 31 c0 and 33 c0
       | might behave identically when executed as xor, but they could,
       | for example, also be part of a larger instruction where the swap
       | results in a change in behavior.
       | 
       | Presumably this implementation disassembles the binary which
       | gives a very high probability of determining whether these bytes
       | are _only_ executed as xor, but this isn 't in general enough
       | since the way the program is executed at runtime may not
       | correspond to disassembly [0].
       | 
       | Other patterns that might trip this up are programs that re-use
       | some program bytes as constants (e.g., if you needed the 16-bit
       | constant 0xc031 you could just point to this existing pattern in
       | the instruction stream [1]) or which otherwise examine or modify
       | their instruction bytes at runtime.
       | 
       | Now of course this caveat doesn't apply to almost any "vanilla
       | program" compiled by a normal compiler. It's only likely to come
       | up in hand-written assembly or as a result of some tool or
       | process that purposes does this weird stuff (e.g., as an anti-
       | debugging measure). 99.99% of the time these swaps will work out
       | fine.
       | 
       | ---
       | 
       | [0] Further, you can't even necessarily unambiguously assign a
       | byte to a particular instruction, even based on the dynamic
       | behavior, since a given byte might be used in two _different
       | instructions_ based on an earlier jump which result in different
       | parsed boundaries. This is _really really_ unusual and usually in
       | the realm of demos or anti-debugging techniques, etc.
       | 
       | [1] This is not actually a good idea for performance because
       | modern CPUs hate it when you use the same bytes for both code and
       | data.
        
         | ornxka wrote:
         | >It might be worth noting that strictly speaking this isn't
         | safe to apply to an arbitrary binary. That is 31 c0 and 33 c0
         | might behave identically when executed as xor, but they could,
         | for example, also be part of a larger instruction where the
         | swap results in a change in behavior.
         | 
         | Technically speaking, you can't apply any transformation to
         | arbitrary binaries, because then the encoding changes and the
         | code can look back at itself. This actually features heavily in
         | a recent Hacker News article
         | https://news.ycombinator.com/item?id=23557998 about reverse
         | engineering the Snapchat app, where the code regularly took
         | checksums of itself to frustrate debuggers and take different
         | codepaths if tampering (such as to insert breakpoints) was
         | detected.
         | 
         | Assembly is kind of, in a sense, the ultimate homoiconic
         | programming language.
        
         | woodruffw wrote:
         | > It might be worth noting that strictly speaking this isn't
         | safe to apply to an arbitrary binary. That is 31 c0 and 33 c0
         | might behave identically when executed as xor, but they could,
         | for example, also be part of a larger instruction where the
         | swap results in a change in behavior.
         | 
         | > Presumably this implementation disassembles the binary which
         | gives a very high probability of determining whether these
         | bytes are only executed as xor, but this isn't in general
         | enough since the way the program is executed at runtime may not
         | correspond to disassembly [0].
         | 
         | Yep: steg86 uses iced[1] internally to decode and re-encode
         | instruction sequences. Arbitrarily replacing `31 C0` with `33
         | C0` (or vice versa) in the instruction text would certainly not
         | go well.
         | 
         | And yes: it's possible to contrive pathological programs that
         | make these kinds of patches impossible to perform safely. But
         | those fall under the "what did you expect" support category, at
         | least in terms of the current implementation. A compiler-based
         | implementation would certainly have an easier time.
         | 
         | [1]: https://github.com/0xd4d/iced
        
           | BeeOnRope wrote:
           | Yes, I think the most practical scenarios this would fail in
           | the real world would be:
           | 
           | 1) Signed binaries or other similar concepts such as "anti-
           | cheat" stuff that tries to detect binary modification.
           | 
           | 2) Code that actually does do really weird stuff in order to
           | fool static disassembly, e.g., obfuscation which is not
           | uncommon in games and some other binary types.
           | 
           | Of course, if you created the binary yourself, you'd be aware
           | of all these gotchas, so this would only come as a surprise
           | if you were applying it as a "third party" to some arbitrary
           | binary.
        
       | weinzierl wrote:
       | I remember that one assembler claimed to do something similar to
       | sort of watermark the binaries. From what I remember it did it
       | only in the otherwise fully functional unregistered version to
       | encourage people to pay for it. From my memory it was NASM but I
       | cannot find anything about it ever being paid for. Does anyone
       | remember a shareware assembler from mid nineties that did this?
       | 
       | EDIT: Another comment already mentioned A86. Probably that was
       | the one.
        
         | jlmcgraw wrote:
         | :You're thinking of A86:
         | 
         | https://en.m.wikipedia.org/wiki/A86_(software)
        
           | userbinator wrote:
           | Decades ago, I remember figuring out quite a bit of that, and
           | here are my original notes from that, including tables that
           | show the pattern clearly:                   MOV rA, rB:
           | B: AL AH BL BH CL CH DL DH         A:      1  1  0  0  1  1
           | 0  0         AL  1   ** ** 8A 8A ** ** 8A 8A         AH  1
           | ** ** 8A 8A ** ** 8A 8A         BL  0   8A 8A ** ** 8A 8A **
           | **         BH  0   8A 8A ** ** 8A 8A ** **         CL  1   **
           | ** 8A 8A ** ** 8A 8A         CH  1   ** ** 8A 8A ** ** 8A 8A
           | DL  0   8A 8A ** ** 8A 8A ** **         DH  0   8A 8A ** **
           | 8A 8A ** **              * = "reversed" opcode (88)
           | MOV rA, rB ( word regs ):              B: AX BX CX DX SP BP
           | SI DI         A:      1  0  1  0  1  1  0  0         AX 1
           | ** 8B ** 8B ** ** 8B 8B         BX 0    8B ** 8B ** 8B 8B **
           | **         CX 1    ** 8B ** 8B ** ** 8B 8B         DX 0    8B
           | ** 8B ** 8B 8B ** **         SP 1    ** 8B ** 8B ** ** 8B 8B
           | BP 1    ** 8B ** 8B ** ** 8B 8B         SI 0    8B ** 8B **
           | 8B 8B ** **         DI 0    8B ** 8B ** 8B 8B ** **
           | * = "reversed" opcode (89)              The above tables
           | apply for the following two-operand instructions:
           | ADD             OR             ADC             SBB
           | AND             SUB             XOR             CMP
           | For TEST and XCHG, which are commutative, A86 always puts the
           | first         operand in the r/m field if possible, while
           | MASM puts it in the reg         field if the first operand is
           | a register.
        
           | weinzierl wrote:
           | It's been so long that I read that remark in the manual and
           | I'm glad I wasn't just imagining it;-)
        
           | astrobe_ wrote:
           | I can confirm. The author was an Intel engineer who was
           | dissatisfied with how his senior engineers did the DOS EXE
           | format in the intro of the docs of his assembler.
           | 
           | A86/D86 where fantastic, with extra bonus points for the
           | docs.
        
       | josephcsible wrote:
       | This seems like it'd be almost comically easy to detect, because
       | normal compilers and assemblers would consistently use either 31
       | C0 or 33 C0 all the time, rather than bouncing back and forth
       | between them within a single program.
        
         | woodruffw wrote:
         | Author here: it is indeed easy to detect. I didn't base my work
         | off of Hydan but it uses a similar strategy, and there are
         | several[1][2] public steganalyses of it.
         | 
         | Edit: I previously claimed that this method might be easy to
         | deny. It isn't, and I've added a note to the post as well.
         | 
         | [1]: https://cosec.inf.uc3m.es/~juan-
         | tapiador/papers/2009sec.pdf
         | 
         | [2]: https://www.sans.org/reading-
         | room/whitepapers/stenganography...
        
           | jaclaz wrote:
           | I am not sure to understand how the program can make a
           | distinction between 33 C0 (because it is written as 33 C0)
           | and 33 C0 (because it was written as 31 C0 and was later
           | changed by steg86 to 33 C0)?
           | 
           | Or, seen the other way, maybe steg86 can "extract" (from
           | already existing untouched binaries) secret messages that
           | were never intentionally written?
        
             | woodruffw wrote:
             | This is a really good question!
             | 
             | steg86 currently embeds a 32-bit header for itself. You can
             | see the relevant constants here[1]. If the header doesn't
             | validate during extraction, steg86 fails instead of
             | extracting potential garbage.
             | 
             | [1]: https://github.com/woodruffw/steg86/blob/master/src/st
             | eg86/b...
        
               | jaclaz wrote:
               | OK, but how is the 32-bit header itself embedded?
               | 
               | The header is a good thing for preventing accidental
               | extraction from binaries not treated with steg86, but
               | still you need to modify 4 bytes (or 4 instructions) to
               | embed this header, so - at least theorically - the
               | possibility of a "collision" or of a false positive seems
               | relatively high to me.
        
               | woodruffw wrote:
               | > OK, but how is the 32-bit header itself embedded?
               | 
               | The header is embedded according to the same rules as the
               | rest of the message: it's treated as a bitstring, and
               | flippable instructions are flipped appropriately to
               | encode it.
               | 
               | > the possibility of a "collision" or of a false positive
               | seems relatively high to me.
               | 
               | As others have pointed out, compilers (and assemblers)
               | tend to stick to a single selection choice for register-
               | to-register ops. Even in the case where a compiler writer
               | flips between them randomly, their 32 random choices
               | would have to align precisely with the expected header.
               | It's certainly not impossible, but pretty unlikely. It's
               | also completely remediable with a CRC32 or similar field
               | tacked onto the header; I just didn't think the
               | likelihood warranted that for the initial design.
        
               | paraboul wrote:
               | 4 bytes of the encoded header isn't 4 instructions, it's
               | 32 consecutive instructions that each translate to a bit
               | depending on their semantic form.
               | 
               | So there is virtually no chance for a compiler to
               | generate a valid sequence randomly.
        
               | jaclaz wrote:
               | I see, each changed instruction gives a single bit (not
               | byte) of information, my bad.
        
               | Legogris wrote:
               | But that's not a fundamental issue of the idea and rather
               | a quirk in the steg86 implementation, right?
        
               | woodruffw wrote:
               | Which part? The header is specific to steg86, and some of
               | its constraints (like the maximum message size) are a
               | product of that. Otherwise, there aren't any quirks in
               | the implementation that I'm currently aware of.
        
               | Legogris wrote:
               | Are there other semantic duals in the instruction sets?
               | If so, when combined with cryptography, it would offer
               | some additional level of plausible deniability over which
               | of the possible messages (due to parameter selection) is
               | the cryptotext.
        
               | jlokier wrote:
               | The two most obvious are:
               | 
               | - Register allocation choices
               | 
               | - Instruction ordering choices
        
               | schoen wrote:
               | Also conditional branches because you can conceivably do
               | the comparison, branch, and fall-through in several
               | different ways. Consider the initial case
               | cmp ax,bx       jge foo       bar:       ; code that gets
               | run if ax < bx       foo:       ; code that gets run if
               | ax >= bx
               | 
               | You could change this to cmp bx,ax and then change the
               | jge to jl, or you could change either one alone and
               | reverse the order of the bar and foo code blocks, or you
               | could change both and not reverse the order of the bar
               | and foo code blocks. (You have fewer choices if you are
               | doing the comparison as the exit condition for a loop,
               | except sometimes compilers will put the loop continue
               | condition as an _unconditional jump_ after bar or foo, in
               | which case you have even _more_ choices, like whether to
               | do that or not!)
        
             | shawnz wrote:
             | The same problem is faced when you present basically any
             | kind of encryption scheme with random data. It would be
             | easy to fix that kind of problem by adding a checksum to
             | the secret message.
             | 
             | Besides, in what scenario would you be decrypting arbitrary
             | binaries with this, such that it would be a problem for you
             | to have false positives? Just make sure to use it only on
             | files which you know contain secret messages.
        
               | jaclaz wrote:
               | >Besides, in what scenario would you be decrypting
               | arbitrary binaries with this, such that it would be a
               | problem for you to have false positives? Just make sure
               | to use it only on files which you know contain secret
               | messages.
               | 
               | Well, more or less cryptography/steganography can be used
               | to either store "secrets" or to communicate them.
               | 
               | If I used something like this to communicate, I would
               | probably tell the other part to download (say) a .iso
               | full of binaries, as opposed to a single binary.
        
           | CydeWeys wrote:
           | It's not easy to deny. Normal compilers flat out don't
           | produce binaries that look like this, so if a binary looks
           | like this then you definitely know something fishy is going
           | on even if you can't decrypt it.
           | 
           | Contrast with steganography in least significant image bits;
           | you genuinely cannot determine at all if an image is carrying
           | hidden encrypted data. That's what easy to deny looks like.
        
             | woodruffw wrote:
             | > It's not easy to deny. Normal compilers flat out don't
             | produce binaries that look like this, so if a binary looks
             | like this then you definitely know something fishy is going
             | on even if you can't decrypt it.
             | 
             | Yes, you're right. I've updated the post to include a note
             | at the bottom saying that this technique is difficult to
             | provide deniability with.
             | 
             | > Contrast with steganography in least significant image
             | bits; you genuinely cannot determine at all if an image is
             | carrying hidden encrypted data. That's what easy to deny
             | looks like.
             | 
             | This might be a misunderstanding on my part, but I
             | _thought_ that LSB-style steganography had been broken by
             | both statistical analyses and ML models for a while now[1].
             | But it 's possible that these methods only work on
             | plaintext messages; I haven't looked deeply into it.
             | 
             | [1]: https://github.com/b3dk7/StegExpose
        
               | royjacobs wrote:
               | You are right, in the sense that there are companies out
               | there (I used to work for one of them) that do image,
               | video and audio watermarking. In the simplest terms
               | that's also adding imperceptible "noise" that
               | statistically builds up to a few bits of data.
        
           | mikorym wrote:
           | I think from the perspective of steganography the question is
           | also: How often is byte code checked for anything like this?
           | 
           | Steganography is like a magician's act; it's only undetected
           | if you don't go and look for it. I think, in principle, if
           | you are the only one using the tool, and didn't tell anyone,
           | why would anyone look for it?
        
         | ivanbakel wrote:
         | Which is a point made in the article - the technique is not
         | meant to be hard to detect a la cryptography, it's meant to be
         | non-obvious to spot. By exploiting the fact that nobody is
         | going to look through (or edit) machine instructions for
         | patterns in register bytes, you can include a message that is
         | easy to overlook.
        
         | JadeNB wrote:
         | This _particular_ trick is easy to detect _when you 're looking
         | for it_, but that doesn't mean that it's easy to realise that
         | you _should_ look for this trick (I 'm not about to make "check
         | the assembler for strange compiler choices" part of my workflow
         | for installation of a new binary), or that all possible such
         | tricks could be so easily caught.
        
           | verroq wrote:
           | Surely there is some way to verify if a build tempered with,
           | _cough_ file integrity hashes.
        
       | jermier wrote:
       | Cool project. I like to hide data in audio files with Deepsound
       | https://deepsound.soft112.com/
        
         | rgovostes wrote:
         | I wrote a blog post on the weak cryptography used by Deepsound:
         | 
         | https://ryan.govost.es/2018/03/09/deepsound.html
        
           | jermier wrote:
           | That's a great writeup. Is it possible to create a really
           | long passphrase whose hash can't be reversed easily? Perhaps
           | a diceware passphrase with six randomly chosen words?
        
             | readams wrote:
             | Just encrypt your data first before giving to DeepSound.
        
             | rgovostes wrote:
             | The difficulty of breaking Deepsound is basically
             | equivalent to the difficulty of reversing a SHA-1 hash. For
             | dictionary words and shorter passwords, consider them
             | broken instantaneously through pre-computed lookup tables.
             | 
             | For more complex passphrases (and remember, only the first
             | 32 characters count here), exponential growth probably
             | works in your favor, even with today's Bitcoin-fueled
             | hyper-accelerated SHA-1 implementations.
             | 
             | Even then, the scheme where they use the password directly
             | as the AES key is flawed. For example, in ASCII, every
             | octet's most-significant bit is zero, so 32 bits of your
             | AES key are fixed. I don't know if this enables practical
             | attacks, but anyone who cares about securing their data
             | shouldn't rely on amateur cryptography like this.
             | 
             | Edit: Oh right, and aside from the password aspect, it uses
             | ECB mode for the encrypted content. That's not good.
        
               | Beldin wrote:
               | For those who are curious about ECB: see the picture &
               | encrypted picture of Tux on https://en.m.wikipedia.org/wi
               | ki/Block_cipher_mode_of_operati...
        
       | kens wrote:
       | I've been reverse-engineering the 8086 lately, and I recently
       | came across the exact register-control circuitry that makes this
       | steganography possible.
       | 
       | The steganography takes advantage of x86 instructions where you
       | can swap the source and destination by replacing an instruction
       | like hex 31 c0 with 33 c0. The difference is bit 1, the direction
       | bit, supported by many instructions.
       | 
       | Internally, the 8086 has 5-bit registers that specify the source
       | and destination, typically a register. [1] The source and
       | destination register get their value either from the register
       | specification in the instruction (bits 5-3 or bits 0-2), or
       | values in the microcode.
       | 
       | The clever part is the outputs of the source and destination
       | registers go through multiplexers that can swap them. If the
       | instruction has a direction bit, and the direction bit is set,
       | then accesses to the source and destination registers are
       | swapped.
       | 
       | The point of this is that the microcode doesn't know anything
       | about direction swapping, so it is implemented "for free" as far
       | as microcode size (but with the addition of the swapping
       | circuit).
       | 
       | The 8086 has a "Group PLA" that categorizes instructions into
       | groups; one of these groups is "instructions that have a
       | direction bit". This prevents direction swapping from happening
       | for instructions where it is not supported.
       | 
       | I hope this explanation makes sense; it should probably be a blog
       | post :-)
       | 
       | [1] You might wonder why source and destination are specified
       | with 5 bits when the instructions use 3 bits to specify the
       | register. The first reason is that many registers can be accessed
       | as half-registers, so you need another bit. (This bit comes from
       | the byte/word specification bit in instructions.) Second, this
       | mechanism is used to access the other 8086 registers, not just
       | the general-purpose registers. Third, there are also invisible
       | temporary registers that also need accessing. Thus, the internal
       | register specifications are 5 bits.
        
         | woodruffw wrote:
         | This is a fantastic explanation, thank you! I would _love_ a
         | detailed blog post on it.
        
       ___________________________________________________________________
       (page generated 2020-08-16 23:00 UTC)