CTF Circle - Airgap 2020 CTF
tcemu Challenge Writeup
Written by sen (@sarahemm)

###############
### SUMMARY ###

Provided with the 'tcemu' challenge was an executable ('cpu') and a binary file
('flag'). The CPU is an emulator for a (presumably custom) architecture, and
the binary file presumably contained the flag somewhere.

###############
### PROCESS ###

1. Ran the 'flag' code on the 'cpu' emulator executable, it outputs "I put
   the flag here somewhere". Good start, this should make it easier to start
   to figure out what the CPU is doing since we have some known behaviour we
   can try to match up with the 'flag' binary.

2. Ran 'strings' on the 'cpu' binary to try to get some info on it, there's
   a bunch of references to a stack and some error messages to do with certain
   instructions, so this gives an idea of what kind of instructions we might
   be dealing with (mov, xor, add, store, push, pop, jmp, prnt) and that it's
   a stack-based system.

3. Hexdumped the 'flag' binary to try to find the instruction set width,
   incorrectly guessed 9 bytes at first because it's almost entirely sets of
   2-3 instructions in a sequence. Figured out by randomly changing bytes here
   and there that 0x05 is an invalid instruction (error message gets emitted),
   so by changing various bytes to this to identify which bytes were
   instruction and which were data I eventually figured out the instructions
   are actually 3 bytes wide (ish, it turned out later). Everything started to
   make more sense once I figured this part out.

4. Using the instruction set width and the info from strings, figured out that
   most of the code is PUSHing a letter onto the stack (0x04), then popping the
   letter from the stack and printing it (0x08, which I'll call PRNT). The "I"
   in "I put the flag here somewhere" isn't actually in the place in the code
   that you'd expect (at least in plain text) though, whereas all the other
   letters are. I initially thought the 0x00 x3 after each PUSH was a NOOP, but
   after more testing it turned out to be part of the argument to PUSH.

5. Investigated where the "I" came from, at first I thought it was a jump
   somewhere that printed it, but eventually figured out that the code for the
   "I" seemed to PUSH a number, do something else, then PRNT. The argument for
   the "something else" instruction plus the number originally pushed added up
   to the ASCII value for "I", so I figured the mystery instruction must be ADD
   (0x10), which ADDs the immediate value to the top item on the stack.

6. Threw a tiny "disassembler" (if you can call it that) together in ruby
   that just understood the instructions I understood so far, and it turned
   out that's all the 'flag' binary used! The other instructions supported by
   the CPU aren't involved in this challenge.

7. Disassembling the 'flag' binary, it showed that after the message it
   displayed, it actually kept PUSHing more things to the stack and ADDing to
   them, but there were no more PRNT calls, so everything was invisible.

8. Added PRNT calls after each PUSH/ADD pair, which resulted in the flag
   being output!

#######################
### INSTRUCTION SET ###

Instruction set in the end (from looking at the 'cpu' binary I know there's
more, but the 'flag' binary only uses these ones). The instruction set is
variable width, the first byte is instruction, then either 2 more bytes or 5
more bytes (3 or 6 total bytes per instruction), it looks like to me?
- 0x04 - PUSH - 5 byte arg - Push argument to the stack.
- 0x08 - PRNT - 2 byte arg - Display information on the terminal, second byte
                             of argument specifies what kind of information.
			     0=print contents of register 0
			     1=pop the top item off the stack and display it
			     (only 1 is used in 'flag' binary)
- 0x10 - ADD  - 5 byte arg - Add the argument to the top item on the stack.
                             The result replaces the top item on the stack.

#######################
### LESSONS LEARNED ###

- When looking at a new/mystery architecture, start with small word sizes
  and work up. I wasted a bunch of time while I was thinking it ran on 9-byte
  words, and things made a lot more sense when I started assuming 3-byte.
  3-byte also turned out to be not entirely correct, but everything started
  getting unlocked once I made that assumption at least. It's easier to glue
  two lines of words together in your head than it is to split up longer
  words.

- I spent too much time trying to reverse the 'cpu' executable, when reversing
  the instruction set based on the 'flag' binary turned out to be much more
  straightforward. Coming up with a couple potential strategies at first and
  flipping to an alternate one when I wasn't making much progress using the
  first one was a good idea here, and I'll try to keep this in mind in the
  future.

- I need better options for CLI hex editing, that slowed me down a bit on this
  one. I eventually wrote a tiny bash script that used xxd to dump the binary
  to a hex file 3 bytes wide, edit it in vim, then use xxd -r to reassemble
  it into binary. Every time I needed to add an instruction I had to renumber
  all the following addresses though, which was a huge pain. Something to
  sort out for next time.

############
### NOTE ###

After thinking about it more and reading other writeups, it's clear that the
stack isn't actually involved here, what I thought was PUSH is actually STORE
to register, and what I was interpreting as pop/print is just 'print register'.

Not entirely sure where my idea that a stack was involved in this flag came
from, but these things happen when you're under time pressure and your
understanding is evolving as you go. I'm going to leave this writeup as-is as
it was my understanding at the time, and worked well enough to get the flag!