Compilation with security flags (1) My current task is about enabling the security flags for the compilation of a few binaries. We are talking about embedded software. I started by reading up a little about these flags, and how they work. So I'll annotate here a few useful pointers, for future reference. The protection mitigates the stack smashing techniques, which I'm familiar with at conceptual level, and only superficially at practical level. 1. _FORTIFY_SOURCE 2. x86_64 assembly 3. Various buffer overflow protection techniques 4. The stack goes down 5. Also 6. Updates (edit from a few days later) <- Acquired wisdom. == 1. _FORTIFY_SOURCE The first result on duck duck go leads to this article by red-hat. https://www.redhat.com/en/blog/enhance-application-security-fortifysource In a nutshell, the idea is to detect problems arising from the [mis]use of a few well-known functions. This is done at compile time when possible (e.g. constant parameters are wrong), and at runtime otherwise. The runtime check is achieved by means of special versions of the original call (e.g. memcpy being replaced with __memcpy_chk). It is easy to understand how it works, even if the discussion is about x86_64 while I'm working with ARM architectures. Also, I'm not very used to assembly since I mostly work with C. == 2. x86_64 assembly A useful book on the topic. https://en.wikibooks.org/wiki/X86_Assembly/ Even if this is not my target platform (I'm on ARM), I appreciated a few interesting details. In the "Address operand syntax" syntax there are a few examples about the operands. Interesting how the Load Effective Address (LEA) operation is handy to do some math. I had to rehearse the meaning of a few registers: the use of %ebp as a base address for local variables, and I learned a little about the segment support, which is disabled in favour of paging on modern operating systems, but still in use for thread specific data (see the segment registers FS GS) What I find hard of assembly is that it is mainly about conventions, and not working on this stuff often, it is difficult to keep them in mind. == 3. Various buffer overflow protection techniques https://en.wikipedia.org/wiki/Buffer_overflow_protection == 4. The stack goes down Reasoning up from the Wikipedia article above, the variables are positioned by design in a way that fosters stack smashing (with the beginning of a buffer positioned on an address which is lesser than than control information and return address => write beyond boundary overwrites them). Would it work to have an upwards-growing stack? https://security.stackexchange.com/questions/44801/smashing-the-stack-if-it-grows-upwards Short answer: no, it just improves the protection of the "closest" stack frame, but that's hardly an improvement. == 5. Also Keep intermediate object codes with CMake: pass '--debug-trycompile' when invoking cmake. == 6. Updates (edit from a few days later) After studying the matter, and getting some clue on how stack smashing protection works, I started to experiment on the target system. The target is a bare-metal build (no operating system) on a ARM CPU. We are using Newlib, in which I could find some code implementing some stack smashing protection. Such code relies on file descriptors and such, which are not available in our firmware. So I was expecting -D_FORTIFY_SOURCE to produce some link-time issues at least, but I did not see anything. Unsurprisingly, I tried to smash the stack on purpose with no effect. Then I did a comparison among binaries compiled with and without the macro: nothing changed at all. Later (next working day) I gave another try, using -fstack-protector instead of _FORTIFY_SOURCE. This time I started to see some linking problem, which meant I was on the right track. A quick analysis of the object file (`objdump -t`) showed that the compiled code was depending on a symbol called __stack_chk_fail. A disassembly (`objdump -D`) showed how this is implemented: a canary value and conditional jump (assembly instruction bl) to __stack_chk_fail. I used the --wrap linker flag to replace the unsuitable __stack_chk_fail handler from Newlib with a function called __wrap___stack_chk_fail, that I implemented. The handler implementation is simple: print an error message on the UART and halt the firmware execution. A little but nice detail is that the error message shows the content of the lr register. The lr register is set with the return address by the bl assembly instruction, so by printing it the handler can tell were, in the object code, the stack smashing happened.