[HN Gopher] How Debuggers Work: Getting and Setting x86 Registers
       ___________________________________________________________________
        
       How Debuggers Work: Getting and Setting x86 Registers
        
       Author : tosh
       Score  : 90 points
       Date   : 2020-10-23 15:29 UTC (7 hours ago)
        
 (HTM) web link (www.moritz.systems)
 (TXT) w3m dump (www.moritz.systems)
        
       | justicezyx wrote:
       | At Pixie, we developed a feature to dynamic trace program
       | execution context, for example: arguments and return values of a
       | function [1].
       | 
       | That was built on top of eBPF [2], and in the process we have
       | studied how debugger works, particularly how to pull rich context
       | information about the program executable's file structure
       | (symbols, elf format) and dwarf information [3]. The other
       | significant piece is golang's own implementation details, such as
       | how interface is implemented.
       | 
       | It's a very refreshing learning experience. The end result is
       | that we gained a deeper understanding of how program presents
       | itself to operating system and chips.
       | 
       | [1] https://docs.pixielabs.ai/using-pixie/code-tracing/ [2]
       | https://www.iovisor.org/technology/ebpf [3]
       | https://en.wikipedia.org/wiki/DWARF
        
       | dx87 wrote:
       | Learning debugger internals is suprisingly frustrating. I think
       | it was the the GDB source code that didn't explain what anything
       | was in its source files about interacting with registers, and
       | just had a comment at the top that said something like "this
       | header file is for GDB internals, you're probably looking for the
       | man page". Then I found a post on Stackoverflow looking for the
       | exact same info I was, but it was closed for "being too narrow
       | and likely no use to anyone else".
        
         | jcranmer wrote:
         | The least annoying thing about writing and working with Linux
         | debuggers is that all of the communication layers
         | (kernel<->debugger, debugger<->C library, debugger<->loader,
         | debugger<->compiler [1]) is basically undocumented. I mean,
         | it's annoying trying to track down where in the source code
         | these things exist, but if you have experience working in large
         | codebases, this kind of task is needed sufficiently frequently
         | that it's not difficult.
         | 
         | The real issue with debuggers is that ptrace is a pretty broken
         | API. Supporting things like spawning threads, forking
         | processes, fork+exec, etc. is difficult, and full of race
         | conditions that are difficult to code correctly. Attaching to
         | running multithreaded processes is another challenge. Writing a
         | debugger that can correctly handle multithreaded applications
         | is challenging, the documentation gives you zero insight into
         | what the potential pitfalls are, and almost all examples are
         | similarly uninformative, being too complex for their use case.
         | 
         | [1] Yes, there's DWARF. But if you're dealing with GNU
         | extensions to DWARF, the documentation ceases...
        
           | krytarowski wrote:
           | Thank you for your feedback.
           | 
           | > The real issue with debuggers is that ptrace is a pretty
           | broken API.
           | 
           | Please note that this article focuses on NetBSD and FreeBSD
           | first.
           | 
           | As your comment describes only one OS (Linux) please do not
           | generalize as your comment seems to use truth sparingly.
           | 
           | The ptrace(2)/NetBSD API design and implementation is free
           | from all of the difficulties you mentioned in your post.
           | 
           | > Supporting things like spawning threads, forking processes,
           | fork+exec, etc. is difficult, and full of race conditions
           | that are difficult to code correctly.
           | 
           | The difficulty of catching LWP creation events:
           | 
           | ptrace_event_t event = {}; event.pe_set_event =
           | PTRACE_LWP_CREATE; ptrace(PT_SET_EVENT_MASK, child, &event,
           | sizeof(event))
           | 
           | Then whenever a debuggee creates a child, it's fully stopped
           | (so called all-stop mode from GDB) and reported to the
           | debugger by sending a signal that is wait(2)ed.
           | 
           | Then, investigate the debuggee event through checking the
           | signal passed (SIGTRAP) and investigating siginfo_t that
           | contains new thread identifier.
           | 
           | Then, you can resume the whole process with a single
           | PT_CONTINUE.
           | 
           | > forking processes
           | 
           | Same for forking, use PT_SET_EVENT_MASK+PTRACE_FORK. Fork
           | events are reported for the forking parent and forked child.
           | As you poll on events on a single PID only (for all events
           | for all threads within a process), you have the deterministic
           | order of reporting the forked parent first always, followed
           | by polling for the forked child (you know its PID from
           | SIGTRAP + siginfo_t submitted to the parent).
           | 
           | > fork+exec
           | 
           | This is a matter of catching EXEC and FORK events separately.
           | All exec() events are reported as SIGTRAP + siginfo_t
           | specifying TRAP_EXEC. No big deal.
           | 
           | > is difficult, and full of race conditions that are
           | difficult to code correctly
           | 
           | I push this comment to the free market of opinions of the
           | readers.
           | 
           | > Attaching to running multithreaded processes is another
           | challenge.
           | 
           | It's 1-liner always:
           | 
           | ptrace(PT_ATTACH, pid, NULL, 0);
           | 
           | No matter whether this is a single-threaded or multi-threaded
           | process.
           | 
           | > Writing a debugger that can correctly handle multithreaded
           | applications is challenging,
           | 
           | Again, I defer this question to the free market of opinions.
           | 
           | > the documentation gives you zero insight into what the
           | potential pitfalls are,
           | 
           | Please list the pitfails so we can improve the documentation!
           | 
           | > and almost all examples are similarly uninformative, being
           | too complex for their use case.
           | 
           | There are a few hundreds of ptrace programs in NetBSD
           | executing each small feature in minimal code. This is
           | embedded into the regression test framework (ATF). This code
           | can be reused (good license + simple) in 3rd party software.
           | 
           | For external examples, I recommend the most minimal event
           | tracker of debuggers, that I wrote here:
           | 
           | https://github.com/krytarowski/picotrace
           | 
           | In particular, you can trace all events possible in all types
           | of programs (at least in the current version of ptrace(2)) in
           | around 300 LOC, as noted here:
           | 
           | https://github.com/krytarowski/picotrace/blob/master/common/.
           | ..
           | 
           | FreeBSD has a distinct ptrace(2) API, but not far from NetBSD
           | and is relatively comparable and quickly portable from one
           | BSD to another.
           | 
           | If you have got any more questions or comments, do not
           | hesitate to ask!
        
             | jcranmer wrote:
             | > Please note that this article focuses on NetBSD and
             | FreeBSD first.
             | 
             | There's a reason I prefaced my comment with Linux
             | debuggers. I haven't played much with BSD kernels to know
             | how problematic debuggers are there.
        
           | saagarjha wrote:
           | man 2 ptrace is really one of the worst man pages in
           | existence, and it's made doubly bad because there are only
           | two real clients that you can reference, one of which is
           | pretty much the definition of an awful legacy codebase and
           | the other which is strangely- and over-engineered and at
           | times not even correct or complete. Actually, triply bad
           | because ptrace(2) itself is the worst API and it bleeds into
           | some other fairly reasonable APIs for signals and process
           | notifications.
           | 
           | I guess the silver lining is that if you ever find yourself
           | in the position of having to _implement_ ptrace (like I did
           | recently) you can get away with a surprisingly broken and
           | incomplete API and GDB at least will take it mostly in
           | stride, as long as you don 't mess up a couple of the
           | fundamental operations (your wait4(2) has to mostly be
           | correct, a couple of the SIGTRAPs need the right codes). It's
           | still really difficult to do-my implementation was designed
           | by cross-checking against strace output and only supports
           | basic single-task debugging-but you can return strange errors
           | or "I don't support this" and GDB for the most part is OK
           | with that, because not only is ptrace a broken API many of
           | its implementations are broken themselves, or certain
           | features are strangely missing in some places.
        
         | jjoonathan wrote:
         | Stackoverflow always closes good questions. It's almost a sign
         | of legitimacy at this point.
        
       ___________________________________________________________________
       (page generated 2020-10-23 23:00 UTC)