Ptrace and You

2018-06-19

🔗Ptrace & You: A comprehensive overview of ptrace

Our current project involves heavy use of ptrace. While man 2 ptrace offers an in depth technical description, it is very difficult to follow for beginners (hence all the introduction to ptrace blogs around). While these blogs helped me get started, I feel there is a lack of resources for in-depth, comprehensive view of ptrace and it's many features and pitfalls.

After a year of using ptrace, I hope to have something useful to say for newcomers and those trying break through the description on the man page. Of course, one post is not enough to cover even the main functionality of ptrace. So this is a multi-part series on ptrace!

For another point of view, I found these blogs to be excellent:

🔗What is ptrace?

Ptrace is a Linux system call which allows a process (the tracer) to trace another process (the tracee) through ptrace, the tracer can intercept events in the tracee such as: system calls, signals, execves, and more.

The tracer can also read and write to arbitrary locations in: program memory, or CPU registers, of the tracee. This functionality is extremely powerful, programs such as gdb and strace are powered by ptrace!

🔗Ptrace at the kernel level

From the ptrace man page, "The ptrace API (ab)uses the standard UNIX parent/child signaling over waitpid(2)."

Ptrace, in some ways, replaces the tracee's true parent with our tracer. When one of many ptrace-events happens, the kernel puts the tracee in a stopped state, forwards the message to the tracer, the tracer may perform arbitrary computation here. The tracer then sends one of several ptrace continue events through a ptrace system call, the kernel allows the stopped tracee to run once more.

🔗Ptrace overhead

Ptrace is not known to be fast, and incurs significant overhead. Surprisingly, I have not been able to find any report/publication quantifying the overhead of ptrace. However, a very rough test, running a simple for loop n times (for a sufficiently large n), doing a trivial system call, like getpid():

#include <unistd.h>
#include <stdio.h>
#include <stdint.h>
#include <sys/syscall.h>

int main(){
  for(int i = 0; i < 10000000; i++){
    int res = syscall(SYS_getpid);
  }

  return 0;
}

On my laptop, this program takes 0.82 seconds of user time (averaged from 3 runs using the time command). While ptraced, it takes about 13.18 seconds, a 15x slowdown. Note this is a worst case scenario when a program continuously calls many system calls. getpid() is one the faster system call, so the time spent in the kernel is dwarfed by the context switching overhead.

The overhead comes from the number of context switches required to intercept an event. Consider a system call event, usually this is requires two context switches: userspace to kernelspace, and back to userspace. With ptrace this is doubled: tracee to kernel, kernel to tracer, tracer to kernel, and kernel to tracee. To add to the cost, system calls are intercepted once before the system call is executed in the pre-hook and once after the system call, in the post-hook. Thus we can see where the overhead comes from. (In a future blog, we will see how using seccomp + bpf + ptrace, we can mitigate this overhead, and the number of context switches).

🔗Using ptrace

Let's have a look at the ptrace system call API:

long ptrace(enum __ptrace_request request, pid_t pid, void *addr, void *data);

As we will see, a lot of functionality and complexity is packed into this API.

request refers to the type of event we want from ptrace. Some of the more common events include PTRACE_PEEKDATA, PTRACE_POKEDATA, PTRACE_GETREGS, PTRACE_SETREGS, PTRACE_SETOPTIONS, PTRACE_CONT, PTRACE_SYSCALL, and PTRACE_ATTACH. In the next post, we will review these requests.
pid refers to the process idea that this action is for. It must be specified since we can be tracing multiple threads or processes from a single tracer.
addr and data, these arguments' meaning is context depended based on the request. addr usually refers to a memory location on the tracee's memory space, while data is some value we wish ptrace to use.

In the next section we will use ptrace to trace the system calls of an arbitrary process.

🔗Future Ptrace Blog Posts

In the future I hope to also create blog posts for:

System call tracing with ptrace
Selectively Tracing System Calls: Using ptrace with seccomp + bpf
Speeding up ptrace read/writes with process_vm_read & process_vm_write
Ptrace with multiprocesses
Ptrace and signals
Ptrace with threads

Time For Crab 🦀