Ptrace and You
Last updated: 2018-06-19🔗Ptrace & You: A comprehensive overview of ptrace
Our current project involves heavy use of ptrace
. While man 2 ptrace
offers
an in depth technical description, it is very difficult to follow for beginners (hence all
the introduction to ptrace
blogs around). While these blogs helped me get started, I feel
there is a lack of resources for in-depth, comprehensive view of ptrace
and it's many features
and pitfalls.
After a year of using ptrace
, I hope to have something useful to say for newcomers
and those trying break through the description on the man page. Of course, one post is not
enough to cover even the main functionality of ptrace
. So this is a multi-part series on
ptrace
!
For another point of view, I found these blogs to be excellent:
🔗What is ptrace?
Ptrace is a Linux system call which allows a process (the tracer) to trace
another process (the tracee) through ptrace
, the tracer can intercept events in the
tracee such as: system calls, signals, execves, and more.
The tracer can also read and write to arbitrary locations in: program memory, or CPU
registers, of the tracee. This functionality is extremely powerful, programs such as
gdb
and strace
are powered by ptrace
!
🔗Ptrace at the kernel level
From the ptrace
man page,
"The ptrace API (ab)uses the standard UNIX parent/child signaling over waitpid(2)."
Ptrace, in some ways, replaces the tracee's true parent with our tracer. When one
of many ptrace-events happens, the kernel puts the tracee in a stopped state, forwards
the message to the tracer, the tracer may perform arbitrary computation here. The
tracer then sends one of several ptrace
continue events through a ptrace
system call,
the kernel allows the stopped tracee to run once more.
🔗Ptrace overhead
Ptrace is not known to be fast, and incurs significant overhead. Surprisingly,
I have not been able to find any report/publication quantifying the overhead of ptrace
.
However, a very rough test, running a simple for loop n times (for a sufficiently
large n), doing a trivial system call, like getpid()
:
#include <unistd.h>
#include <stdio.h>
#include <stdint.h>
#include <sys/syscall.h>
int main(){
for(int i = 0; i < 10000000; i++){
int res = syscall(SYS_getpid);
}
return 0;
}
On my laptop, this program takes 0.82 seconds of user time (averaged from 3 runs using the time
command). While ptraced, it takes about 13.18 seconds, a 15x slowdown. Note this
is a worst case scenario when a program continuously calls many system calls. getpid()
is one the faster system call, so the time spent in the kernel is dwarfed by the
context switching overhead.
The overhead comes from the number of context switches required to
intercept an event. Consider a system call event, usually this is requires two context
switches: userspace to kernelspace, and back to userspace. With ptrace
this is doubled:
tracee to kernel, kernel to tracer, tracer to kernel, and kernel to tracee. To add
to the cost, system calls are intercepted once before the system call is executed in the
pre-hook and once after the system call, in the post-hook. Thus we can see where the
overhead comes from. (In a future blog, we will see how using seccomp + bpf + ptrace
, we
can mitigate this overhead, and the number of context switches).
🔗Using ptrace
Let's have a look at the ptrace
system call API:
long ptrace(enum __ptrace_request request, pid_t pid, void *addr, void *data);
As we will see, a lot of functionality and complexity is packed into this API.
-
request
refers to the type of event we want fromptrace
. Some of the more common events include PTRACE_PEEKDATA, PTRACE_POKEDATA, PTRACE_GETREGS, PTRACE_SETREGS, PTRACE_SETOPTIONS, PTRACE_CONT, PTRACE_SYSCALL, and PTRACE_ATTACH. In the next post, we will review these requests. -
pid
refers to the process idea that this action is for. It must be specified since we can be tracing multiple threads or processes from a single tracer. -
addr
anddata
, these arguments' meaning is context depended based on therequest
.addr
usually refers to a memory location on the tracee's memory space, whiledata
is some value we wishptrace
to use.
In the next section we will use ptrace
to trace the system calls of an arbitrary process.
🔗Future Ptrace Blog Posts
In the future I hope to also create blog posts for:
- System call tracing with
ptrace
- Selectively Tracing System Calls: Using
ptrace
withseccomp + bpf
- Speeding up
ptrace
read/writes withprocess_vm_read
&process_vm_write
- Ptrace with multiprocesses
- Ptrace and signals
- Ptrace with threads