gdb / backtrace / running process
Sometimes you want a backtrace or a core dump from a process that you do
not want to stall. This could concern a multithreaded application of
which some threads are still doing important work (like handling
customer calls). Firing up gdb would halt the process for as long as
you’re getting info, and raising a SIGABRT
to get a core dump has the
negative side-effect of killing the process. Neither is acceptable in a
production environment.
In comes the handy gdb(1)
option -ex
. See this hanging.c
example
that we will examine while leaving it running.
int c() {
while(1);
return 64;
}
int b() {
return c();
}
int a() {
return b();
}
int main() {
return a();
}
Fire it up, gather info, and keep running:
$ gcc hanging.c -o hanging -g
$ ./hanging &
[1] 787
$ time gdb -p `pidof hanging` -ex bt -ex 'thread apply all bt full' -ex detach -ex quit
...
c () at hanging.c:2
2 while(1);
#0 c () at hanging.c:2
#1 0x00000000004004d8 in b () at hanging.c:6
#2 0x00000000004004e8 in a () at hanging.c:9
#3 0x00000000004004f8 in main () at hanging.c:12
Thread 1 (process 787):
#0 c () at hanging.c:2
No locals.
#1 0x00000000004004d8 in b () at hanging.c:6
No locals.
#2 0x00000000004004e8 in a () at hanging.c:9
No locals.
#3 0x00000000004004f8 in main () at hanging.c:12
No locals.
Detaching from program: /home/walter/hanging, process 787
real 0m0.128s
user 0m0.120s
sys 0m0.020s
$ fg
./hanging
Obviously the process does hang while gdb gathers the required information, but it resumes immediately after, hopefully without your users noticing it.
You can write a core dump too, if you like. But this can require a bit more time, depending on how much memory your process is using.
# cat /proc/`pidof asterisk`/status | grep VmRSS
VmRSS: 13236 kB
# time gdb -p `pidof asterisk` -ex generate-core-file -ex detach -ex quit
...
0x00007f1f01e5ebd6 in poll () from /lib/libc.so.6
Saved corefile core.313
Detaching from program: /usr/local/sbin/asterisk, process 313
real 0m1.972s
user 0m0.192s
sys 0m0.332s
# ls -lh core.313
-rw-r--r-- 1 root root 15M 2011-09-06 08:43 core.313
A couple of notes about this last example:
- RSS is only indicative of the dump size. The dump may very well turn out twice as large.
- Most of the time spent here was loading symbols. A 30MB dump won’t take twice as long. An 800MB dump will take some time though. Beware.