core file / docker image / auplink
A while, I’ve been looking at a stray /core
file in some of our daily
Xenial Docker images. Time to find out where it comes from.
Tracing with a few well placed RUN ls -l /core || true
, tells us that
the dump appeared after a large RUN
statement and not during one.
Running gdb on the core revealed that it was a dump of auplink
, a
part of Docker. Opening the core on a Xenial machine with docker
installed, showed the following backtrace:
Core was generated by `auplink /var/lib/docker/aufs/mnt/21c482c11476d6fb9842fa91c0d9e2c49cfb51c3d04dd5'.
Program terminated with signal SIGSEGV, Segmentation fault.
(gdb) bt
#0 ftw_startup (
dir=0x1d66010 "/var/lib/docker/aufs/mnt/21c482c11476d6fb9842fa91c0d9e2c49cfb51c3d04dd5d0dee424d4080d0a4f",
is_nftw=1, func=0x40149c, descriptors=1048566, flags=19) at ../sysdeps/wordsize-64/../../io/ftw.c:654
#1 0x0000000000401d52 in ?? ()
#2 0x00000000004013ec in ?? ()
#3 0x00007f4331728830 in __libc_start_main (main=0x401266, argc=3, argv=0x7ffc32267318, init=<optimized out>,
fini=<optimized out>, rtld_fini=<optimized out>, stack_end=0x7ffc32267308) at ../csu/libc-start.c:291
#4 0x0000000000401199 in ?? ()
And, for completeness sake, the innermost frame from full bt
:
(gdb) bt full
#0 ftw_startup (
dir=0x1d66010 "/var/lib/docker/aufs/mnt/21c482c11476d6fb9842fa91c0d9e2c49cfb51c3d04dd5d0dee424d4080d0a4f",
is_nftw=1, func=0x40149c, descriptors=1048566, flags=19) at ../sysdeps/wordsize-64/../../io/ftw.c:654
data = {dirstreams = 0x7ffc31a67030, actdir = 0, maxdir = 1048566, dirbuf = 0x0,
dirbufsize = 4412750543122677053, ftw = {base = 1027423549, level = 1027423549}, flags = 0,
cvt_arr = 0xff0000, func = 0xff, dev = 18446744073709486080, known_objects = 0xffff000000000000}
st = {st_dev = 0, st_ino = 0, st_nlink = 0, st_mode = 0, st_uid = 0, st_gid = 0, __pad0 = 0, st_rdev = 0,
st_size = 30828704, st_blksize = 4, st_blocks = 30826928, st_atim = {tv_sec = 122, tv_nsec = 30826928},
st_mtim = {tv_sec = 50, tv_nsec = 30826624}, st_ctim = {tv_sec = 139926575187608, tv_nsec = 1048576},
__glibc_reserved = {19, 1048566, 4199580}}
result = 0
cwdfd = -1
cwd = 0x0
cp = <optimized out>
...
From this, a quick Google search returned various pages where the aufs
filesystem is to blame.
Without going into detail this time, the fix was to change the Docker filesystem of the daily docker runners. Apparently we were still using the (poor) default aufs filesystem. Updating to overlay2 did the trick.
$ cat /etc/docker/daemon.json
{
"storage-driver": "overlay2"
}