ubuntu zesty / apt / dns timeout / srv records
Ever since I updated from Ubuntu/Yakkety to Zesty, my apt-get(1) would sit and wait a while before doing actual work:
$ sudo apt-get update
0% [Working]
Madness. Let’s see what it’s doing…
$ sudo strace -f -s 512 apt-get update
...
[pid 5603] connect(3, {sa_family=AF_INET, sin_port=htons(53), sin_addr=inet_addr("127.0.0.1")}, 16) = 0
...
[pid 5603] sendto(3, "\1\271\1\0\0\1\0\0\0\0\0\0\5_http\4_tcp\3ppa\tlaunchpad\3net\0\0!\0\1", 46, MSG_NOSIGNAL, NULL, 0) = 46
[pid 5603] poll([{fd=3, events=POLLIN}], 1, 5000 <unfinished ...>
...
[pid 5600] select(8, [5 6 7], [], NULL, {0, 500000}) = 0 (Timeout)
...
[pid 5600] select(8, [5 6 7], [], NULL, {0, 500000}) = 0 (Timeout)
...
That is, it does an UDP sendto(2) to 127.0.0.1:53
, with the data which
contains _http\4_tcp\3ppa\tlaunchpad\3net
. It’s a DNS lookup of
course, for _http._tcp.ppa.launchpad.net
. For which it waits 5000 ms
before continuing.
That looks like SRV records. New in apt, apparently. And probably a first lookup before falling back to regular A record lookups.
However, it shouldn’t be timing out if there is nothing. Who is not doing its job?
$ sudo netstat -tulpen | grep 127.0.0.1:53
tcp 0 0 127.0.0.1:53 0.0.0.0:* LISTEN 0 23600 1347/dnsmasq
udp 0 0 127.0.0.1:53 0.0.0.0:* 0 23599 1347/dnsmasq
$ dpkg -l dnsmasq | grep ^ii
ii dnsmasq 2.76-5 all Small caching DNS proxy and DHCP/TFTP server
Is it dnsmasq or is the problem upstream?
$ time dig -t srv _http._tcp.google.com. @ns1.google.com. | grep status:
;; ->>HEADER<<- opcode: QUERY, status: NXDOMAIN, id: 32887
real 0m0.023s
user 0m0.008s
sys 0m0.000s
$ time dig -t srv _http._tcp.google.com. @127.0.0.1 | grep status:
real 0m15.011s
user 0m0.004s
sys 0m0.004s
Okay, dnsmasq is to blame.
Interestingly, dnsmasq does return quickly for existing or even non-existing but NOERROR-status records:
$ dig -t srv _http._tcp.microsoft.com. @127.0.0.1 | grep -E 'status:|^[^;].*SRV'
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 32215
$ dig -t srv _sip._udp.example-voip-provider.com @127.0.0.1 | grep -E 'status:|^[^;].*SRV'
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 27212
_sip._udp.example-voip-provider.com. 2212 IN SRV 60 0 5060 sip01.example-voip-provider.com
Workarounds?
Other than checking why dnsmasq misbehaves, we can quickly work around this by either adding the following, or removing dnsmasq altogether.
For the following workaround, you will need to keep this list updated. So if removing dnsmasq is feasible, you should consider doing that.
$ cat /etc/dnsmasq.d/srv-records-broken
srv-host=_http._tcp.ppa.launchpad.net,91.189.95.83,80