This is an update to an essay I wrote back when I was working in 18F, for an audience of fellow federal employees. Then, because of politics, my language was softer and I avoided some points entirely. Now, without aiming to be harsh, a full accounting.
A Brief History Of DNS
It is easy to look back at the naivete of the Internet’s early infrastructure but, even as far back as 1989, Steven Bellovin had already noticed that “relying on the IP source address for authentication is extremely dangerous”. The following year, he wrote an explanation as to how DNS cache poisoning would work and concluded that systems like rlogin (which we have since replaced with ssh) were “fundamentally flawed” for their “reliance on host addresses or host names for authentication”.
At the time, however, UC Berkeley’s Computer Systems Research Group (which maintained the r-commands suite) had already settled on verifying address-to-name DNS queries against name-to-address DNS queries, under the mistaken impression that an attacker could not forge responses to both. In a nutshell, a DNS server under the attacker’s control, when queried for a DNS name record, would also provide a DNS address record. Both these records would be cached by the victim’s local DNS resolver, and so when rlogin would ask for the address record for the attacker’s host, it would be given the attacker’s record from the cache instead of from the authoritative DNS server for that host. Despite this, Bellovin emphasized that “the problem here is not DNS”. Foretelling the replacement of rlogin with ssh, he wrote, “If strong (i.e., cryptographic) mechanisms were used, people playing games with [DNS] would be notable solely for their nuisance value”.
Almost two decades later, Dan Kaminsky publicly disclosed a method for an attacker to poison DNS caches without a DNS server under their control. By flooding a DNS resolver with responses to a query the attacker had asked it to make, each response containing a unique guess as to what the expected query ID might be, the attacker maximizes the odds of their response being accepted by the resolver before the authoritative nameserver’s response is received. This particular method was effectively fixed by also randomizing the source port used by the resolver, so that between the query ID and source port, the attacker could not practically win the race. Nonetheless, in the time since Bellovin first called attention to the untrustworthiness of DNS, the Internet had grown into the public and commercial sphere we know today. Shortly after the Kaminsky disclosure, the Office of Management and Budget (OMB) would issue Memo M-08-23 requiring the US government to deploy DNSSEC. To this day, however, essentially no companies besides Cloudflare (which sells DNSSEC-enabled services) and Paypal (but none of its subsidiaries) have adopted it.
The Failures Of DNSSEC
RFC2065, aka “Domain Name System Security Extensions” and published in 1997, formalized the effort to take some of Bellovin’s advice to heart and employ cryptographic mechanisms. It has undergone significant revisions since its inception as more and more design and implementation issues have been found, and some issues have still not been addressed. Here, I’ll be referring to DNSSEC as it stands today per RFC4033 “DNS Security Introduction and Requirements”, also known as “DNSSEC-bis”.
The problem began with taking the shortcut of employing the cryptographic mechanisms in DNS itself. Remember that DNS was not designed to be trustworthy, and services like rlogin that did so were in error. So here, the path forks: do we re-engineer our services to not trust DNS, or do we try to re-engineer DNS to be trustworthy? A theoretically perfectly secure DNS utilized by a perfectly insecure rlogin is still vulnerable to all of the attacks that necessitated its replacement by ssh, and perfectly insecure HTTP is still vulnerable to all of the attacks that necessitated its tunneling via TLS. Re-engineering DNS simply won’t solve the problem.
Chief among the security properties of those two protocols is that they do not require trust in lower-level protocols like DNS or even in IP or UDP. Both protocols establish point-to-point security based on verifiable key material, and so if people play games with DNS while these protocols are employed, it is only for the nuisance value. DNSSEC, in a most confusing contrast, requires a chain of trust across multiple points but does not include the “last mile” chain-link to the local resolver. While each of the DNS servers involved in resolving a query must use verifiable key material, no material is readily available to the resolver to establish trust that the response to their query is coming from the server they queried. Instead, DNSSEC-aware servers include an Authenticated Data (AD) bit to indicate whether they were able to validate the signatures for all of the data in their response. It is then up to the resolver to trust the response or not, effectively moving the cache poisoning problem from the server to the resolver without actually fixing it.
This cryptographic design branches in two different problematic directions, rooted in the failure modes of DNS servers when they cannot validate a signature. For servers that respond with unsigned domains, their responses will still be carried back to the resolver; for servers that respond with invalidly signed domains, this will cause a DNS SERVFAIL error and the application will be unable to proceed. This disparity in behavior is an attractive nuisance. With the expectation that the resolver will ignore the AD bit or lack the means to surface its status to the relying application, attackers can continue to forge unsigned responses for targeted domains. With the expectation that servers will return SERVFAIL to the resolvers, attackers can forge invalidly signed responses to deny service to targeted domains. This kind of denial of service doesn’t even need to be intentional! Thanks to configuration changes that resulted in signature validation failures, DNSSEC already has an illustrious history for being a cause of many high-profile outages.
DNSSEC is also a vector for denial of service. Distributed denial of service (DDoS) attacks attempt to overwhelm their target through the sheer force of combined bandwidth; the target has a peak bandwidth of X Gbps and by adding more and more nodes into their net, the DDoS perpetrators hope to exceed X. Here, DNSSEC is leveraged as an amplifier, since a small-sized request from a client will trigger a much-larger sized response from the server. By reflecting their small requests, attackers can direct DNSSEC-enabled servers to deliver their much larger responses to their intended victim, enabling a swarm of DDoS nodes to leverage much more bandwidth than their individual numbers would typically enable. In fact, this exact situation befell cspc.gov.
DNSSEC signs responses for all queries, even ones about domains that do not exist. Since the non-existent domains that resolvers might ask about can’t be known in advance, though, the signing key would need to remain online and at risk of exfiltration in the event of server compromise. For organizations who have adopted DNSSEC despite the above issues, this is still an understandably unacceptable risk. So what DNSSEC does instead is return a response that in effect says “the domain you asked about, which would exist between these two other domains you didn’t ask about, doesn’t exist”. This removes the need to keep the signing key online, but at the cost of leaking possibly internal domain names to external clients. After that flaw was acknowledged, the response was modified to cryptographically hash the domain names, but advances in GPU and ASIC hardware have made it cost-effective to brute-force those hashes in offline dictionary attacks.
More could be said about the cryptographic errors in DNSSEC, but they are only tempting distractions from the fundamental issues. Yes, DNSSEC requires support of SHA-1, which is cryptographically broken, and NIST has directed agencies to stop using it in digital signature schemes. But requiring better algorithms will not address all, or even any, of the preceding issues. It is noted only as further indication of how disconnected DNSSEC is from the kind of solution we want and need.
The Alternatives To DNSSEC
The single best step towards a trust-worthy Internet is wider deployment of protocols featuring cryptographic mechanisms that don’t trust DNS. We’ve already made tremendous strides in that regard, and there’s even more to come. The legacy services without a maintainer to adopt these practices, however, cannot be forgotten. Opportunities exist to have those services tunneled through secure protocols, though, that provide the kind of security DNSSEC never could.
Eventually, OMB issued Memo M-15-13 requiring the US government to deploy HTTPS. In the memo, explicit mention is made of the security guarantees that HTTPS provides and DNSSEC does not, but there is no mention of any guarantees that DNSSEC provides and HTTPS does not. While OMB was unwilling to rescind the DNSSEC memo, implicit is the conclusion that it’s unnecessary. Were DNSSEC to disappear tomorrow, the world would not be at any greater risk, but pinning any hopes on DNSSEC puts us at risk because it steals the air from actually-effective practices.
Thanks to Thomas Ptacek.