The Making of BCP 235

“DNS over TCP is a thing, please don’t block it. kthxbye.” That is how I whimsically tweeted my summary of IETF RFC 9210, a new BCP co-authored with Duane Wessels. The history of the document is rooted in a chance encounter over seven years ago. For posterity, here is my version of how it came to be.

In December 2014, a former student informed me of an interaction with another instructor where my name had come up. I refer to that instructor as JPL in the Acknowledgments section of the document. As I remember it being told to me, this student had to construct a mock network design as part of a class assignment. In describing the setup in class, this former student of mine enumerated various types of traffic that were to be permitted ingress from the Internet to a make-believe organization. One such grouping of traffic was DNS. The student noted that DNS queries over both UDP and TCP to the authoritative servers were to be permitted. At this JPL pointed out a problem, stating that DNS over TCP should be blocked. When my former student, and apparently other former students were there to corroborate, explained that JTK had been very adamant in a prior class that DNS over TCP should not be blocked, a short discussion ensued ending with JPL agreeing to disagree.

When I heard the story I was proud of my former students for having learned that lesson. The student I heard this from even remembered some examples where DNS over TCP were clearly needed, such as for DoS mitigation (aka TCP switch-over) and in the presence of large answers. Well done! However, I was unhappy knowing students who hadn’t taken my class or who might rather accept JPL’s argument were now being told something contrary and potentially problematic on an operational issue I considered important. Admittedly I was also annoyed someone was suggesting I might be wrong on this technicality since I fancied myself the expert instructor in this area.

You see, in the classes I’ve taught I usually talk about naming (DNS) and routing (BGP) extensively for two reasons. One, I think they are two of the most important subsystems in the Internet system itself and therefore deserve significant attention in networking-related classes. Two, I have historically been the only regular computer science faculty member at the institution who has had any significant real-world network operator experience in these areas. Bringing that kind of experience into the classroom arguably provides the students a unique, and dare I say, more authoritative perspective than they are usually exposed to.

Let me state unequivocally, I have the utmost respect for JPL, who I consider an excellent instructor and has deservedly earned much praise from both students and peers.

I must admit, I may have taken JPL’s disagreement a little personally. That, and proving someone wrong is a heck of a work motivator. In a nutshell this is what drove what happened next. I confronted JPL, firing off the equivalent of “what gives?!” in an email. JPL was gracious and recalled what was said in class, telling me:

“Students stated that they were under the impression (from your class?? (not sure)) that proper DNS functions for a DNS client required TCP/53. I said no that unless you needed zone transfer or long responses, it was not needed and simple CNAME/A/AAAA record/MX resolutions can be done on UDP/53.”

“I also stated that in general, unless you trust the DNS admin to properly harden the service, you are better off filtering TCP/53 for inbound from the Internet. I did state that if someone said that DNS absolutely required TCP/53 for simple client resolutions that I disagreed with it.”

I have encountered that sort of position before and treated it as essentially misguided if not potentially harmful in light of current realities. I began to type my reply, but a funny thing happened on my way to show how smart I was. The difficulty in formulating a clear, concise, and convincing rebuttal to prove JPL wrong humbled me as I struggled to find definitive statements and support for my position in the historical record. Nevertheless I bombarded JPL with a long screed. I sent a 1000-word response, complete with links to IETF RFCs, recent drafts, presentations by experts, an knowledge base article, and so on. When I showed this to that former student I was asked “Did you spend all day doing this?” Heh, I probably did. What I had done, but didn’t realize it at the time was create an early version of RFC 9210’s Section 2, “History of DNS over TCP”.

JPL was gracious again, promising to review and look at the issue more closely later. I probably deserved a virtual middle finger, while he likely ignored most of what I said and moved on with his life.

I didn’t move on. There was a problem to fix and now I saw it everywhere. A couple months later, in February 2015 I got myself on the NANOG 63 DNS track agenda to talk about this messaging problem with operational guidance on DNS over TCP. By this time I had invested a short talk’s worth of effort uncovering operator advice and guidance, mostly from the IETF, but also from other places, people and organizations that have had some sway on the topic. I presented the evidence as clearly as I could, showing what we thought was obvious and assumed to exist in some DNS cannon text was actually DNS community dogma and hubris. In fact, it turns out there is some well regarded references, Bellovin and Cheswick’s Firewalls book being the earliest example I could find, that perfectly align with JPL’s position. The audience seemed to agreed that I was onto something. I asked for a show-of-hands and as I recall everyone was in unanimous agreement that DNS over TCP queries should (or must) not be filtered, and creating the operational equivalent of IETF RFC 5966 - DNS Transport over TCP - Implementation Requirements (now obsoleted by IETF RFC 7766) was a reasonable idea.

At this point in the story it is important to point out a common misunderstanding at the time was what people thought IETF RFC 5966 (and now 7766) said but didn’t. Even some of the best known DNS experts in the world referred to 5966 as if it had decreed DNS over TCP was required not only for software implementations but also that DNS over TCP traffic must travel unencumbered across networks. Unfortunately, it not only didn’t say that, it purposefully ducked the question of network operator responsibility. Here is what is included in the Introduction of the specification that prevented me from using it as a cudgel to beat back calls for blocking DNS over TCP:

This document therefore updates the core DNS protocol specifications such that support for TCP is henceforth a REQUIRED part of a full DNS protocol implementation.

Whilst this document makes no specific recommendations to operators of DNS servers, it should be noted that failure to support TCP (or the blocking of DNS over TCP at the network layer) may result in resolution failure and/or application-level timeouts.

In today’s Internet, it is often now necessary to be very explicit, especially with those at the edges, in what they should be expected to do and why. That document gave them an out even if that wasn’t the intention.

As a footnote to this history, Ray Bellis even pointed out that statement “was required to get 5966 through IESG review.” Curiously, in that thread we see Paul Vixie even appears to have been opposed to making DNS over TCP an operational requirement at that time. I’m not sure if he would still hold that view, but in the numerous opportunities to oppose the requirement as 9210 made it’s way through the process he never objected.

I’m not sure what took me so long, but it wasn’t until about a year later in March 2016 when I submitted the first DNS over TCP operational requirements draft as an individual submission. A couple revisions followed based on initial feedback to that document over the next year. Then in 2017 the DNSOP working group invited me to me present the draft in person at IETF 98, conveniently hosted in Chicago. Multiple attendees agreed that it should be adopted as a working group document and thus “kristoff” in the file name was replaced with “ietf” and a new draft was submitted in June with Duane Wessels joining as a co-author.

Duane is a seasoned pro not only in the world of the DNS, but also with the IETF process. I’ve known, admired, learned from, and collaborated with Duane for many years. He expressed a willingness and interest in helping co-author the document. I welcomed his help. In fact, if it wasn’t for him, I probably would have given up on the process long ago. He was instrumental in sorting out issues, ensured we made forward progress, and contributed significantly to the document, particularly Section 4 - Network and System Considerations.

Along the way, by this time I polled a couple of Internet security communities I knew would have better than average network clue. I wanted to gauge the positions of people in these communities on the notion of allowing or blocking DNS over TCP traffic. Results leaned toward the allow side and was about what I expected. You can read more about the polls taken with REN-ISAC survey and the survey participants in earlier entries of this blog.

After the DNSOP group took the document on, a new round of changes were made, but then the draft progressed relatively slowly for awhile. We would sometimes push out a new version with minimal changes just to keep the draft from expiring. Outside the major contributions that Duane was making, much of my effort now was on relatively minor maintenance. Most of the new content from me was in the appendix, which was first added as a result of an earlier suggestion to enumerate existing RFCs that made some pertinent statement related to DNS over TCP transport. I wasn’t a big fan of including all these references, but maybe someone finds it useful. I’m sure it is already out of date.

Maybe no one was really motivated to push the document through all that quickly, because quite a bit of time passed with relatively little movement until it eventually went to last-call in August 2021. There would be three more minor revisions to address needed fixes and last-minute comments over the next few months, but it was finally just about done sans a few more minor tweaks during the RFC Editor phase. On March 22, 2022, IETF RFC 9210 / BCP 235 officially became part of the RFC series. The document updates IETF RFC 1123 - Requirements for Internet Hosts – Applications and Support an Internet Standard, and IETF RFC 1536 - Common DNS Implementation Errors and Suggested Fixes an Informational document.

There you have it. My version of how the making of BCP 235 or IETF RFC 9210 if you prefer unfolded. It started with a deep search for some truth about DNS over TCP when I tried to prove someone wrong and came up a little short. There was a long slog of document updates in the IETF process over many years, then it eventually all came to fruition with the help of numerous individuals, including DNS WG participants, the chairs, ADs, the RFC Editor, and especially Duane. Now if someone disagrees with me about DNS over TCP transport, my retort goes from a pages-long rant to “BCP 235, kthxbye”.