On this date twenty years ago the Internet came as close to a total meltdown as we’ve ever seen since the commercialization of the Internet. A tiny UDP worm payload of just 376 bytes spread to all remotely accessible and vulnerable Microsoft SQL servers listening on port 1434 within a matter of minutes. This tiny payload ultimately infected roughly 75 thousand hosts worldwide and the disruption it caused made international news. It was enough to bring many networks to a screeching halt. This blog post is a personal reflection and reconsideration of the fateful event that continues to resonate as one of my most vivid experiences in Internet availability.
Background
We must place the 2003 Internet in its historical context. Internet security was not quite the driving force and big business it is now, but it was clearly underway. The network geeks, a tribe I still feel a camaraderie with, were regularly butting heads with the growing group of security specialists. The network packet pushers were generally opposed to most network-based traffic mangling and middle box devices. The security crowd, especially in campus and enterprise environments, generally favored a sort of permission-based architecture, where imperfect protection and centralization at aggregation points generally outweighed the potential problem of collateral damage.
The network people would argue that the security threats of the day were best addressed as close to the problem as possible, which usually meant a focus on better host and application security practices. The security people would argue this was just not practical. The best solutions were probably some consideration of both, but like any two groups with vastly different perspectives, compromise and nuance were rare.
An aside: Here is a fun experiment you can try. Ask anyone who was involved in Internet operations at a university in those days about packet shaping devices like the Packeteer. It should be immediately clear whether they were on team network or team security.
The most popular operating systems on the Internet at this time were Windows 2000 and Windows XP. Millions of these systems were connected to the Internet, but many had no network-based protection in front of them and even more were poorly managed. Practically all these systems had relatively weak host-based protection from remote exploits by default. XP’s basic firewall (packet filtering) capability that existed at the time was not enabled by default until service packet 2 (SP2) was released, which wouldn’t happen until 2004, well into the era when IRC-based botnets were rampant. Once SP2 became the norm, the number of remotely exploitable Windows systems eventually began to wane dramatically.
Another important observation of this time is that the speeds of end hosts were getting much faster. The standard LAN-based connection was 100 Mb/s switched Ethernet, which was usually only one order of magnitude slower than many organizations network backbones. It would only take a small number of hosts to saturate backbone links. Home users around the world were also moving from dial-up to higher speed cable and DSL networks. A relatively small number of hosts on a few access networks at this time were a potent weapon for a DDoS, or perhaps a high-speed spreading worm. To wit, in took only a few thousand hosts to render some root DNS server instances inaccessible in an attack in 2007.
The Exploit and The Worm
In the summer of 2002, multiple problems with Microsoft SQL server were publicly disclosed to the BUGTRAQ mailing list with a coordinated patch release from Microsoft. One of the vulnerabilities resulted from a buffer overflow that could be triggered with just a few bytes to the service over the network using a small UDP message.
Months later, on January 25 between the hours of 0500 and 0600 UTC, around midnight for most of the continental U.S., a UDP payload began replicating to practically all Internet-connected Microsoft SQL systems in what came to be known as the SQL Slammer or Sapphire worm. This worm was based on one of the buffer overflow vulnerabilities for which a patch was available for about a half of a year. In an article for IEEE Security & Privacy, researchers estimated the worm reached 55 million scans per second and ultimately compromised approximately all the vulnerable systems connected to the Internet, or approximately 75,000 hosts, within just a few minutes after the initial infection.
Infected hosts would emit scanning traffic at near the maximum rate of their network interface. At the university where I was a netop, it took only a handful of infected hosts to completely saturate our external Internet connection with scanning traffic and render our external connectivity essentially unusable. See page 10 from my Network Defenses to Denial of Service Attacks presentation to see what our network traffic graphs looked like at the time.
It just so happens I was not on call and I had gone to bed just before the outbreak had happened so I literally missed all the overnight action. At the time I felt like I missed out and wished I could have been a help to my colleagues who took the brunt of the initial response. I was the most senior person at the time and I tended to be the one with the connections to the most network and security community. Despite all this, my co-workers never woke me up. I guess I wasn’t as important as I thought, and now with the passage of time I don’t really feel all that bad about having gotten a good night’s sleep.
The Response
Even though the worm began spreading quickly outside of working hours the news and reaction spread almost as quickly. Early reports from network operators identified routing anomalies and in private network security communities I was privy to, traffic characterization was almost instantaneous with recommendations to filter UDP port 1434 traffic immediately.
There were widespread reports of network instability and availability problems for many hours until network packet filters became more widely deployed and infected hosts were taken offline. A few networks had filtered UDP destination port 1434 prior to the worm’s arrival, but this was not very common. Many large network backbones with an abundance of capacity weathered the storm, but because many of them continued forwarding packets it might be argued they didn’t do the smaller networks that connected to them any favors. Many network operators rejected, and still reject calls to perform much, if any general traffic filtering services by default. To paraphrase a common refrain from provider networks, “Our job is to move packets received towards the destination as efficiently as possible. We are are not Internet firewall providers.” Nonetheless, in extreme situations such as this, many big networks did apply temporary UDP port 1434 filters to help provide relief to many who needed it.
The Internet didn’t crash. However, a significant number of individual networks melted, at least briefly. I can’t think of any other time where a larger share of Internet-connected networks experienced a shared capacity collapse event in the 21st century.
Fallout
Perhaps due to the severe pain experienced from Slammer, many of those original UDP port 1434 filters persist to this day. But do we still see Slammer infections trying to spread in-the-wild? No. At least I could find no evidence it is running anywhere. Over the last few weeks I have been running a packet capture on approximately 300 widely distributed systems on the Internet looking for evidence of that 376-byte UDP payload to destination port 1434. Not a single Slammer payload ever arrived at any host. I suppose it is possible there are isolated systems in private networks or behind port 1434 filters somewhere that remain infected and are trying to spread, but I think it is safe to conclude for all practical purposes the worm is dead or I would have seen it.
Even after 20 years and no apparent evidence of even a single infection on today’s Internet, you can still observe the port 1434 filters in action. For example, from outside the Northwestern University campus network as of this writing, the following works:
dig -b0.0.0.0#1433 @accuvax.northwestern.edu. ns northwestern.edu.
But this does not:
dig -b0.0.0.0#1434 @accuvax.northwestern.edu. ns northwestern.edu.
I will note a new and different DDoS amplification/reflection vulnerability in Microsoft SQL was discovered in 2015, but it is a significantly less impacting and a comparatively lesser known problem. I’d be willing to bet almost all the UDP 1434 filters that exist stem from the Slammer threat many years earlier.
One reaction to the Slammer infections I undertook in my role at the DePaul University was to design and encourage deployment of what I called “shielding filters”, but would come to be known as “network seat belts”, a version of which continues on that campus network to this day. The idea was to apply a set of filters and rate limits on the first-hop router interface where client end systems reside. So for example, all 100 Mb/s router interfaces would allow only 10 Mb/s of aggregate UDP traffic into the network that was destined to external networks. The idea was that anything above that rate would be undesirable, because practically all client applications would never do such a thing, not over UDP anyway. This worked quite well for a long time, as long as it was done properly and maintained, which was not always the case. We did not block UDP port 1434 outright to avoid collateral damage, but if a Slammer infected host bypassed any local host security protections and appeared on our network, the scanning traffic would be limited and was easily identified when the edge network rate limit was reached.
Another reaction I undertook from the Slammer worm was to enhance IP multicast protections, which for all practical purposes is really only of historic interest these days. Without getting into all the nitty gritty details, Slammer would scan the entire IP address space including 224/4. Due to the way IP multicast networking was commonly configured, scanning traffic to IP multicast destinations would wreak havoc in network state on routers unless mitigated with various filters and rate limiters.
Final Thoughts
Could something like this happen again? I suppose anything is possible, but it seems unlikely we’d experience anything quite like we did with Slammer. On the one hand, worms that spread just for the sake of spreading are much less common these days. Typically there is some ulterior motive (e.g., financial). On the other hand, I could envision how an entirely new class of vulnerable systems such as IoT devices might be taken advantage of to create a very large, fast-spreading worm. Nonetheless, I think even more now than then, the Internet as a whole seems highly unlikely to crash if such an outbreak were to occur. If it does, don’t feel compelled to wake me up this time either.