Ops: Deep Darknet Inspection - Part 1 of 3

Posted: 2010-09-01

Editor’s note: Imported from my old personal blog @ TC with minor edits to improve readability where necessary.

Darknets can be used to help provide valuable network and security insight with little to no risk for the darknet operator. Examining packets sent to unused IP addresses may highlight new threat vectors, a misconfiguration, information leaks and various types of Internet backscatter such as active denial-of-service activity. Team Cymru uses darknets for statistical data gathering and reporting, but from time to time I like to take a deeper look into the raw content that an average, publicly accessible Internet host is exposed to. Doing so can help you get closer to the ground truth of what is ultimately responsible for those packets and in a geeky sort of way, that can be a fun and educational thing to blog about. In the first of a three-part series running over the course of the next few days, here is the relevant summary detail of two packets seen at a darknet recently in slightly modified Wireshark display notation:

delta saddr IP_id sport dport info

0.000000 114.124.162.72 32569 21269 445 TCP SYN Win=65535 MSS=1410 WS=1 TSV=0 TSER=0
2.968866 114.124.162.72 32638 21269 445 TCP SYN Win=65535 MSS=1410 WS=1 TSV=0 TSER=0

The source IP address is purportedly originating from Indonesia and appears to be allocated specifically for mobile or cellular data usage. The destination port is 445, widely used by some Microsoft Windows systems for SMB over TCP and a common target for a number of worms and malware. In fact, according to the SANS Internet Storm Center Top 10 Reports as of this writing, port 445 leads all others by a comfortable margin. Even when considering Team Cymru’s own Top 10 TCP ports graph, which is all sampled Internet traffic we see, port 445 is near the top. However, in this exercise we are doing deep darknet inspection. What more can we say, with some level of confidence, about these two packets? Perhaps more than what appears on the surface.

The IP identification field’s primary and original purpose is to aid IP datagram reassembly in the face of fragmentation. For many systems however, the originating host simply treats this value as a per packet counter, incrementing the IP id field value by one for each datagram transmitted. Its possible that a middle box would re-write this field, however the distance between the first and second value is relatively small, a difference of only 69. This suggests that the identification field is being used as a counter and that there have been a handful of packets sent to other hosts between the two we received. Its possible that many of those intervening packets were SYN scans, about 20 per second, to other hosts. With today’s modern connectivity options that is not a high rate of traffic, but it should be easy to identify as anomalous at the source network. Time to look higher up the stack.

We can assume the second packet is a retransmission since it has the same TCP characteristics as the first and it arrived approximately 3 seconds later, which is the default retransmission time for many operating systems. However, no more packets were received from that host for this session, which is odd because most systems default to sending at least two TCP connection retransmissions attempts. Conficker/Conflicker/Downadup as seen from the UCSD Network Telescope identified this seemingly unique worm behavior last year. So there was probably some version of Microsoft Windows running at that address infected with Conficker. Lets dig deeper.

The maximum segment size (MSS) TCP option from this host was set to a value of 1410 bytes. Normally this value might be set to 1460, if at all, which allows for 1460 bytes of payload in a TCP/IP packet without options on top of a typical Ethernet frame. While this value can vary, 1410 is a sometimes used by PPPoE-connected hosts. Lets be thorough data packet archaeologists and see what else we can discover.

The window scale and time stamp options are set in the SYN segments. These are not typically seen from Windows clients except for more recent versions such as Vista or Microsoft Windows Server 2008. Based on the route origin block and detail thus far, I’d say chances are good that the original host was a consumer-oriented Vista box, but we’re not done yet.

Consider the source port value of 21269. At first glance this is just a random looking ephemeral source port. However, earlier versions of Microsoft Windows machines used a default ephemeral port range of 1024 to 4999, while newer systems such as Vista and Windows 2008 server use 49152 to 65535 by default. Since 21269 is outside of both ranges, it seems likely the original host has gone through some sort of middle box, maybe a NAT-PT gateway.

In summary, we’ve identified what appears to be a transient Microsoft Vista host infected with Conficker connected to the Internet with PPPoE and traversing some sort of middle box from Indonesia. That is a fair bit of insight from just a couple of TCP packets.

Given even just a small number of packets we can often infer a lot. We might not be able to guarantee a precise analysis, but with continuous evidence gathering and some practice, just like any good scientist, we evaluate our hypotheses and hone our results. I tend to believe there is some value in better understanding the packets our darknets receive rather than just lumping them together as protocol or port noise. It certainly doesn’t hurt to be able to explain with more clarity to a colleague, reporter, student or supervisor what all the noise is about. In part two on our deep darknet inspection expedition, we look for answers not just in packets, but in source code. In part three, we’ll wrap up our journey with more insight, but end with a mystery, highlighting the limitations of using darknets to fully comprehend the origin and nature of some packets that arrive at a darknet. Til then bit mechanics, may all your evil bits be 0x0.