SMTP greet and data feeds

Posted: 2021-01-21

The DataPlane.org smtpgreet and smtpdata feeds are now publicly available. Both look much like other feeds with the usual set of entry attributes: an IPv4 or IPv6 address, associated route origin information (ASN and AS name), the most recent time stamp of the event in the past 7 days, and the feed name. The distinction between the two feeds is the extent of SMTP client behavior observed.

Both feeds are derived from SMTP sensors that are neither advertised nor intended to be used for any actual delivery of legitimate email. The smtpgreet feed reports on SMTP clients that have been seen issuing an unsolicited HELO or EHLO command after the TCP port 25 connection to a sensor in the wild. A number of Internet surveying projects will issue such commands to facilitate SMTP daemon fingerprinting. Therefore you will likely see some mostly harmless scanning sources show up in this feed. The smtpdata feed goes further and reports on SMTP clients that also issue a DATA command, which signals the content of an email to follow. Clients reaching the DATA phase in an SMTP session with an intent to deliver unsolicited email tend to be more suspicious. Often times these clients are testing the SMTP server to see if email will actually be delivered by including a unique SMTP server identifier in the header or body. In other more nefarious cases, the client may be trying to send or forward spam if it appears the server will accept it.

A few security threat organizations and incident responders have expressed interest in seeing DataPlane.org provide email-based feeds. In fact, this was something planned long ago under a different, but now defunct, project (hat tip to my friend Italo Vlacy for some early prototype work in this area). I finally got around to deploying a working module and these two new feeds are the result of that effort and an attempt to meet the needs of the community.

The feeds are clearly marked, as all feeds are, that they are not block lists, but some use them to automatically block traffic from reported IP addresses anyway. I can’t stop people from shooting themselves in the foot, but maybe I can at least explain how they might do so with these feeds.

The SMTP sensor systems are assigned IP addresses dedicated for nothing but sensing, but their IP addresses may have been assigned to legitimate email systems at some point in the past. Furthermore, I have no control over random DNS names that may point to sensor IP addresses, either accidentally or as a form of an attack. Legitimate mail clients could be made to believe the sensors are legitimate email servers, even if briefly until they realize email never gets delivered. In these scenarios, legitimate email clients could end up in an SMTP feed.

I’ve not seen any evidence of these kinds of things happening, but it is entirely possible. Consider yourself warned.

If you’re familiar with SMTP commands you may be wondering if other SMTP-based feeds are on the horizon. The sender address, recipients, and body content that may include attachments for example, provide not only more context, but additional threat intelligence. I am not ignorant of that insight, but collecting it all and making it publicly available comes with both some risk and expense. This may be project work I explore in the future.

I am anxious to gauge the community’s reaction to the availability of these feeds. Please let me know what you think of these feeds or the prospect of additional insight from STMP that DataPlane.org might be able to produce.