Editor’s note: Imported from my old personal blog @ TC with minor edits to improve readability where necessary.
For me, writing code usually means writing combat Perl code. Its my
standard joke, but also my standard disclaimer. When I recently used C
for another project I included an addendum to my standard disclaimer.
In that case, it went like this, “I wouldn’t advise running this as
root”. Yet, being able to construct tools with your own code is a
tremendously useful skill to have and I frequently urge my networking
undergrad students to attain some competency in tool building. Being
able to parse and summarize logs is a great first tool to attempt to
build. In Perl mine was an ISC BIND named log parsing and summarization
tool called named-report.pl
. Its an abomination, but it served its
purpose and still seems to work well enough for when I need it that I
haven’t bothered to redo it. I’ve since gone on to build a number of
tools with Perl, many of the parsing and summarization variety, but
hopefully each successive incarnation a little better than the last.
The one Perl book that has helped me take a little of the combat out of
my more recent code for which I love to recommend to other Perl coders
has been Damien Conway’s Perl Best
Practices. Perl is what I
tend to reach for first, mainly out of habit. Regardless of your
preferred weaponry for combat coding, lets just agree that being able to
build useful tools is a great thing and instead discuss what it takes to
build a tool around a service a number of us have used for many years.
If you’re like me, you make regular use of the Team Cymru IP address to
BGP route mapping
service. I tend to
use the whois
-based service interface for queries that I do by hand
and the DNS-based service interface in code. The Route Views
Project offers a similar DNS-based service,
but it is not as widely known nor does it have the registry-associated
data that can be handy when trying to uncover some quick insight about
an address. However, parsing the Team Cymru DNS-based service can be a
bit tricky, something I hope this post provides some insight into if not
a few good laughs at my code along the way.
Our task here is a seemingly simple one: pass an IP address to a subroutine and get back an autonomous system number (ASN). Here is how the start of such a Perl subroutine might look:
1. sub get_asn {
2. my $address = shift || return;
3. my $res = Net::DNS::Resolver->new;
4. my $qname = get_ptr_name($address);
5. my $query = $res->send( $qname, 'TXT', 'IN' );
6. my $asn;
7.
8. return if !$query;
9. return if $query->header->ancount < 1;
The routine above expects a scalar value parameter, an IPv4 or IPv6
address, and assigns it to the $address
variable in line 2. We
set up a DNS query by using the
Net::DNS
module in line 3 to create a
new resolver object. In line 4 we need the reverse or PTR name that
will be used in the query so we pass the address to a utility function
called get_ptr_name()
and expect the appropriate query name back
and assign it to the $qname
variable. We are then ready to send
the query and attempt to do so at line 5. If the query fails or no
answer data is returned, we abruptly leave the subroutine at line 7 or 8
respectively. At any time we leave the subroutine early we will return
with an undefined value, so it will be up to the caller to handle such a
condition gracefully.
As an aside, let us take a quick look at the get_ptr_name()
utility routine and see what it might do:
a. sub get_ptr_name {
b. my $addr = shift || return;
c.
d. if ( $addr =~ /:/ ) {
e. $addr = substr new Net::IP ($addr)->reverse_ip, 0, -10;
f. $addr .= '.origin6.asn.cymru.com';
g. }
h. else {
i. $addr = join( '.', reverse split( /\./, $addr ) );
j. $addr .= '.origin.asn.cymru.com';
k. }
l.
m. return $addr;
n. }
This subroutine uses a simple regular expression to test for an IPv6
address. If a colon (':') character is found in the address string, the
address is presumed to be an IPv6 address, otherwise it must be an IPv4
address. In line e. we use the power of
CPAN
and the
Net::IP
module to get the
reverse nibbles for an IPv6 address, because doing so by hand is a pita.
However, in that case we also must strip off the trailing
.ip6.arpa.
zone the module includes by default and in its place
append the Team Cymru IPv6 route origin zone. We perform a similar, but
simpler transformation on an IPv4 address and return the final result.
Presuming everything has gone well up to this point, we want to process a DNS answer we get back. What does an answer look like? This is where things can get a little hairy. We might get multiple answers and there may be multiple ASNs listed in each answer. In DNS-speak, here is what the general format of the RDATA in an RRset will look like:
"49152 [...] | 192.0.2.0/24 | AA | registry | 1970-01-01"
[...]
There are five fields per answer, separated by a pipe ('|') symbol. The first field is an ASN list. Often it will be a single ASN, but due to multiple origin autonomous system (MOAS) routes there may be more separated by whitespace. The second field is the covering route prefix. The third field is a two-letter country-code based on IP address registry allocation information. The fourth field is the registry responsible for the address allocation. The fifth and final field is the date the registry allocated the covering prefix. If there is a route, you should get an answer and at least one ASN and prefix. Beyond that, you should code defensively. Most of the time you get a single answer and a single ASN, but don’t count on it. In our case, we won’t care about more specific prefixes nor multiple ASNs. Continuing on then…
10. ANSWER:
11. for my $answer ( $query->answer ) {
12. next ANSWER if $answer->type ne 'TXT';
13. ($asn) = $answer->rdatastr =~ m{ \A ["] (\d+) }xms;
14. $asn ? last ANSWER : next ANSWER;
15. }
16.
17. return $asn;
18. }
We conclude our simple ASN mapping routine by finding the first TXT RR
in the set and capturing the first ASN in that RR before returning it to
the caller. Keep in mind that this routine is very simple and likely
not suitable for any truly robust project where you care about
multi-homing, MOAS, different covering prefix announcements, upstream
routes or a descriptive name for an ASN. Constructing code that deals
with those situations is probably more appropriate for a library than a
blog post (not a bad idea eh?). The
Net::Abuse::Utils module
contains a subroutine called get_asn_info()
which uses our mapping
service and goes a little further than I show here. I wrapped the
routines above into a small script called sample-tcbgp-mapping.pl
which you may freely use and expand on for your projects. It will take
a list of IPv4 or IPv6 addresses, one per line via STDIN, and give back
a pipe-delimited list of the first associated ASN it finds or ‘NA’ if
none. Go forth and do battle.