This is part of a larger project I’m working on, but it’s useful enough to share (github version coming soon).
The fine folks at @TeamCymru have a great service to map IP addresses to ASN/BGP information en masse.
There are libraries for Python, Perl and other languages but none for R (that I could find). So, I threw together a quick set of functions to interface to @TeamCymru’s service. Unlike many other modern services, this one isn’t XML
or JSON
over a REST
ful interface, so the code uses a socketConnection()
over the standard WHOIS
TCP port to post and retrieve simple text lists.
# # bulkorigin.R - perform bulk IP to ASN mapping via Team Cymru whois service # # Author: @hrbrmstr # Version: 0.1 # Date: 2013-02-07 # # Copyright 2013 Bob Rudis # # Permission is hereby granted, free of charge, to any person obtaining # a copy of this software and associated documentation files (the # "Software"), to deal in the Software without restriction, including # without limitation the rights to use, copy, modify, merge, publish, # distribute, sublicense, and/or sell copies of the Software, and to # permit persons to whom the Software is furnished to do so, subject to # the following conditions: # # The above copyright notice and this permission notice shall be # included in all copies or substantial portions of the Software. # # THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, # EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF # MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND # NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE # LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION # OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION # WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.# library(plyr) # short function to trim leading/trailing whitespace trim <- function (x) gsub("^\\s+|\\s+$", "", x) BulkOrigin <- function(ip.list,host="v4.whois.cymru.com",port=43) { # Retrieves BGP Origin ASN info for a list of IP addresses # # NOTE: IPv4 version # # NOTE: The Team Cymru's service is NOT a GeoIP service! # Do not use this function for that as your results will not # be accurate. # # Args: # ip.list : character vector of IP addresses # host: which server to hit for lookup (defaults to Team Cymru's server) # post: TCP port to use (defaults to 43) # # Returns: # data frame of BGP Origin ASN lookup results # setup query cmd = "begin\nverbose\n" ips = paste(unlist(ip.list), collapse="\n") cmd = sprintf("%s%s\nend\n",cmd,ips) # setup connection and post query con = socketConnection(host=host,port=port,blocking=TRUE,open="r+") cat(cmd,file=con) response = readLines(con) close(con) # trim header, split fields and convert results response = response[2:length(response)] response = lapply(response,function(n) { sapply(strsplit(n,"|",fixed=TRUE),trim) }) response = adply(response,c(1)) response = response[,2:length(response)] names(response) = c("AS","IP","BGP.Prefix","CC","Registry","Allocated","AS.Name") return(response) } BulkPeer <- function(ip.list,host="v4-peer.whois.cymru.com",port=43) { # Retrieves BGP Peer ASN info for a list of IP addresses # # NOTE: IPv4 version # # NOTE: The Team Cymru's service is NOT a GeoIP service! # Do not use this function for that as your results will not # be accurate. # # Args: # ip.list : character vector of IP addresses # host: which server to hit for lookup (defaults to Team Cymru's server) # post: TCP port to use (defaults to 43) # # Returns: # data frame of BGP Peer ASN lookup results # setup query cmd = "begin\nverbose\n" ips = paste(unlist(ip.list), collapse="\n") cmd = sprintf("%s%s\nend\n",cmd,ips) # setup connection and post query con = socketConnection(host=host,port=port,blocking=TRUE,open="r+") cat(cmd,file=con) response = readLines(con) close(con) # trim header, split fields and convert results response = response[2:length(response)] response = lapply(response,function(n) { sapply(strsplit(n,"|",fixed=TRUE),trim) }) response = adply(response,c(1)) response = response[,2:length(response)] names(response) = c("Peer.AS","IP","BGP.Prefix","CC","Registry","Allocated","Peer.AS.Name") return(response) }
Take a list of IPs, make an IP connection, formulate a bulk query and convert the results. Here’s a small script to test it:
ips = c("100.43.81.11","100.43.81.7") origin = BulkOrigin(ips) str(origin) peers = BulkPeer(ips) str(peers)
That code outputs:
'data.frame': 2 obs. of 7 variables: $ AS : chr "13238" "13238" $ IP : chr "100.43.81.11" "100.43.81.7" $ BGP.Prefix: chr "100.43.64.0/19" "100.43.64.0/19" $ CC : chr "US" "US" $ Registry : chr "arin" "arin" $ Allocated : chr "2011-12-06" "2011-12-06" $ AS.Name : chr "YANDEX Yandex LLC" "YANDEX Yandex LLC"
and
'data.frame': 8 obs. of 7 variables: $ Peer.AS : chr "174" "3257" "9002" "10310" ... $ IP : chr "100.43.81.11" "100.43.81.11" "100.43.81.11" "100.43.81.11" ... $ BGP.Prefix : chr "100.43.64.0/19" "100.43.64.0/19" "100.43.64.0/19" "100.43.64.0/19" ... $ CC : chr "US" "US" "US" "US" ... $ Registry : chr "arin" "arin" "arin" "arin" ... $ Allocated : chr "2011-12-06" "2011-12-06" "2011-12-06" "2011-12-06" ... $ Peer.AS.Name: chr "COGENT Cogent/PSI" "TINET-BACKBONE Tinet SpA" "RETN-AS ReTN.net Autonomous System" "YAHOO-1 - Yahoo!" ...
respectively for each str()
.
Nothing super-sexy, but it’s part of a mission I’m on to make IP addresses “first class citizens” in R. I’m starting with building some smaller functions that accumulate IP metadata and will ultimately collect them all into a compact R library.
In the interim, I thought these two routines might be useful to some folks.
With just these two functions, you can use various graphing libraries to get a picture of the network connectivity. Here’s a small sample to get you started:
library(igraph) ips = c("100.43.81.11") origin = BulkOrigin(ips) peers = BulkPeer(ips) g = graph.empty() + vertices(c(ips, peers$Peer.AS, origin$AS),size=30) V(g)$label = c(ips, peers$Peer.AS, origin$AS) e = lapply(peers$Peer.AS,function(x) { c(origin$AS,x) }) g = g + edges(unlist(e)) g = g + edge(ips, origin$AS) g$layout = layout.circle plot(g)
If you know of any other R libraries or code that provide functions that operate on IP addresses or interface to services that provide IP address metadata, please drop a note in the comments or ping me on Twitter.
One Trackback/Pingback
[…] small igraph visualization in the previous post shows the basics of what you can do with the BulkOrigin & BulkPeer functions, and I thought a […]