Here’s a quick example of couple additional ways to use the netintel R package I’ve been tinkering with. This could easily be done on the command line with other tools, but if you’re already doing scripting/analysis with R, this provides a quick way to tell if a list of IPs is in the @AlienVault IP reputation database. Zero revelations here for regular R users, but it might help some folks who are working to make R more of a first class scripting citizen.
I whipped up the following bit of code to check to see how many IP addresses in the @Mandiant APT-1 FQDN dump were already in the AlienVault database. Reverse resolution of the Mandiant APT-1 FQDN list is a bit dubious at this point so a cross-check with known current data is a good idea. I should also point out that not all the addresses resolved “well” (there are 2046 FQDNs and my quick dig
only yielded 218 usable IPs).
library(netintel) # get the @AlienVault reputation DB av.rep = Alien.Vault.Reputation() # read in resolved APT-1 FQDNs list apt.1 = read.csv("apt-1-ips.csv") # basic set operation whats.left = intersect(apt.1$ip,av.rep$IP) # how many were in the quickly resolved apt-1 ip list? length(apt.1) [1]218 # how many are common across the lists? length(whats.left) [1] 44 # take a quick look at them whats.left [1] "12.152.124.11" "140.112.19.195" "161.58.182.205" "165.165.38.19" "173.254.28.80" [6] "184.168.221.45" "184.168.221.54" "184.168.221.56" "184.168.221.58" "184.168.221.68" [11] "192.31.186.141" "192.31.186.149" "194.106.162.203" "199.59.166.109" "203.170.198.56" [16] "204.100.63.18" "204.93.130.138" "205.178.189.129" "207.173.155.44" "207.225.36.69" [21] "208.185.233.163" "208.69.32.230" "208.73.210.87" "213.63.187.70" "216.55.83.12" [26] "50.63.202.62" "63.134.215.218" "63.246.147.10" "64.12.75.1" "64.12.79.57" [31] "64.126.12.3" "64.14.81.30" "64.221.131.174" "66.228.132.20" "66.228.132.53" [36] "68.165.211.181" "69.43.160.186" "69.43.161.167" "69.43.161.178" "70.90.53.170" [41] "74.14.204.147" "74.220.199.6" "74.93.92.50" "8.5.1.34"
So, roughly a 20% overlap between (quickly-I’m sure there’s a more comprehensive list) resolved & “clean” APT-1 FQDNs IPs and the AlienVault reputation database.
For kicks, we can see where all the resolved APT-1 nodes live (BGP/network-wise) in relation to each other using some of the other library functions:
library(netintel) library(igraph) library(plyr) apt.1 = read.csv("apt-1-ips.csv") ips = apt.1$ip # get BGP origin & peers origin = BulkOrigin(ips) peers = BulkPeer(ips) # start graphing g = graph.empty() # Make IP vertices; IP endpoints are red g = g + vertices(ips,size=1,color="red",group=1) # Make BGP vertices ; BGP nodes are light blue g = g + vertices(unique(c(peers$Peer.AS, origin$AS)),size=1.5,color="orange",group=2) # no labels V(g)$label = "" # Make IP/BGP edges ip.edges = lapply(ips,function(x) { iAS = origin[origin$IP==x,]$AS lapply(iAS,function(y){ c(x,y) }) }) # Make BGP/peer edges bgp.edges = lapply(unique(origin$BGP.Prefix),function(x) { startAS = unique(origin[origin$BGP.Prefix==x,]$AS) lapply(startAS,function(z) { pAS = peers[peers$BGP.Prefix==x,]$Peer.AS lapply(pAS,function(y) { c(z,y) }) }) }) # get total graph node count node.count = table(c(unlist(ip.edges),unlist(bgp.edges))) # add edges g = g + edges(unlist(ip.edges)) g = g + edges(unlist(bgp.edges)) # base edge weight == 1 E(g)$weight = 1 # simplify the graph g = simplify(g, edge.attr.comb=list(weight="sum")) # no arrows E(g)$arrow.size = 0 # best layout for this L = layout.fruchterman.reingold(g) # plot the graph plot(g,margin=0)
If we take out the BGP peer relationships from the graph (i.e. don’t add the bgp.edges
in the above code) we can see the mal-host clusters even more clearly (the pseudo “Death Star” look is unintentional but appropro):
We can also determine which ASNs the bigger clusters belong to by checking out the degree. The “top” 5 clusters are:
16509 40676 36351 26496 15169 7 8 8 13 54
While my library doesn’t support direct ASN detail lookup yet (an oversight), we can take those ASN’s, check them out manually and see the results:
16509 | US | arin | 2000-05-04 | AMAZON-02 - Amazon.com, Inc. 40676 | US | arin | 2008-02-26 | PSYCHZ - Psychz Networks 36351 | US | arin | 2005-12-12 | SOFTLAYER - SoftLayer Technologies Inc. 26496 | US | arin | 2002-10-01 | AS-26496-GO-DADDY-COM-LLC - GoDaddy.com, LLC 15169 | US | arin | 2000-03-30 | GOOGLE - Google Inc.
So Google servers are hosting the most mal-nodes from the resolved ASN-1 list, followed by GoDaddy. I actually expected Amazon to be higher up in the list.
I’ll be adding igraph
and ASN lookup functions to the netintel
library soon. Also, if anyone has a better APT-1 IP list, please shoot me a link.