Here’s a quick example of couple additional ways to use the netintel R package I’ve been tinkering with. This could easily be done on the command line with other tools, but if you’re already doing scripting/analysis with R, this provides a quick way to tell if a list of IPs is in the @AlienVault IP reputation database. Zero revelations here for regular R users, but it might help some folks who are working to make R more of a first class scripting citizen.
I whipped up the following bit of code to check to see how many IP addresses in the @Mandiant APT-1 FQDN dump were already in the AlienVault database. Reverse resolution of the Mandiant APT-1 FQDN list is a bit dubious at this point so a cross-check with known current data is a good idea. I should also point out that not all the addresses resolved “well” (there are 2046 FQDNs and my quick dig
only yielded 218 usable IPs).
library(netintel)
# get the @AlienVault reputation DB
av.rep = Alien.Vault.Reputation()
# read in resolved APT-1 FQDNs list
apt.1 = read.csv("apt-1-ips.csv")
# basic set operation
whats.left = intersect(apt.1$ip,av.rep$IP)
# how many were in the quickly resolved apt-1 ip list?
length(apt.1)
[1]218
# how many are common across the lists?
length(whats.left)
[1] 44
# take a quick look at them
whats.left
[1] "12.152.124.11" "140.112.19.195" "161.58.182.205" "165.165.38.19" "173.254.28.80"
[6] "184.168.221.45" "184.168.221.54" "184.168.221.56" "184.168.221.58" "184.168.221.68"
[11] "192.31.186.141" "192.31.186.149" "194.106.162.203" "199.59.166.109" "203.170.198.56"
[16] "204.100.63.18" "204.93.130.138" "205.178.189.129" "207.173.155.44" "207.225.36.69"
[21] "208.185.233.163" "208.69.32.230" "208.73.210.87" "213.63.187.70" "216.55.83.12"
[26] "50.63.202.62" "63.134.215.218" "63.246.147.10" "64.12.75.1" "64.12.79.57"
[31] "64.126.12.3" "64.14.81.30" "64.221.131.174" "66.228.132.20" "66.228.132.53"
[36] "68.165.211.181" "69.43.160.186" "69.43.161.167" "69.43.161.178" "70.90.53.170"
[41] "74.14.204.147" "74.220.199.6" "74.93.92.50" "8.5.1.34"
So, roughly a 20% overlap between (quickly-I’m sure there’s a more comprehensive list) resolved & “clean” APT-1 FQDNs IPs and the AlienVault reputation database.
For kicks, we can see where all the resolved APT-1 nodes live (BGP/network-wise) in relation to each other using some of the other library functions:
library(netintel)
library(igraph)
library(plyr)
apt.1 = read.csv("apt-1-ips.csv")
ips = apt.1$ip
# get BGP origin & peers
origin = BulkOrigin(ips)
peers = BulkPeer(ips)
# start graphing
g = graph.empty()
# Make IP vertices; IP endpoints are red
g = g + vertices(ips,size=1,color="red",group=1)
# Make BGP vertices ; BGP nodes are light blue
g = g + vertices(unique(c(peers$Peer.AS, origin$AS)),size=1.5,color="orange",group=2)
# no labels
V(g)$label = ""
# Make IP/BGP edges
ip.edges = lapply(ips,function(x) {
iAS = origin[origin$IP==x,]$AS
lapply(iAS,function(y){
c(x,y)
})
})
# Make BGP/peer edges
bgp.edges = lapply(unique(origin$BGP.Prefix),function(x) {
startAS = unique(origin[origin$BGP.Prefix==x,]$AS)
lapply(startAS,function(z) {
pAS = peers[peers$BGP.Prefix==x,]$Peer.AS
lapply(pAS,function(y) {
c(z,y)
})
})
})
# get total graph node count
node.count = table(c(unlist(ip.edges),unlist(bgp.edges)))
# add edges
g = g + edges(unlist(ip.edges))
g = g + edges(unlist(bgp.edges))
# base edge weight == 1
E(g)$weight = 1
# simplify the graph
g = simplify(g, edge.attr.comb=list(weight="sum"))
# no arrows
E(g)$arrow.size = 0
# best layout for this
L = layout.fruchterman.reingold(g)
# plot the graph
plot(g,margin=0)
If we take out the BGP peer relationships from the graph (i.e. don’t add the bgp.edges
in the above code) we can see the mal-host clusters even more clearly (the pseudo “Death Star” look is unintentional but appropro):
We can also determine which ASNs the bigger clusters belong to by checking out the degree. The “top” 5 clusters are:
16509 40676 36351 26496 15169
7 8 8 13 54
While my library doesn’t support direct ASN detail lookup yet (an oversight), we can take those ASN’s, check them out manually and see the results:
16509 | US | arin | 2000-05-04 | AMAZON-02 - Amazon.com, Inc.
40676 | US | arin | 2008-02-26 | PSYCHZ - Psychz Networks
36351 | US | arin | 2005-12-12 | SOFTLAYER - SoftLayer Technologies Inc.
26496 | US | arin | 2002-10-01 | AS-26496-GO-DADDY-COM-LLC - GoDaddy.com, LLC
15169 | US | arin | 2000-03-30 | GOOGLE - Google Inc.
So Google servers are hosting the most mal-nodes from the resolved ASN-1 list, followed by GoDaddy. I actually expected Amazon to be higher up in the list.
I’ll be adding igraph
and ASN lookup functions to the netintel
library soon. Also, if anyone has a better APT-1 IP list, please shoot me a link.