New R Package – ipapi (IP/Domain Geolocation)

I noticed that the @rOpenSci folks had an interface to [ip-api.com](http://ip-api.com/) on their [ToDo](https://github.com/ropensci/webservices/wiki/ToDo) list so I whipped up a small R package to fill said gap.

Their IP Geolocation API will take an IPv4, IPv6 or FQDN and kick back a ASN, lat/lon, address and more. The [ipapi package](https://github.com/hrbrmstr/ipapi) exposes one function – `geolocate` which takes in a character vector of any mixture of IPv4/6 and domains and returns a `data.table` of results. Since `ip-api.com` has a restriction of 250 requests-per-minute, the package also tries to help ensure you don’t get your own IP address banned (there’s a form on their site you can fill in to get it unbanned if you do happen to hit the limit). Overall, there’s nothing fancy in the package, but it gets the job done.

I notified the rOpenSci folks about it, so hopefully it’ll be one less thing on that particular to-do list.

You can see it in action in combination with the super-spiffy [leaflet](http://www.htmlwidgets.org/showcase_leaflet.html) htmlwidget:

library(leaflet)
library(ipapi)
library(maps)
 
# get top 500 domains
sites <- read.csv("http://moz.com/top500/domains/csv", stringsAsFactors=FALSE)
 
# make reproducible
set.seed(1492)
 
# pick out a random 50
sites <- sample(sites$URL, 50) 
sites <- gsub("/", "", sites)
locations <- geolocate(sites)
 
# take a quick look
dplyr::glimpse(locations)
 
## Observations: 50
## Variables:
## $ as          (fctr) AS2635 Automattic, Inc, AS15169 Google Inc., AS3561...
## $ city        (fctr) San Francisco, Mountain View, Chesterfield, Mountai...
## $ country     (fctr) United States, United States, United States, United...
## $ countryCode (fctr) US, US, US, US, US, US, JP, US, US, IT, US, US, US,...
## $ isp         (fctr) Automattic, Google, Savvis, Google, Level 3 Communi...
## $ lat         (dbl) 37.7484, 37.4192, 38.6631, 37.4192, 38.0000, 33.7516...
## $ lon         (dbl) -122.4156, -122.0574, -90.5771, -122.0574, -97.0000,...
## $ org         (fctr) Automattic, Google, Savvis, Google, AddThis, Peer 1...
## $ query       (fctr) 192.0.80.242, 74.125.227.239, 206.132.6.134, 74.125...
## $ region      (fctr) CA, CA, MO, CA, , GA, 13, MA, TX, , MA, TX, CA, , ,...
## $ regionName  (fctr) California, California, Missouri, California, , Geo...
## $ status      (fctr) success, success, success, success, success, succes...
## $ timezone    (fctr) America/Los_Angeles, America/Los_Angeles, America/C...
## $ zip         (fctr) 94110, 94043, 63017, 94043, , 30303, , 02142, 78218...
 
# all i want is the world!
world <- map("world", fill = TRUE, plot = FALSE) 
 
# kick out a a widget
leaflet(data=world) %>% 
  addTiles() %>% 
  addCircleMarkers(locations$lon, locations$lat, 
                   color = '#ff0000', popup=sites)

50 Random Top Sites

Cover image from Data-Driven Security
Amazon Author Page

2 Comments New R Package – ipapi (IP/Domain Geolocation)

  1. Brian High

    The URL for the sites CSV link needs to be “https” in order for the example code to work properly.

    sites <- read.csv(“https://moz.com/top500/domains/csv”, stringsAsFactors=FALSE)

    Reply

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.