20 Day 18: Globe

20.1 Technologies/Techniques

  • Using {threejs}

20.2 Data Source: Internet Attack Data from Rapid7’s Heisenberg Network

The $DAYJOB (as of the production date of this tome) provides lots of fun data to work with. Sadly, most of it doesn’t belong in a world-geography context. However, IP addresses do have some notion of geolocation information associated with them. I decided to abuse my data for today’s challenge and use the {threejs} package — which wraps the three.js javascript library57 — to plot honeypot attack data on a 3D globe (i.e. a 3D “pew pew” map).

library(threejs)
library(tidyverse)

There are two data files for this, one is just source data with totals for a period of time. The other is source & destination data. You’ll be using that one in the “Try This at Home” section.

We can do many things with {threejs}, but for today’s challeng we’re just going to plot 3D bars on top of longitude/latitude points. We’re going to assign colors to those bars based on a logarithmic cut of the data into 11 bins (internet data distributions are almost always going to come out on a log scale of some sort).

attacks <- read_csv(here::here("data/attacks.csv"))
attacks$col <- scales::brewer_pal(palette = "RdYlBu", direction = -1)(11)[cut(log10(attacks$n), 11)]

glimpse(attacks)
## Observations: 52,389
## Variables: 4
## $ src_latitude  <dbl> 24.0740, 29.8510, 55.0425, 37.2744, -6.8048, -6.7063, 2…
## $ src_longitude <dbl> 120.5384, 112.6568, 59.0400, 9.8739, 110.8405, 108.5570…
## $ n             <dbl> 10353785, 493885, 937465, 49814, 2777494, 1444967, 2057…
## $ col           <chr> "#F46D43", "#FDAE61", "#FDAE61", "#FFFFBF", "#F46D43", …

read_csv(here::here("data/attacks2.csv")) %>% 
  filter(complete.cases(.)) %>% 
  mutate_at(vars(-n), ~round(.x, 0)) %>%
  count(src_latitude, src_longitude, dst_latitude, dst_longitude, wt=n) -> attacks2

attacks2$col <- scales::brewer_pal(palette = "RdYlBu", direction = -1)(11)[cut(log10(attacks2$n), 11)]

glimpse(attacks)
## Observations: 52,389
## Variables: 4
## $ src_latitude  <dbl> 24.0740, 29.8510, 55.0425, 37.2744, -6.8048, -6.7063, 2…
## $ src_longitude <dbl> 120.5384, 112.6568, 59.0400, 9.8739, 110.8405, 108.5570…
## $ n             <dbl> 10353785, 493885, 937465, 49814, 2777494, 1444967, 2057…
## $ col           <chr> "#F46D43", "#FDAE61", "#FDAE61", "#FFFFBF", "#F46D43", …

20.3 Drawing the Map

You’d think making a 3D interactive globe with data on it would be hard, but {threejs} makes it super simple. Give it the points, value (which we’re scaling so it’ll look better) for the bar height and color, along with how big the points are at the bottom and whether we want some type of atmospheric effect and we’re good to go:

globejs(
  lat = attacks$src_latitude,
  long = attacks$src_longitude,
  value = attacks$n/100000,
  color = attacks$col,
  pointsize = 0.5,
  atmosphere = TRUE
) 

20.4 In Review

We experimented with using a 3D globe to plot honeypot attack data. It turned out to be much less work than one might expect.

20.5 Try This At Home

Use the attacks2 data to do the same globe but with destination data.

Use the {threejs} capability to draw lines58 and plot source/destination lines for subsets of the attacks2 data.