A colleague asked if I would blog about how I crafted the grid of world tile grids in this post and I accepted the challenge. The technique isn’t too hard as it just builds on the initial work by Jon Schwabish and a handy file made by Maarten Lambrechts.
The Premise
For this particular use-case, I sifted through our internet scan data and classified a series of device families from their telnet banners then paired that with our country-level attribution data for each IPv4 address. I’m not generally “a fan” of rolling things up at a country level, but since many (most) of these devices are residential or small/medium-business routers, country-level attribution has some merit.
But, I’m also not a fan of country-level choropleths when it comes to “cyber” nor am I wont to area-skewed cartograms since most folks still cannot interpret them. Both of those take up a ton of screen real estate, too, espeically if you have more than one of them. Yet, I wanted to show a map-like structure without resorting to Hilbert IPv4 heatmaps since they are neither very readable by a general audience and become skewed when you have to move up from a 1 pixel == 1 Class C network block.
I think the tile grid is a great compromise since it avoids the “area”and projection skewness confusion that regular global choropleths cause while still preserving geographic & positional proximity. Sure, they’ll take some getting used to by casual readers, but I felt it was the best of all the tradeoffs.
The Setup
Here’s the data:
library(here)
library(hrbrthemes)
library(tidyverse)
wtg <- read_csv("https://gist.githubusercontent.com/maartenzam/787498bbc07ae06b637447dbd430ea0a/raw/9a9dafafb44d8990f85243a9c7ca349acd3a0d07/worldtilegrid.csv")
glimpse(wtg)
## Observations: 192
## Variables: 11
## $ name "Afghanistan", "Albania", "Algeria", "Angola",...
## $ alpha.2 "AF", "AL", "DZ", "AO", "AQ", "AG", "AR", "AM"...
## $ alpha.3 "AFG", "ALB", "DZA", "AGO", "ATA", "ATG", "ARG...
## $ country.code "004", "008", "012", "024", "010", "028", "032...
## $ iso_3166.2 "ISO 3166-2:AF", "ISO 3166-2:AL", "ISO 3166-2:...
## $ region "Asia", "Europe", "Africa", "Africa", "Antarct...
## $ sub.region "Southern Asia", "Southern Europe", "Northern ...
## $ region.code "142", "150", "002", "002", NA, "019", "019", ...
## $ sub.region.code "034", "039", "015", "017", NA, "029", "005", ...
## $ x 22, 15, 13, 13, 15, 7, 6, 20, 24, 15, 21, 4, 2...
## $ y 8, 9, 11, 17, 23, 4, 14, 6, 19, 6, 7, 2, 9, 8,...
routers <- read_csv(here::here("data", "routers.csv"))
routers
## # A tibble: 453,027 x 3
## type country_name country_code
##
## 1 mikrotik Slovak Republic SK
## 2 mikrotik Czechia CZ
## 3 mikrotik Colombia CO
## 4 mikrotik Bosnia and Herzegovina BA
## 5 mikrotik Czechia CZ
## 6 mikrotik Brazil BR
## 7 mikrotik Vietnam VN
## 8 mikrotik Brazil BR
## 9 mikrotik India IN
## 10 mikrotik Brazil BR
## # ... with 453,017 more rows
distinct(routers, type) %>%
arrange(type) %>%
print(n=11)
## # A tibble: 11 x 1
## type
##
## 1 asus
## 2 dlink
## 3 huawei
## 4 linksys
## 5 mikrotik
## 6 netgear
## 7 qnap
## 8 tplink
## 9 ubiquiti
## 10 upvel
## 11 zte
So, we have 11 different device families under assault by “VPNFilter” and I wanted to show the global distribution of them. Knowing the compact world tile grid would facet well, I set off to make it happen.
Let’s get some decent names for facet labels:
real_names <- read_csv(here::here("data", "real_names.csv"))
real_names
## # A tibble: 11 x 2
## type lab
##
## 1 asus Asus Device
## 2 dlink D-Link Devices
## 3 huawei Huawei Devices
## 4 linksys Linksys Devices
## 5 mikrotik Mikrotik Devices
## 6 netgear Netgear Devices
## 7 qnap QNAP Devices
## 8 tplink TP-Link Devices
## 9 ubiquiti Ubiquiti Devices
## 10 upvel Upvel Devices
## 11 zte ZTE Devices
Next, we need to summarise our scan results and pair it up the world tile grid data and our real names:
count(routers, country_code, type) %>% # summarise the data into # of device familes per country
left_join(wtg, by = c("country_code" = "alpha.2")) %>% # join them up on the common field
filter(!is.na(alpha.3)) %>% # we only want countries on the grid and maxmind attributes some things to meta-regions and anonymous proxies
left_join(real_names) -> wtg_routers
glimpse(wtg_routers)
## Observations: 629
## Variables: 14
## $ country_code "AE", "AE", "AE", "AF", "AF", "AF", "AG", "AL"...
## $ type "asus", "huawei", "mikrotik", "huawei", "mikro...
## $ n 1, 12, 70, 12, 264, 27, 1, 941, 2081, 7, 2, 1,...
## $ name "United Arab Emirates", "United Arab Emirates"...
## $ alpha.3 "ARE", "ARE", "ARE", "AFG", "AFG", "AFG", "ATG...
## $ country.code "784", "784", "784", "004", "004", "004", "028...
## $ iso_3166.2 "ISO 3166-2:AE", "ISO 3166-2:AE", "ISO 3166-2:...
## $ region "Asia", "Asia", "Asia", "Asia", "Asia", "Asia"...
## $ sub.region "Western Asia", "Western Asia", "Western Asia"...
## $ region.code "142", "142", "142", "142", "142", "142", "019...
## $ sub.region.code "145", "145", "145", "034", "034", "034", "029...
## $ x 20, 20, 20, 22, 22, 22, 7, 15, 15, 15, 20, 20,...
## $ y 10, 10, 10, 8, 8, 8, 4, 9, 9, 9, 6, 6, 6, 6, 1...
## $ lab "Asus Device", "Huawei Devices", "Mikrotik Dev...
Then, plot it:
ggplot(wtg_routers, aes(x, y, fill=n, group=lab)) +
geom_tile(color="#b2b2b2", size=0.125) +
scale_y_reverse() +
viridis::scale_fill_viridis(name="# Devices", trans="log10", na.value="white", label=scales::comma) +
facet_wrap(~lab, ncol=3) +
coord_equal() +
labs(
x=NULL, y=NULL,
title = "World Tile Grid Per-country Concentration of\nSeriously Poorly Configured Network Devices",
subtitle = "Device discovery based on in-scope 'VPNFilter' vendor device banner strings",
caption = "Source: Rapid7 Project Sonar & Censys"
) +
theme_ipsum_rc(grid="") +
theme(panel.background = element_rect(fill="#969696", color="#969696")) +
theme(axis.text=element_blank()) +
theme(legend.direction="horizontal") +
theme(legend.key.width = unit(2, "lines")) +
theme(legend.position=c(0.85, 0.1))
Doh! We forgot to ensure we had data for every country. Let’s try that again:
count(routers, country_code, type) %>%
complete(country_code, type) %>%
filter(!is.na(country_code)) %>%
left_join(wtg, c("country_code" = "alpha.2")) %>%
filter(!is.na(alpha.3)) %>%
left_join(real_names) %>%
complete(country_code, type, x=unique(wtg$x), y=unique(wtg$y)) %>%
filter(!is.na(lab)) %>%
ggplot(aes(x, y, fill=n, group=lab)) +
geom_tile(color="#b2b2b2", size=0.125) +
scale_y_reverse() +
viridis::scale_fill_viridis(name="# Devices", trans="log10", na.value="white", label=scales::comma) +
facet_wrap(~lab, ncol=3) +
coord_equal() +
labs(
x=NULL, y=NULL,
title = "World Tile Grid Per-country Concentration of\nSeriously Poorly Configured Network Devices",
subtitle = "Device discovery based on in-scope 'VPNFilter' vendor device banner strings",
caption = "Source: Rapid7 Project Sonar & Censys"
) +
theme_ipsum_rc(grid="") +
theme(panel.background = element_rect(fill="#969696", color="#969696")) +
theme(axis.text=element_blank()) +
theme(legend.direction="horizontal") +
theme(legend.key.width = unit(2, "lines")) +
theme(legend.position=c(0.85, 0.1))
That’s better.
We take advantage of ggplot2’s ability to facet and just ensure we have complete (even if NA
) tiles for each panel.
Once consumers start seeing these used more they’ll be able to pick up key markers (or one of us will come up with a notation that makes key markers more visible) and be able to get specific information from the chart. I just wanted to show regional and global differences between vendors (and really give MikroTik users a swift kick in the patootie for being so bad with their kit).
FIN
You can find the RStudio project (code + data) here: (http://rud.is/dl/tile-grid-grid.zip)
GDPR Unintended Consequences Part 1 — Increasing WordPress Blog Exposure
I pen this mini-tome on “GDPR Enforcement Day”. The spirit of GDPR is great, but it’s just going to be another Potempkin Village in most organizations much like PCI or SOX. For now, the only thing GDPR has done is made GDPR consulting companies rich, increased the use of javascript on web sites so they can pop-up useless banners we keep telling users not to click on and increase the size of email messages to include mandatory postscripts (that should really be at the beginning of the message, but, hey, faux privacy is faux privacy).
Those are just a few of the “unintended consequences” of GDPR. Just like Let’s Encrypt & “HTTPS Everywhere” turned into “Let’s Enable Criminals and Hurt Real People With Successful Phishing Attacks”, GDPR is going to cause a great deal of downstream issues that either the designers never thought of or decided — in their infinite, superior wisdom — were completely acceptable to make themselves feel better.
Today’s installment of “GDPR Unintended Consequences” is WordPress.
WordPress “powers” a substantial part of the internet. As such, it is a perma-target of attackers.
Since the GDPR Intelligentsia provided a far-too-long lead-time on both the inaugural and mandated enforcement dates for GDPR and also created far more confusion with the regulations than clarity, WordPress owners are flocking to “single button install” solutions to make them magically GDPR compliant (
#protip
that’s not “a thing”). Here’s a short list of plugins and active installation counts (no links since I’m not going to encourage attack surface expansion):I’m somewhat confident that a fraction of those publishers follow secure coding guidelines (it may be a small fraction). But, if I was an attacker, I’d be poking pretty hard at a few of those with six-figure installs to see if I could find a usable exploit.
GDPR just gave attackers a huge footprint of homogeneous resources to attempt at-scale exploits. They will very likely succeed (over-and-over-and-over again). This means that GDPR just increased the likelihood of losing your data privacy…the complete opposite of the intent of the regulation.
There are more unintended consequences and I’ll pepper the blog with them as the year and pain progresses.