cartography Archives

Category Archives: cartography

Mapping Tornado Alley with R

I caught a re-tweet of this tweet by @harry_stevens:

THREAD: I wrote a post on @observablehq about a map I made today. It shows a typical day in the life of a graphics journalist: You never know what problems you'll have to solve on deadline! https://t.co/yRhW1wbLxN #d3js #dataviz 1/7 pic.twitter.com/7N6mmK0nz3

— Harry Stevens (@Harry_Stevens) May 16, 2019

Harry’s thread and Observable post are great on their own and both show the power and utility of Observable javascript notebooks.

However, the re-tweet (which I’m not posting because it’s daft) took a swipe at both Python & R. Now, I’m all for a good swipe at Python (mostly to ensure we never forget all those broken spacebars and tab keys that language has caused) but I’ll gladly defend it and R together when it comes to Getting Things Done, even on deadline.

Let’s walk through what one of us might have done had we been in the same scenario as Harry.

Mapping On A Deadline

So, we have to create a map of historical tornado frequency trends on deadline.

We emailed researchers and received three txt files. One is a set of latitudes, another longitudes, and the final one is the trend value. It’s gridded data.

Download that ZIP and pretend you got three files in email vs a nice ZIP and make a new RStudio project called “tornado” and put those three files in a local-to-the-project-root data/ directory. Let’s read them in and look at them:

library(hrbrthemes) # not 100% necessary but i like my ggplot2 theme(s) :-)
library(tidyverse)  # data wrangling & ggplot2

tibble(
  lat = scan(here::here("data/lats.txt")),
  lon = scan(here::here("data/lons.txt")),
  trend = scan(here::here("data/trends.txt"))
) -> tornado

You very likely never directly use the base::scan() function, but it’s handy here since we just have files of doubles with each value separated by whitespace. Now, let’s see what we have:

tornado
## # A tibble: 30,000 x 3
##      lat   lon trend
##    <dbl> <dbl> <dbl>
##  1 0.897 -180.     0
##  2 0.897 -179.     0
##  3 0.897 -178.     0
##  4 0.897 -176.     0
##  5 0.897 -175.     0
##  6 0.897 -174.     0
##  7 0.897 -173.     0
##  8 0.897 -172.     0
##  9 0.897 -170.     0
## 10 0.897 -169.     0
## # … with 29,990 more rows

summary(tornado)
##      lat               lon                 trend           
## Min.   : 0.8973   Min.   :-179.99808   Min.   :-0.4733610  
## 1st Qu.:22.0063   1st Qu.: -90.00066   1st Qu.: 0.0000000  
## Median :43.1154   Median :  -0.00323   Median : 0.0000000  
## Mean   :43.1154   Mean   :  -0.00323   Mean   : 0.0002756  
## 3rd Qu.:64.2245   3rd Qu.:  89.99419   3rd Qu.: 0.0000000  
## Max.   :85.3335   Max.   : 179.99161   Max.   : 0.6314569  

#+ grid-overview
ggplot(tornado, aes(lon, lat)) +
  geom_point(aes(color = trend))

#+ trend-overview
ggplot(tornado, aes(trend)) +
  geom_histogram() +
  scale_x_continuous(breaks = seq(-0.5, 0.5, 0.05))

Since we’re looking for trends (either direction) in just the United States the latitude and longitude ranges will need to be shrunk down a bit (it does indeed look like globally gridded data) and we’ll be able to shrink the data set a bit more since we only want to look at large or small tends.

We don’t really need modern R/ggplot2 mapping idioms for this project (i.e. the new {sf} ecosystem), so we’ll keep it “simple” (scare quotes since that’s a loaded term) and just use the built in maps and geom_map(). First, let’s get the U.S. states and extract their bounding boxes/limits:

maps::map("state", ".", exact = FALSE, plot = FALSE, fill = TRUE) %>% 
  fortify(map_obj) %>% 
  as_tibble() -> state_map

xlim <- range(state_map$long)
ylim <- range(state_map$lat)

NOTE: I tend not to use the handy ggplot::map_data() function since it ends up clobbering purrr::map() which I use heavily (though not in this post). I also try to use {sf} these days so this tends to not be an issue anymore anyway.

Now, let’s focus in on the target area in the original paper and the Axios article:

filter(
  tornado,
  between(lon, -107, xlim[2]), between(lat, ylim[1], ylim[2]), # -107 gets us ~left-edge of TX
  ((trend < -0.07) | (trend > 0.07)) # approximates notebook selection range
) -> tornado

#+ grid-overview-2
ggplot(tornado, aes(lon, lat)) +
  geom_point(aes(color = trend))

Now we’re getting close to our final solution.

As stated in the Observable notebook and implied by the word “grid” these dots are centroids of grid rectangles. This means we really want boxes, not points. The article got all fancy but it’s not really necessary since we can use ggplot2::geom_tile() to get us said boxes:

#+ grid-overview-3
ggplot(tornado, aes(lon, lat)) +
  geom_tile(aes(fill = trend, color = trend))

Now, we just need to add in map layers, and tweak some aesthetics to make it look like a map. We’ll start naively:

#+ map-1
ggplot() +
  geom_tile(
    data = tornado,
    aes(lon, lat, fill = trend, color = trend)
  ) +
  geom_map(
    data = state_map, map = state_map,
    aes(long, lat, map_id = region),
    color = "black", size = 0.125, fill = NA
  )

Our gridded data is definitely covering the right/same areas so we just need to make this more suitable for an article. We’ll use Harry’s palette and layer in U.S. state borders, an overall country border, and approximate the title and legend aesthetics:

#+ map-final
c(
  "#023858", "#045a8d", "#0570b0", "#3690c0", "#74a9cf",
  "#a6bddb", "#d0d1e6", "#ece7f2", "#fff7fb", "#ffffff",
  "#ffffcc", "#ffeda0", "#fed976", "#feb24c", "#fd8d3c",
  "#fc4e2a", "#e31a1c", "#bd0026", "#800026"
) -> grad_cols # colors from article

ggplot() +

  # tile layer

  geom_tile(
    data = tornado,
    aes(lon, lat, fill = trend, color = trend)
  ) +

  # state borders

  geom_map(
    data = state_map, map = state_map,
    aes(long, lat, map_id = region),
    color = ft_cols$slate, size = 0.125, fill = NA
  ) +

  # usa border

  borders("usa", colour = "black", size = 0.5) +

  # color scales

  scale_colour_gradientn(
    colours = grad_cols,
    labels = c("Fewer", rep("", 4), "More"),
    name = "Change in tornado frequency, 1979-2017"
  ) +
  scale_fill_gradientn(
    colours = grad_cols,
    labels = c("Fewer", rep("", 4), "More"),
    name = "Change in tornado frequency, 1979-2017"
  ) +

  # make it Albers-ish and ensure we can fit the borders in 

  coord_map(
    projection = "polyconic",
    xlim = scales::expand_range(range(tornado$lon), add = 2),
    ylim = scales::expand_range(range(tornado$lat), add = 2)
  ) +

  # tweak legend aesthetics

  guides(
    colour = guide_colourbar(
      title.position = "top", title.hjust = 0.5
    ),
    fill = guide_colourbar(
      title.position = "top", title.hjust = 0.5
    )
  ) +
  labs(
    x = NULL, y = NULL
  ) +
  theme_ipsum_rc(grid="") +
  theme(axis.text = element_blank()) +
  theme(legend.position = "top") +
  theme(legend.title = element_text(size = 16, hjust = 0.5)) +
  theme(legend.key.width = unit(4, "lines")) +
  theme(legend.key.height = unit(0.5, "lines"))

FIN

I went through some extra steps for folks new to R but the overall approach was at the very least equally as expedient as the Observable one and — despite the claims by the quite daft retweet — this is no less “shareable” or “reusable” than the Observable notebook. You can clone the repo (https://git.sr.ht/~hrbrmstr/tornado) and reuse this work immediately.

If you take a stab at an alternate approach — especially if you do use {sf} — definitely blog about it and drop a link here or on Twitter.

GeoJSON Version of CBC Quebec Ridings Hex Cartograms with Example Usage in R

The CBC covered the recent (as of the original post-time on this blog entry) Quebec elections and used a well-crafted hex grid map to display results:

They have a great ‘splainer on why they use this type of map.

Thinking that it may be useful for others, I used a browser Developer Tools inspector to yank out the javascript-created SVG and wrangled out the hexes using svg2geojson? and put them into a GeoJSON file along with some metadata that I extracted from the minified javascript from the CBC’s site and turned into a data frame using the V8? package. Since most of the aforementioned work was mouse clicking and ~8 (disjointed) lines of accompanying super-basic R code, there’s not really much to show wrangling-wise¹, but I can show an example of using the new GeoJSON file in R and the sf? package:

library(sf)
library(ggplot2)

# get the GeoJSON file from: https://gitlab.com/hrbrmstr/quebec-hex-ridings or https://github.com/hrbrmstr/quebec-hex-ridings
sf::st_read("quebec-ridings.geojson", quiet = TRUE, stringsAsFactors = FALSE) %>% 
  ggplot() +
  geom_sf(aes(fill = regionname)) +
  coord_sf(datum = NA) +
  ggthemes::scale_fill_tableau(name = NULL, "Tableau 20") +
  ggthemes::theme_map() +
  theme(legend.position = "bottom")

And, with a little more ggplot2-tweaking and some magick, we can even put it in the CBC-styled border:

library(sf)
library(magick)
library(ggplot2)

plt <- image_graph(1488, 1191, bg = "white")
sf::st_read("quebec-ridings.geojson", quiet=TRUE, stringsAsFactors=FALSE) %>% 
  ggplot() +
  geom_sf(aes(fill=regionname)) +
  coord_sf(datum=NA) +
  scale_x_continuous(expand=c(0,2)) +
  scale_y_continuous(expand=c(0,0)) +
  ggthemes::theme_map() +
  theme(plot.margin = margin(t=150)) +
  theme(legend.position = "none")
dev.off()

# get this bkgrnd img from the repo
image_composite(plt, image_read("imgs/background.png")) %>% 
  image_write("imgs/composite-map.png")

You can tweak the border color with magick? as needed and there’s a background2.png in the imgs directory in the repo that has the white inset that you can further composite as needed.

With a teensy bit of work you should be able adjust the stroke color via aes() to separate things as the CBC did.

FIN

It’s important to re-state that the CBC made the original polygons for the hexes (well, they made a set of grid points and open source software turned it into a set of SVG paths) and the background images. All I did was some extra bit of wrangling and conversionating².

¹ I can toss a screencast if there’s sufficient interest.
² Totally not a word.

Quick Hit: Using the New Equal Earth Projection in R

In my semi-daily run of brew update I noticed that proj4 had been updated to 5.2. I kinda “squeee“‘d since (as the release notes show) the Equal Earth projection was added to it (+proj=eqearth).

As the team who created the projection describes it: “The Equal Earth map projection is a new equal-area pseudocylindrical projection for world maps. It is inspired by the widely used Robinson projection, but unlike the Robinson projection, retains the relative size of areas. The projection equations are simple to implement and fast to evaluate. Continental outlines are shown in a visually pleasing and balanced way.”

They continue: “The Robinson and Equal Earth projections share a similar outer shape[…] Upon close inspection, however, the way that they differ becomes apparent. The Equal Earth with a height-to-width ratio of 1:2.05 is slightly wider than the Robinson at 1:1.97. On the Equal Earth, the continents in tropical and mid-latitude areas are more elongated (north to south) and polar areas are more flattened. This is a consequence of Equal Earth being equal-area in contrast to the Robinson that moderately exaggerates high-latitude areas.”

Here’s a comparison of it to other, similar, projections:

Map projections are a pretty nerdy thing, but this one even got the attention of Newsweek.

To use this new projection now in R, you’ll need to install the proj4 ? from source after upgrading to the new proj4 library. That was a simple brew upgrade for me and Linux users can do the various package manager incantations to get theirs as well. Windows users can be jealous for a while until updated package binaries make their way to CRAN (or switch to a real platform ?).

After a fresh source install of proj4 all you have to do is:

library(ggalt) # git[la|hu]b/hrbrmstr/hrbrthemes
library(hrbrthemes) # git[la|hu]b/hrbrmstr/hrbrthemes
library(ggplot2)

world <- map_data("world")

ggplot() +
  geom_map(
    map = world, data = world,
    aes(long, lat, map_id = region), 
    color = ft_cols$white, fill = ft_cols$slate,
    size = 0.125
  ) +
  coord_proj("+proj=eqearth") +
  labs(
    x = NULL, y = NULL,
    title = "Equal Earth Projection (+proj=eqearth)"
  ) +
  theme_ft_rc(grid="") +
  theme(axis.text=element_blank())

to get:

Remember, friends don't let friends use Mercator.

voteogram Is Now On CRAN

Earlier this year, I made a package that riffed off of ProPublica’s really neat voting cartograms (maps) for the U.S. House and Senate. You can see one for disaster relief spending in the House and one for the ACA “Skinny Repeal” in the Senate.

We can replicate both here with the voteogram package (minus the interactivity, for now):

library(voteogram)
library(ggplot2)

hr_566 <- roll_call(critter="house", number=115, session=1, rcall=566)

house_carto(hr_566) +
  coord_equal() +
  theme_voteogram()

sen_179 <- roll_call(critter="senate", number=115, session=1, rcall=179)

senate_carto(sen_179) +
  coord_equal() +
  theme_voteogram()

I think folks might have more fun with the roll_call() objects though:

str(hr_566)
## List of 29
##  $ vote_id              : chr "H_115_1_566"
##  $ chamber              : chr "House"
##  $ year                 : int 2017
##  $ congress             : chr "115"
##  $ session              : chr "1"
##  $ roll_call            : int 566
##  $ needed_to_pass       : int 282
##  $ date_of_vote         : chr "October 12, 2017"
##  $ time_of_vote         : chr "03:23 PM"
##  $ result               : chr "Passed"
##  $ vote_type            : chr "2/3 YEA-AND-NAY"
##  $ question             : chr "On Motion to Suspend the Rules and Agree"
##  $ description          : chr "Providing for the concurrence by the House in the Senate amendment to H.R. ## 2266, with an amendment"
##  $ nyt_title            : chr "On Motion to Suspend the Rules and Agree"
##  $ total_yes            : int 353
##  $ total_no             : int 69
##  $ total_not_voting     : int 11
##  $ gop_yes              : int 164
##  $ gop_no               : int 69
##  $ gop_not_voting       : int 7
##  $ dem_yes              : int 189
##  $ dem_no               : int 0
##  $ dem_not_voting       : int 5
##  $ ind_yes              : int 0
##  $ ind_no               : int 0
##  $ ind_not_voting       : int 0
##  $ dem_majority_position: chr "Yes"
##  $ gop_majority_position: chr "Yes"
##  $ votes                :Classes ‘tbl_df’, ‘tbl’ and 'data.frame':  435 obs. of  11 variables:
##   ..$ bioguide_id         : chr [1:435] "A000374" "A000370" "A000055" "A000371" ...
##   ..$ role_id             : int [1:435] 274 294 224 427 268 131 388 320 590 206 ...
##   ..$ member_name         : chr [1:435] "Ralph Abraham" "Alma Adams" "Robert B. Aderholt" "Pete Aguilar" ...
##   ..$ sort_name           : chr [1:435] "Abraham" "Adams" "Aderholt" "Aguilar" ...
##   ..$ party               : chr [1:435] "R" "D" "R" "D" ...
##   ..$ state_abbrev        : chr [1:435] "LA" "NC" "AL" "CA" ...
##   ..$ display_state_abbrev: chr [1:435] "La." "N.C." "Ala." "Calif." ...
##   ..$ district            : int [1:435] 5 12 4 31 12 3 2 19 36 2 ...
##   ..$ position            : chr [1:435] "Yes" "Yes" "Yes" "Yes" ...
##   ..$ dw_nominate         : num [1:435] 0.493 -0.462 0.36 -0.273 0.614 0.684 0.388 NA 0.716 NA ...
##   ..$ pp_id               : chr [1:435] "LA_5" "NC_12" "AL_4" "CA_31" ...
##  - attr(*, "class")= chr [1:2] "pprc" "list"

as they hold tons of info on the votes.

I need to explore the following a bit more but there are some definite “patterns” in the way the 115th Senate has voted this year:

library(hrbrthemes)

# I made a mistake in how I exposed these that I'll correct next month
# but we need to munge it a bit anyway for this view
fills <- voteogram:::vote_carto_fill
names(fills) <- tolower(names(fills))

rcalls <- map(1:280, ~voteogram::roll_call(critter="senate", session=1, number=115, rcall=.x))
# save it off so you don't have to waste those calls again
write_rds(rcalls, "2017-115-1-sen-280-roll-calls.rds")

# do a bit of wrangling
map_df(rcalls, ~{
  mutate(.x$votes, vote_id = .x$vote_id) %>% 
    arrange(party, position) %>% 
    mutate(fill = tolower(sprintf("%s-%s", party, position))) %>% 
    mutate(ques = .x$question) %>% 
    mutate(x = 1:n())
}) -> votes_df

# plot it
ggplot(votes_df, aes(x=x, y=vote_id, fill=fill)) +
  geom_tile() +
  scale_x_discrete(expand=c(0,0)) +
  scale_y_discrete(expand=c(0,0)) +
  scale_fill_manual(name=NULL, values=fills) +
  labs(x=NULL, y=NULL, title="Senate Roll Call Votes",
       subtitle="2017 / 115th, Session 1, Votes 1-280",
       caption="Note free-Y scales") +
  facet_wrap(~ques, scales="free_y", ncol=3) +
  theme_ipsum_rc(grid="") +
  theme(axis.text = element_blank()) +
  theme(legend.position="right")

Hopefully I’ll get some time to dig into the differences and report on anything interesting. If you get to it before me definitely link to your blog post in a comment!

FIN

I still want to make an htmlwidgets version of the plots and also add the ability to get the index of roll call votes by Congress number and session to make it easier to iterate.

I’m also seriously considering creating different palettes. I used the ones from the source interactive site but am not 100% happy with them. Suggestions/PRs welcome.

Hopefully this package will make it easier for U.S. folks to track what’s going on in Congress and keep their representatives more accountable to the truth.

Everything’s on GitHub so please file issues, questions or PRs there.

Plot the Vote: Making U.S. Senate & House Cartograms in R

Political machinations are a tad insane in the U.S. these days & I regularly hit up @ProPublica & @GovTrack sites (& sub to the GovTrack e-mail updates) as I try to be an informed citizen, especially since I’ve got a Senator and Representative who seem to be in the sway of ?.

I’ve always appreciated the ProPublica and GovTrack cartograms as they present a great deal of information in a compact space (especially the House versions). Something nudged me into starting an R package to let folks create them in R (mainly with ggplot2 but an htmlwidget version is planned), which I’ve dubbed voteogram.

With the voteogram package, you can:

pull ProPublica roll call vote data for the 101st Congress up through today (via roll_call())
plot ProPublica-esque Senate roll call vote cartograms
plot ProPublica-esque House roll call vote cartograms
plot GovTrack-esque House roll call vote cartograms

GovTrack uses — what I’ve seen @thosjleeper refer to as — a “parliamentary plot” for their version of the Senate roll call cartogram and sir Leeper already has that type of plot covered in ggparliament, so I’ve just focused on the other ones here.

Roll Call

You need data for these cartogram generation functions and you can specify your own populated data frame (the needed columns are in the manual pages for the cartogram plotters). However, you’ll likely want to plot existing data that others have tallied and ProPublica makes that super simple since each vote is in a standalone JSON file. All you have to do is specify whether you want the roll call vote for the house or senate, the Congress number (current one is 115), the session number (current one is 1) and the roll call vote number.

For example, we can see all the ~~idiots~~ Representatives who voted, recently, to ~~kill people~~ repeal the ACA with the following function call:

(h256 <- roll_call("house", 115, 1, 256))
## 115th Congress / Session: 1 / House Roll Call: 256 / May  4, 2017
## 
## American Health Care Act
## 
## Result: Passed

str(h256, max.level = 1)
## List of 29
##  $ vote_id              : chr "H_115_1_256"
##  $ chamber              : chr "House"
##  $ year                 : int 2017
##  $ congress             : chr "115"
##  $ session              : chr "1"
##  $ roll_call            : int 256
##  $ needed_to_pass       : int 216
##  $ date_of_vote         : chr "May  4, 2017"
##  $ time_of_vote         : chr "02:18 PM"
##  $ result               : chr "Passed"
##  $ vote_type            : chr "RECORDED VOTE"
##  $ question             : chr "On Passage"
##  $ description          : chr "American Health Care Act"
##  $ nyt_title            : chr "On Passage"
##  $ total_yes            : int 217
##  $ total_no             : int 213
##  $ total_not_voting     : int 1
##  $ gop_yes              : int 217
##  $ gop_no               : int 20
##  $ gop_not_voting       : int 1
##  $ dem_yes              : int 0
##  $ dem_no               : int 193
##  $ dem_not_voting       : int 0
##  $ ind_yes              : int 0
##  $ ind_no               : int 0
##  $ ind_not_voting       : int 0
##  $ dem_majority_position: chr "No"
##  $ gop_majority_position: chr "Yes"
##  $ votes                :Classes ‘tbl_df’, ‘tbl’ and 'data.frame':  435 obs. of  11 variables:
##  - attr(*, "class")= chr [1:2] "pprc" "list"

As you can see, it has a custom print function and the usable data (for cartographic needs) is in $votes. You can go to town with just that information, making bar charts or tracking individual Congress-critter votes.

Do your best to cache this data as you retrieve it. ProPublica is a non-profit and the JSON files are on AWS. While there’s a certain number of free bits of bandwidth-per-month allotted buy Amazon’s S3 service, best to make sure you’re not tipping them over on any given month. Plus, the vote data doesn’t change once it’s recorded. Consider donating to them if you decided to always grab fresh copies.

There’s a fortify function for this object (it’s classed pprc) so you can pass it right into ggplot() for use or pipe it into a dplyr chain for aggregation & filtering.

House Rules

With the data in hand, we can make some cartograms (the real purpose of the package). I riffed off the ProPublica colors (and haven’t fully finished copying them yet as I need to search for 2 more categories of Independent voting colors) but you can replace them with anything you want. Just reset the scale and use the names in the exposed color value vectors.

There’s also a theme_voteogram() which is designed to augment any base theme (like hrbrthemes::theme_ipsum_rc()) (it’s much like ggthemes::theme_map()).

Here’s the ProPublica view for that particular vote:

house_carto(rep) +
  labs(x=NULL, y=NULL, 
       title="House Vote 256 - Passes American Health Care Act,\nRepealing Obamacare") +
  theme_ipsum_rc(plot_title_size = 24) +
  theme_voteogram()

The house_carto() function defaults to the ProPublica cartogram, but you can easily change that:

house_carto(rep, "gt") +
  labs(x=NULL, y=NULL, 
       title="House Vote 256 - Passes American Health Care Act,\nRepealing Obamacare") +
  theme_ipsum_rc(plot_title_size = 24) +
  theme_voteogram()

Senate Drools

Again, the senate_carto() function only has the ProPublica-esque cartogram available and works pretty much the same way after getting the Senate vote data:

sen <- roll_call("senate", 115, 1, 110)

senate_carto(sen) +
  labs(title="Senate Vote 110 - Invokes Cloture on Neil Gorsuch Nomination") +
  theme_ipsum_rc(plot_title_size = 24) +
  theme_voteogram()

FIN

There’s a bit of work left to do in the package (including an htmlwidget version). You’re invited to file PRs or Issues as you are so moved.

U.S. Drought Animations with the “Witch’s Brew” (purrr + broom + magick)

This is another purrr-focused post but it’s also an homage to the nascent magick package (R interface to ImageMagick) by @opencpu.

We’re starting to see/feel the impact of the increasing drought up here in southern Maine. I’ve used the data from the U.S. Drought Monitor before on the blog, but they also provide shapefiles and this seemed like a good opportunity to further demonstrate the utility of purrr and make animations directly using magick. Plus, I wanted to see the progression of the drought. Putting library() statements for purrr, magick and broom together was completely random, but I now feel compelled to find a set of functions to put into a cauldron package. But, I digress.

What does this demonstrate?

Apart from giving you an idea of the extent of the drought, working through this will help you:

use the quietly() function (which automagically turns off warnings for a function)
see another example of a formula function
illustrate the utility map_df(), and
see how to create an animation pipeline for magick

Comments are in the code and the drought gif is at the end. I deliberately only had it loop once, so refresh the image if you want to see the progression again. Also, drop a note in the comments if anything needs more exposition. (NOTE: I was fairly bad and did virtually no file cleanup in the function, so you’ll have half a year’s shapefiles in your getwd(). Consider the cleanup an exercise for the reader :-)

library(rgdal)
library(sp)
library(albersusa) # devtools::install_github("hrbrmstr/albersusa")
library(ggplot2) # devtools::install_github("hadley/ggplot2")
library(ggthemes)
library(rgeos)

# the witch's brew
library(purrr)
library(broom)
library(magick)

#' Get a drought map shapefile and turn it into a PNG
drought_map <- function(wk) {

  # need to hush some chatty functions
  hush_tidy <- quietly(tidy)

  # some are more stubbon than others
  old_warn <- getOption("warn")
  options(warn=-1)

  week <- format(wk, "%Y%m%d")

  # get the drought shapefile only if we don't have it already
  URL <- sprintf("http://droughtmonitor.unl.edu/data/shapefiles_m/USDM_%s_M.zip", week)
  (fil <- basename(URL))
  if (!file.exists(fil)) download.file(URL, fil)
  unzip(fil)

  # read in the shapefile and reduce the polygon complexity
  dr <- readOGR(sprintf("USDM_%s.shp", week),
                sprintf("USDM_%s", week),
                verbose=FALSE,
                stringsAsFactors=FALSE)

  dr <- SpatialPolygonsDataFrame(gSimplify(dr, 0.01, TRUE), dr@data)

  # turn separate out each drought level into its own fortified data.frame
  map(dr$DM, ~subset(dr, DM==.)) %>%
    map(hush_tidy) %>%
    map_df("result", .id="DM") -> m

  # get a conus base map (prbly cld have done map_data("usa"), too)
  usa_composite() %>%
    subset(!(iso_3166_2 %in% c("AK", "HI"))) %>%
    hush_tidy() -> usa

  usa <- usa$result # an artifact of using quietly()

  # this is all Ushey's fault. the utility of cmd-enter to run
  # the entire ggplot2 chain (in RStudio) turns out to have a
  # greater productity boost (i measured it) than my shortcuts for
  # gg <- gg + snippets and hand-editing the "+" bits out when
  # editing old plot constructs. I'm not giving up on gg <- gg + tho

  # Note putting the "base" layer on top since we don't really
  # want to deal with alpha levels of the drought polygons and
  # we're only plotting the outline of the us/states, not filling
  # the interior(s).

  ggplot() +
    geom_map(data=m, map=m,
             aes(long, lat, fill=DM, map_id=id),
             color="#2b2b2b", size=0.05) +
    geom_map(data=usa, map=usa, aes(long, lat, map_id=id),
             color="#2b2b2b88", fill=NA, size=0.1) +
    scale_fill_brewer("Drought Level", palette="YlOrBr") +
    coord_map("polyconic", xlim=c(-130, -65), ylim=c(25, 50)) +
    labs(x=sprintf("Week: %s", wk)) +
    theme_map() +
    theme(axis.title=element_text()) +
    theme(axis.title.x=element_text()) +
    theme(axis.title.y=element_blank()) +
    theme(legend.position="bottom") +
    theme(legend.direction="horizontal") -> gg

  options(warn=old_warn) # put things back the way they were

  outfil <- sprintf("gg-dm-%s.png", wk)
  ggsave(outfil, gg, width=8, height=5)

  outfil

}

# - create a vector of weeks (minus the current one)
# - create the individual map PNGs
# - read the individual map PNGs into a list
# - join the images together
# - create the animated gif structure
# - write the gif to a file

seq(as.Date("2016-01-05"), Sys.Date(), by="1 week") %>%
  head(-1) %>%
  map(drought_map) %>%
  map(image_read) %>%
  image_join() %>%
  image_animate(fps=2, loop=1) %>%
  image_write("drought.gif")

NOTE: an updated, comment-free version of the above code block is in this gist and uses spdplyr::filter() vs subset(), keeps downloaded files tidy in a temporary directory and includes a progress bar vs raw, ugly download.file() messages.

Easier Composite U.S. Choropleths with albersusa

2016-03-29 – 08:14
Posted in cartography, Charts & Graphs, Data Visualization, data wrangling, DataVis, DataViz, ggplot, maps, R
Tagged post
Comments (3)

Folks who’ve been tracking this blog on R-bloggers probably remember [this post](https://rud.is/b/2014/11/16/moving-the-earth-well-alaska-hawaii-with-r/) where I showed how to create a composite U.S. map with an Albers projection (which is commonly referred to as AlbersUSA these days thanks to D3).

I’m not sure why I didn’t think of this earlier, but you don’t _need_ to do those geographical machinations every time you want a prettier & more inclusive map (Alaska & Hawaii have been states for a while, so perhaps we should make more of an effort to include them in both data sets and maps). After doing the map transformations, the composite shape can be saved out to a shapefile, preferably GeoJSON since (a) you can use `geojsonio::geojson_write()` to save it and (b) it’s a single file vs a ZIP/directory.

I did just that and saved both state and country maps out with FIPS codes and other useful data slot bits and created a small data package : [`albersusa`](https://github.com/hrbrmstr/albersusa) : with some helper functions. It’s not in CRAN yet so you need to `devtools::install_github(“hrbrmstr/albersusa”)` to use it. The github repo has some basic examples, heres a slightly more complex one.

### Mapping Obesity

I grabbed an [obesity data set](http://www.cdc.gov/diabetes/data/county.html) from the CDC and put together a compact example for how to make a composite U.S. county choropleth to show obesity rates per county (for 2012, which is the most recent data). I read in the Excel file, pull out the county FIPS code and 2012 obesity rate, then build the choropleth. It’s not a whole lot of code, but that’s one main reason for the package!

library(readxl)
library(rgeos)
library(maptools)
library(ggplot2)   # devtools::install_github("hadley/ggplot2") only if you want subtitles/captions
library(ggalt)
library(ggthemes)
library(albersusa) # devtools::install_github("hrbrmstr/albersusa")
library(viridis)
library(scales)
 
# get the data and be nice to the server and keep a copy of the data for offline use
 
URL <- "http://www.cdc.gov/diabetes/atlas/countydata/OBPREV/OB_PREV_ALL_STATES.xlsx"
fil <- basename(URL)
if (!file.exists(fil)) download.file(URL, fil)
 
# it's not a horrible Excel file, but we do need to hunt for the data
# and clean it up a bit. we just need FIPS & 2012 percent info
 
wrkbk <- read_excel(fil)
obesity_2012 <- setNames(wrkbk[-1, c(2, 61)], c("fips", "pct"))
obesity_2012$pct <- as.numeric(obesity_2012$pct) / 100
 
# I may make a version of this that returns a fortified data.frame but
# for now, we just need to read the built-in saved shapefile and turn it
# into something ggplot2 can handle
 
cmap <- fortify(counties_composite(), region="fips")
 
# and this is all it takes to make the map below
 
gg <- ggplot()
gg <- gg + geom_map(data=cmap, map=cmap,
                    aes(x=long, y=lat, map_id=id),
                    color="#2b2b2b", size=0.05, fill=NA)
gg <- gg + geom_map(data=obesity_2012, map=cmap,
                    aes(fill=pct, map_id=fips),
                    color="#2b2b2b", size=0.05)
gg <- gg + scale_fill_viridis(name="Obesity", labels=percent)
gg <- gg + coord_proj(us_laea_proj)
gg <- gg + labs(title="U.S. Obesity Rate by County (2012)",
                subtitle="Content source: Centers for Disease Control and Prevention",
           caption="Data from http://www.cdc.gov/diabetes/atlas/countydata/County_ListofIndicators.html")
gg <- gg + theme_map(base_family="Arial Narrow")
gg <- gg + theme(legend.position=c(0.8, 0.25))
gg <- gg + theme(plot.title=element_text(face="bold", size=14, margin=margin(b=6)))
gg <- gg + theme(plot.subtitle=element_text(size=10, margin=margin(b=-14)))
gg

### Fin

Note that some cartographers think of this particular map view the way I look at a pie chart, but it’s a compact & convenient way to keep the states/counties together and will make it easier to include Alaska & Hawaii in your cartographic visualizations.

The composite GeoJSON files are in:

– `system.file(“extdata/composite_us_states.geojson.gz”, package=”albersusa”)`
– `system.file(“extdata/composite_us_counties.geojson.gz”, package=”albersusa”)`

if you want to use them in another program/context.

Drop an issue [on github](https://github.com/hrbrmstr/albersusa) if you want any more default fields in the data slot and if you “need” territories (I’d rather have a PR for the latter tho :-).

Nuclear Animations in D3

2016-03-27 – 22:30
Posted in cartography, Charts & Graphs, d3, Data Visualization, DataVis, DataViz
Tagged post
Comments (1)

As I said, I’m kinda obsessed with the “nuclear” data set. So much so that I made a D3 version that’s similar to the R version I made the other day. I tried not to code much today (too much Easter fun going on), so I left off the size & color legends, but it drops the bombs and fills the ~~radiation meters~~ bars as each year progresses.

Working in raw D3 reminds me how nice it is to work in R. Much cruft is abstracted away and the “API” in R (at least list ops and ggplot2) seem more consistent or at least less verbose.

One big annoyance (besides tweaking projection settings) is that I had to set:

script-src 'self' 'unsafe-eval';

on the [content security policy](http://content-security-policy.com/) since D3 uses “eval” a bit.

It’s embedded below and you can bust the frame via [this link](/projects/nucleard3/index.html). I made no effort to have it fit on tiny screens (it just requires fidgeting with the heights/widths of things), but it’s pretty complete code (including browser/touch icons, content security policy & open graph tags) and it’s annotated to help R folks map between ggplot2 and D3. If you build anything on it, drop a note in the comments.