Statebins Reimagined

A long time ago, in a github repo far, far away there lived a tiny package that made it possible to create equal area, square U.S. state cartograms in R dubbed statebins🔗. Three years have come and gone and — truth be told — I’ve never been happy with that package. It never felt “right” and that gnawing feeling finally turned into action with a “re-imagining” of the API.

Previously, on statebins

There were three different functions in the old-style package:

  • one for discrete scales (it automated ‘cuts’)
  • one for continuous scales
  • one for manual scales

It also did some hack-y stuff with grobs to try to get things to look good without putting too much burden on the user.

All that “mostly” worked, but I always ended up doing some painful workaround when I needed more custom charts (not that I have to use this package much given the line of work I’m in).

Downsizing statebins

Now, there’s just one function for making the cartograms — statebins() — and another for applying a base theme — theme_statebins(). The minimalisation has some advantages that we’ll take a look at now, starting with the most basic example (the one on the manual page):

library(statebins)
library(tidyverse)

data(USArrests)

USArrests$state <- rownames(USArrests)
statebins(USArrests, value_col="Assault", name = "Assault") +
  theme_statebins(legend_position="right")

Two things should stand out there:

  • you got scale_fill_distiller() for free!
  • labels are dark/light depending on the tile color

Before we go into ^^, it may be helpful to show the new function interface:

library(statebins)
library(tidyverse)

statebins(state_data, state_col = "state", value_col = "value",
  dark_label = "black", light_label = "white", font_size = 3,
  state_border_col = "white", state_border_size = 2,
  ggplot2_scale_function = ggplot2::scale_fill_distiller, ...)

You pass in the state name/abbreviation & value columns like the old interface but also specify colors for the dark & light labels (set hex code color with 00 ending alpha values if you don’t want labels but Muricans are pretty daft and generally need the abbreviations on the squares). You can set the font size, too (we’ll do that in a bit) and customize the border color (usually to match the background of the target medium). BUT, you also pass in the ggplot2 scale function you want to use and the named parameters for it (that’s what the ... is for).

So, yes I’ve placed more of a burden on you if you want discrete cuts, but I’ve also made the package way more flexible and made it possible to keep the labels readable without you having to lift an extra coding finger.

The theme()-ing is also moved out to a separate theme function which makes it easier for you to further customize the final output.

But that’s not all!

There are now squares for Puerto Rico, the Virgin Islands and New York City (the latter two were primarily for new features/data in cdcfluview but they are good to have available). Let’s build out a larger example with some of these customizations (we’ll make up some data to do that):

library(statebins)
library(tidyverse)
library(viridis)

data(USArrests)

# make up some data for the example

rownames_to_column(USArrests, "state") %>%
  bind_rows(
    data_frame(
      state = c("Virgin Islands", "Puerto Rico", "New York City"),
      Murder = rep(mean(max(USArrests$Murder),3)),
      Assault = rep(mean(max(USArrests$Assault),3)),
      Rape = rep(mean(max(USArrests$Rape),3)),
      UrbanPop = c(93, 95, 100)
    )
  ) -> us_arrests

statebins(us_arrests, value_col="Assault",
          ggplot2_scale_function = viridis::scale_fill_viridis) +
  labs(title="USArrests + made up data") +
  theme_statebins("right")

Cutting to the chase

I still think it makes more sense to use binned data in these cartograms, and while you no longer get that for “free”, it’s not difficult to do:

adat <- suppressMessages(read_csv(system.file("extdata", "wapostates.csv", package="statebins")))

mutate(
  adat, 
  share = cut(avgshare94_00, breaks = 4, labels = c("0-1", "1-2", "2-3", "3-4"))
) %>% 
  statebins(
    value_col = "share", 
    ggplot2_scale_function = scale_fill_brewer,
    name = "Share of workforce with jobs lost or threatened by trade"
  ) +
  labs(title = "1994-2000") +
  theme_statebins()

More manual labor

You can also still use hardcoded colors, but it’s a little more work on your end (but not much!):

library(statebins)
library(tidyverse)

election_2012 <- suppressMessages(read_csv(system.file("extdata", "election2012.csv", package="statebins")))

mutate(election_2012, value = ifelse(is.na(Obama), "Romney", "Obama")) %>% 
  statebins(
    font_size=4, dark_label = "white", light_label = "white",
    ggplot2_scale_function = scale_fill_manual,
    name = "Winner",
    values = c(Romney = "#2166ac", Obama = "#b2182b")
  ) +
  theme_statebins()

BREAKING NEWS: Rounded corners

A Twitter request ended up turning into a new feature this afternoon (after I made this post) => rounded corners:

library(statebins)
library(tidyverse)
data(USArrests)

USArrests$state <- rownames(USArrests)
statebins(USArrests, value_col="Assault", name = "Assault", round=TRUE) +
  theme_statebins(legend_position="right")

MOAR BREAKING NEWS: geom & faceting

Thomas Wood suggested that faceting wld be nice, but that would really require a Geom. So, I took a stab at one:

library(statebins)
library(cdcfluview)
library(hrbrthemes)
library(tidyverse)

flu <- ili_weekly_activity_indicators(2017)

ggplot(flu, aes(state=statename, fill=activity_level)) +
  geom_statebins() +
  coord_equal() +
  viridis::scale_fill_viridis(
    name = "ILI Activity Level  ", limits=c(0,10), breaks=0:10, option = "magma", direction = -1
  ) +
  facet_wrap(~weekend) +
  labs(title="2017-18 Flu Season ILI Activity Level") +
  theme_statebins(base_family = font_ps) +
  theme(plot.title=element_text(size=16, hjust=0)) +
  theme(plot.margin = margin(30,30,30,30))

FIN

It’ll be a while before this hits CRAN and I’m not really planning on keeping the old interface when the submission happens. So, it’ll be on GitHub for a bit to let folks chime in on what additional features you want and whether you really need to keep the deprecated functions around in the package.

So, kick the tyres and don’t hesitate to shoot over some feedback!

Cover image from Data-Driven Security
Amazon Author Page

10 Comments Statebins Reimagined

  1. Brian Adams

    I’ve been trying to follow the example but when I submit the code below:

    statebins(USArrests, value<em>col="Assault", name = "Assault") +
      theme</em>statebins(legend_position="right")
    

    I get the error message

    Error in themebw(basefamily = basefamily, basesize = basesize, baselinesize = baselinesize, :
    unused arguments (base
    linesize = baselinesize, baserectsize = baserectsize)

    Any help greatly appreciated.

    Brian

    Reply
  2. Pingback: Statebins Reimagined – Mubashir Qasim

    1. hrbrmstr

      Scratch that. I keep forgetting I use a dev-version of ggplot2. I’ve modified the DESCRIPTION accordingly and another devtools install of it shld load what’s needed (or at least indicate what’s needed).

      Reply
  3. LF

    I love your graphic work. Just a quibble: your use of brightness in the blue color scales in this post is inconsistent. I find it much more intuitive for the dark blues to signify higher values.

    Reply
    1. hrbrmstr

      I showed different scales. You have the full freedom to use any scale and scale options. That was the one of the core points of the post.

      Reply
  4. AlexKulay

    Hi, everything’s cool! I have a question – a wish for you. Is it possible to make similar maps using data files with the extension (*.kml)
    Thank you!

    Reply
  5. Jason_M

    Nice work with statebins. I’m sure it will get some use at work.

    The reversal of the usual red/blue in the electoral map messed with my mind for a minute there. I have since recovered.

    Reply

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.