Quick Hit: A Different (Diminutive) Look At Distributions With {ggeconodist}

Despite being a full-on denizen of all things digital I receive a fair number of dead-tree print magazines as there’s nothing quite like seeing an amazing, large, full-color print data-driven visualization up close and personal. I also like supporting data journalism through the subscriptions since without cash we will only have insane, extreme left/right-wing perspectives out there.

One of these publications is The Economist (I’d subscribe to the Financial Times as well for the non-liberal perspective but I don’t need another mortgage payment right now). The graphics folks at The Economist are Top Notch™ and a great source of inspiration to “do better” when cranking out visuals.

After reading a recent issue, one of their visualization styles stuck in my head. Specifically, the one from this story on the costs of a mammogram. I’ve put it below:

Essentially it’s a boxplot with outliers removed along with significantly different aesthetics than the one we’re all used to seeing. I would not use this for exploratory data analysis or working with other data science team members when poking at a problem but I really like the idea of making “distributions” easier to consume for general audiences and believe The Economist graphics folks have done a superb job focusing on the fundamentals (both statistical and aesthetic).

There are ways to hack something like those out manually in {ggplot2} but it would be nice to just be able to swap out something for geom_boxplot() when deciding to go to “production” with a distribution chart. Thus begat {ggeconodist}.

Since this is just a “quick hit” post we’ll avoid some interim blathering to note that we can use {ggplot2} (and a touch of {grid}) to make the following:

with just a tiny bit of R code:

library(ggeconodist)

ggplot(mammogram_costs, aes(x = city)) +
  geom_econodist(
    aes(ymin = tenth, median = median, ymax = ninetieth), stat = "identity"
  ) +
  scale_y_continuous(expand = c(0,0), position = "right", limits = range(0, 800)) +
  coord_flip() +
  labs(
    x = NULL, y = NULL,
    title = "Mammoscams",
    subtitle = "United States, prices for a mammogram*\nBy metro area, 2016, $",
    caption = "*For three large insurance companies\nSource: Health Care Cost Institute"
  ) +
  theme_econodist() -> gg

grid.newpage()
left_align(gg, c("subtitle", "title", "caption")) %>% 
  add_econodist_legend(econodist_legend_grob(), below = "subtitle") %>% 
  grid.draw()

FIN

A future post (and, even — perhaps — a new screen sharing video) will describe what went into making this new geom, stat, and theme (along with some info on how I managed to reproduce data for the vis since none was provided).

In the interim, hit up the CINC page on {ggeconodist} to learn more about the package. You may want to take a quick look at {hrbrthemes} since it might have some helpers for using all the required theming components.

So kick the tyres, file issues/PRs, and be on the lookout for the director’s cut of the making of {ggeconodist}.

Cover image from Data-Driven Security
Amazon Author Page

2 Comments Quick Hit: A Different (Diminutive) Look At Distributions With {ggeconodist}

  1. Pingback: Quick Hit: A Different (Diminutive) Look At Distributions With {ggeconodist} – Data Science Austria

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.