Making waffle charts in R (with the new ‘waffle’ package)

NOTE: The waffle package (sans JavaScript-y goodness) is up on CRAN so you can do an install.packages("waffle") and library(waffle) vs the devtools dance.

My disdain for pie charts is fairly well-known, but I do concede that there are times one needs to communicate parts of a whole graphically verses using just words or a table. When that need arises, I’m partial to “waffle charts” or “square pie charts”. @eagereyes did a great post a while ago on them (make sure to read the ‘debate’ between Robert and @hadleywickham in the comments, too), so head there for the low-down on them. Rather than have every waffle chart I make be a one-off creation, I made an R package for them.

There is currently one function in the package — waffle — and said function doesn’t mimic all the goodness of these charts as described in Robert’s post (yet). It does, however, do a pretty decent job covering the basics. Let’s take the oft-cited New York times “debt” graphic:

img

We can replicate that pretty closely in R. To make it as simple as possible, the waffle function takes a named numeric vector. If no names are specified, or you leave some names out, LETTERS will be used to fill in the gaps. The function takes your data quite literally, so if you give it a vector that sums up to, say, 10,000, then the function will try to create a ggplot object with 10,000 geom_rect elements. Needless to say, that’s a bad idea. So, I suggest using the raw numbers in the vector and passing in a scaled version of the vector to the function. That way, you can play with the values to get the desired look. Here’s the R version of of the NYT graphic:

# devtools::install_github("hrbrmstr/waffle")
library(waffle)
savings <- c(`Mortgage ($84,911)`=84911, `Auto and\ntuition loans ($14,414)`=14414, 
             `Home equity loans ($10,062)`=10062, `Credit Cards ($8,565)`=8565)
waffle(savings/392, rows=7, size=0.5, 
       colors=c("#c7d4b6", "#a3aabd", "#a0d0de", "#97b5cf"), 
       title="Average Household Savings Each Year", 
       xlab="1 square == $392")

savings

This package evolved from a teensy gist I made earlier this year to help communicate the scope of the Anthem data breach in the US. Since then, a recent breach at Premera occurred and added to the tally. Here’s two views of that data, one with one square equalling one million people and another with one square equalling ten million people (using the blue shade from each of the company’s logos):

parts <- c(`Un-breached\nUS Population`=(318-11-79), `Premera`=11, `Anthem`=79)
 
waffle(parts, rows=8, size=1, colors=c("#969696", "#1879bf", "#009bda"), 
       title="Health records breaches as fraction of US Population", 
       xlab="One square == 1m ppl")

320

waffle(parts/10, rows=3, colors=c("#969696", "#1879bf", "#009bda"), 
       title="Health records breaches as fraction of US Population", 
       xlab="One square == 10m ppl"

10

I’m betting that gets alot bluer by the end of the year.

The function returns a ggplot object, so fonts, sizes, etc can all be customized and the source is up on github for all to play with and contribute to.

Along with adding support for filling in the chart as shown in the @eagereyes post, there will also be an htmlwidget version coming as well. Standard drill applies: issues/enhancements to github issues, feedback and your own examples in the comments.

UPDATE

Thanks to a PR by @timelyportfolio, there is now a widget option in the package.

Cover image from Data-Driven Security
Amazon Author Page

17 Comments Making waffle charts in R (with the new ‘waffle’ package)

  1. Nicole Radziwill

    This package looks awesome!! But I don’t see it on CRAN. Is it available somewhere else (e.g. Github)? I’m looking forward to test driving it.

    Reply
    1. hrbrmstr

      Aye. I should have added a devtools::install_github("hrbrmstr/waffle") as well as the links to the github repository for it. I’ve modified the post, and thx for catching this. Also, I did the CRAN submission today, so if all goes well it’ll be in CRAN next week.

      Reply
  2. Michael Whitaker

    Really cool package and visualization. I was wondering if it might be possible to use icons instead of solid squares? Particularly with population data it would be nice in my opinion to use an icon of a person, along the lines of http://static1.squarespace.com/static/50060e33c4aa3dba773634ec/529d2463e4b0f373b2d5aecd/529d2465e4b0896ae07cf706/1386030188208/IncomeGuide2013Jan17RGBpage+15_15.png?format=1500w (from http://visualizingeconomics.com/blog/2013/12/2/income-distribution-in-the-united-states). I think it would be powerful even without scaling the icons. Many thanks thanks again for a great package!

    Reply
  3. Ruben C. Arslan (@_r_c_a)

    Oh man! I googled for this when I started working on exactly the same thing. Unfortunately I started one day before you put this blog post up and finished one day after you. What a coincidence :-)
    I used three different approaches though, using geomtext, _tile and _point, so may be that’s of interest to someone as well.
    https://twitter.com/
    rca/status/578612775188037632

    I tried getting close to the XKCD style (with drop shadows), but didn’t have that much success:
    http://rpubs.com/rubenarslan/waffle_plots

    Reply
  4. rawr

    That graphic isn’t too difficult with base r:

    waffle <- function(x, rows, cols = seq_along(x), …) {
    xx <- rep(cols, times = x)
    lx <- length(xx)
    m <- matrix(nrow = rows, ncol = (lx %/% rows) + (lx %% rows != 0))
    m[1:length(xx)] <- xx

    op <- par(no.readonly = TRUE)
    on.exit(par(op))

    par(list(…))
    plot.new()
    o <- cbind(c(row(m)), c(col(m))) + 1
    plot.window(xlim = c(0, max(o[, 2]) + 1), ylim = c(0, max(o[, 1]) + 1),
    asp = 1, xaxs = ‘i’, yaxs = ‘i’)
    rect(o[, 2], o[, 1], o[, 2] + .85, o[, 1] + .85, col = c(m), border = NA)

    invisible(list(m = m, o = o))
    }

    savings <- c(‘Mortgage ($84,911)’ = 84911, ‘Auto and\ntuition loans ($14,414)’=14414,
    ‘Home equity loans ($10,062)’ = 10062, ‘Credit cards\n($8,565)’ = 8565)

    pdf(‘./waffle.pdf’, width = 7, height = 3)
    waffle(savings / 392, rows = 7, bg = ‘cornsilk’, mar = c(0,0,0,3),
    cols = c(“#c7d4b6”, “#a3aabd”, “#a0d0de”, “#97b5cf”))
    legend(.87, 1.3, legend = ‘$392’, xpd = NA, bty = ‘n’, bg = ‘cornsilk’,
    col = ‘orange’, text.col = ‘grey50’, pch = 15, cex = .8, pt.cex = 1.5)

    xx <- c(-.05, .7, .8, .86)
    yy <- c(-.05, -.05, -.35, -.05)
    segments(x0 = xx, y0 = .05, y1 = yy, lty = ‘dotted’,
    lwd = 1, xpd = NA, col = ‘grey50’)
    text(xx, yy + .05, labels = names(savings), xpd = NA, cex = .6, col = ‘grey50’, pos = 1)

    mtext(‘Average household savings each year’, at = xx[1], font = 2,
    col = ‘grey50’, adj = 0, line = 1)
    mtext(‘Source: Federal Reserve’, side = 1, font = 3, at = xx[1],
    col = ‘grey50’, adj = 0, cex = .6, line = 2)
    dev.off()

    Reply
      1. hrbrmstr

        Thx.I’ll add some blocks later tonight when I add a new post showing an isotype pictogram version based on some of Ruben's ideas.

        Reply
  5. Максим Гальченко

    Thank you! Really good visualization!
    Can I add labels to segments as in New York times “debt” graphic?

    Reply
  6. carbonmetrics

    Good stuff ! What would be good is the possibility to use color brewer. You can, but yourplot + scalefillbrewer() will zap your legend. The good thing about brewer is that you dont have to manually set your colors if the number of categories is dynamic.

    Reply
  7. Edaphic (@edomaniac)

    So it’s a ggplot graphic? Does that mean you can theme it and remove the legend? When I tried it didn’t work.

    Reply
    1. @patternproject

      Issues/Suggestions:
      1. The output waffle chart using circle instead of circle work individually, but not when combined with “iron”.
      2. How to combine them in a grid of “social class” on one and “reasons” on the other.
      3. Starting point of individual waffle chart is different. Original starts from top-left, waffle starts from bottom-left.

      Reply
  8. Pingback: Using Waffle Charts in R to Analyze Visits to the Grand Canyon - Real Life <- Code

  9. Pingback: Waffle Chart vs. Dot Plot vs. Pie Charts - nandeshwar.info

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.