Fear of WaPo Using Bad Pie Charts Has Increased Since Last Year

I woke up this morning to a [headline story from the Washington Post](https://www.washingtonpost.com/news/the-fix/wp/2015/12/10/to-many-christian-terrorists-arent-true-christians-but-muslim-terrorists-are-true-muslims/) on _”Americans are twice as willing to distance Christian extremists from their religion as Muslims_”. This post is not about the content of the headline or story. It _is_ about the horrible pie chart WaPo led the article with:

This isn’t just a rant of a madman against pie charts. While I _am_ vehemently opposed to them, we did cover them [in our book](https://books.google.com/books?id=7DqwAgAAQBAJ&pg=PA146&lpg=PA146&dq=data-driven+security+pie+chart&source=bl&ots=Cy1iJylsHd&sig=a6Hz1JB-QYLq6H0VZJpPleJgRkQ&hl=en&sa=X&ved=0ahUKEwj79uqt_tjJAhVG0iYKHS0uDn4Q6AEIMzAH#v=onepage&q=data-driven%20security%20pie%20chart&f=false) and my co-author (@jayjacobs) and the incredibly talented @annkemery both agree there are often cases where they are appropriate. Even using their less-sensitive sensibilities, this would not be one of those cases.

So, what—exactly—is the problem? WaPo tried to enable comparison between pies by exploding them and using colors to indicate similar fear levels, mapping shades to entries in the top legend. Your eye has to move around a bit to take everything in and remember the mapping as you focus on each slice (since you will end up doing that given that each category colored differently). Their whole goal was to enable the reader to see the change in sentiment towards terrorism since this time last year.

Hrm. Two dates. Small set of values. Desire to quickly compare change in value/slope. **This sounds like a job for a slopegraph!**

The article and graphic are based on a [survey](http://publicreligion.org/research/2015/12/survey-nearly-half-of-americans-worried-that-they-or-their-family-will-be-a-victim-of-terrorism/). Thankfully the [complete survey data was made available](http://publicreligion.org/site/wp-content/uploads/2015/12/December-2015-PRRI-RNS-Topline1.pdf), which made it easy to do a makeover (in R of course). Here’s the result:

Each category change is clearly visible, you don’t need to remember color association and you even know the actual values^*.

The R code is below and in [this gist](https://gist.github.com/hrbrmstr/9bf4f93dffc1df48fe27). How would you make the WaPo chart better (drop a note in the comments with a link to your own makeover)?

library(tidyr)
library(ggplot2)
library(ggthemes)
library(scales)
library(dplyr)
 
# Easiest way to transcribe the PDF table
# The slope calculation will enable us to color the lines/points based on up/down
dat <- data_frame(`2014-11-01`=c(0.11, 0.22, 0.35, 0.31, 0.01),
                  `2015-12-01`=c(0.17, 0.30, 0.30, 0.23, 0.00),
                  slope=factor(sign(`2014-11-01` - `2015-12-01`)),
                  fear_level=c("Very worried", "Somewhat worried", "Not too worried",
                               "Not at all", "Don't know/refused"))
 
# Transform that into something we can use
dat <- gather(dat, month, value, -fear_level, -slope)
 
# We need real dates for the X-axis manipulation
dat <- mutate(dat, month=as.Date(as.character(month)))
 
# Since 2 categories have the same ending value, we need to
# take care of that (this is one of a few "gotchas" in slopegraph preparation)
end_lab <- dat %>%
  filter(month==as.Date("2015-12-01")) %>%
  group_by(value) %>%
  summarise(lab=sprintf("%s", paste(fear_level, collapse=", ")))
 
gg <- ggplot(dat)
 
# line
gg <- gg + geom_line(aes(x=month, y=value, color=slope, group=fear_level), size=1)
# points
gg <- gg + geom_point(aes(x=month, y=value, fill=slope, group=fear_level),
                      color="white", shape=21, size=2.5)
 
# left labels
gg <- gg + geom_text(data=filter(dat, month==as.Date("2014-11-01")),
                     aes(x=month, y=value, label=sprintf("%s — %s  ", fear_level, percent(value))),
                     hjust=1, size=3)
# right labels
gg <- gg + geom_text(data=end_lab,
                     aes(x=as.Date("2015-12-01"), y=value,
                         label=sprintf("  %s — %s", percent(value), lab)),
                     hjust=0, size=3)
 
# Here we do some slightly tricky x-axis formatting to ensure we have enough
# space for the in-panel labels, only show the months we need and have
# the month labels display properly
gg <- gg + scale_x_date(expand=c(0.125, 0),
                        labels=date_format("%b\n%Y"),
                        breaks=c(as.Date("2014-11-01"), as.Date("2015-12-01")),
                        limits=c(as.Date("2014-02-01"), as.Date("2016-12-01")))
gg <- gg + scale_y_continuous()
 
# I used colors from the article
gg <- gg + scale_color_manual(values=c("#f0b35f", "#177fb9"))
gg <- gg + scale_fill_manual(values=c("#f0b35f", "#177fb9"))
gg <- gg + labs(x=NULL, y=NULL, title="Fear of terror attacks (change since last year)\n")
gg <- gg + theme_tufte(base_family="Helvetica")
gg <- gg + theme(axis.ticks=element_blank())
gg <- gg + theme(axis.text.y=element_blank())
gg <- gg + theme(legend.position="none")
gg <- gg + theme(plot.title=element_text(hjust=0.5))
gg

^* Well, it’s survey. To add insult to injury, it’s a sentiment-based survey given right after a likely-to-be-attributed-terrorism attack. Also, there is a margin of error that isn’t communicated in either visualization. So while there is “data”, trust it at your own peril.

15 Comments → Fear of WaPo Using Bad Pie Charts Has Increased Since Last Year

Pingback: Fear of WaPo Using Bad Pie Charts Has Increased Since Last Year | OSINFO

Pingback: Fear of WaPo Using Bad Pie Charts Has Increased Since Last Year | Mubashir Qasim

jangorecki 2015-12-14 at 06:27

This is pretty sad as the number of deaths caused by modern militarized police forces are orders of magnitudes higher.

Reply ↓

Robert Young 2015-12-14 at 15:33

“So while there is “data”, trust it at your own peril.”

I’m not a fan of Big Data, but given this sort of topic, I’d be happier if the entire sampling plan were provided. ~1,000 observations is typical of such surveys (Gallup, et al have been in that ballpark for decades), but not, to my mind, sufficient. just one opinion.

Reply ↓

Tom Hiatt 2015-12-15 at 00:22

I applaud your efforts at making this clearer, but to be honest, the slope graph doesn’t do it for me. Is it just because I’m not so familiar with them? Maybe labels in the middle rather than duplicating them and not combing the label of the categories that both become 30%? Sad as it is, a stacked area chart would do a better job at conveying this information to me I think.

Reply ↓

Rick Scavetta 2015-12-15 at 01:08

Lovely slope plot, but the change in colours is confusing when making comparisons to the original pie charts. In the end, I think the appropriate solution would have been a likert plot, for which this data is perfectly suited.

Reply ↓

marco 2015-12-15 at 01:10

Sorry, but I think the original pie chart is clearer and more informative

Reply ↓

Richard Kolodziej 2015-12-15 at 03:44

I do not see a problem with this pie chart. There is a total of four categories which were grouped to “(rather) worried” or “(rather) not worried” by similar colors. The goal of this visualization was not to show the detailed changes inside these four categories over time from November 2014 to now but to show that people are now more worried than before. No need for numbers or detailed information about changes.

The pie chart shows that “not too worried” stayed mostly the same, while “very worried” and “somewhat worried” gained and “not at all worried” lost.

The slopegraph on the other hand takes more cognitive effort to understand and is too much for just showing that people are now worried more. Although, the slopegraph does indeed do the title of the pie chart justice “Fear of terror attacks has INCREASED since last year”. “Increase” is a process and the slopegraph shows detailed information about this process.

“People now more worried of terror attacks than last year” would fit better as a title for the pie chart.

Reply ↓

David Unwin 2015-12-15 at 04:34

They get worse. Cartographers have long known that use of an area to represent a scalar quantity is dangerous and leads to very poor estimation of the actual and relative magnitudes (check out proportionate symbol mapping) and even suggested an empirical correction for it (Flannery’s law). To compound the problem consider also the Excel way in which the pies are shown in psuedo-3D, which is really crazy.

Reply ↓

Kristina 2015-12-15 at 05:05

The alternative shown 1) serves different purpose; 2) is more difficutl to read and understant; 3) is simply not nice to look at. So what was the purpose of this article? To show other options – ok. To critisize – fail.

Reply ↓

Rob Knell 2015-12-15 at 07:39

I’m afraid I also think the slopegraph isn’t much of an improvement on the pie charts. The combination of “Somewhat worried” and “Not too worried” with the same value for 2015 is particularly confusing. I would just present these data with a straightforward bar chart – since what’s really important is the shape of the distribution of answers this allows the reader to easily compare the two years with a glance. For me at least this is a lot easier to interpret than the slopegraph. I can’t seem to put a figure in the comments but this code will make one.

fear<-matrix(c(31,23,35,30,22,30,11,17), nrow=2)
rownames(fear)<-c(“2014″,”2015”)
colnames(fear)<-c(“Not worried”,”Not too worried”,”Somewhat worried”,”Very worried”)
barplot(fear, beside=TRUE, ylab=”Percentage of responders”, col=c(“orange”,”steelblue”),legend=TRUE)

Reply ↓

Rob Knell 2015-12-15 at 07:41

Here’s the code with the carriage returns, I hope – they were stripped out in the previous version.

“`
fear<-matrix(c(31,23,35,30,22,30,11,17), nrow=2)

rownames(fear)<-c(“2014″,”2015”)

colnames(fear)<-c(“Not worried”,”Not too worried”,”Somewhat worried”,”Very worried”)

barplot(fear, beside=TRUE, ylab=”Percentage of responders”, col=c(“orange”,”steelblue”),legend=TRUE)
“`

Reply ↓

mwgrant 2015-12-15 at 09:51

I dropped WaPo several months ago. One reason was the poor graphics. They seemed enamoured with ‘pretty’ over communication to the extent that it cut into credibility.

Reply ↓

rr 2015-12-15 at 12:36

“`
dat <- data.frame(‘Nov 2014’ = c(0.11, 0.22, 0.35, 0.31, 0.01),
‘Dec 2015’ = c(0.17, 0.30, 0.30, 0.23, 0.00),
fear_level = c(“Very worried”, “Somewhat worried”, “Not too worried”,
“Not at all”, “Don’t know/refused”),
check.names = FALSE)

col <- c(‘gold2′,’dodgerblue2’)[grepl(‘(?i)n.t’, dat$fearlevel) + 1L]
plot(col(dat[, -3]), unlist(dat[, -3]), xlim = c(0,3), col = col, pch = 16,
ann = FALSE, axes = FALSE)
axis(1, at = 1:2, labels = names(dat)[1:2], lwd = 0)
segments(rep(1, 5), dat$Nov 2014, rep(2, 5), dat$Dec 2015, col = col, lwd = 2)
ttl <- sprintf(‘%s – %s%%’, dat$fearlevel, dat$Nov 2014 * 100)
ttr <- sprintf(‘%s – %s%%’, dat$fear_level, dat$Dec 2015 * 100)
text(rep(1, 5), dat$Nov 2014, ttl, adj = 1, pos = 2)
text(rep(2, 5), dat$Dec 2015 + c(0, -.01, .01, 0, 0), ttr, adj = 0, pos = 4)
“`

Reply ↓

Pingback: Fear of WaPo Using Bad Pie Charts Has Increased Since Last Year | OSINFO
Pingback: Fear of WaPo Using Bad Pie Charts Has Increased Since Last Year | OSINFO
Pingback: Fear of WaPo Using Bad Pie Charts Has Increased Since Last Year | Mubashir Qasim
jangorecki 2015-12-14 at 06:27

This is pretty sad as the number of deaths caused by modern militarized police forces are orders of magnitudes higher.

Reply ↓
Robert Young 2015-12-14 at 15:33

“So while there is “data”, trust it at your own peril.”

I’m not a fan of Big Data, but given this sort of topic, I’d be happier if the entire sampling plan were provided. ~1,000 observations is typical of such surveys (Gallup, et al have been in that ballpark for decades), but not, to my mind, sufficient. just one opinion.

Reply ↓
Tom Hiatt 2015-12-15 at 00:22

I applaud your efforts at making this clearer, but to be honest, the slope graph doesn’t do it for me. Is it just because I’m not so familiar with them? Maybe labels in the middle rather than duplicating them and not combing the label of the categories that both become 30%? Sad as it is, a stacked area chart would do a better job at conveying this information to me I think.

Reply ↓
Rick Scavetta 2015-12-15 at 01:08

Lovely slope plot, but the change in colours is confusing when making comparisons to the original pie charts. In the end, I think the appropriate solution would have been a likert plot, for which this data is perfectly suited.

Reply ↓
marco 2015-12-15 at 01:10

Sorry, but I think the original pie chart is clearer and more informative

Reply ↓
Richard Kolodziej 2015-12-15 at 03:44

I do not see a problem with this pie chart. There is a total of four categories which were grouped to “(rather) worried” or “(rather) not worried” by similar colors. The goal of this visualization was not to show the detailed changes inside these four categories over time from November 2014 to now but to show that people are now more worried than before. No need for numbers or detailed information about changes.

The pie chart shows that “not too worried” stayed mostly the same, while “very worried” and “somewhat worried” gained and “not at all worried” lost.

The slopegraph on the other hand takes more cognitive effort to understand and is too much for just showing that people are now worried more. Although, the slopegraph does indeed do the title of the pie chart justice “Fear of terror attacks has INCREASED since last year”. “Increase” is a process and the slopegraph shows detailed information about this process.

“People now more worried of terror attacks than last year” would fit better as a title for the pie chart.

Reply ↓
David Unwin 2015-12-15 at 04:34

They get worse. Cartographers have long known that use of an area to represent a scalar quantity is dangerous and leads to very poor estimation of the actual and relative magnitudes (check out proportionate symbol mapping) and even suggested an empirical correction for it (Flannery’s law). To compound the problem consider also the Excel way in which the pies are shown in psuedo-3D, which is really crazy.

Reply ↓
Kristina 2015-12-15 at 05:05

The alternative shown 1) serves different purpose; 2) is more difficutl to read and understant; 3) is simply not nice to look at. So what was the purpose of this article? To show other options – ok. To critisize – fail.

Reply ↓
Rob Knell 2015-12-15 at 07:39

I’m afraid I also think the slopegraph isn’t much of an improvement on the pie charts. The combination of “Somewhat worried” and “Not too worried” with the same value for 2015 is particularly confusing. I would just present these data with a straightforward bar chart – since what’s really important is the shape of the distribution of answers this allows the reader to easily compare the two years with a glance. For me at least this is a lot easier to interpret than the slopegraph. I can’t seem to put a figure in the comments but this code will make one.

fear<-matrix(c(31,23,35,30,22,30,11,17), nrow=2)
rownames(fear)<-c(“2014″,”2015”)
colnames(fear)<-c(“Not worried”,”Not too worried”,”Somewhat worried”,”Very worried”)
barplot(fear, beside=TRUE, ylab=”Percentage of responders”, col=c(“orange”,”steelblue”),legend=TRUE)

Reply ↓
Rob Knell 2015-12-15 at 07:41

Here’s the code with the carriage returns, I hope – they were stripped out in the previous version.

“`
fear<-matrix(c(31,23,35,30,22,30,11,17), nrow=2)

rownames(fear)<-c(“2014″,”2015”)

colnames(fear)<-c(“Not worried”,”Not too worried”,”Somewhat worried”,”Very worried”)

barplot(fear, beside=TRUE, ylab=”Percentage of responders”, col=c(“orange”,”steelblue”),legend=TRUE)
“`

Reply ↓
mwgrant 2015-12-15 at 09:51

I dropped WaPo several months ago. One reason was the poor graphics. They seemed enamoured with ‘pretty’ over communication to the extent that it cut into credibility.

Reply ↓
rr 2015-12-15 at 12:36

“`
dat <- data.frame(‘Nov 2014’ = c(0.11, 0.22, 0.35, 0.31, 0.01),
‘Dec 2015’ = c(0.17, 0.30, 0.30, 0.23, 0.00),
fear_level = c(“Very worried”, “Somewhat worried”, “Not too worried”,
“Not at all”, “Don’t know/refused”),
check.names = FALSE)

col <- c(‘gold2′,’dodgerblue2’)[grepl(‘(?i)n.t’, dat$fearlevel) + 1L]
plot(col(dat[, -3]), unlist(dat[, -3]), xlim = c(0,3), col = col, pch = 16,
ann = FALSE, axes = FALSE)
axis(1, at = 1:2, labels = names(dat)[1:2], lwd = 0)
segments(rep(1, 5), dat$Nov 2014, rep(2, 5), dat$Dec 2015, col = col, lwd = 2)
ttl <- sprintf(‘%s – %s%%’, dat$fearlevel, dat$Nov 2014 * 100)
ttr <- sprintf(‘%s – %s%%’, dat$fear_level, dat$Dec 2015 * 100)
text(rep(1, 5), dat$Nov 2014, ttl, adj = 1, pos = 2)
text(rep(2, 5), dat$Dec 2015 + c(0, -.01, .01, 0, 0), ttr, adj = 0, pos = 4)
“`

Reply ↓

rud.is

"In God we trust. All others must bring data"

Fear of WaPo Using Bad Pie Charts Has Increased Since Last Year

Like this:

Related

15 Comments → Fear of WaPo Using Bad Pie Charts Has Increased Since Last Year

Leave a ReplyCancel reply

Share this:

Like this:

Related

15 Comments → Fear of WaPo Using Bad Pie Charts Has Increased Since Last Year

Leave a ReplyCancel reply