I follow the most excellent Pew Research folks on Twitter to stay in tune with what’s happening (statistically speaking) with the world. Today, they tweeted this excerpt from their 2015 Global Attitudes survey:
The age gap in social media use around the world https://t.co/0Dq1PcbExG pic.twitter.com/9HBM7gLxwR
— PewResearch Internet (@pewinternet) April 17, 2016
I thought it might be helpful to folks if I made a highly aesthetically tuned version of Pew’s chart (though I chose to go a bit more minimal in terms of styling than they did) with the new geom_dumbbell()
in the development version of ggalt
. The source (below) is annotated, but please drop a note in the comments if any of the code would benefit from more exposition.
I’ve also switched to using the Prism javascript library starting with this post after seeing how well it works in RStudio’s flexdashboard
package. If the “light on black” is hard to read or distracting, drop a note here and I’ll switch the theme if enough folks are having issues.
library(ggplot2) # devtools::install_github("hadley/ggplot2")
library(ggalt) # devtools::install_github("hrbrmstr/ggalt")
library(dplyr) # for data_frame() & arrange()
# I'm not crazy enough to input all the data; this will have to do for the example
df <- data_frame(country=c("Germany", "France", "Vietnam", "Japan", "Poland", "Lebanon",
"Australia", "South\nKorea", "Canada", "Spain", "Italy", "Peru",
"U.S.", "UK", "Mexico", "Chile", "China", "India"),
ages_35=c(0.39, 0.42, 0.49, 0.43, 0.51, 0.57,
0.60, 0.45, 0.65, 0.57, 0.57, 0.65,
0.63, 0.59, 0.67, 0.75, 0.52, 0.48),
ages_18_to_34=c(0.81, 0.83, 0.86, 0.78, 0.86, 0.90,
0.91, 0.75, 0.93, 0.85, 0.83, 0.91,
0.89, 0.84, 0.90, 0.96, 0.73, 0.69),
diff=sprintf("+%d", as.integer((ages_18_to_34-ages_35)*100)))
# we want to keep the order in the plot, so we use a factor for country
df <- arrange(df, desc(diff))
df$country <- factor(df$country, levels=rev(df$country))
# we only want the first line values with "%" symbols (to avoid chart junk)
# quick hack; there is a more efficient way to do this
percent_first <- function(x) {
x <- sprintf("%d%%", round(x*100))
x[2:length(x)] <- sub("%$", "", x[2:length(x)])
x
}
gg <- ggplot()
# doing this vs y axis major grid line
gg <- gg + geom_segment(data=df, aes(y=country, yend=country, x=0, xend=1), color="#b2b2b2", size=0.15)
# dum…dum…dum!bell
gg <- gg + geom_dumbbell(data=df, aes(y=country, x=ages_35, xend=ages_18_to_34),
size=1.5, color="#b2b2b2", point.size.l=3, point.size.r=3,
point.colour.l="#9fb059", point.colour.r="#edae52")
# text below points
gg <- gg + geom_text(data=filter(df, country=="Germany"),
aes(x=ages_35, y=country, label="Ages 35+"),
color="#9fb059", size=3, vjust=-2, fontface="bold", family="Calibri")
gg <- gg + geom_text(data=filter(df, country=="Germany"),
aes(x=ages_18_to_34, y=country, label="Ages 18-34"),
color="#edae52", size=3, vjust=-2, fontface="bold", family="Calibri")
# text above points
gg <- gg + geom_text(data=df, aes(x=ages_35, y=country, label=percent_first(ages_35)),
color="#9fb059", size=2.75, vjust=2.5, family="Calibri")
gg <- gg + geom_text(data=df, color="#edae52", size=2.75, vjust=2.5, family="Calibri",
aes(x=ages_18_to_34, y=country, label=percent_first(ages_18_to_34)))
# difference column
gg <- gg + geom_rect(data=df, aes(xmin=1.05, xmax=1.175, ymin=-Inf, ymax=Inf), fill="#efefe3")
gg <- gg + geom_text(data=df, aes(label=diff, y=country, x=1.1125), fontface="bold", size=3, family="Calibri")
gg <- gg + geom_text(data=filter(df, country=="Germany"), aes(x=1.1125, y=country, label="DIFF"),
color="#7a7d7e", size=3.1, vjust=-2, fontface="bold", family="Calibri")
gg <- gg + scale_x_continuous(expand=c(0,0), limits=c(0, 1.175))
gg <- gg + scale_y_discrete(expand=c(0.075,0))
gg <- gg + labs(x=NULL, y=NULL, title="The social media age gap",
subtitle="Adult internet users or reported smartphone owners who\nuse social networking sites",
caption="Source: Pew Research Center, Spring 2015 Global Attitudes Survey. Q74")
gg <- gg + theme_bw(base_family="Calibri")
gg <- gg + theme(panel.grid.major=element_blank())
gg <- gg + theme(panel.grid.minor=element_blank())
gg <- gg + theme(panel.border=element_blank())
gg <- gg + theme(axis.ticks=element_blank())
gg <- gg + theme(axis.text.x=element_blank())
gg <- gg + theme(plot.title=element_text(face="bold"))
gg <- gg + theme(plot.subtitle=element_text(face="italic", size=9, margin=margin(b=12)))
gg <- gg + theme(plot.caption=element_text(size=7, margin=margin(t=12), color="#7a7d7e"))
gg
15 Comments
This is beautifull, thank you. I think I can learn a lot by seeing how you did this.
Nice. But why leave Kenya out in your new chart? Because it is from Africa? Unconscious bias?
I left a ton out of the new chart. And, if you look carefully (vs approach it with malicious insinuation in mind) you’ll see I hand transcribed them in groups of six and stopped at 3 groups kinda b/c I had better things to do with my time. So, please feel free to redirect your typing energy into transcribing them all and provide the data so I can update the instructive post I made freely available. Thanks in advance for the data transcription offer!
I like a lot more the graphic design of your plot than the original one.
The positions of comment lines “# text below points” and “# text above points” seem to be swapped in the code listing.
Thanks for the always interesting posts!
Great job!!
I have question!
Is “geom_dumbbell” working?
I’ve got error message like below;
Error: could not find function “geom_dumbbell”
How do I fix it?
This:
library(ggalt) # devtools::install_github("hrbrmstr/ggalt")
is in the source.
I put the comment after the
library()
call to indicate folks probably need to installggalt
from github.Hi, I have ggalt installed and called/loaded, but still get the same error as HANSOO above. I ran the script for the geomdumbbell function on Github (https://github.com/hrbrmstr/ggalt/blob/master/R/geomdumbbell.R). When I call the ggplot object after adding the geom_dumbbell, I get the error: could not find function “%| |%”.
I have all the packages installed and loaded from your example (https://gist.github.com/hrbrmstr/0d206070cea01bcb0118) and am running R 3.2.2.
I’d appreciate your help! Thanks.
Whether Hadley or Thomas will admit it or not, they broke stuff in the latest ggplot2 release and I have yet to scrounge the time (seriously complex “real life” stuff started happening around July) to fix. I’ll try to get to it soon, tho.
Ok, thanks! I’ll stay tuned!
Thanks. This is amazing to learn. I picked up the use of sprintf.
Can I ask how to fix the custom font issue. I get the following error
Warning message:
In grid.Call.graphics(L_text, as.graphicsAnnot(x$label), x$x, x$y, :
font family not found in Windows font database
If I query the existing installed fonts, available to R as follows:
$sans
[1] “TT Arial”
$mono
[1] “TT Courier New”
I have tried package showtext but still not able to fix it. How does it work in your case.
Thanks
Was able to fix it, as follows:
font_import()
fonts()
loadfonts(device = “win”)
Based on http://stackoverflow.com/questions/13989644/xkcd-style-graph-error-with-registered-fonts
Thank you very much for posting this code, it’s excellent! Very much appreciated by the R community!
Great work. I like to portray time difference in goes_dubmbbell, I have learned a lot from your illustration. Thanks
Nice. I have a more complex version I’m grappling with: I’d love to see an example where where left hand side is “Before” and the right is “After”, but where the dots are color coded from red via yellow to green on a scale from 0-100%. So one can illustrate “Before it was at 10% (red), after it’s at 75% (yellow-green)”. I’m not sure this can even be done in the current incarnation of geom_dumbbell().
Dear Bob,
I used your codes for my own data set. However, i can not show text DIFF by using the follow command:
geom_text(data = dfEstonia, aes(label = DIFF, x = (h1 + h2) / 2, y = Country), fontface = “bold”, family = my_font, vjust = -2).
Here is full codes: http://rpubs.com/chidungkt/562726.
Would you show me the reason why DIFF can not shown in my codes?
Many thanks.
9 Trackbacks/Pingbacks
[…] I follow the most excellent Pew Research folks on Twitter to stay in tune with what’s happening (statistically speaking) with the world. Today, they tweeted this excerpt from their 2015 Global Attitudes survey: The age gap in social media use around the world https://t.co/0Dq1PcbExG pic.twitter.com/9HBM7gLxwR— PewResearch Internet (@pewinternet) April 17, 2016 I thought it might… Continue reading → […]
[…] article was first published on R – rud.is, and kindly contributed to […]
[…] The process relies on Bob Rudis’s ggalt package and the geom_dumbbell function, which does most of the heavy lifting. This tutorial is mostly a step-by-step recreation of Rudis’s code found here. […]
[…] (ggplot2) Exercising with (ggalt) dumbbells […]
[…] am stil lreading this: [rud.is/b/2016/04/17/ggplot2-exercising-with-ggalt-dumbbells/]. Any help or hint is highly […]
[…] (ggplot2) Exercising with (ggalt) dumbbells […]
[…] (ggplot2) Exercising with (ggalt) dumbbells […]
[…] (ggplot2) Exercising with (ggalt) dumbbells […]
[…] (ggplot2) Exercising with (ggalt) dumbbells […]