The kind folks over at @RStudio gave a nod to my recently CRAN-released
epidata package in their January data package roundup and I thought it might be useful to give it one more showcase using the recent CRAN update to
ggalt and the new
hrbrthemes (github-only for now) packages.
Labor force participation rate
The U.S. labor force participation rate (LFPR) is an oft-overlooked and under- or mis-reported economic indicator. I’ll borrow the definition from Investopedia:
Population age distributions and other factors are necessary to honestly interpret this statistic. Parties in power usually dismiss/ignore this statistic outright and their opponents tend to wholly embrace it for criticism (it’s an easy target if you’re naive). “Yay” partisan democracy.
Since the LFPR has nuances when looked at categorically, let’s take a look at it by attained education level to see how that particular view has changed over time (at least since the gov-quants have been tracking it).
We can easily grab this data with
epidata::get_labor_force_participation_rate()(and, we’ll setup some
library() calls while we’re at it:
library(epidata) library(hrbrthemes) # devtools::install_github("hrbrmstr/hrbrthemes") library(ggalt) library(tidyverse) library(stringi) part_rate <- get_labor_force_participation_rate("e") glimpse(part_rate) ## Observations: 457 ## Variables: 7 ## $ date <date> 1978-12-01, 1979-01-01, 1979-02-01, 1979-03-01, 1979-04-01, 1979-05-01... ## $ all <dbl> 0.634, 0.634, 0.635, 0.636, 0.636, 0.637, 0.637, 0.637, 0.638, 0.638, 0... ## $ less_than_hs <dbl> 0.474, 0.475, 0.475, 0.475, 0.475, 0.474, 0.474, 0.473, 0.473, 0.473, 0... ## $ high_school <dbl> 0.690, 0.691, 0.692, 0.692, 0.693, 0.693, 0.694, 0.694, 0.695, 0.696, 0... ## $ some_college <dbl> 0.709, 0.710, 0.711, 0.712, 0.712, 0.713, 0.712, 0.712, 0.712, 0.712, 0... ## $ bachelor's_degree <dbl> 0.771, 0.772, 0.772, 0.773, 0.772, 0.772, 0.772, 0.772, 0.772, 0.773, 0... ## $ advanced_degree <dbl> 0.847, 0.847, 0.848, 0.848, 0.848, 0.848, 0.847, 0.847, 0.848, 0.848, 0...
One of the easiest things to do is to use
ggplot2 to make a faceted line chart by attained education level. But, let’s change the labels so they are a bit easier on the eyes in the facets and switch the facet order from alphabetical to something more useful:
gather(part_rate, category, rate, -date) %>% mutate(category=stri_replace_all_fixed(category, "_", " "), category=stri_trans_totitle(category), category=stri_replace_last_regex(category, "Hs$", "High School"), category=factor(category, levels=c("Advanced Degree", "Bachelor's Degree", "Some College", "High School", "Less Than High School", "All"))) -> part_rate
Now, we’ll make a simple line chart, tweaking the aesthetics just a bit:
ggplot(part_rate) + geom_line(aes(date, rate, group=category)) + scale_y_percent(limits=c(0.3, 0.9)) + facet_wrap(~category, scales="free") + labs(x=paste(format(range(part_rate$date), "%Y-%b"), collapse=" to "), y="Participation rate (%)", title="U.S. Labor Force Participation Rate", caption="Source: EPI analysis of basic monthly Current Population Survey microdata.") + theme_ipsum_rc(grid="XY", axis="XY")
The “All” view is interesting in that the LFPR has held fairly “steady” between 60% & 70%. Those individual and fractional percentage points actually translate to real humans, so the “minor” fluctuations do matter.
It’s also interesting to see the direct contrast between the starting historical rate and current rate (you could also do the same with min/max rates, etc.) We can use a “dumbbell” chart to compare the 1978 value to today’s value, but we’ll need to reshape the data a bit first:
group_by(part_rate, category) %>% arrange(date) %>% slice(c(1, n())) %>% spread(date, rate) %>% ungroup() %>% filter(category != "All") %>% mutate(category=factor(category, levels=rev(levels(category)))) -> rate_range filter(part_rate, category=="Advanced Degree") %>% arrange(date) %>% slice(c(1, n())) %>% mutate(lab=lubridate::year(date)) -> lab_df
(We’ll be using the extra data frame to add labels the chart.)
Now, we can compare the various ranges, once again tweaking aesthetics a bit:
ggplot(rate_range) + geom_dumbbell(aes(y=category, x=`1978-12-01`, xend=`2016-12-01`), size=3, color="#e3e2e1", colour_x = "#5b8124", colour_xend = "#bad744", dot_guide=TRUE, dot_guide_size=0.25) + geom_text(data=lab_df, aes(x=rate, y=5.25, label=lab), vjust=0) + scale_x_percent(limits=c(0.375, 0.9)) + labs(x=NULL, y=NULL, title=sprintf("U.S. Labor Force Participation Rate %s-Present", lab_df$lab), caption="Source: EPI analysis of basic monthly Current Population Survey microdata.") + theme_ipsum_rc(grid="X")
One takeaway from both these charts is that it’s probably important to take education level into account when talking about the labor force participation rate. The
get_labor_force_participation_rate() function — along with most other functions in the
epidata package — also has options to factor the data by sex, race and age, so you can pull in all those views to get a more nuanced & informed understanding of this economic health indicator.
Pingback: Putting It All Together – sec.uno
Pingback: Putting It All Together | A bunch of data
Pingback: Putting It All Together – Cyber Security
I’m on a Mac, I had problems with your “themeipsumrc” call, but using just plain “theme_ipsum” worked fine. Apparently, I don’t have the “Roboto Condensed Light” font family on my computer.
There is a function provided to install the font and the help does say you need to also load it on your Mac as well as in R.
Thanks for posting this! I found it pretty interesting. Really appreciate your run-down of the code and the helpful explanation too…just one question code-wise: do you know why the dates need to be enclosed in
marks to define the x-axis limits for the dumbbells in the dumbbell chart?
Pingback: Linkdump #31 | WZB Data Science Blog
I’m not sure I believe your claim that LFP is under-reported. The Atlanta Fed blog wrote a lot about it in the last several years. I see LFP/unemp/U6/U4 etc as being like the core vs headline CPI debate. Every news cycle pundits wheedle on about the “technicalities” to score a few seriousness points while picking which ever number suits their bias.
Also Chair Bernanke drew a lot of attention to LFP several years ago, noting that depression and skill rot would set in — and some people might never get their first start in the labour market.