Skip navigation

Author Archives: hrbrmstr

Don't look at me…I do what he does — just slower. #rstats avuncular • ?Resistance Fighter • Cook • Christian • [Master] Chef des Données de Sécurité @ @rapid7

Ingredients

  • 453g all-purpose flour (this precision is not 100% necessary, you may need to add more when folding in the wet ingredients)
  • 30-37g sugar (I’d err on the lower side for the first run)
  • 28g baking powder
  • 7g salt
  • 151g butter-flavored shortening (cold!)
  • 76g eggs (I kinda just use 2 jumbo and add a teensy bit more flour)
  • 240ml coconut milk yogurt (unsweetened)
  • egg wash (one egg and a tablespoon of oat milk and a pinch of salt)

TODO

Oven @ 425°F/218°C ; Sheet pan lined with parchment paper.

Sift dry bits together.

Cut up shortening into small squares and mix into the dry bits with a pastry blender.

Combine eggs and yogurt. Add to ^^.

Fold and knead gently just until the dough forms. Too much working the dough and the biscuits will be tough.

Press out dough on lightly floured surface to ~3cm thickness and use a 5cm cutter to cut (push down, don’t twist).

Place on lined sheet pan.

Brush with egg wash.

Bake ~15m (start watching at 7m, prbly rotate the sheet pan then, and then again at ~12m).

The United States Centers for Disease Control (CDC from now on) has setup two new public surveillance resources for COVID-19. Together, COVIDView and COVID-NET provide similar weekly surveillance data as FluView does for influenza-like illnesses (ILI).

The COVIDView resources are HTML tables (O_O) and, while the COVID-NET interface provides a “download” button, there is no exposed API to make it easier for the epidemiological community to work with these datasets.

Enter {cdccovidview} — https://cinc.rud.is/web/packages/cdccovidview/ — which scrapes the tables and uses the hidden API in the same way {cdcfluview}(https://cran.rstudio.com/web/packages/cdcfluview/index.html) does for the FluView data.

Weekly case, hospitalization, and mortality data is available at the national, state and regional levels (where provided) and I tried to normalize the fields across each of the tables/datasets (I hate to pick on them when they’re down, but these two sites are seriously sub-optimal from a UX and just general usage perspective).

After you follow the above URL for information on how to install the package, it should “just work”. No API keys are needed, but the CDC may change the layout of tables and fields structure of the hidden API at any time, so keep an eye out for updates.

Using it is pretty simple, just use one of the functions to grab the data you want and then work with it.

library(cdccovidview)
library(hrbrthemes)
library(tidyverse)

hosp <- laboratory_confirmed_hospitalizations()

hosp
## # A tibble: 4,590 x 8
##    catchment      network   year  mmwr_year mmwr_week age_category cumulative_rate weekly_rate
##    <chr>          <chr>     <chr> <chr>     <chr>     <chr>                  <dbl>       <dbl>
##  1 Entire Network COVID-NET 2020  2020      10        0-4 yr                   0           0  
##  2 Entire Network COVID-NET 2020  2020      11        0-4 yr                   0           0  
##  3 Entire Network COVID-NET 2020  2020      12        0-4 yr                   0           0  
##  4 Entire Network COVID-NET 2020  2020      13        0-4 yr                   0.3         0.3
##  5 Entire Network COVID-NET 2020  2020      14        0-4 yr                   0.6         0.3
##  6 Entire Network COVID-NET 2020  2020      15        0-4 yr                  NA          NA  
##  7 Entire Network COVID-NET 2020  2020      16        0-4 yr                  NA          NA  
##  8 Entire Network COVID-NET 2020  2020      17        0-4 yr                  NA          NA  
##  9 Entire Network COVID-NET 2020  2020      18        0-4 yr                  NA          NA  
## 10 Entire Network COVID-NET 2020  2020      19        0-4 yr                  NA          NA  
## # … with 4,580 more rows

c(
  "0-4 yr", "5-17 yr", "18-49 yr", "50-64 yr", "65+ yr", "65-74 yr", "75-84 yr", "85+"
) -> age_f

mutate(hosp, start = mmwr_week_to_date(mmwr_year, mmwr_week)) %>%
  filter(!is.na(weekly_rate)) %>%
  filter(catchment == "Entire Network") %>%
  select(start, network, age_category, weekly_rate) %>%
  filter(age_category != "Overall") %>%
  mutate(age_category = factor(age_category, levels = age_f)) %>%
  ggplot() +
  geom_line(
    aes(start, weekly_rate)
  ) +
  scale_x_date(
    date_breaks = "2 weeks", date_labels = "%b\n%d"
  ) +
  facet_grid(network~age_category) +
  labs(
    x = NULL, y = "Rates per 100,000 pop",
    title = "COVID-NET Weekly Rates by Network and Age Group",
    caption = sprintf("Source: COVID-NET: COVID-19-Associated Hospitalization Surveillance Network, Centers for Disease Control and Prevention.\n<https://gis.cdc.gov/grasp/COVIDNet/COVID19_3.html>; Accessed on %s", Sys.Date())
  ) +
  theme_ipsum_es(grid="XY")

FIN

This is brand new and — as noted — things may change or break due to CDC site changes. I may have also missed a table or two (it’s a truly terrible site).

If you notice things are missing or would like a different interface to various data endpoints, drop an issue or PR wherever you’re most comfortable.

Stay safe!

Ingredients

  • 2 cups adzuki (not dried)
  • 2 cups (4-6 links) andouille sliced
  • 1 tbsp olive oil
  • 1 medium onion, chopped
  • 2 bay leaves
  • 2 garlic cloves, coarse chopped
  • 0-2 dry hot peppers to taste
  • 1-2 fresh sprigs thyme
  • 4 cups stock (chicken or veg)
  • salt & pepper to taste
  • dash of vinegar

TODO

Brown sausage in oil then remove.

Sauté onion in same oil (add more if dry) til clear.

Add garlic and sauté for 1-2 minutes.

Add back sausage and add in everything else and simmer for 45 minutes.

To thicken, remove and pulse a few tablespoons of beans and add back or stir in 1 tbsp corn starch dissolved in stock or water.

Just a quick note that thanks to a gentle nudge an updated version of {uaparser} — a package that processes User Agent strings web clients send to servers — is making its way to all the CRAN mirrors and is also available on CINC. The most significant change is a much overdue update to the user agent regex dictionary.

It takes something like this Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/535.2 (KHTML, like Gecko) Ubuntu/11.10 Chromium/15.0.874.106 Chrome/15.0.874.106 Safari/535.2 and turns it into a tidy data frame:

uaparserjs::ua_parse("Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/535.2 (KHTML, like Gecko) Ubuntu/11.10 Chromium/15.0.874.106 Chrome/15.0.874.106 Safari/535.2")
## # A tibble: 1 x 9
##   userAgent                                   ua.family ua.major ua.minor ua.patch os.family os.major os.minor device.family
##   <chr>                                       <chr>     <chr>    <chr>    <chr>    <chr>     <chr>    <chr>    <chr>        
## 1 Mozilla/5.0 (X11; Linux x86_64) AppleWebKi… Chromium  15       0        874      Ubuntu    11       10       Other     

The js on the end of the package name is a nod that it uses the javascript ua-parser-core module via Jeroen’s seriously awesome {V8} package. Four years ago, {uaparserjs} did not work on Windows due to V8 VM stack limitations on Windows. Today, it now works on all platforms!

Tis no slouch, either, as it processes 100 user agent strings in ~20ms. No speed demon, but should get the job done for most use-cases.

There is an excellent C++-backed R version that is not on CRAN and some heavy dependencies, but is faster than the javascript version if you need to process scads of user agent strings (I tend to use at-scale Scala environments for this now, hence the reason for a long delay update).

Jeroen has an excellent writeup on how to use browserify to create an application bundle for javascript-backed R packages or scripts. There are some idiosyncrasies with the ua-parser-core reference implementation that was causing me no end of trouble with that method. On a lark, I tried:

$ webpack --mode="production" index.js -o bundle.js

and it worked perfectly on the first try (both in creating the app bundle and that bundle working just as before). This is due in no small part to Jeroen getting {V8} package to work with more recent lib-v8 releases (which is also why it works on Windows now). I’ll try to write-up the webpack alternate method and PR into the vignette as I get time.

FIN

As usual, kick the tyres, jump in with PRs or issues and — most of all — be safe out there!

For folks who interact with CRAN or R Core: they’re continuing to support our community during these crazy times so when you are in exchanges with them, definitely take some time out to add an extra nod of thanks for managing to do so whilst juggling the same things we all are.

Waffle House announced it was closing hundreds of stores this week due to SARS-Cov-2 (a.k.a COVID-19). This move garnered quite a bit of media attention since former FEMA Administrator Craig Fugate used the restaurant chain as both an indicator of the immediate and overall severity of a natural disaster. [He’s not the only one](https://www.ehstoday.com/emergency-management/article/21906815/what-do-waffles-have-to-do-with-risk-management. The original concept was pretty straightforward:

For example, if a Waffle House store is open and offering a full menu, the index is green. If it is open but serving from a limited menu, it’s yellow. When the location has been forced to close, the index is red. Because Waffle House is well prepared for disasters, Kouvelis said, it’s rare for the index to hit red. For example, the Joplin, Mo., Waffle House survived the tornado and remained open.
 
“They know immediately which stores are going to be affected and they call their employees to know who can show up and who cannot,” he said. “They have temporary warehouses where they can store food and most importantly, they know they can operate without a full menu. This is a great example of a company that has learned from the past and developed an excellent emergency plan.”

SARS-Cov-2 is not a tropical storm, so conditions are a bit different and a tad more complex when it comes to basing severity of this particular disaster (mostly caused by inept politicians across the globe), which gave me an idea for how to make the Waffle House Index a proper index, i.e. a _”statistical measure of change in a representative group of individual data points.”_1.

In the case of an outbreak, rather than a simple green/yellow/red condition state, using the ratio of closed to open Waffle House locations as a numeric index — [0-1] — seems to make more sense since it may better help indicate:

  • when shelter-in-place became mandatory where a given restaurant is located
  • the severity of SARS-Cov-2-caused symptoms for a given location
  • disruptions in the supply chain for a given location due to SARS-Cov-2

I kinda desperately needed a covidistraction so I set out to see how hard it would be to build such an index metric.

Waffle House lets you find locations via a standard map/search interface. They provide lots of data via that map which can be used to figure out which stores are open and which are closed. There’s a nascent R package which contains all the recipes necessary for the data gathering. However, you don’t need to use it, since it powers wafflehouseindex.us which is collecting the data when the store closings info changes and provides a snapshot of the latest data daily (direct CSV link).

The historical data will make it to a git repo at some point in the near future.

The current index value is 21.2, which increased quickly after the first value of 18.1 (that event was the catalyst for getting the site up and package done) and the closed locations are on the map at the beginning of the post. I went with three qualitative levels on the gauge mostly to keep things simple.

There will absolutely be more location closings and it will be interesting (and, ultimately, very depressing and likely grave) to see how high the index goes and how long it stays above zero.

FIN

The metric is — for the time being — computed across all stores. As noted earlier, this could be broken down into regional index scores to intuit the aforementioned three indicators on a more local level. The historical data (apart from the first closings announcement) is being saved so it will be possible to go back and compute regional indexes when I’ve got more time.

I shall reiterate that you should grab the data from http://wafflehouseindex.us/data/latest.csv vs use the R package since there’s no point in dup’ing the gathering and the historical data will be up and maintained soon.

Stay safe, folks.

Stuff you need

  • 370g all-purpose flour
  • 7g baking powder
  • 300g sugar
  • 80g shortening
  • 7.5g salt
  • 140g eggs (~2 proper jumbo)
  • 75ml coconut milk or cashew milk or almond milk yogurt
  • 75ml oat milk
  • 15ml vanilla
  • 85ml veg oil
  • 10oz bag ghirardelli dark chocolate chips

Stuff you do

Oven @ 375°F.

Paddle sugar, shortening and salt. 3-5 mins.

Whisk eggs, milk, yogurt, vanilla & oil.

In three batches, mix/fold ^^ into the paddled mixture.

Sift together dry ingredients and mix until moist. Don’t over-mix.

Fold in chips.

Let sit for 3 mins.

While ^^, put liners in a 12-cup muffin tin.

Evenly distribute batter. It’s ~100g batter (~1/2 dry measuring cup) per muffin.

22-30m in the oven (it really depends on your oven type). You should not be afraid to skewer to test nor to move the tin around to evenly brown.

Cool on wire rack.

Über Tuesday has come and almost gone (some state results will take a while to coalesce) and I’m relieved to say that {catchpole} did indeed work, with the example code from before producing this on first run:

If we tweak the buffer space around the squares, I think the cartogram looks better:

but, you should likely use a different palette (see this Twitter thread for examples).

I noted in the previous post that borders might be possible. While I haven’t solved that use-case for individual states, I did manage to come up with a method for making a light version of the cartogram usable:

library(sf)
library(hrbrthemes) 
library(catchpole)
library(tidyverse)

delegates <- read_delegates()

candidates_expanded <- expand_candidates()

gsf <- left_join(delegates_map(), candidates_expanded, by = c("state", "idx"))

m <- delegates_map()

# split off each "area" on the map so we can make a border+background
list(
  setdiff(state.abb, c("HI", "AK")),
  "AK", "HI", "DC", "VI", "PR", "MP", "GU", "DA", "AS"
) %>% 
  map(~{
    suppressWarnings(suppressMessages(st_buffer(
      x = st_union(m[m$state %in% .x, ]),
      dist = 0.0001,
      endCapStyle = "SQUARE"
    )))
  }) -> m_borders

gg <- ggplot()
for (mb in m_borders) {
  gg <- gg + geom_sf(data = mb, col = "#2b2b2b", size = 0.125)
}

gg + 
  geom_sf(
    data = gsf,
    aes(fill = candidate),
    col = "white", shape = 22, size = 3, stroke = 0.125
  ) +
  scale_fill_manual(
    name = NULL,
    na.value = "#f0f0f0",
    values = c(
      "Biden" = '#f0027f',
      "Sanders" = '#7fc97f',
      "Warren" = '#beaed4',
      "Buttigieg" = '#fdc086',
      "Klobuchar" = '#ffff99',
      "Gabbard" = '#386cb0',
      "Bloomberg" = '#bf5b17'
    ),
    limits = intersect(unique(delegates$candidate), names(delegates_pal))
  ) +
  guides(
    fill = guide_legend(
      override.aes = list(size = 4)
    )
  ) +
  coord_sf(datum = NA) +
  theme_ipsum_es(grid="") +
  theme(legend.position = "bottom")

{ssdeepr}

Researcher pals over at Binary Edge added web page hashing (pre- and post-javascript scraping) to their platform using ssdeep. This approach is in the category of context triggered piecewise hashes (CTPH) (or local sensitivity hashing) similar to my R adaptation/packaging of Trend Micro’s tlsh.

Since I’ll be working with BE’s data off-and-on and the ssdeep project has a well-crafted library (plus we might add ssdeep support at $DAYJOB), I went ahead and packaged that up as well.

I recommend using the hash_con() function if you need to read large blobs since it doesn’t require you to read everything into memory first (though hash_file() doesn’t either, but that’s a direct low-level call to the underlying ssdeep library file reader and not as flexible as R connections are).

These types of hashes are great at seeing if something has changed on a website (or see how similar two things are to each other). For instance, how closely do CRAN mirror match the mothership?

library(ssdeepr) # see the links above for installation

cran1 <- hash_con(url("https://cran.r-project.org/web/packages/available_packages_by_date.html"))
cran2 <- hash_con(url("https://cran.biotools.fr/web/packages/available_packages_by_date.html"))
cran3 <- hash_con(url("https://cran.rstudio.org/web/packages/available_packages_by_date.html"))

hash_compare(cran1, cran2)
## [1] 0

hash_compare(cran1, cran3)
## [1] 94

I picked on cran.biotools.fr as I saw they were well-behind CRAN-proper on the monitoring page.

I noted that BE was doing pre- and post-javascript hashing as well. Why, you may ask? Well, websites behave differently with javascript running, plus they can behave differently when different user-agents are set. Let’s grab a page from Wikipedia a few different ways to show how they are not alike at all, depending on the retrieval context. First, let’s grab some web content!

library(httr)
library(ssdeepr)
library(splashr)

# regular grab
h1 <- hash_con(url("https://en.wikipedia.org/wiki/Donald_Knuth"))

# you need Splash running for javascript-enabled scraping this way
sp <- splash(host = "mysplashhost", user = "splashuser", pass = "splashpass")

# js-enabled with one ua
sp %>%
  splash_user_agent(ua_macos_chrome) %>%
  splash_go("https://en.wikipedia.org/wiki/Donald_Knuth") %>%
  splash_wait(2) %>%
  splash_html(raw_html = TRUE) -> js1

# js-enabled with another ua
sp %>%
  splash_user_agent(ua_ios_safari) %>%
  splash_go("https://en.wikipedia.org/wiki/Donald_Knuth") %>%
  splash_wait(2) %>%
  splash_html(raw_html = TRUE) -> js2

h2 <- hash_raw(js1)
h3 <- hash_raw(js2)

# same way {rvest} does it
res <- httr::GET("https://en.wikipedia.org/wiki/Donald_Knuth")

h4 <- hash_raw(content(res, as = "raw"))

Now, let’s compare them:

hash_compare(h1, h4) # {ssdeepr} built-in vs httr::GET() => not surprising that they're equal
## [1] 100

# things look way different with js-enabled

hash_compare(h1, h2)
## [1] 0
hash_compare(h1, h3)
## [1] 0

# and with variations between user-agents

hash_compare(h2, h3)
## [1] 0

hash_compare(h2, h4)
## [1] 0

# only doing this for completeness

hash_compare(h3, h4)
## [1] 0

For this example, just content size would have been enough to tell the difference (mostly, note how the hashes are equal despite more characters coming back with the {httr} method):

length(js1)
## [1] 432914

length(js2)
## [1] 270538

nchar(
  paste0(
    readLines(url("https://en.wikipedia.org/wiki/Donald_Knuth")),
    collapse = "\n"
  )
)
## [1] 373078

length(content(res, as = "raw"))
## [1] 374099

FIN

If you were in a U.S. state with a primary yesterday and were eligible to vote (and had something to vote for, either a (D) candidate or a state/local bit of business) I sure hope you did!

The ssdeep library works on Windows, so I’ll be figuring out how to get that going in {ssdeepr} fairly soon (mostly to try out the Rtools 4.0 toolchain vs deliberately wanting to support legacy platforms).

As usual, drop issues/PRs/feature requests where you’re comfortable for any of these or other packages.

For folks who are smart enough not to go near Twitter, I’ve been on a hiatus from the platform insofar as reading the Twitter feed goes. “Why” isn’t the subject of this post so I won’t go into it, but I’ve broken this half-NYE resolution on more than one occasion and am very glad I did so late January when I caught a RT of this tweet by WSJ’s Brian McGill:

You can find it here, and a static copy of a recent one is below:

I kinda wanted to try to make a woefully imperfect static version of it in R with {ggplot2} so poked around at that URL’s XHR objects and javascript to see if I could find the cartogram and the data source.

The data source was easy as it’s an XHR loaded JSON file: https://asset.wsj.net/wsjnewsgraphics/election/2020/delegates.json.

The cartogram bits… were not. Brian’s two-days of manual effort still needed to be put into something that goes onto a web page and news outlets are super-talented at making compact, fast-loading interactive visualizations, which means one tool they use is “webpack”-esque tools to combine many small javascript files into one. I did traipse through it seeing if there as a back-end JSON or CSV somewhere but could not locate it. However, their cartogram library builds the SVG you see on the page. If you use Developer Tools to inspect any element of the SVG then copy the whole SVG “outer HTML” and save it to a local file:

After using an intercept proxy, it turns out this is a dynamically loaded resource, too: https://asset.wsj.net/wsjnewsgraphics/election/delegate-tracker/carto.svg.

That SVG has three top layer groups and has some wicked transforms in it. There was no way I was going to attempt a {statebins}-esque approach to this copycat project (i.e. convert the squares to a grid and map things manually like Brian did) but I had an idea and used Adobe Illustrator to remove the state names layer and the background polygon layer, then “flatten” the image (which — to over-simplify the explanation — flattens all the transforms), and save it back out.

Then, I added a some magic metadata prescribed by svg2geojson to turn the SVG into a GeoJSON file (which {sf} can read!). (That sentence just made real cartographers & geocomp’ers weep, btw).

Now, that I had something R could use in a bit of an easier fashion there was still work to be done. The SVG 1-px <rect> elements ended up coming across as POLYGONs and many, many more point-squares came along for the ride (in retrospect, I think they may have been the borders around the states, more on that in a bit).

I used {purrr} and {st_coordinates} to figure out where all the 1-px “polygons” were in the {sf} object and isolated them, then added an index field (1:n, n being the number of delegate squares for a given state).

I read in the original SVG with {xml2} and extracted the named state groups. Thankfully the order and number of “blocks” matched the filtered {sf} object. I merged them together, turned the 1-px POLYGONs into POINTs, and made the final {sf} object which I put in the nascent {catchpole} package (location below). Here’s a quick view of it using plot():

library(catchpole) # hrbrmstr/catchpole

plot(delegates_map()[1])

delegates_map()
## Simple feature collection with 3979 features and 2 fields
## geometry type:  POINT
## dimension:      XY
## bbox:           xmin: -121.9723 ymin: 37.36802 xmax: -121.9581 ymax: 37.37453
## epsg (SRID):    4326
## proj4string:    +proj=longlat +datum=WGS84 +no_defs
## First 10 features:
##    state idx                   geometry
## 1     WY   1 POINT (-121.9693 37.37221)
## 2     WY   2 POINT (-121.9693 37.37212)
## 3     WY   3 POINT (-121.9691 37.37221)
## 4     WY   4 POINT (-121.9691 37.37212)
## 5     WY   5 POINT (-121.9691 37.37203)
## 6     WY   6 POINT (-121.9691 37.37194)
## 7     WY   7 POINT (-121.9691 37.37185)
## 8     WY   8  POINT (-121.969 37.37221)
## 9     WY   9  POINT (-121.969 37.37212)
## 10    WY  10  POINT (-121.969 37.37203)

All that was needed was to try it out with the real data.

I simplified that process quite a bit in {catchpole} but also made it possible to work with the individual bits on your own. {gg_catchpole()} will fetch the WSJ delegate JSON and build the basic map for you using my dark “ipsum” theme:

library(sf)
library(catchpole) # hrbrmstr/catchpole
library(hrbrthemes)
library(tidyverse)

gg_catchpole() +
  theme_ft_rc(grid="") +
  theme(legend.position = "bottom")

BONUS!

Now that you have the WSJ JSON file, you can do other, basic visualizations with it:

library(hrbrthemes) 
library(waffle)
library(geofacet)
library(tidyverse)

jsonlite::fromJSON(
  url("https://asset.wsj.net/wsjnewsgraphics/election/2020/delegates.json"),
  simplifyDataFrame = FALSE
) -> del

c(
  "Biden" = "#5ac4c2",
  "Sanders" = "#63bc51",
  "Warren" = "#9574ae",
  "Buttigieg" = "#007bb1",
  "Klobuchar" = "#af973a",
  "Bloomberg" = "#AA4671",
  "Steyer" = "#4E4EAA",
  "Yang" = "#C76C48",
  "Gabbard" = "#7B8097"
) -> dcols

bind_cols(del$data$US$delCount) %>% 
  gather(candidate, delegates) %>% 
  filter(delegates > 0) %>%
  arrange(desc(delegates)) %>% 
  mutate(candidate = fct_inorder(candidate)) %>%
  ggplot(aes(candidate, delegates)) +
  geom_col(fill = ggthemes::tableau_color_pal()(1), width = 0.55) +
  labs(
    x = NULL, y = "# Delegates",
    title = "2020 Democrat POTUS Race Delegate Counts",
    subtitle = sprintf("Date: %s", Sys.Date()),
    caption = "Data source: WSJ <https://asset.wsj.net/wsjnewsgraphics/election/2020/delegates.json>\n@hrbrmstr #rstats"
  ) +
  theme_ipsum_rc(grid="Y")

bind_cols(del$data$US$delCount) %>% 
  gather(candidate, delegates) %>% 
  filter(delegates > 0) %>%
  arrange(desc(delegates)) %>% 
  mutate(candidate = fct_inorder(candidate)) %>%
  ggplot(aes(fill=candidate, values=delegates)) +
  geom_waffle(color = "white", size = 0.5) +
  scale_fill_manual(name = NULL, values = dcols) +
  coord_fixed() +
  labs(
    x = NULL, y = "# Delegates",
    title = "2020 Democrat POTUS Race Delegate Counts",
    subtitle = sprintf("Date: %s", Sys.Date()),
    caption = "Data source: WSJ <https://asset.wsj.net/wsjnewsgraphics/election/2020/delegates.json>\n@hrbrmstr #rstats"
  ) +
  theme_ipsum_rc(grid="") +
  theme_enhance_waffle()

state_del <- del
state_del$data[["US"]] <- NULL

map_df(state_del$data, ~bind_cols(.x$delCount), .id = "state") %>% 
  gather(candidate, delegates, -state) %>% 
  filter(delegates > 0) %>% 
  ggplot(aes(candidate, delegates)) +
  geom_col(aes(fill = candidate), col = NA, width = 0.55) +
  scale_fill_manual(name = NULL, values = dcols) +
  facet_geo(~state) +
  labs(
    x = NULL, y = "# Delegates",
    title = "2020 Democrat POTUS Race Delegate Counts by State",
    subtitle = sprintf("Date: %s", Sys.Date()),
    caption = "Data source: WSJ <https://asset.wsj.net/wsjnewsgraphics/election/2020/delegates.json>\n@hrbrmstr #rstats"
  ) +
  theme_ipsum_rc(grid="Y") +
  theme(axis.text.x = element_blank()) +
  theme(panel.spacing.x = unit(0.5, "lines")) +
  theme(panel.spacing.y = unit(0.1, "lines")) +
  theme(legend.position = c(0.95, 0.1)) +
  theme(legend.justification = c(1, 0))

FIN

More work needs to be done on the map and {catchpole} itself but there’s a sufficient base for others to experiment with (PRs and your own blog posts welcome!).

W/r/t “more on that later” bits: The extra polygons were very likely borders and I think borders would help the cartogram, but we can make them with {sf}, too. We can also add in a layer for state names and/or just figure out the centroid for each point grouping (with {sf}) and get places for labels that way). Not sure I’ll have time for any of that (this whole process went quickly, believe it or not).

Also: ggiraph::geom_sf_interactive() can be used as a poor-dude’s popup to turn this (quickly) into an interactive piece.

If you hit up https://git.rud.is/hrbrmstr/catchpole you’ll find the package and URLs to other social coding sites (though GitUgh has been plagued with downtime and degraded performance the past few weeks so you should really think about moving your workloads to real service).

Have fun mapping Über Tuesday and share your creations, PR’s, ideas, etc for the package wherever you’re most comfortable.