Skip navigation

I like to turn coincidence into convergence whenever possible. This weekend, a user of [cdcfluview](http://github.com/hrbrmstr/cdcfluview) had a question that caused me to notice a difference in behaviour between the package was interacting with CDC FluView API, so I updated the package to accommodate the change and the user.

Around the same time, @recology_ tweeted:

Finally, the [2015-2016 flu season](http://www.cdc.gov/flu/about/season/flu-season-2015-2016.htm) is also fast approaching (so a three-fer!), making a CRAN leap for `cdcfluview` quite timely.

Since I can’t let @quonimous have all the `#rstats` fun & glory, I added a function and two data sets to `cdcfluview`, did the CRAN dance and it’s now [on CRAN](https://cran.rstudio.com/web/packages/cdcfluview/). I also did the github dance to have it’s entry in the [Web Technologies Task View](https://cran.rstudio.com/web/views/WebTechnologies.html) updated.

The new function lets you grab the XML behind the high-level [Weekly US Map: Influenza Summary Update](http://www.cdc.gov/flu/weekly/usmap.htm) (I’ll be adding a function to make a similar plot that won’t require gosh-awful Flash) and the data sets provide metadata about the composition of HHS regions and Census regions, making it easier to compose factors, add labels to maps or even segment maps/combine polygons. The existing two grab current & historical detailed national & state influenza data.

There’s an example on github and in the

(https://rud.is/b/2015/01/10/new-r-package-cdcfluview-retrieve-flu-data-from-cdcs-fluview-portal/).

If you have any other data you need freed from the confines of the CDC FluView Flash portal, please file an issue & paste a screen shot (if you are comfortable with most browser Developer Tools views, even a dump of the request or “as cURL” URL would be awesome).

Riffing off of [the previous post](http://rud.is/b/2015/08/05/speeding-up-your-quests-for-r-stuff/), here’s a way to quickly search CRAN (the @RStudio flavor) from the Chrome search bar.

– Paste `chrome://settings/searchEngines` into your location bar and hit return/enter
– Scroll down until the input boxes show, enabling you to add a search engine
– For _”Add a new search engine”_ put “`CRAN`”
– For _”Keyword”_ put “`R`”, “`rstats`” or “`CRAN`”, but “`R`” is super easy to type, though it may not be optimal for you :-)
– For _”URL with %s in place of query”_ put the following:

https://www.google.com/search?as_q=%s&as_epq=&as_oq=&as_eq=&as_nlo=&as_nhi=&lr=&cr=&as_qdr=all&as_sitesearch=cran.rstudio.com&as_occt=any&safe=images&as_filetype=&as_rights=&gws_rd=ssl

(you may be able to trim that URL a bit, if desired)

Save that and then in the Chrome location bar hit `R` then `[TAB]` and you’ll be sending the query to a custom Google search that only looks on CRAN (specifically the @RStudio CRAN mirror).

I use Google quite a bit when conjuring up R projects, whether it be in a lazy pursuit of a PDF vignette or to find a package or function to fit a niche need. Inevitably, I’ll do something like [this](https://www.google.com/#q=cran+shapefile) (yeah, I’m still on a mapping kick) and the first (and best) results will come back with `https://cran.r-project.org/`-prefixed URLs. If all this works, what’s the problem? Well, the main CRAN site is, without mincing words, _slow_ much of the time. The switch to `https` on it (and it’s mostly-academic mirrors) has introduced noticeable delays.

Now, these aren’t productivity-crushing delays, but (a) why wait if you don’t have to; and, (b) why not spread the load to a whole [server farm](http://cran.rstudio.com/) dedicated to ensuring fast delivery of content? I was going to write a Chrome extension specifically for this, but I kinda figured this was a solved problem, and it is!

From the plethora of options in the Chrome Store, I grabbed [Switcheroo Redirector](https://chrome.google.com/webstore/detail/switcheroo-redirector/cnmciclhnghalnpfhhleggldniplelbg?hl=en) because (a) it has a decent user base and rating; (b) it’s not super-complex to use; and, (c) the source is [on github](https://github.com/ranjez/Switcheroo) and closely matches what’s in the actual installed extension (some extensions are tricksy/evil and you can even build your own with the source vs trust the Chrome Store one).

So, go install it and come back. We’ll wait.

OK, you back? Good. Let’s continue. You should have a Switcheroo icon near your location bar. Select it and you should see a popup like this:

Fullscreen_8_5_15__9_10_PM

I’ve already made the entry, but you just need to tell the app to substitute all URL occurrences of `cran.r-project.org` with `cran.rstudio.com` when Chrome is trying to load a URL.

Now, when you click one of those links in the above example, it will go (speedily!) to the RStudio CRAN mirror server farm.

Once nice (to security freaks like me) feature is that if you have one of the Switcheroo links open in a new tab (i.e. not directly/immediately visible to you) it will let you know that something is happening out of the ordinary:

Redirect_Notice

This is a tiny (and good) price to pay to know you’re not being whacked by a bad plugin.

If you have another preference (or have suggestions for Safari or Firefox) please drop a note in the comments so others can benefit from your experience!

There was some chatter on the twitters this week about a relatively new D3-based charting library called [TauCharts](http://taucharts.com/) (also @taucharts). The API looked pretty clean and robust, so I started working on an htmlwidget for it and was quickly joined by the Widget Master himself, @timelyportfolio.

TauCharts definitely has a “grammar of graphics” feel about it and the default aesthetics are super-nifty While the developers are actively adding new features and “geoms”, the core points (think scatterplot), lines and bars (including horizontal bars!) geoms are quite robust and definitely ready for your dashboards.

Between the two of us, we have a _substantial_ part of the [charting library API](http://api.taucharts.com/) covered. I think the only major thing left unimplemented is composite charts (i.e. lines + bars + points on the same chart) and some minor tweaks around the edges.

While you can find it on [github](http://github.com/hrbrmstr/taucharts) and do the normal:

devtools::install_github("hrbrmstr/taucharts")

or, even use the official initial release version:

devtools::install_github("hrbrmstr/taucharts@v0.1.0")

I’ll use the `dev` version:

devtools::install_github("hrbrmstr/taucharts@dev"

for the example below, mostly since it includes the data set I want to use to mimic the current, featured example on the [TauCharts homepage](http://taucharts.com/) and also has full documentation with examples.

Here’s all it takes to make a faceted scatterplot with:

– interactive tooltips
– interactive legend
– custom-selectable trendline annotation:

devtools::install_github("hrbrmstr/taucharts@dev")
 
library(taucharts)
 
data(cars_data)
 
tauchart(cars_data) %>% 
  tau_point("milespergallon", c("class", "price"), color="class") %>% 
  tau_guide_padding(bottom=300) %>% 
  tau_legend() %>% 
  tau_trendline() %>% 
  tau_tooltip(c("vehicle", "year", "class", "price", "milespergallon"))


Hybrid cars fuel economy by price and class
It seems expensive cars are less efficient.

There are _tons_ more examples in the [TauCharts RPub](http://rpubs.com/hrbrmstr/taucharts) (and soon-to-be vignette) and @timelyportfolio will be featuring it in his weekly [widget update](http://www.buildingwidgets.com/).

Believe it or not, there are two [1] [2] questions on @StackOverflowR about how to make QR codes in R. I personally think QR codes are kinda hokey, but who am I to argue with pressing needs of the #rstats community? I found libqrencode and it’s highly brew-able and apt-able (probably yum-able, too, if you lean that way), so it was super-easy to crank out an Rcpp-based package for it.

There are a few functions in the package, but the following would be my guess as to how most folks would use it (well, two folks, anyway):

library(qrencoder)
library(raster)
 
par(mar=c(0,0,0,0))
image(qrencode_raster("http://rud.is/b"), 
      asp=1, col=c("white", "black"), axes=FALSE, 
      xlab="", ylab="")

README-qr-1

Since @quominus threw down yet-another gauntlet (in our own “battle of the hoRnbuRg”) after my qrencoder announcement:

I had no choice but to also crank out another package that interfaces with the PasswordRandom API. There’s at least some fairly obvious real utility to this one:

library(passwordrandom)
 
# current verison
packageVersion("passwordrandom")
#> [1] '0.0.0.9000'
 
random_chars()
#>  [1] "m" "M" "Z" "p" "G" "B" "E" "m" "B" "v"
 
random_doubles()
#>  [1] 82.6272 89.6146  1.2591 77.9003 62.5740 68.8216 61.9789 37.9001 20.6352  4.6343
 
random_guids()
#>  [1] "fdf5d58e-ebe9-4db3-b429-303e8a5e1fdf" "20ad94f9-a232-4fa8-91c6-ba21e9925b96"
#>  [3] "d44c1c4b-0117-43c3-b77c-89bc33caf59f" "500ce633-1197-4c92-aff4-1eac94fd2d8d"
#>  [5] "13b1a1a0-f7fa-40b6-a9da-9e445ac26d2b" "06362286-8d5b-4dfc-9283-df564291120d"
#>  [7] "36dbb258-ede0-4a8a-b416-5609f11c8be1" "9b27dcca-26d7-4467-9a54-3862ccbd06cf"
#>  [9] "4f53fe11-d4f0-4c01-a2fc-97e35983d567" "b1f0df88-e790-4d48-8683-ebe68db9f0ca"
 
random_ints()
#>  [1] 82 17 97 20 87 91 57 42 22 62
 
random_passwords()
#>  [1] "RoeXO2{75bh"  "RiuFE6'10hj"  "TauTY1)92pj"  "XooHO8%87rv"  "MooJA1^40np"  "KyaDU3\\35tr" "KiaQY0>91nr" 
#>  [8] "XoeGO1%68nt"  "KeoFI0>33cc"  "VeaDI2$51jc"

If you’re so inclined, kick the tyres of either/both and drop a note here or issue/feature request on either repo.

In the never-ending battle for truth, justice and publishing more R
packages than [Oliver](http://twitter.com/quominus), I whipped out an R
package for the [OpenStreetMap Nominatim
API](http://wiki.openstreetmap.org/wiki/Nominatim). It actually hits the
[MapQuest Nominatim Servers](http://open.mapquestapi.com/nominatim/) for
most of the calls, but the functionality is the same.

The R package lets you:

– `address_lookup`: Lookup the address of one or multiple OSM objects
like node, way or relation.
– `osm_geocode`: Search for places by address
– `osm_search`: Search for places
– `osm_search_spatial`: Search for places, returning a list of
`SpatialPointsDataFrame`, `SpatialLinesDataFrame` or a
`SpatialPolygonsDataFrame`
– `reverse_geocode_coords`: Reverse geocode based on lat/lon
– `reverse_geocode_osm`: Reverse geocode based on OSM Type & Id

Just like Google Maps, these services are not meant to be your
freebie-access to mega-bulk-geocoding. You can and should pay for that.
But, when you need a few items geocoded (or want to lookup some
interesting things on OSM since it provides [special
phrases](http://wiki.openstreetmap.org/wiki/Nominatim/Special_Phrases)
to work with), Nominatim lookups can be just what’s needed.

Let’s say we wanted to see where pubs are in the Seattle metro area.
That’s a simple task for nominatim:

# devtools::install_github("hrbrmstr/nominatim")
library(nominatim)
library(dplyr)
 
sea_pubs <- osm_search("pubs near seattle, wa", limit=20)
 
glimpse(sea_pubs)
 
## Observations: 20
## Variables:
## $ place_id     (chr) "70336054", "82743439", "11272568", "21478701", "...
## $ licence      (chr) "Data © OpenStreetMap contributors, ODbL 1.0. htt...
## $ osm_type     (chr) "way", "way", "node", "node", "node", "node", "no...
## $ osm_id       (chr) "51460516", "96677583", "1077652159", "2123245933...
## $ lat          (dbl) 47.64664, 47.63983, 47.60210, 47.62438, 47.59203,...
## $ lon          (dbl) -122.3503, -122.3023, -122.3321, -122.3559, -122....
## $ display_name (chr) "Nickerson Street Saloon, 318, Nickerson Street, ...
## $ class        (chr) "amenity", "amenity", "amenity", "amenity", "amen...
## $ type         (chr) "pub", "pub", "pub", "pub", "pub", "pub", "pub", ...
## $ importance   (dbl) 0.201, 0.201, 0.201, 0.201, 0.201, 0.201, 0.201, ...
## $ icon         (chr) "http://mq-open-search-int-ls03.ihost.aol.com:800...
## $ bbox_left    (dbl) 47.64650, 47.63976, 47.60210, 47.62438, 47.59203,...
## $ bbox_top     (dbl) 47.64671, 47.63990, 47.60210, 47.62438, 47.59203,...
## $ bbox_right   (dbl) -122.3504, -122.3025, -122.3321, -122.3559, -122....
## $ bbox_bottom  (dbl) -122.3502, -122.3022, -122.3321, -122.3559, -122....

We can even plot those locations:

library(rgdal)
library(ggplot2)
library(ggthemes)
library(sp)
library(DT)
 
# Grab a neighborhood map of Seattle
url <- "https://data.seattle.gov/api/file_data/VkU4Er5ow6mlI0loFhjIw6eL6eKEYMefYMm4MGcUakU?filename=Neighborhoods.zip"
fil <- "seattle.zip"
if (!file.exists(fil)) download.file(url, fil)
if (!dir.exists("seattle")) unzip(fil, exdir="seattle")
 
# make it usable
sea <- readOGR("seattle/Neighborhoods/WGS84/Neighborhoods.shp", "Neighborhoods")
 
## OGR data source with driver: ESRI Shapefile 
## Source: "seattle/Neighborhoods/WGS84/Neighborhoods.shp", layer: "Neighborhoods"
## with 119 features
## It has 12 fields
 
sea_map <- fortify(sea)
 
# Get the extenes of where the pubs are so we can "zoom in"
bnd_box <- bbox(SpatialPoints(as.matrix(sea_pubs[, c("lon", "lat")])))
 
# plot them
gg <- ggplot()
gg <- gg + geom_map(data=sea_map, map=sea_map,
                    aes(x=long, y=lat, map_id=id),
                    color="black", fill="#c0c0c0", size=0.25)
gg <- gg + geom_point(data=sea_pubs, aes(x=lon, y=lat),
                      color="#ffff33", fill="#ff7f00",
                      shape=21, size=4, alpha=1/2)
# decent projection for Seattle-y things and expand the zoom/clip a bit
gg <- gg + coord_map("gilbert",
                     xlim=extendrange(bnd_box["lon",], f=0.5),
                     ylim=extendrange(bnd_box["lat",], f=0.5))
gg <- gg + labs(title="Seattle Pubs")
gg <- gg + theme_map()
gg <- gg + theme(title=element_text(size=16))
gg

seattle_map-1

Of course you can geocode:

addrs <- osm_geocode(c("1600 Pennsylvania Ave, Washington, DC.",
                     "1600 Amphitheatre Parkway, Mountain View, CA",
                     "Seattle, Washington"))
addrs %>% select(display_name)
 
## Source: local data frame [3 x 1]
## 
##                                                                  display_name
## 1                  Washington, District of Columbia, United States of America
## 2 Mountainview Lane, Huntington Beach, Orange County, California, 92648, Unit
## 3                  Seattle, King County, Washington, United States of America
 
addrs %>% select(lat, lon)
 
## Source: local data frame [3 x 2]
## 
##        lat        lon
## 1 38.89495  -77.03665
## 2 33.67915 -118.02588
## 3 47.60383 -122.33006

Or, reverse geocode:

# Reverse geocode Canadian embassies
# complete list of Canadian embassies here:
# http://open.canada.ca/data/en/dataset/6661f0f8-2fb2-46fa-9394-c033d581d531
 
embassies <- data.frame(lat=c("34.53311", "41.327546", "41.91534", "36.76148", "-13.83282",
                             "40.479094", "-17.820705", "13.09511", "13.09511"),
                       lon=c("69.1835", "19.818698", "12.50891", "3.0166", "-171.76462",
                             "-3.686115", "31.043559", "-59.59998", "-59.59998"), stringsAsFactors=FALSE)
 
emb_coded_coords <- reverse_geocode_coords(embassies$lat, embassies$lon)
 
emb_coded_coords %>% select(display_name)
 
## Source: local data frame [9 x 1]
## 
##                                                                  display_name
## 1                Embassy of Canada, Ch.R.Wazir Akbar Khan, Kabul, Afghanistan
## 2 Monumenti i Skënderbeut, Skanderbeg Square, Lulishtja Këshilli i Europëes, 
## 3 Nomentana/Trieste, Via Nomentana, San Lorenzo, Salario, Municipio Roma II, 
## 4 18, Avenue Khalef Mustapha, Ben Aknoun, Daïra Bouzareah, Algiers, Ben aknou
## 5                               The Hole in the Wall, Beach Road, Āpia, Samoa
## 6 Torre Espacio, 259 D, Paseo de la Castellana, Fuencarral, Fuencarral-El Par
## 7 Leopold Takawira Street, Avondale West, Harare, Harare Province, 00263, Zim
## 8                    Bishop's Court Hill, Bridgetown, Saint Michael, Barbados
## 9                    Bishop's Court Hill, Bridgetown, Saint Michael, Barbados

It can even return `Spatial` objects (somewhat experimental):

# stock example search from OSM
osm_search_spatial("[bakery]+berlin+wedding", limit=5)[[1]]
 
##            coordinates   place_id
## 1 (13.34931, 52.54165)    9039748
## 2 (13.34838, 52.54125) 2659941153
## 3 (13.35678, 52.55138)   23586341
## 4 (13.34985, 52.54158)    7161987
## 5  (13.35348, 52.5499)   29179742
##                                                                               licence
## 1 Data © OpenStreetMap contributors, ODbL 1.0. http://www.openstreetmap.org/copyright
## 2 Data © OpenStreetMap contributors, ODbL 1.0. http://www.openstreetmap.org/copyright
## 3 Data © OpenStreetMap contributors, ODbL 1.0. http://www.openstreetmap.org/copyright
## 4 Data © OpenStreetMap contributors, ODbL 1.0. http://www.openstreetmap.org/copyright
## 5 Data © OpenStreetMap contributors, ODbL 1.0. http://www.openstreetmap.org/copyright
##   osm_type     osm_id      lat      lon
## 1     node  939667448 52.54165 13.34931
## 2     node 3655549445 52.54125 13.34838
## 3     node 2299953786 52.55138 13.35678
## 4     node  762607353 52.54158 13.34985
## 5     node 2661679367 52.54990 13.35348
##                                                                                    display_name
## 1            Baguetterie, Föhrer Straße, Brüsseler Kiez, Wedding, Mitte, Berlin, 13353, Germany
## 2 Föhrer Cafe & Backshop, Föhrer Straße, Brüsseler Kiez, Wedding, Mitte, Berlin, 13353, Germany
## 3               Körfez, Amsterdamer Straße, Leopoldkiez, Wedding, Mitte, Berlin, 13347, Germany
## 4             Knusperbäcker, Torfstraße, Brüsseler Kiez, Wedding, Mitte, Berlin, 13353, Germany
## 5             Hofbäckerei, Müllerstraße, Brüsseler Kiez, Wedding, Mitte, Berlin, 13353, Germany
##   class   type importance
## 1  shop bakery      0.201
## 2  shop bakery      0.201
## 3  shop bakery      0.201
## 4  shop bakery      0.201
## 5  shop bakery      0.201
##                                                                                                      icon
## 1 http://mq-open-search-int-ls04.ihost.aol.com:8000/nominatim/v1/images/mapicons/shopping_bakery.p.20.png
## 2 http://mq-open-search-int-ls04.ihost.aol.com:8000/nominatim/v1/images/mapicons/shopping_bakery.p.20.png
## 3 http://mq-open-search-int-ls04.ihost.aol.com:8000/nominatim/v1/images/mapicons/shopping_bakery.p.20.png
## 4 http://mq-open-search-int-ls04.ihost.aol.com:8000/nominatim/v1/images/mapicons/shopping_bakery.p.20.png
## 5 http://mq-open-search-int-ls04.ihost.aol.com:8000/nominatim/v1/images/mapicons/shopping_bakery.p.20.png
##    bbox_left   bbox_top bbox_right bbox_bottom
## 1 52.5416504 52.5416504  13.349306   13.349306
## 2 52.5412496 52.5412496 13.3483832  13.3483832
## 3 52.5513806 52.5513806 13.3567785  13.3567785
## 4   52.54158   52.54158 13.3498507  13.3498507
## 5 52.5499029 52.5499029 13.3534756  13.3534756

The lookup functions are vectorized but there’s a delay built in to
avoid slamming the free servers.

Some things on the TODO list are:

– enabling configuration of timeouts
– enabling switching Nominatim API server providers (you can host your
own!)
– better `Spatial` support

So, give the [code a spin](https://github.com/hrbrmstr/nominatim) and
submit feature requests/issues to github!

Despite having shown various ways to overcome D3 cartographic envy, there are always more examples that can cause the green monster to rear it’s ugly head.

Take the Voronoi Arc Map example.


Voronoi_Arc_Map

For those in need of a primer, a Voronoi tesslation/diagram is:

a partitioning of a plane into regions based on distance to points in a specific subset of the plane. That set of points (called seeds, sites, or generators) is specified beforehand, and for each seed there is a corresponding region consisting of all points closer to that seed than to any other. Wikipedia

We can overlay a Voronoi tessalation on top of a map in R as well thanks to the deldir package (which has been around since the “S” days!). Let’s get (most of) the package requirements cruft out of the way, first:

library(sp)
library(rgdal)
library(deldir)
library(dplyr)
library(ggplot2)
library(ggthemes)

Now we’ll [ab]use the data from the Arc Map example:

flights <- read.csv("http://bl.ocks.org/mbostock/raw/7608400/flights.csv", stringsAsFactors=FALSE)
airports <- read.csv("http://bl.ocks.org/mbostock/raw/7608400/airports.csv", stringsAsFactors=FALSE)

Since the D3 example cheats and only uses the continental US (CONUS) we’ll do the same and we’ll also filter out only those airports mentioned in the flights data and get the total # of incoming/outgoing flights for each airport:

conus <- state.abb[!(state.abb %in% c("AK", "HI"))]
airports <- filter(airports,
                   state %in% conus,
                   iata %in% union(flights$origin, flights$destination))
orig <- select(count(flights, origin), iata=origin, n1=n)
dest <- select(count(flights, destination), iata=destination, n2=n)
airports <- left_join(airports,
                      select(mutate(left_join(orig, dest),
                                    tot=n1+n2),
                             iata, tot)) %>% 
            filter(!is.na(tot))

Since we’re going to initially plot polygons in ggplot (and, eventually, in leaflet), we’ll need to work with Spatial objects, so let’s make those airport lat/lon pairs into a SpatialPointsDataFrame:

vor_pts <- SpatialPointsDataFrame(cbind(airports$longitude,
                                        airports$latitude),
                                  airports, match.ID=TRUE)

The deldir function returns a pretty complex object. Thankfully, the authors of the package realized that one might just want the polygons from the computation and pre-made a function: tile.list for computing/extracting them. Those polygons aren’t, however, closed and we really want to keep the airport data associatd with them, so we need to close the polygons and associate the data. Since we’re likely going to repeat this task, let’s make it a (very badly named) function:

SPointsDF_to_voronoi_SPolysDF <- function(sp) {
 
  # tile.list extracts the polygon data from the deldir computation
  vor_desc <- tile.list(deldir(sp@coords[,1], sp@coords[,2]))
 
  lapply(1:(length(vor_desc)), function(i) {
 
    # tile.list gets us the points for the polygons but we
    # still have to close them, hence the need for the rbind
    tmp <- cbind(vor_desc[[i]]$x, vor_desc[[i]]$y)
    tmp <- rbind(tmp, tmp[1,])
 
    # now we can make the Polygon(s)
    Polygons(list(Polygon(tmp)), ID=i)
 
  }) -> vor_polygons
 
  # hopefully the caller passed in good metadata!
  sp_dat <- sp@data
 
  # this way the IDs _should_ match up w/the data & voronoi polys
  rownames(sp_dat) <- sapply(slot(SpatialPolygons(vor_polygons),
                                  'polygons'),
                             slot, 'ID')
 
  SpatialPolygonsDataFrame(SpatialPolygons(vor_polygons),
                           data=sp_dat)
 
}

Before we can make the plots, we need to put the Spatial objects into the proper form for ggplot2 (and get the U.S. state map):

vor <- SPointsDF_to_voronoi_SPolysDF(vor_pts)
 
vor_df <- fortify(vor)
 
states <- map_data("state")

Now we can have some fun. Let’s try to mimic the D3 example map as closely as possible. We’ll lay down the CONUS map, add a points layer for the the airports, sizing & styling them just like the D3 example. Note that we order the points so that the smallest ones appear on top (so we can still see them).

We’ll then lay down our newly created Voronoi layer. We’ll also use the same projection (Albers) that the D3 examples uses:

gg <- ggplot()
# base map
gg <- gg + geom_map(data=states, map=states,
                    aes(x=long, y=lat, map_id=region),
                    color="white", fill="#cccccc", size=0.5)
# airports layer
gg <- gg + geom_point(data=arrange(airports, desc(tot)),
                      aes(x=longitude, y=latitude, size=sqrt(tot)),
                      shape=21, color="white", fill="steelblue")
# voronoi layer
gg <- gg + geom_map(data=vor_df, map=vor_df,
                    aes(x=long, y=lat, map_id=id),
                    color="#a5a5a5", fill="#FFFFFF00", size=0.25)
gg <- gg + scale_size(range=c(2, 9))
gg <- gg + coord_map("albers", lat0=30, lat1=40)
gg <- gg + theme_map()
gg <- gg + theme(legend.position="none")
gg

ggplot-1

While that’s pretty, it’s not exactly useful. I’m sure there are times when it’s important to show the Voronoi polygons, but they are especially useful when they are used to help with user interface interactions.

In the case of this map, some airport “bubbles” are very small and many overlap, making a “click” (or even “hover”) a potentially painstaking task for someone looking to get more data out of the visualization. The D3 example uses Voronoi polygons to make it super-easy for the user to hover over a map area and get more info about the flights for the closest airport to the mouse pointer.

We’ll use the leaflet htmlwidget to do something similar. Until I can figure out “hover” events for R+leaflet, you’ll have to live with “click”.

First we’ll need some additional packages:

library(leaflet)
library(rgeos)
library(htmltools)

And, we’ll also need a U.S. shapefile (which we simplify since the polygons are pretty detailed and that’s not necessary for this vis):

url <- "http://eric.clst.org/wupl/Stuff/gz_2010_us_040_00_500k.json"
fil <- "gz_2010_us_040_00_500k.json"
 
if (!file.exists(fil)) download.file(url, fil, cacheOK=TRUE)
 
states_m <- readOGR("gz_2010_us_040_00_500k.json", 
                    "OGRGeoJSON", verbose=FALSE)
states_m <- subset(states_m, 
                   !NAME %in% c("Alaska", "Hawaii", "Puerto Rico"))
dat <- states_m@data # gSimplify whacks the data bits
states_m <- SpatialPolygonsDataFrame(gSimplify(states_m, 0.05,
                                               topologyPreserve=TRUE),
                                     dat, FALSE)

The leaflet vis idiom is similar to the ggplot idiom. I’m using a base tile layer since I was too lazy to figure out how to change the leaflet default gray background map color. The map polygons are added, then the circles/bubbles (note that you work in meters with addCircles which lets leaflet scale the bubbles as you zoom in/out). Finally, the Voronoi layer is added. I kept the stroke visible purely for demonstration purposes. You need to keep fill=TRUE otherwise the Voronoi layer won’t get click/hover events and once I figure out how to trigger popups on hover and use a static popup layer, this will let users hover around the map to get the underlying airport flight information.

leaflet(width=900, height=650) %>%
  # base map
  addProviderTiles("Hydda.Base") %>%
  addPolygons(data=states_m,
              stroke=TRUE, color="white", weight=1, opacity=1,
              fill=TRUE, fillColor="#cccccc", smoothFactor=0.5) %>%
  # airports layer
  addCircles(data=arrange(airports, desc(tot)),
             lng=~longitude, lat=~latitude,
             radius=~sqrt(tot)*5000, # size is in m for addCircles O_o
             color="white", weight=1, opacity=1,
             fillColor="steelblue", fillOpacity=1) %>%
  # voronoi (click) layer
  addPolygons(data=vor,
              stroke=TRUE, color="#a5a5a5", weight=0.25,
              fill=TRUE, fillOpacity = 0.0,
              smoothFactor=0.5, 
              popup=sprintf("Total In/Out: %s",
                            as.character(vor@data$tot)))

I made the Voronoi layer very light, so you may want to keep it there as a cue for the user. How you work with it is completely up to you.

Now you have one less reason to be envious of the D3 cartographers!

As I was putting together the [coord_proj](https://rud.is/b/2015/07/24/a-path-towards-easier-map-projection-machinations-with-ggplot2/) ggplot2 extension I had posted a (https://gist.github.com/hrbrmstr/363e33f74e2972c93ca7) that I shared on Twitter. Said gist received a comment (several, in fact) and a bunch of us were painfully reminded of the fact that there is no built-in way to receive notifications from said comment activity.

@jennybryan posited that it could be possible to use IFTTT as a broker for these notifications, but after some checking that ended up not being directly doable since there are no “gist comment” triggers to act upon in IFTTT.

There are a few standalone Ruby gems that programmatically retrieve gist comments but I wasn’t interested in managing a Ruby workflow [ugh]. I did find a Heroku-hosted service – https://gh-rss.herokuapp.com/ – that will turn gist comments into an RSS/Atom feed (based on Ruby again). I gave it a shot and hooked it up to IFTTT but my feed is far enough down on the food chain there that it never gets updated. It was possible to deploy that app on my own Heroku instance, but—again—I’m not interested in managing a Ruby workflow.

The Ruby scripts pretty much:

– grab your main gist RSS/Atom feed
– visit each gist in the feed
– extract comments & comment metadata from them (if any)
– return a composite data structure you can do anything with

That’s super-easy to duplicate in R, so I decided to build a small R script that does all that and generates an RSS/Atom file which I added to my Feedly feeds (I’m pretty much always scanning RSS, so really didn’t need the IFTTT notification setup). I put it into a `cron` job that runs every hour. When Feedly refreshes the feed, a new entry will appear whenever there’s a new comment.

The script is below and [on github](https://gist.github.com/hrbrmstr/0ad1ced217edd137de27) (ironically as a gist). Here’s what you’ll grok from the code:

– one way to deal with the “default namespace” issue in R+XML
– one way to deal with error checking for scraping
– how to build an XML file (and, specifically, an RSS/Atom feed) with R
– how to escape XML entities with R
– how to get an XML object as a character string in R

You’ll definitely need to tweak this a bit for your own setup, but it should be a fairly complete starting point for you to work from. To see the output, grab the [generated feed](http://dds.ec/hrbrmstrgcfeed.xml).

# Roll your own GitHub Gist Comments Feed in R
 
library(xml2)    # github version
library(rvest)   # github version
library(stringr) # for str_trim & str_replace
library(dplyr)   # for data_frame & bind_rows
library(pbapply) # free progress bars for everyone!
library(XML)     # to build the RSS feed
 
who <- "hrbrmstr" # CHANGE ME!
 
# Grab the user's gist feed -----------------------------------------------
 
gist_feed <- sprintf("https://gist.github.com/%s.atom", who)
feed_pg <- read_xml(gist_feed)
ns <- xml_ns_rename(xml_ns(feed_pg), d1 = "feed")
 
# Extract the links & titles of the gists in the feed ---------------------
 
links <-  xml_attr(xml_find_all(feed_pg, "//feed:entry/feed:link", ns), "href")
titles <-  xml_text(xml_find_all(feed_pg, "//feed:entry/feed:title", ns))
 
#' This function does the hard part by iterating over the
#' links/titles and building a tbl_df of all the comments per-gist
get_comments <- function(links, titles) {
 
  bind_rows(pblapply(1:length(links), function(i) {
 
    # get gist
 
    pg <- read_html(links[i])
 
    # look for comments
 
    ref <- tryCatch(html_attr(html_nodes(pg, "div.timeline-comment-wrapper a[href^='#gistcomment']"), "href"),
                    error=function(e) character(0))
 
    # in theory if 'ref' exists then the rest will
 
    if (length(ref) != 0) {
 
      # if there were comments, get all the metadata we care about
 
      author <- html_text(html_nodes(pg, "div.timeline-comment-wrapper a.author"))
      timestamp <- html_attr(html_nodes(pg, "div.timeline-comment-wrapper time"), "datetime")
      contentpg <- str_trim(html_text(html_nodes(pg, "div.timeline-comment-wrapper div.comment-body")))
 
    } else {
      ref <- author <- timestamp <- contentpg <- character(0)
    }
 
    # bind_rows ignores length 0 tbl_df's
    if (sum(lengths(list(ref, author, timestamp, contentpg))==0)) {
      return(data_frame())
    }
 
    return(data_frame(title=titles[i], link=links[i],
                      ref=ref, author=author,
                      timestamp=timestamp, contentpg=contentpg))
 
  }))
 
}
 
comments <- get_comments(links, titles)
 
feed <- xmlTree("feed")
feed$addNode("id", sprintf("user:%s", who))
feed$addNode("title", sprintf("%s's gist comments", who))
feed$addNode("icon", "https://assets-cdn.github.com/favicon.ico")
feed$addNode("link", attrs=list(href=sprintf("https://github.com/%s", who)))
feed$addNode("updated", format(Sys.time(), "%Y-%m-%dT%H:%M:%SZ", tz="GMT"))
 
for (i in 1:nrow(comments)) {
 
  feed$addNode("entry", close=FALSE)
    feed$addNode("id", sprintf("gist:comment:%s:%s", who, comments[i, "timestamp"]))
    feed$addNode("link", attrs=list(href=sprintf("%s%s", comments[i, "link"], comments[i, "ref"])))
    feed$addNode("title", sprintf("Comment by %s", comments[i, "author"]))
    feed$addNode("updated", comments[i, "timestamp"])
    feed$addNode("author", close=FALSE)
      feed$addNode("name", comments[i, "author"])
    feed$closeTag()
    feed$addNode("content", saveXML(xmlTextNode(as.character(comments[i, "contentpg"])), prefix=""), 
                 attrs=list(type="html"))
  feed$closeTag()
 
}
 
rss <- str_replace(saveXML(feed), "<feed>", '<feed xmlns="http://www.w3.org/2005/Atom">')
 
writeLines(rss, con="feed.xml")

To get that RSS feed into something that an internet service can process you have to make sure that `feed.xml` is being written to a directory that translates to a publicly accessible web location (mine is at [http://dds.ec/hrbrmstrgcfeed.xml](http://dds.ec/hrbrmstrgcfeed.xml) if you want to see it).

On the internet-facing Ubuntu box that generated the feed I’ve got a `cron` entry:

30  * * * * /home/bob/bin/gengcfeed.R

which means it’s going to check github every 30 minutes for comment updates. Tune said parameters to your liking.

At the top of `gengcfeed.R` I have an `Rscript` shebang:

#!/usr/bin/Rscript

and the execute bit is set on the file.

Run the file by hand, first, and then test the feed via [https://validator.w3.org/feed/](https://validator.w3.org/feed/) to ensure it’s accessible and that it validates correctly. Now you can enter that feed URL into your favorite newsfeed reader (I use @feedly).