Skip navigation

Author Archives: hrbrmstr

Don't look at me…I do what he does — just slower. #rstats avuncular • ?Resistance Fighter • Cook • Christian • [Master] Chef des Données de Sécurité @ @rapid7

Despite having shown various ways to overcome D3 cartographic envy, there are always more examples that can cause the green monster to rear it’s ugly head.

Take the Voronoi Arc Map example.


Voronoi_Arc_Map

For those in need of a primer, a Voronoi tesslation/diagram is:

a partitioning of a plane into regions based on distance to points in a specific subset of the plane. That set of points (called seeds, sites, or generators) is specified beforehand, and for each seed there is a corresponding region consisting of all points closer to that seed than to any other. Wikipedia

We can overlay a Voronoi tessalation on top of a map in R as well thanks to the deldir package (which has been around since the “S” days!). Let’s get (most of) the package requirements cruft out of the way, first:

library(sp)
library(rgdal)
library(deldir)
library(dplyr)
library(ggplot2)
library(ggthemes)

Now we’ll [ab]use the data from the Arc Map example:

flights <- read.csv("http://bl.ocks.org/mbostock/raw/7608400/flights.csv", stringsAsFactors=FALSE)
airports <- read.csv("http://bl.ocks.org/mbostock/raw/7608400/airports.csv", stringsAsFactors=FALSE)

Since the D3 example cheats and only uses the continental US (CONUS) we’ll do the same and we’ll also filter out only those airports mentioned in the flights data and get the total # of incoming/outgoing flights for each airport:

conus <- state.abb[!(state.abb %in% c("AK", "HI"))]
airports <- filter(airports,
                   state %in% conus,
                   iata %in% union(flights$origin, flights$destination))
orig <- select(count(flights, origin), iata=origin, n1=n)
dest <- select(count(flights, destination), iata=destination, n2=n)
airports <- left_join(airports,
                      select(mutate(left_join(orig, dest),
                                    tot=n1+n2),
                             iata, tot)) %>% 
            filter(!is.na(tot))

Since we’re going to initially plot polygons in ggplot (and, eventually, in leaflet), we’ll need to work with Spatial objects, so let’s make those airport lat/lon pairs into a SpatialPointsDataFrame:

vor_pts <- SpatialPointsDataFrame(cbind(airports$longitude,
                                        airports$latitude),
                                  airports, match.ID=TRUE)

The deldir function returns a pretty complex object. Thankfully, the authors of the package realized that one might just want the polygons from the computation and pre-made a function: tile.list for computing/extracting them. Those polygons aren’t, however, closed and we really want to keep the airport data associatd with them, so we need to close the polygons and associate the data. Since we’re likely going to repeat this task, let’s make it a (very badly named) function:

SPointsDF_to_voronoi_SPolysDF <- function(sp) {
 
  # tile.list extracts the polygon data from the deldir computation
  vor_desc <- tile.list(deldir(sp@coords[,1], sp@coords[,2]))
 
  lapply(1:(length(vor_desc)), function(i) {
 
    # tile.list gets us the points for the polygons but we
    # still have to close them, hence the need for the rbind
    tmp <- cbind(vor_desc[[i]]$x, vor_desc[[i]]$y)
    tmp <- rbind(tmp, tmp[1,])
 
    # now we can make the Polygon(s)
    Polygons(list(Polygon(tmp)), ID=i)
 
  }) -> vor_polygons
 
  # hopefully the caller passed in good metadata!
  sp_dat <- sp@data
 
  # this way the IDs _should_ match up w/the data & voronoi polys
  rownames(sp_dat) <- sapply(slot(SpatialPolygons(vor_polygons),
                                  'polygons'),
                             slot, 'ID')
 
  SpatialPolygonsDataFrame(SpatialPolygons(vor_polygons),
                           data=sp_dat)
 
}

Before we can make the plots, we need to put the Spatial objects into the proper form for ggplot2 (and get the U.S. state map):

vor <- SPointsDF_to_voronoi_SPolysDF(vor_pts)
 
vor_df <- fortify(vor)
 
states <- map_data("state")

Now we can have some fun. Let’s try to mimic the D3 example map as closely as possible. We’ll lay down the CONUS map, add a points layer for the the airports, sizing & styling them just like the D3 example. Note that we order the points so that the smallest ones appear on top (so we can still see them).

We’ll then lay down our newly created Voronoi layer. We’ll also use the same projection (Albers) that the D3 examples uses:

gg <- ggplot()
# base map
gg <- gg + geom_map(data=states, map=states,
                    aes(x=long, y=lat, map_id=region),
                    color="white", fill="#cccccc", size=0.5)
# airports layer
gg <- gg + geom_point(data=arrange(airports, desc(tot)),
                      aes(x=longitude, y=latitude, size=sqrt(tot)),
                      shape=21, color="white", fill="steelblue")
# voronoi layer
gg <- gg + geom_map(data=vor_df, map=vor_df,
                    aes(x=long, y=lat, map_id=id),
                    color="#a5a5a5", fill="#FFFFFF00", size=0.25)
gg <- gg + scale_size(range=c(2, 9))
gg <- gg + coord_map("albers", lat0=30, lat1=40)
gg <- gg + theme_map()
gg <- gg + theme(legend.position="none")
gg

ggplot-1

While that’s pretty, it’s not exactly useful. I’m sure there are times when it’s important to show the Voronoi polygons, but they are especially useful when they are used to help with user interface interactions.

In the case of this map, some airport “bubbles” are very small and many overlap, making a “click” (or even “hover”) a potentially painstaking task for someone looking to get more data out of the visualization. The D3 example uses Voronoi polygons to make it super-easy for the user to hover over a map area and get more info about the flights for the closest airport to the mouse pointer.

We’ll use the leaflet htmlwidget to do something similar. Until I can figure out “hover” events for R+leaflet, you’ll have to live with “click”.

First we’ll need some additional packages:

library(leaflet)
library(rgeos)
library(htmltools)

And, we’ll also need a U.S. shapefile (which we simplify since the polygons are pretty detailed and that’s not necessary for this vis):

url <- "http://eric.clst.org/wupl/Stuff/gz_2010_us_040_00_500k.json"
fil <- "gz_2010_us_040_00_500k.json"
 
if (!file.exists(fil)) download.file(url, fil, cacheOK=TRUE)
 
states_m <- readOGR("gz_2010_us_040_00_500k.json", 
                    "OGRGeoJSON", verbose=FALSE)
states_m <- subset(states_m, 
                   !NAME %in% c("Alaska", "Hawaii", "Puerto Rico"))
dat <- states_m@data # gSimplify whacks the data bits
states_m <- SpatialPolygonsDataFrame(gSimplify(states_m, 0.05,
                                               topologyPreserve=TRUE),
                                     dat, FALSE)

The leaflet vis idiom is similar to the ggplot idiom. I’m using a base tile layer since I was too lazy to figure out how to change the leaflet default gray background map color. The map polygons are added, then the circles/bubbles (note that you work in meters with addCircles which lets leaflet scale the bubbles as you zoom in/out). Finally, the Voronoi layer is added. I kept the stroke visible purely for demonstration purposes. You need to keep fill=TRUE otherwise the Voronoi layer won’t get click/hover events and once I figure out how to trigger popups on hover and use a static popup layer, this will let users hover around the map to get the underlying airport flight information.

leaflet(width=900, height=650) %>%
  # base map
  addProviderTiles("Hydda.Base") %>%
  addPolygons(data=states_m,
              stroke=TRUE, color="white", weight=1, opacity=1,
              fill=TRUE, fillColor="#cccccc", smoothFactor=0.5) %>%
  # airports layer
  addCircles(data=arrange(airports, desc(tot)),
             lng=~longitude, lat=~latitude,
             radius=~sqrt(tot)*5000, # size is in m for addCircles O_o
             color="white", weight=1, opacity=1,
             fillColor="steelblue", fillOpacity=1) %>%
  # voronoi (click) layer
  addPolygons(data=vor,
              stroke=TRUE, color="#a5a5a5", weight=0.25,
              fill=TRUE, fillOpacity = 0.0,
              smoothFactor=0.5, 
              popup=sprintf("Total In/Out: %s",
                            as.character(vor@data$tot)))

I made the Voronoi layer very light, so you may want to keep it there as a cue for the user. How you work with it is completely up to you.

Now you have one less reason to be envious of the D3 cartographers!

As I was putting together the [coord_proj](https://rud.is/b/2015/07/24/a-path-towards-easier-map-projection-machinations-with-ggplot2/) ggplot2 extension I had posted a (https://gist.github.com/hrbrmstr/363e33f74e2972c93ca7) that I shared on Twitter. Said gist received a comment (several, in fact) and a bunch of us were painfully reminded of the fact that there is no built-in way to receive notifications from said comment activity.

@jennybryan posited that it could be possible to use IFTTT as a broker for these notifications, but after some checking that ended up not being directly doable since there are no “gist comment” triggers to act upon in IFTTT.

There are a few standalone Ruby gems that programmatically retrieve gist comments but I wasn’t interested in managing a Ruby workflow [ugh]. I did find a Heroku-hosted service – https://gh-rss.herokuapp.com/ – that will turn gist comments into an RSS/Atom feed (based on Ruby again). I gave it a shot and hooked it up to IFTTT but my feed is far enough down on the food chain there that it never gets updated. It was possible to deploy that app on my own Heroku instance, but—again—I’m not interested in managing a Ruby workflow.

The Ruby scripts pretty much:

– grab your main gist RSS/Atom feed
– visit each gist in the feed
– extract comments & comment metadata from them (if any)
– return a composite data structure you can do anything with

That’s super-easy to duplicate in R, so I decided to build a small R script that does all that and generates an RSS/Atom file which I added to my Feedly feeds (I’m pretty much always scanning RSS, so really didn’t need the IFTTT notification setup). I put it into a `cron` job that runs every hour. When Feedly refreshes the feed, a new entry will appear whenever there’s a new comment.

The script is below and [on github](https://gist.github.com/hrbrmstr/0ad1ced217edd137de27) (ironically as a gist). Here’s what you’ll grok from the code:

– one way to deal with the “default namespace” issue in R+XML
– one way to deal with error checking for scraping
– how to build an XML file (and, specifically, an RSS/Atom feed) with R
– how to escape XML entities with R
– how to get an XML object as a character string in R

You’ll definitely need to tweak this a bit for your own setup, but it should be a fairly complete starting point for you to work from. To see the output, grab the [generated feed](http://dds.ec/hrbrmstrgcfeed.xml).

# Roll your own GitHub Gist Comments Feed in R
 
library(xml2)    # github version
library(rvest)   # github version
library(stringr) # for str_trim & str_replace
library(dplyr)   # for data_frame & bind_rows
library(pbapply) # free progress bars for everyone!
library(XML)     # to build the RSS feed
 
who <- "hrbrmstr" # CHANGE ME!
 
# Grab the user's gist feed -----------------------------------------------
 
gist_feed <- sprintf("https://gist.github.com/%s.atom", who)
feed_pg <- read_xml(gist_feed)
ns <- xml_ns_rename(xml_ns(feed_pg), d1 = "feed")
 
# Extract the links & titles of the gists in the feed ---------------------
 
links <-  xml_attr(xml_find_all(feed_pg, "//feed:entry/feed:link", ns), "href")
titles <-  xml_text(xml_find_all(feed_pg, "//feed:entry/feed:title", ns))
 
#' This function does the hard part by iterating over the
#' links/titles and building a tbl_df of all the comments per-gist
get_comments <- function(links, titles) {
 
  bind_rows(pblapply(1:length(links), function(i) {
 
    # get gist
 
    pg <- read_html(links[i])
 
    # look for comments
 
    ref <- tryCatch(html_attr(html_nodes(pg, "div.timeline-comment-wrapper a[href^='#gistcomment']"), "href"),
                    error=function(e) character(0))
 
    # in theory if 'ref' exists then the rest will
 
    if (length(ref) != 0) {
 
      # if there were comments, get all the metadata we care about
 
      author <- html_text(html_nodes(pg, "div.timeline-comment-wrapper a.author"))
      timestamp <- html_attr(html_nodes(pg, "div.timeline-comment-wrapper time"), "datetime")
      contentpg <- str_trim(html_text(html_nodes(pg, "div.timeline-comment-wrapper div.comment-body")))
 
    } else {
      ref <- author <- timestamp <- contentpg <- character(0)
    }
 
    # bind_rows ignores length 0 tbl_df's
    if (sum(lengths(list(ref, author, timestamp, contentpg))==0)) {
      return(data_frame())
    }
 
    return(data_frame(title=titles[i], link=links[i],
                      ref=ref, author=author,
                      timestamp=timestamp, contentpg=contentpg))
 
  }))
 
}
 
comments <- get_comments(links, titles)
 
feed <- xmlTree("feed")
feed$addNode("id", sprintf("user:%s", who))
feed$addNode("title", sprintf("%s's gist comments", who))
feed$addNode("icon", "https://assets-cdn.github.com/favicon.ico")
feed$addNode("link", attrs=list(href=sprintf("https://github.com/%s", who)))
feed$addNode("updated", format(Sys.time(), "%Y-%m-%dT%H:%M:%SZ", tz="GMT"))
 
for (i in 1:nrow(comments)) {
 
  feed$addNode("entry", close=FALSE)
    feed$addNode("id", sprintf("gist:comment:%s:%s", who, comments[i, "timestamp"]))
    feed$addNode("link", attrs=list(href=sprintf("%s%s", comments[i, "link"], comments[i, "ref"])))
    feed$addNode("title", sprintf("Comment by %s", comments[i, "author"]))
    feed$addNode("updated", comments[i, "timestamp"])
    feed$addNode("author", close=FALSE)
      feed$addNode("name", comments[i, "author"])
    feed$closeTag()
    feed$addNode("content", saveXML(xmlTextNode(as.character(comments[i, "contentpg"])), prefix=""), 
                 attrs=list(type="html"))
  feed$closeTag()
 
}
 
rss <- str_replace(saveXML(feed), "<feed>", '<feed xmlns="http://www.w3.org/2005/Atom">')
 
writeLines(rss, con="feed.xml")

To get that RSS feed into something that an internet service can process you have to make sure that `feed.xml` is being written to a directory that translates to a publicly accessible web location (mine is at [http://dds.ec/hrbrmstrgcfeed.xml](http://dds.ec/hrbrmstrgcfeed.xml) if you want to see it).

On the internet-facing Ubuntu box that generated the feed I’ve got a `cron` entry:

30  * * * * /home/bob/bin/gengcfeed.R

which means it’s going to check github every 30 minutes for comment updates. Tune said parameters to your liking.

At the top of `gengcfeed.R` I have an `Rscript` shebang:

#!/usr/bin/Rscript

and the execute bit is set on the file.

Run the file by hand, first, and then test the feed via [https://validator.w3.org/feed/](https://validator.w3.org/feed/) to ensure it’s accessible and that it validates correctly. Now you can enter that feed URL into your favorite newsfeed reader (I use @feedly).

The $DAYJOB doesn’t afford much opportunity to work with cartographic datasets, but I really like maps and tinker with shapefiles and geo-data when I can, plus answer a ton of geo-questions on StackOverflow. R makes it easy—one might even say too easy—to work with maps. All it takes to make a map of the continental United States (CONUS) is:

library(maps)
map("state")

base_1-1

It’s a little more involved with ggplot2:

library(ggplot2)
library(ggthemes)

states <- map_data("state")

gg <- ggplot()
gg <- gg + geom_map(data=states, map=states,
                    aes(x=long, y=lat, map_id=region),
                    color="black", fill="white", size=0.25)
gg <- gg + theme_map()
gg

ggplot_1-1

Both of those maps are horrible. The maps::map function defaults to using a rectangular projection with the aspect ratio chosen so that longitude and latitude scales are equivalent at the center of the picture. ggplot will size the graphic to the device window. Sticking with these defaults is a really bad idea. Why? I’ll let Mark Monmonier (author of How to Lie with Maps) explain:

Maps have three basic attributes: scale, projection, and symbolization. Each element is a source of distortion. As a group, they describe the essence of the map’s possibilities and limitations. No one can use maps or make maps safely and effectively without understanding map scales, map projections, and map symbols.

Map projections distort five geographic relationships: areas, angles, gross shapes, distances, and directions. Although some projections preserve local angles but not areas, others preserve areas but not local angles. All distort large shapes noticeably (but some distort continental shapes more than others), and all distort at least some distances and some directions.

There are great examples in his book on how map projections can inflate or diminish the area and relative importance of countries and regions, and how a map projection can itself become a rallying point for “cartographically oppressed” regions.

If map projections are important (and they are) what should you do? Both map and ggplot give you options to use projections that are found in the mapproj library (specifically using the mapproject function). The help for mapproject even gives an example of using the Albers equal-area conic projection when plotting the CONUS:

library(mapproj)
map("state", projection="albers", par=c(lat0=30, lat1=40))

base_2-1

As it’s full name suggests, Albers is an equal-area projection which is recommended for U.S. choropleths as it preserves the relative areas of geographic features. It’s also better than the ggplot default (“Mercator“) in it’s coord_map:

gg + coord_map()

ggplot_2-1

But, we can pass in parameters to use a saner projection:

gg + coord_map("albers", lat0=30, lat1=40)

ggplot_3-1

The mapproject function exposes 41 projections, which may seem generous; however, not all of them are practical and even the ones with parameters are not very customizable. Before we dig into that a bit more, you’re probably wondering (if you don’t work with cartography for a living or hobby) how one goes about choosing a map projection…

Choosing A Map Projection

Thankfully, many smart folks have thought quite a bit about choosing map projections and there are a number of resources you can draw upon when making your own choices.

The first one is Map Projections – A Working Manual. This is a free resource from the U.S. Geological Survey and was published back in 1987. It has a “companion” resource – An Album of Map Projections. Both are outstanding resources to have on hand as they provide a great deal of information on map projections. If you’re in a hurry, the “Album” makes for a good quick reference. Here’s the entry for our buddy, Albers:


album-albers

(Go to page 10 of the “Album” for the larger version)

The image in the upper-right is a “Tissot indicatrix” (named for it’s creator Nicolas Auguste Tissot), which “puts Tissot indicatrices at every 30° of latitude and longitude, except at the poles. This shows the shape of infinitesimally small circles on the Earth as they appear when they are plotted by using a fixed finite scale at the same locations on a map. Every circle is plotted as a circle or an ellipse or, in extreme cases, as a straight line. On an equal-area projection, all these ellipses and circles are shown as having the same area. The flattening of the ellipse shows the extent of local shape distortion and how much the scale is changed and in what direction. On conformal map projections, all indicatrices remain circles, but the areas change. On other projections, both the areas and the shapes of the indicatrices change”. This makes for a great way to understand just how your creations are being distorted (though that may be obvious when you actually plot your maps).

The “Working Manual” also includes the equations necessary to compute the projections. Both provide projection usage guidance as well as the technically bits describing them.

The Institute for Environment and Sustainability has a similar guide for Map Projections for Europe.

Many countries and municipalities also have their own guides for using projections (e.g. California).

If you can handle what feels like a TARDIS trip back in time to the days of GeoCities, MapRef also provides good information on projections. You can also review Carlos A. Furuti’s compilation of projections for more information.

Using More Complex/Nuanced Projections

Despite being able to use Albers and 40 other projections via mapproject, having more flexibility would allow us to use grid systems (see the refs in the previous section for what those are) and also hyper-customize many projections without the need to write our own equations (be thankful of that as you skim the math in the “Working Manual”!).

Gerald Evenden developed a library and utility called proj for the USGS back in 1995 and said utility, library & projection specification idiom has been maintained and expanded ever since in the PROJ.4 project. This library/tool is straightforward to install on most (if not all) operating systems. PROJ.4 lets you specify projections in a (fairly complex) string format (often referred to as a proj4string, especially in R). For example, you can specify Albers for the U.S. with:

"+proj=aea +lat_1=29.5 +lat_2=45.5 +lat_0=37.5 +lon_0=-96"

Cartographic Projection Procedures for the UNIX Environment—A User’s Manual (considered “the”manual for PROJ.4) explains how to build the specification. At the time of this blog post there are 134 named (i.e. +proj=NAME) projections availble for use & customization (run proj -l at a command-line to see the full list).

A simple example of why this is useful is when you need to plot a world map. Said activity should always be prefaced with a review of this seminal work to ensure you will choose a good projection—say, Winkel tripel. A quick glance of what mapproject supports will show you that you’re pretty much out of luck using the standard R tools we’ve seen so far for that but there is a handy +proj=wintri in PROJ.4 we can use.

Here’s how you’d plot that projection in the base plotting system:

library(sp)
library(rworldmap) # this pkg has waaaay better world shapefiles

plot(spTransform(getMap(), CRS("+proj=wintri")))

base_3-1

However, we can’t pass in PROJ.4 strings to ggplot’s coord_map, so we have to project the Earth first before plotting and use coord_equal:

world <- fortify(spTransform(getMap(), CRS("+proj=wintri")))

gg <- ggplot()
gg <- gg + geom_map(data=world, map=world,
                    aes(x=long, y=lat, map_id=id),
                    color="black", fill="white", size=0.25)
gg <- gg + coord_equal()
gg <- gg + theme_map()
gg

ggplot_4-1

While that works, you have to pre-project any lines, points, etc. as well before plotting them with the map. For example, here are the locations of all of the non-military rocket launch sites (yeah, I’ve done that before, but it’s a fun data set! – and I had it handy):

library(rgdal) # for readOGR
library(httr)  # for downloading

launch_sites <- paste0("https://collaborate.org/sites/collaborate.org/",
                       "files/spacecentersnon-militaryspacelaunchsites.kml")
invisible(try(GET(launch_sites, write_disk("/tmp/launch-sites.kml")), silent=TRUE))

sites <- readOGR("/tmp/launch-sites.kml", 
                 "SpaceCenters: Non-Military Space Launch Sites", 
                 verbose=FALSE)
sites_wt <- spTransform(sites, CRS("+proj=wintri"))
sites_wt <- coordinates(sites_wt)

gg <- gg + geom_point(data=as.data.frame(sites_wt), 
                      aes(x=coords.x1, y=coords.x2), 
                      color="#b2182b")
gg

ggplot_5-1

If you have more layers, that’s doable, but painful. If we could make it so ggplot allow for PROJ.4 projections in some coord_ this would help keep our plot idioms (and code) cleaner. Thus, coord_proj was born.

coord_proj mimics the functionality of coord_map but uses proj4::project instead of mapproject (and takes in any of those parameters). The world map with launch sites now looks like this (complete ggplot code):

world <- fortify(getMap())
sites <- as.data.frame(coordinates(readOGR("/tmp/launch-sites.kml", 
                                           "SpaceCenters: Non-Military Space Launch Sites", 
                                           verbose=FALSE)))

gg <- ggplot()
gg <- gg + geom_map(data=world, map=world,
                    aes(x=long, y=lat, map_id=id),
                    color="black", fill="white", size=0.25)
gg <- gg + geom_point(data=(sites), 
                      aes(x=coords.x1, y=coords.x2), 
                      color="#b2182b")
gg <- gg + coord_proj("+proj=wintri")
gg <- gg + theme_map()
gg

ggplot_6-1

Now, if you want to go all Hobo-Dyer instead of Winkel tripel, it’s one, small change:

gg + coord_proj("+proj=cea +lat_ts=37.5")

ggplot_7-1

Spatial Reference is a website by Howard Butler, Christopher Schmidt, Dane Springmeyer, and Josh Livni that helps assist others in their understanding, recording, and usage of spatial reference systems. It’s ultimate goal is to let you use URI’s to specify spatial reference systems, but you can use it to lookup Proj4 strings to meet virtually any need you have.

The Bad News

Presently, coord_proj is not part of the ggplot2 code base. I chose a really bad time to muck with this as Hadley is doing all sorts of spiffy stuff to ggplot2 and it’s not a good time to shove this in there.

You can fork your own copy of ggplot2 from Hadley’s GitHub repo and add this file to it, re-document/collate/build the package and then use it locally. It still needs some work (there’s a horrible hack in it that anyone familiar with geo-stuff will cringe at) but I welcome feedback/contributions. Hopefully this will get into ggplot2 (or out of it and into a separate package since Hadley is making ggplot2 very extensible in his update) at some point in the not-too-distant future.

Nathaniel Smith and Stéfan van der Walt presented a new colormap (for Python) at SciPy 2015 called viridis.

From the authors:

The default colourmap in Matplotlib is the colourful rainbow-map called Jet, which is deficient in many ways: small changes in the data sometimes produce large perceptual differences and vice-versa; its lightness gradient is non-monotonic; and, it is not particularly robust against color-blind viewing. Thus, a new default colormap is needed — but no obvious candidate has been found. Here, we present our proposed new default colormap for Matplotlib, and expose the theory, tools, data exploration and motivations behind its design.

You can also find out a tad more about their other colormap designs (a.k.a. the runner-ups), along with Parula, which is a proprietary MATLAB color map.

Simon Garnier (@sjmgarnier) took Nathaniel & Stéfan’s work and turned it into an R package.

Noam Ross (@noamross) & I piled on shortly thereafter to add some ggplot color scale_ functions which are (for now) only available in Simon’s github repo.

Rather than duplicate the examples already provided in the documentation of those functions, I thought there might be more efficacy in creating a post that helped showcase why you should switch from rainbow (et al) to viridis.

Since folks seem to like maps, we’ll work with one for the example, but let’s get some package machinations out of the way first:

library(viridis)
library(raster)
library(scales)
library(dichromat)
library(rasterVis)
library(httr)
library(colorspace)

Now, we’ll need a map to work with so let’s grab a U.S. max temperature GeoTIFF raster from NOAA (from the bitter cold month of February 2015) and project it to something more reasonable:

temp_raster <- "http://ftp.cpc.ncep.noaa.gov/GIS/GRADS_GIS/GeoTIFF/TEMP/us_tmax/us.tmax_nohads_ll_20150219_float.tif"
 
try(GET(temp_raster,
        write_disk("us.tmax_nohads_ll_20150219_float.tif")), silent=TRUE)
us <- raster("us.tmax_nohads_ll_20150219_float.tif")
 
# albers FTW
us <- projectRaster(us, crs="+proj=aea +lat_1=29.5 +lat_2=45.5 +lat_0=37.5 +lon_0=-96 +x_0=0 +y_0=0 +ellps=GRS80 +datum=NAD83 +units=m +no_defs")

We’ll also make a helper function to save us some typing and setup the base # of colors in the colormap:

n_col <- 64
 
img <- function(obj, col) {
  image(obj, col=col, asp=1, axes=FALSE, xaxs="i", xaxt='n', yaxt='n', ann=FALSE)
}

Let’s take a look at various color palettes with different types of vision. We’ll use a 3×2 grid and:

  • use 4 color palettes from grDevices,
  • make a gradient palette from one of the ColorBrewer sequential palettes, and
  • then (finally) use a viridis color palette.

We’ll take this grid of 6 maps and view it through the eyes of three different types of color vision as well as a fully desaturated version. Note that I’m not adding much cruft to the map display (including legends) since this isn’t about the values so much as it is about the visual perception of the colormaps.

Remember you can select/click/tap the map grids for (slightly) larger versions.

“Normal” Vision

par(mfrow=c(3, 2), mar=rep(0, 4))
img(us, rev(heat.colors(n_col)))
img(us, rev(rainbow(n_col)))
img(us, rev(topo.colors(n_col)))
img(us, rev(cm.colors(n_col)))
img(us, gradient_n_pal(brewer_pal("seq")(9))(seq(0, 1, length=n_col)))
img(us, rev(viridis(n_col)))

01_normal-1

All of the maps convey the differences in max temperature. If you happen to have “normal” color vision you should be drawn to the bottom two (ColorBrewer on the left and Viridis on the right). They are both sequential and convey the temperature changes more precisely (and they aren’t as gosh-awful ugly the the other four).

While the ColorBrewer gradient may be “good”, Viridis is designed to be:

  • colorful & “pretty”
  • sequential (to not impose other structure on exploratory data analysis visualizations)
  • perceptually uniform (i.e. changes in the data should be accurately decoded by our brains) even when desaturated
  • accessible to colorblind viewers

and seems to meet those goals quite well.

Take a look at each of the vision-adjusted examples:

Green-Blind (Deuteranopia)

par(mfrow=c(3, 2), mar=rep(0, 4))
img(us, dichromat(rev(heat.colors(n_col)), "deutan"))
img(us, dichromat(rev(rainbow(n_col)), "deutan"))
img(us, dichromat(rev(topo.colors(n_col)), "deutan"))
img(us, dichromat(rev(cm.colors(n_col)), "deutan"))
img(us, dichromat(gradient_n_pal(brewer_pal("seq")(9))(seq(0, 1, length=n_col)), "deutan"))
img(us, dichromat(rev(viridis(n_col)), "deutan"))

02_deutan-1

Red-Blind (Protanopia)

par(mfrow=c(3, 2), mar=rep(0, 4))
img(us, dichromat(rev(heat.colors(n_col)), "protan"))
img(us, dichromat(rev(rainbow(n_col)), "protan"))
img(us, dichromat(rev(topo.colors(n_col)), "protan"))
img(us, dichromat(rev(cm.colors(n_col)), "protan"))
img(us, dichromat(gradient_n_pal(brewer_pal("seq")(9))(seq(0, 1, length=n_col)), "protan"))
img(us, dichromat(rev(viridis(n_col)), "protan"))

03_protan-1

Blue-Blind (Tritanopia)

par(mfrow=c(3, 2), mar=rep(0, 4))
img(us, dichromat(rev(heat.colors(n_col)), "tritan"))
img(us, dichromat(rev(rainbow(n_col)), "tritan"))
img(us, dichromat(rev(topo.colors(n_col)), "tritan"))
img(us, dichromat(rev(cm.colors(n_col)), "tritan"))
img(us, dichromat(gradient_n_pal(brewer_pal("seq")(9))(seq(0, 1, length=n_col)), "tritan"))
img(us, dichromat(rev(viridis(n_col)), "tritan"))

04_tritan-1

Desaturated

par(mfrow=c(3, 2), mar=rep(0, 4))
img(us, desaturate(rev(heat.colors(n_col))))
img(us, desaturate(rev(rainbow(n_col))))
img(us, desaturate(rev(topo.colors(n_col))))
img(us, desaturate(rev(cm.colors(n_col))))
img(us, desaturate(gradient_n_pal(brewer_pal("seq")(9))(seq(0, 1, length=n_col))))
img(us, desaturate(rev(viridis(n_col))))

05_desatureated-1

Hopefully both the ColorBrewer gradient and Viridis palettes stood out as conveying the temperature data with more precision and more consistently across all non-standard vision types as you progressed through each one.

To see this for yourself in your own work, grab Simon’s package and start substituting viridis for some of your usual defaults to see if it makes a difference in helping you convey the story your data is trying to tell, both more accurately and for a more diverse audience. Remember, the github version (which will be on CRAN soon) have handy ggplot scale_ functions to make using viridis as painless as possible.

I also updated my Melbourne Walking EDA project to use the viridis palette instead of parula (which I was only really using in defiance of MATLAB’s inane restrictions).

It’s not on CRAN yet, but there’s a devtools-installable R package for getting data from the OMDB API.

It covers all of the public API endpoints:

  • find_by_id: Retrieve OMDB info by IMDB ID search
  • find_by_title: Retrieve OMDB info by title search
  • get_actors: Get actors from an omdb object as a vector
  • get_countries: Get countries from an omdb object as a vector
  • get_directors: Get directors from an omdb object as a vector
  • get_genres: Get genres from an omdb object as a vector
  • get_writers: Get writers from an omdb object as a vector
  • print.omdb: Print an omdb result
  • search_by_title: Lightweight omdb title search

Here’s a bit of it in action:

devtools::install_github("hrbrmstr/omdbapi")
library(dplyr)
library(pbapply)
 
search_by_title("Captain America")
 
# Source: local data frame [10 x 4]
# 
#                                                   Title  Year    imdbID   Type
# 1                    Captain America: The First Avenger  2011 tt0458339  movie
# 2                   Captain America: The Winter Soldier  2014 tt1843866  movie
# 3                                       Captain America  1990 tt0103923  movie
# 4                                       Captain America  1979 tt0078937  movie
# 5           Iron Man and Captain America: Heroes United  2014 tt3911200  movie
# 6                    Captain America II: Death Too Soon  1979 tt0078938  movie
# 7                                       Captain America  1944 tt0036697  movie
# 8                                       Captain America 1966– tt0206474 series
# 9                        Captain America: Super Soldier  2011 tt1740721   game
# 10 Comic Book Origins: Captain America - Winter Soldier  2014 tt3618126  movie
 
search_by_title("Captain America", year_of_release=2013)
 
# Source: local data frame [1 x 4]
# 
#                              Title Year    imdbID  Type
# 1 A Look Back at 'Captain America' 2013 tt3307378 movie
 
games <- search_by_title("Captain America", type="game")
 
glimpse(games)
 
# Observations: 2
# Variables:
# $ Title  (chr) "Captain America: Super Soldier", "Captain America and the A...
# $ Year   (chr) "2011", "1991"
# $ imdbID (chr) "tt1740721", "tt0421939"
# $ Type   (chr) "game", "game"
 
find_by_title(games$Title[1])
 
#      Title: Captain America: Super Soldier
#       Year: 2011
#      Rated: N/A
#   Released: 2011-07-19
#    Runtime: N/A
#      Genre: Action
#   Director: Michael McCormick, Robert Taylor
#     Writer: Christos N. Gage
#     Actors: Hayley Atwell, Chris Evans, Sebastian Stan, Neal McDonough
#       Plot: You play the Sentinel of Liberty as you raid the Red Skull's scientist
#             minion, Armin Zola's, lair.
#   Language: English
#    Country: USA
#     Awards: N/A
#     Poster: http://ia.media-imdb.com/images/M/
#             MV5BMTUwMzQ0NjE5N15BMl5BanBnXkFtZTgwODI3MzQxMTE@._V1_SX300.jpg
#  Metascore: N/A
# imdbRating: 7.2
#  imdbVotes: 271
#     imdbID: tt1740721
#       Type: game
 
find_by_title("Game of Thrones", type="series", season=1, episode=1)
 
#      Title: Winter Is Coming
#       Year: 2011
#      Rated: TV-MA
#   Released: 2011-04-17
#    Runtime: 62 min
#      Genre: Adventure, Drama, Fantasy
#   Director: Timothy Van Patten
#     Writer: David Benioff (created by), D.B. Weiss (created by), George R.R.
#             Martin ("A Song of Ice and Fire" by), David Benioff, D.B.
#             Weiss
#     Actors: Sean Bean, Mark Addy, Nikolaj Coster-Waldau, Michelle Fairley
#       Plot: Jon Arryn, the Hand of the King, is dead. King Robert Baratheon plans
#             to ask his oldest friend, Eddard Stark, to take Jon's
#             place. Across the sea, Viserys Targaryen plans to wed his
#             sister to a nomadic warlord in exchange for an army.
#   Language: English
#    Country: USA
#     Awards: N/A
#     Poster: http://ia.media-imdb.com/images/M/
#             MV5BMTk5MDU3OTkzMF5BMl5BanBnXkFtZTcwOTc0ODg5NA@@._V1_SX300.jpg
#  Metascore: N/A
# imdbRating: 8.5
#  imdbVotes: 12584
#     imdbID: tt1480055
#       Type: episode
 
get_genres(find_by_title("Star Trek: Deep Space Nine", season=5, episode=7))
 
# [1] "Action"    "Adventure" "Drama"
 
get_writers(find_by_title("Star Trek: Deep Space Nine", season=4, episode=6))
 
# [1] "Gene Roddenberry (based upon \"Star Trek\" created by)"
# [2] "Rick Berman (created by)"                              
# [3] "Michael Piller (created by)"                           
# [4] "David Mack"                                            
# [5] "John J. Ordover"
 
get_directors(find_by_id("tt1371111"))
 
# [1] "Tom Tykwer"     "Andy Wachowski" "Lana Wachowski"
 
get_countries(find_by_title("The Blind Swordsman: Zatoichi"))
 
# [1] "Japan"
 
ichi <- search_by_title("Zatoichi")
bind_rows(lapply(ichi$imdbID, function(x) {
  find_by_id(x, include_tomatoes = TRUE)
})) -> zato
 
par(mfrow=c(3,1)) 
boxplot(zato$tomatoUserMeter, horizontal=TRUE, main="Tomato User Meter", ylim=c(0, 100))
boxplot(zato$imdbRating, horizontal=TRUE, main="IMDB Rating", ylim=c(0, 10))
boxplot(zato$tomatoUserRating, horizontal=TRUE, main="Tomato User Rating", ylim=c(0, 5))

README-usage-1

You can find out more at it’s github repo

Poynter did a nice interactive piece on world population by income (i.e. “How Many Live on How Much, and Where”). I’m always on the lookout for optimized shapefiles and clean data (I’m teaching a data science certificate program starting this Fall) and the speed of the site load and the easy availability of the data set made this one a “must acquire”. Rather than just repeat Poynter’s D3-goodness, here’s a way to look at the income data in series of small multiple choropleths—using R & ggplot2—that involves:

  • downloading data & shapefiles from a web site
  • using dplyr & tidyr for data munging
  • applying custom fill color scale mapping in ggplot
  • ordering plots with a custom facet order (using factors)
  • tweaking the theme and aesthetics for a nicely finished result

By using D3, Poynter inherently made the data available. Pop open the “Developer Tools” in any browser, reload the page and look at the “Network” tab and you’ll see a list of files (you can sometimes see things in the source code, but this technique is often faster). The income data is a well-formed CSV file http://www.pewglobal.org/wp-content/themes/pew-global/interactive-global-class.csv and their highly optimized world map was also easy to discern http://www.pewglobal.org/wp-content/lib/js/world-geo.json. We’ll start by grabbing the map and using the same map projection that Poynter did (Robinson). Don’t be put off by all the library calls since one of the best parts of R is the ever-increasing repository of great packages to help you get things done.

library(httr)     # getting data
library(rgdal)    # working with shapefile
library(dplyr)    # awesome data manipulation
library(readr)    # faster reading of CSV data
library(stringi)  # string manipulation
library(stringr)  # string manipulation
library(tidyr)    # reshaping data
library(grid)     # for 'unit'
library(scales)   # for 'percent'
library(ggplot2)  # plotting
library(ggthemes) # theme_map
 
# this ensures you only download the shapefile once and hides
# errors and warnings. remove `try` and `invisible` to see messages
try(invisible(GET("http://www.pewglobal.org/wp-content/lib/js/world-geo.json",
                  write_disk("world-geo.json"))), silent=TRUE)
 
# use ogrListLayers("world-geo.json") to see file type & 
# layer info to use in the call to readOGR
 
world <- readOGR("world-geo.json", "OGRGeoJSON")
world_wt <- spTransform(world, CRS("+proj=robin"))
world_map <- fortify(world_wt)

I would have liked to do fortify(world_wt, region="name") (since that makes working with filling in countries by name much easier in the choropleth part of the code) but that generated TopologyException errors (I’ve seen this happen quite a bit with simplified/optimized shapefiles and some non-D3 geo-packages). One can sometimes fix those with a strategic rgeos::gBuffer call, but that didn’t work well in this case. We can still use country names with a slight rejiggering of the fortified data frame using dplyr:

world_map %>%
  left_join(data_frame(id=rownames(world@data), name=world@data$name)) %>%
  select(-id) %>%
  rename(id=name) -> world_map

Now it’s time to get the data. The CSV file has annoying spaces in it that causes R to interpret all the columns as strings, so we can use dplyr again to get them into the format we want them in. Note that I’m also making the percentages decimals so we can use percent later on to easily format them.

# a good exercise would be to repeat the download code above 
# rather and make repeated calls to an external resource
read_csv("http://www.pewglobal.org/wp-content/themes/pew-global/interactive-global-class.csv") %>%
  mutate_each(funs(str_trim)) %>%
  filter(id != "None") %>%
  mutate_each(funs(as.numeric(.)/100), -name, -id) -> dat

For this post, we’ll only be working with the actual share percentages, so let’s:

  • ignore the “change” columns
  • convert the data frame from wide to long
  • extract out the income levels (e.g. “Poor”, “Low Income”…)
  • set a factor order for them so our plots will be in the correct sequence
dat %>%
  gather(share, value, starts_with("Share"), -name, -id) %>%
  select(-starts_with("Change")) %>%
  mutate(label=factor(stri_trans_totitle(str_match(share, "Share ([[:alpha:]- ]+),")[,2]),
                      c("Poor", "Low Income", "Middle Income", "Upper-Middle Income", "High Income"),
                      ordered=TRUE)) -> share_dat

The stringi package is really handy (stringr is built on it, too). The stri_trans_totitle function alleviates some mundane string operations and the stri_replace_all_regex (below) also allows us to do vectorized regular expression replacements without a ton of code.

To keep the charts aligned, we’ll use Poynter’s color scale (which was easy to extract from the site’s code) and use the same legend breaks via `cut’. We’ll also format the labels for these breaks to make our legend nicer to view.

# use same "cuts" as poynter
poynter_scale_breaks <- c(0, 2.5, 5, 10, 25, 50, 75, 80, 100)
 
sprintf("%2.1f-%s", poynter_scale_breaks, percent(lead(poynter_scale_breaks/100))) %>%
  stri_replace_all_regex(c("^0.0", "-NA%"), c("0", "%"), vectorize_all=FALSE) %>%
  head(-1) -> breaks_labels
 
share_dat %>%
  mutate(`Share %`=cut(value,
                       c(0, 2.5, 5, 10, 25, 50, 75, 80, 100)/100,
                       breaks_labels))-> share_dat
 
share_pal <- c("#eaecd8", "#d6dab3", "#c2c98b", "#949D48", "#6e7537", "#494E24", "#BB792A", "#7C441C", "#ffffff")

Finally, we get to the good part and start plotting the visualization. There are only two layers: the base map and the choropleth filled map. We then:

  • apply our manual color palette to them
  • remove the line color slashes in the legend boxes
  • setup the overall plot label
  • tell ggplot which coordinate system to use (in this case coord_equal is fine since we already projected the points)
  • apply a base theme that’s good for mapping
  • tweak the text and ensure our legend is in the position we want it to be ing
gg <- ggplot()
 
gg <- gg + geom_map(data=world_map, map=world_map,
                    aes(x=long, y=lat, group=group, map_id=id),
                    color="#7f7f7f", fill="white", size=0.15)
 
gg <- gg + geom_map(data=share_dat, map=world_map,
                    aes(map_id=name, fill=`Share %`),
                    color="#7f7f7f", size=0.15)
 
gg <- gg + scale_fill_manual(values=share_pal)
 
gg <- gg + guides(fill=guide_legend(override.aes=list(colour=NA)))
gg <- gg + labs(title="World Population by Income\n")
gg <- gg + facet_wrap(~label, ncol=2)
 
gg <- gg + coord_equal()
 
gg <- gg + theme_map()
 
gg <- gg + theme(panel.margin=unit(1, "lines"))
gg <- gg + theme(plot.title=element_text(face="bold", hjust=0, size=24))
gg <- gg + theme(legend.title=element_text(face="bold", hjust=0, size=12))
gg <- gg + theme(legend.text=element_text(size=10))
gg <- gg + theme(strip.text=element_text(face="bold", size=10))
gg <- gg + theme(strip.background=element_blank())
gg <- gg + theme(legend.position="bottom")
 
gg

And, here’s the result (click for larger version):

forblog

The optimized shapefile makes for a very fast plot and you can plot individual chorpleths by filtering the data and not using facets.

While there are a number of choropleth packages out there for R, learning how to do the core components on your own can (a) make you appreciate those packages a bit more (b) give you the skills to do them on your own when you need a more customized version. Many of the theme tweaks will also apply to the ggplot-based choropleth packages.

With this base, it should be a fun exercise to the reader to do something similar with Poynter’s “percentage point change” choropleth. You’ll need to change the color palette and manipulate different data columns to get the same scales and visual representation they do. Drop a note in the comments if you give this a go!

You can find all the code in this post in one convenient gist.

I’ve been slowly prodding the [metricsgraphics package](https://github.com/hrbrmstr/metricsgraphics/) towards a 1.0.0 release, but there are some rough edges that still need sorting out. One of them is the ability to handle passing in variables for the `x` & `y` accessor values (you _can_ pass in bare and quoted strings). This can now be achieved (in the `dev01` branch) via `mjs_plot_` and in `mjs_plot` proper in the github main branch thanks to a [PR](https://github.com/hrbrmstr/metricsgraphics/pull/31) by [Jonathan Owen](https://github.com/jrowen). If everything stays stable with the PR, I’ll just fold the code into `mjs_plot` for the `0.9.0` CRAN release.

One other pending feature is the ability to turn _basic_ (single `geom_`) `ggplot` objects into `metricsgraphics` plots. Sometimes it’s just easier/nicer to “think” in `ggplot` and it may be the case that one might have coded a quick histogram/scatter/line plot in `ggplot` and want an equally quick interactive version. This can also now be achieved (again, in beta) via `as_mjsplot`. While the previous addition is fairly self-explanatory, this new one needs a few examples. Please note that the package installation is coming from the `dev01` branch:

devtools::install_github("hrbrmstr/metricsgraphics", ref="dev01") 
 
library(metricsgraphics)
library(ggplot2)
 
dat <- data.frame(year=seq(1790, 1970, 10),
                  uspop=as.numeric(uspop))
 
set.seed(5689)
movies <- movies[sample(nrow(movies), 1000), ]
 
gg1 <- ggplot(dat, aes(x=year, y=uspop)) + geom_line()
gg2 <- ggplot(dat, aes(x=year, y=uspop)) + geom_point()
gg3 <- ggplot(movies, aes(rating)) + geom_histogram()
gg4 <- ggplot(movies, aes(rating)) + geom_histogram(binwidth = 0.1)
 
gg1
as_mjsplot(gg1)
 
gg2
as_mjsplot(gg2)
 
gg3
as_mjsplot(gg3)
 
gg4
as_mjsplot(gg4)

Which you can see below:

As you can see, `as_mjsplot` will do it’s best to figure out the bins (if using `geom_histogram`) and also axis labels. Support for converting `geom_vline` and `geom_hline` to markers and baselines (respectively) is a work in progress.

I’ve only done limited testing with some basic single `geom_` constructs, but if there are any bugs with it or feature requests (remember, the MetricsGraphics.js library has a very limited repertoire) please post an issue on GitHub tagging the `dev01` branch.

@JennyBryan posted her [slides from the 2015 R Summit](https://github.com/jennybc/2015-06-28_r-summit-talk) and they are a must-read for instructors and even general stats/R-folk. She’s one of the foremost experts in R+GitHub and her personal and class workflows provide solid patterns worth emulation.

One thing she has mentioned a few times—and included in her R Summit talk—is the idea that you can lean on GitHub when official examples of a function are “kind of thin”. She uses a search for `vapply` as an example, showing how to [search for uses of `vapply` in CRAN](https://github.com/search?utf8=%E2%9C%93&q=vapply+extension%3AR+user%3Acran) (there’s a [read-only CRAN mirror on GitHub](http://www.r-pkg.org/)) and [in GitHub R code in general](https://github.com/search?utf8=%E2%9C%93&q=vapply+extension%3AR).

I remember throwing together a small function to kick up a browser from R for those URLs (in a response to one of her tweets), but realized this morning (after reading her slides last night) that it’s possible to not leave RStudio to get these GitHub search results (or, at least the first page of results). So, I threw together [this gist](https://gist.github.com/hrbrmstr/32e9c140129d7d51db52) which, when sourced, provides a `ghelp` function. This is the code:

ghelp <- function(topic, in_cran=TRUE) {
 
  require(htmltools) # for getting HTML to the viewer
  require(rvest)     # for scraping & munging HTML
 
  # github search URL base
  base_ext_url <- "https://github.com/search?utf8=%%E2%%9C%%93&q=%s+extension%%3AR"
  ext_url <- sprintf(base_ext_url, topic)
 
  # if searching with user:cran (the default) add that to the URL  
  if (in_cran) ext_url <- paste(ext_url, "+user%3Acran", sep="", collapse="")
 
  # at the time of writing, "rvest" and "xml2" are undergoing some changes, so
  # accommodate those of us who are on the bleeding edge of the hadleyverse
  # either way, we are just extracting out the results <div> for viewing in 
  # the viewer pane (it works in plain ol' R, too)
  if (packageVersion("rvest") < "0.2.0.9000") { 
    require(XML)
    pg <- html(ext_url)
    res_div <- paste(capture.output(html_node(pg, "div#code_search_results")), collapse="")
  } else {
    require(xml2)
    pg <- read_html(ext_url)
    res_div <- as.character(html_nodes(pg, "div#code_search_results"))
  }
 
  # clean up the HTML a bit   
  res_div <- gsub('How are these search results\\? <span class="removed_link" title="/contact">Tell us!</span>', '', res_div)
  # include a link to the results at the top of the viewer
  res_div <- gsub('href="/', 'href="http://github.com/', res_div)
  # build the viewer page, getting CSS from github-proper and hiding some cruft
  for_view <- sprintf('<html><head><link crossorigin="anonymous" href="https://assets-cdn.github.com/assets/github/index-4157068649cead58a7dd42dc9c0f2dc5b01bcc77921bc077b357e48be23aa237.css" media="all" rel="stylesheet" /><style>body{padding:20px}</style></head><body><span class="removed_link" title="%s">Show on GitHub</span><hr noshade size=1/>%s</body></html>', ext_url, res_div)
  # this makes it show in the viewer (or browser if you're using plain R)
  html_print(HTML(for_view))
 
}

Now, when you type `ghelp(“vapply”)`, you’ll get:

RStudio

in the viewer pane (and similar with `ghelp(“vapply”, in_cran=FALSE)`). Clicking the top link will take you to the search results page on GitHub (in your default web browser), and all the other links will pop out to a browser as well.

If you’re the trusting type, you can `devtools::source_gist(’32e9c140129d7d51db52′)` or just add this to your R startup functions (or add it to your personal helper package).

There’s definitely room for some CSS hacking and it would be fairly straightforward to get _all_ the search results into the viewer by following the pagination links and stitching them all together (an exercise left to the reader).