Skip navigation

Category Archives: d3

It’s been a while since I’ve updated my [metricsgraphics package](https://cran.r-project.org/web/packages/metricsgraphics/index.html). The hit list for changes includes:

– Fixes for the new ggplot2 release (metricsgraphics uses the `movies` data set which is now in ggplot2movies)
– Updated all javascript libraries to the most recent versions
– Borrowed the ability to add CSS rules to a widget from taucharts (`mjs_add_css_rule`)
– Added a metricsgraphics plugin to enable line chart region annotation (`mjs_annotate_region`)
– Enabled explicit coloring line/area charts (it was a new feature in the underlying Metrics-Graphics library)
– You can use bare or quoted names when specifying the x & y accessors and can also use a variable name
– You can now use the metricsgraphics title & description capabilities, but doing so voids any predictable/specified widget height/width and the description functionality is really only suited for bootstrap templates

I think all that can be demonstrated in the following snippet:

library(metricsgraphics)
 
dat <- read.csv("http://real-chart.finance.yahoo.com/table.csv?s=AAPL&a=07&b=9&c=1996&d=11&e=21&f=2015&g=d&ignore=.csv",
                stringsAsFactors=FALSE)
 
DATE <- "Date"
 
dat %>%
  filter(Date>="2008-01-01") %>% 
  mjs_plot(DATE, y="Low", title="AAPL Stock (2008-Present)", width=800, height=500) %>% 
  mjs_line(color="#6a3d9a") %>% 
  mjs_add_line(High, color="#ff7f00") %>% 
  mjs_axis_x(xax_format="date") %>% 
  mjs_add_css_rule("{{ID}} .blk { fill:black }") %>%
  mjs_annotate_region("2013-01-01", "2013-12-31", "Volatility", "blk") %>% 
  mjs_add_marker("2014-06-09", "Split") %>% 
  mjs_add_marker("2012-09-12", "iPhone 5") %>% 
  mjs_add_legend(c("Low", "High"))

NOTE: I’m still trying to figure out why WebKit on Safari renders the em dashes and Chrome does not.

There was some chatter on the twitters this week about a relatively new D3-based charting library called [TauCharts](http://taucharts.com/) (also @taucharts). The API looked pretty clean and robust, so I started working on an htmlwidget for it and was quickly joined by the Widget Master himself, @timelyportfolio.

TauCharts definitely has a “grammar of graphics” feel about it and the default aesthetics are super-nifty While the developers are actively adding new features and “geoms”, the core points (think scatterplot), lines and bars (including horizontal bars!) geoms are quite robust and definitely ready for your dashboards.

Between the two of us, we have a _substantial_ part of the [charting library API](http://api.taucharts.com/) covered. I think the only major thing left unimplemented is composite charts (i.e. lines + bars + points on the same chart) and some minor tweaks around the edges.

While you can find it on [github](http://github.com/hrbrmstr/taucharts) and do the normal:

devtools::install_github("hrbrmstr/taucharts")

or, even use the official initial release version:

devtools::install_github("hrbrmstr/taucharts@v0.1.0")

I’ll use the `dev` version:

devtools::install_github("hrbrmstr/taucharts@dev"

for the example below, mostly since it includes the data set I want to use to mimic the current, featured example on the [TauCharts homepage](http://taucharts.com/) and also has full documentation with examples.

Here’s all it takes to make a faceted scatterplot with:

– interactive tooltips
– interactive legend
– custom-selectable trendline annotation:

devtools::install_github("hrbrmstr/taucharts@dev")
 
library(taucharts)
 
data(cars_data)
 
tauchart(cars_data) %>% 
  tau_point("milespergallon", c("class", "price"), color="class") %>% 
  tau_guide_padding(bottom=300) %>% 
  tau_legend() %>% 
  tau_trendline() %>% 
  tau_tooltip(c("vehicle", "year", "class", "price", "milespergallon"))


Hybrid cars fuel economy by price and class
It seems expensive cars are less efficient.

There are _tons_ more examples in the [TauCharts RPub](http://rpubs.com/hrbrmstr/taucharts) (and soon-to-be vignette) and @timelyportfolio will be featuring it in his weekly [widget update](http://www.buildingwidgets.com/).

Despite having shown various ways to overcome D3 cartographic envy, there are always more examples that can cause the green monster to rear it’s ugly head.

Take the Voronoi Arc Map example.


Voronoi_Arc_Map

For those in need of a primer, a Voronoi tesslation/diagram is:

a partitioning of a plane into regions based on distance to points in a specific subset of the plane. That set of points (called seeds, sites, or generators) is specified beforehand, and for each seed there is a corresponding region consisting of all points closer to that seed than to any other. Wikipedia

We can overlay a Voronoi tessalation on top of a map in R as well thanks to the deldir package (which has been around since the “S” days!). Let’s get (most of) the package requirements cruft out of the way, first:

library(sp)
library(rgdal)
library(deldir)
library(dplyr)
library(ggplot2)
library(ggthemes)

Now we’ll [ab]use the data from the Arc Map example:

flights <- read.csv("http://bl.ocks.org/mbostock/raw/7608400/flights.csv", stringsAsFactors=FALSE)
airports <- read.csv("http://bl.ocks.org/mbostock/raw/7608400/airports.csv", stringsAsFactors=FALSE)

Since the D3 example cheats and only uses the continental US (CONUS) we’ll do the same and we’ll also filter out only those airports mentioned in the flights data and get the total # of incoming/outgoing flights for each airport:

conus <- state.abb[!(state.abb %in% c("AK", "HI"))]
airports <- filter(airports,
                   state %in% conus,
                   iata %in% union(flights$origin, flights$destination))
orig <- select(count(flights, origin), iata=origin, n1=n)
dest <- select(count(flights, destination), iata=destination, n2=n)
airports <- left_join(airports,
                      select(mutate(left_join(orig, dest),
                                    tot=n1+n2),
                             iata, tot)) %>% 
            filter(!is.na(tot))

Since we’re going to initially plot polygons in ggplot (and, eventually, in leaflet), we’ll need to work with Spatial objects, so let’s make those airport lat/lon pairs into a SpatialPointsDataFrame:

vor_pts <- SpatialPointsDataFrame(cbind(airports$longitude,
                                        airports$latitude),
                                  airports, match.ID=TRUE)

The deldir function returns a pretty complex object. Thankfully, the authors of the package realized that one might just want the polygons from the computation and pre-made a function: tile.list for computing/extracting them. Those polygons aren’t, however, closed and we really want to keep the airport data associatd with them, so we need to close the polygons and associate the data. Since we’re likely going to repeat this task, let’s make it a (very badly named) function:

SPointsDF_to_voronoi_SPolysDF <- function(sp) {
 
  # tile.list extracts the polygon data from the deldir computation
  vor_desc <- tile.list(deldir(sp@coords[,1], sp@coords[,2]))
 
  lapply(1:(length(vor_desc)), function(i) {
 
    # tile.list gets us the points for the polygons but we
    # still have to close them, hence the need for the rbind
    tmp <- cbind(vor_desc[[i]]$x, vor_desc[[i]]$y)
    tmp <- rbind(tmp, tmp[1,])
 
    # now we can make the Polygon(s)
    Polygons(list(Polygon(tmp)), ID=i)
 
  }) -> vor_polygons
 
  # hopefully the caller passed in good metadata!
  sp_dat <- sp@data
 
  # this way the IDs _should_ match up w/the data & voronoi polys
  rownames(sp_dat) <- sapply(slot(SpatialPolygons(vor_polygons),
                                  'polygons'),
                             slot, 'ID')
 
  SpatialPolygonsDataFrame(SpatialPolygons(vor_polygons),
                           data=sp_dat)
 
}

Before we can make the plots, we need to put the Spatial objects into the proper form for ggplot2 (and get the U.S. state map):

vor <- SPointsDF_to_voronoi_SPolysDF(vor_pts)
 
vor_df <- fortify(vor)
 
states <- map_data("state")

Now we can have some fun. Let’s try to mimic the D3 example map as closely as possible. We’ll lay down the CONUS map, add a points layer for the the airports, sizing & styling them just like the D3 example. Note that we order the points so that the smallest ones appear on top (so we can still see them).

We’ll then lay down our newly created Voronoi layer. We’ll also use the same projection (Albers) that the D3 examples uses:

gg <- ggplot()
# base map
gg <- gg + geom_map(data=states, map=states,
                    aes(x=long, y=lat, map_id=region),
                    color="white", fill="#cccccc", size=0.5)
# airports layer
gg <- gg + geom_point(data=arrange(airports, desc(tot)),
                      aes(x=longitude, y=latitude, size=sqrt(tot)),
                      shape=21, color="white", fill="steelblue")
# voronoi layer
gg <- gg + geom_map(data=vor_df, map=vor_df,
                    aes(x=long, y=lat, map_id=id),
                    color="#a5a5a5", fill="#FFFFFF00", size=0.25)
gg <- gg + scale_size(range=c(2, 9))
gg <- gg + coord_map("albers", lat0=30, lat1=40)
gg <- gg + theme_map()
gg <- gg + theme(legend.position="none")
gg

ggplot-1

While that’s pretty, it’s not exactly useful. I’m sure there are times when it’s important to show the Voronoi polygons, but they are especially useful when they are used to help with user interface interactions.

In the case of this map, some airport “bubbles” are very small and many overlap, making a “click” (or even “hover”) a potentially painstaking task for someone looking to get more data out of the visualization. The D3 example uses Voronoi polygons to make it super-easy for the user to hover over a map area and get more info about the flights for the closest airport to the mouse pointer.

We’ll use the leaflet htmlwidget to do something similar. Until I can figure out “hover” events for R+leaflet, you’ll have to live with “click”.

First we’ll need some additional packages:

library(leaflet)
library(rgeos)
library(htmltools)

And, we’ll also need a U.S. shapefile (which we simplify since the polygons are pretty detailed and that’s not necessary for this vis):

url <- "http://eric.clst.org/wupl/Stuff/gz_2010_us_040_00_500k.json"
fil <- "gz_2010_us_040_00_500k.json"
 
if (!file.exists(fil)) download.file(url, fil, cacheOK=TRUE)
 
states_m <- readOGR("gz_2010_us_040_00_500k.json", 
                    "OGRGeoJSON", verbose=FALSE)
states_m <- subset(states_m, 
                   !NAME %in% c("Alaska", "Hawaii", "Puerto Rico"))
dat <- states_m@data # gSimplify whacks the data bits
states_m <- SpatialPolygonsDataFrame(gSimplify(states_m, 0.05,
                                               topologyPreserve=TRUE),
                                     dat, FALSE)

The leaflet vis idiom is similar to the ggplot idiom. I’m using a base tile layer since I was too lazy to figure out how to change the leaflet default gray background map color. The map polygons are added, then the circles/bubbles (note that you work in meters with addCircles which lets leaflet scale the bubbles as you zoom in/out). Finally, the Voronoi layer is added. I kept the stroke visible purely for demonstration purposes. You need to keep fill=TRUE otherwise the Voronoi layer won’t get click/hover events and once I figure out how to trigger popups on hover and use a static popup layer, this will let users hover around the map to get the underlying airport flight information.

leaflet(width=900, height=650) %>%
  # base map
  addProviderTiles("Hydda.Base") %>%
  addPolygons(data=states_m,
              stroke=TRUE, color="white", weight=1, opacity=1,
              fill=TRUE, fillColor="#cccccc", smoothFactor=0.5) %>%
  # airports layer
  addCircles(data=arrange(airports, desc(tot)),
             lng=~longitude, lat=~latitude,
             radius=~sqrt(tot)*5000, # size is in m for addCircles O_o
             color="white", weight=1, opacity=1,
             fillColor="steelblue", fillOpacity=1) %>%
  # voronoi (click) layer
  addPolygons(data=vor,
              stroke=TRUE, color="#a5a5a5", weight=0.25,
              fill=TRUE, fillOpacity = 0.0,
              smoothFactor=0.5, 
              popup=sprintf("Total In/Out: %s",
                            as.character(vor@data$tot)))

I made the Voronoi layer very light, so you may want to keep it there as a cue for the user. How you work with it is completely up to you.

Now you have one less reason to be envious of the D3 cartographers!

I’ve been slowly prodding the [metricsgraphics package](https://github.com/hrbrmstr/metricsgraphics/) towards a 1.0.0 release, but there are some rough edges that still need sorting out. One of them is the ability to handle passing in variables for the `x` & `y` accessor values (you _can_ pass in bare and quoted strings). This can now be achieved (in the `dev01` branch) via `mjs_plot_` and in `mjs_plot` proper in the github main branch thanks to a [PR](https://github.com/hrbrmstr/metricsgraphics/pull/31) by [Jonathan Owen](https://github.com/jrowen). If everything stays stable with the PR, I’ll just fold the code into `mjs_plot` for the `0.9.0` CRAN release.

One other pending feature is the ability to turn _basic_ (single `geom_`) `ggplot` objects into `metricsgraphics` plots. Sometimes it’s just easier/nicer to “think” in `ggplot` and it may be the case that one might have coded a quick histogram/scatter/line plot in `ggplot` and want an equally quick interactive version. This can also now be achieved (again, in beta) via `as_mjsplot`. While the previous addition is fairly self-explanatory, this new one needs a few examples. Please note that the package installation is coming from the `dev01` branch:

devtools::install_github("hrbrmstr/metricsgraphics", ref="dev01") 
 
library(metricsgraphics)
library(ggplot2)
 
dat <- data.frame(year=seq(1790, 1970, 10),
                  uspop=as.numeric(uspop))
 
set.seed(5689)
movies <- movies[sample(nrow(movies), 1000), ]
 
gg1 <- ggplot(dat, aes(x=year, y=uspop)) + geom_line()
gg2 <- ggplot(dat, aes(x=year, y=uspop)) + geom_point()
gg3 <- ggplot(movies, aes(rating)) + geom_histogram()
gg4 <- ggplot(movies, aes(rating)) + geom_histogram(binwidth = 0.1)
 
gg1
as_mjsplot(gg1)
 
gg2
as_mjsplot(gg2)
 
gg3
as_mjsplot(gg3)
 
gg4
as_mjsplot(gg4)

Which you can see below:

As you can see, `as_mjsplot` will do it’s best to figure out the bins (if using `geom_histogram`) and also axis labels. Support for converting `geom_vline` and `geom_hline` to markers and baselines (respectively) is a work in progress.

I’ve only done limited testing with some basic single `geom_` constructs, but if there are any bugs with it or feature requests (remember, the MetricsGraphics.js library has a very limited repertoire) please post an issue on GitHub tagging the `dev01` branch.

I’m super-pleased to announce that the Benevolent CRAN Overlords [accepted the metricsgraphics package](http://cran.r-project.org/web/packages/metricsgraphics/index.html) into CRAN over the weekend. Now, you no longer need to rely on github/devtools to use [MetricsGraphics.js](http://metricsgraphicsjs.org/) charts from your R scripts. If you’re not familiar with `htmlwidgets`, take a look at [the official site for them](http://www.htmlwidgets.org/).

To make it easier to grok the package, I replicated many of the core [MetricsGraphics examples](http://metricsgraphicsjs.org/examples.htm) in the package [vignette](http://cran.r-project.org/web/packages/metricsgraphics/vignettes/introductiontometricsgraphics.html) (which is also below).

I’ll be finishing up support for all of the features of MetricsGraphics library, most importantly `POSIX[cl]t` support for time ranges in the not-too-distant future. You can drop feature requests, questions or problems [over at github](https://github.com/hrbrmstr/metricsgraphics/issues).

I set aside a small bit of time to give [rbokeh](https://github.com/bokeh/rbokeh) a try and figured I’d share a small bit of code that shows how to make the “same” chart in both ggplot2 and rbokeh.

#### What is Bokeh/rbokeh?

rbokeh is an [htmlwidget](http://htmlwidgets.org) wrapper for the [Bokeh](http://bokeh.pydata.org/en/latest/) visualization library that has become quite popular in Python circles. Bokeh makes creating interactive charts pretty simple and rbokeh lets you do it all with R syntax.

#### Comparing ggplot & rbokeh

This is not a comprehensive introduction into rbokeh. You can get that [here (officially)](http://hafen.github.io/rbokeh/). I merely wanted to show how a ggplot idiom would map to an rbokeh one for those that may be looking to try out the rbokeh library and are familiar with ggplot. They share a very common “grammar of graphics” base where you have a plot structure and add layers and aesthetics. They each do this a tad bit differently, though, as you’ll see.

First, let’s plot a line graph with some markers in ggplot. The data I’m using is a small time series that we’ll use to plot a cumulative sum of via a line graph. It’s small enough to fit inline:

library(ggplot2)
library(rbokeh)
library(htmlwidgets)
 
structure(list(wk = structure(c(16069, 16237, 16244, 16251, 16279,
16286, 16300, 16307, 16314, 16321, 16328, 16335, 16342, 16349,
16356, 16363, 16377, 16384, 16391, 16398, 16412, 16419, 16426,
16440, 16447, 16454, 16468, 16475, 16496, 16503, 16510, 16517,
16524, 16538, 16552, 16559, 16566, 16573), class = "Date"), n = c(1L,
1L, 1L, 1L, 3L, 1L, 3L, 2L, 4L, 2L, 3L, 2L, 5L, 5L, 1L, 1L, 3L,
3L, 3L, 2L, 1L, 1L, 1L, 2L, 1L, 2L, 7L, 1L, 2L, 6L, 7L, 1L, 1L,
1L, 2L, 2L, 7L, 1L)), .Names = c("wk", "n"), class = c("tbl_df",
"tbl", "data.frame"), row.names = c(NA, -38L)) -> by_week
 
events <- data.frame(when=as.Date(c("2014-10-09", "2015-03-20", "2015-05-15")),
                     what=c("Thing1", "Thing2", "Thing2"))

The ggplot version is pretty straightforward:

gg <- ggplot()
gg <- gg + geom_vline(data=events,
                      aes(xintercept=as.numeric(when), color=what),
                      linetype="dashed", alpha=1/2)
gg <- gg + geom_text(data=events,
                     aes(x=when, y=1, label=what, color=what),
                     hjust=1.1, size=3)
gg <- gg + geom_line(data=by_week, aes(x=wk, y=cumsum(n)))
gg <- gg + scale_x_date(expand=c(0, 0))
gg <- gg + scale_y_continuous(limits=c(0, 100))
gg <- gg + labs(x=NULL, y="Cumulative Stuff")
gg <- gg + theme_bw()
gg <- gg + theme(panel.grid=element_blank())
gg <- gg + theme(panel.border=element_blank())
gg <- gg + theme(legend.position="none")
gg

We:

– setup a base ggplot object
– add a layer of marker lines (which are the 3 `events` dates)
– add a layer of text for the marker lines
– add a layer of the actual line – note that we can use `cumsum(n)` vs pre-compute it
– setup scale and other aesthetic properties

That gives us this:

gg

Here’s a similar structure in rbokeh:

figure(width=550, height=375,
       logo="grey", outline_line_alpha=0) %>%
  ly_abline(v=events$when, color=c("red", "blue", "blue"), type=2, alpha=1/4) %>%
  ly_text(x=events$when, y=5, color=c("red", "blue", "blue"),
          text=events$what, align="right", font_size="7pt") %>%
  ly_lines(x=wk, y=cumsum(n), data=by_week) %>%
  y_range(c(0, 100)) %>%
  x_axis(grid=FALSE, label=NULL,
         major_label_text_font_size="8pt",
         axis_line_alpha=0) %>%
  y_axis(grid=FALSE,
         label="Cumulative Stuff",
         minor_tick_line_alpha=0,
         axis_label_text_font_size="10pt",
         axis_line_alpha=0) -> rb
rb

Here, we set the `width` and `height` and configure some of the initial aesthetic options. Note that `outline_line_alpha=0` is the equivalent of `theme(panel.border=element_blank())`.

The markers and text do not work exactly as one might expect since there’s no way to specify a `data` parameter, so we have to set the colors manually. Also, since the target is a browser, points are specified in the same way you would with CSS. However, it’s a pretty easy translation from `geom_[hv]line` to `ly_abline` and `geom_text` to `ly_text`.

The `ly_lines` works pretty much like `geom_line`.

Notice that both ggplot and rbokeh can grok dates for plotting (though we do not need the `as.numeric` hack for rbokeh).

rbokeh will auto-compute bounds like ggplot would but I wanted the scale to go from 0 to 100 in each plot. You can think of `y_range` as `ylim` in ggplot.

To configure the axes, you work directly with `x_axis` and `y_axis` parameters vs `theme` elements in ggplot. To turn off only lines, I set the alpha to 0 in each and did the same with the y axis minor tick marks.

Here’s the rbokeh result:

NOTE: you can save out the widget with:

saveWidget(rb, file="rbokeh001.html")

and I like to use the following `iframe` settings to include the widgets:

<iframe style="max-width=100%" 
        src="rbokeh001.html" 
        sandbox="allow-same-origin allow-scripts" 
        width="100%" 
        height="400" 
        scrolling="no" 
        seamless="seamless" 
        frameBorder="0"></iframe>

#### Wrapping up

Hopefully this helped a bit with translating some ggplot idioms over to rbokeh and developing a working mental model of rbokeh plots. As I play with it a bit more I’ll add some more examples here in the event there are “tricks” that need to be exposed. You can find the code [up on github](https://gist.github.com/hrbrmstr/a3a1be8132530b355bf9) and please feel free to drop a note in the comments if there are better ways of doing what I did or if you have other hints for folks.

In preparation for using some of our streamgraphs for production (PDF/print) graphics, I ended up having to hand-edit labels in on one of the graphics in an Adobe product. This bumped up the priority on adding annotation functions to the streamgraph package (you really don’t want to have to hand-edit graphics if at all possible, trust me). To illustrate them, I’ll use unemployment data that I started gathering for a course I’m teaching in the Fall.

We’ll start with the setup and initial data gathering:

library(dplyr)
library(streamgraph)
library(pbapply)
 
url <- "http://www.bls.gov/lau/ststdsadata.txt"
dat <- readLines(url)

This data is not exactly in a happy format (hit the URL in your browser and you’ll see what I mean). It was definitely made for line printers/human consumption and I feel bad for any human that has to stare at it. The function I’m using to extract data is not necessarily what I’d do to just read in the whole data, but it’s more for teaching something else than optimization. It’ll do for our purposes here:

get_state_data <- function(state) {
 
  section <- paste("^%s|    (", paste0(month.name, sep="", collapse="|"), ")\ +[[:digit:]]{4}", sep="", collapse="")
  section <- sprintf(section, state)
  vals <- gsub("^\ +|\ +$", "", grep(section, dat, value=TRUE))
 
  state_vals <- gsub("^.* \\.+", "", vals[seq(from=2, to=length(vals), by=2)])
 
  cols <- read.table(text=state_vals)
  cols$month <- as.Date(sprintf("01 %s", vals[seq(from=1, to=length(vals), by=2)]),
                        format="%d %B %Y")
  cols$state <- state
 
  cols %>%
    select(8:9, 1:8) %>%
    mutate(V1=as.numeric(gsub(",", "", V1)),
           V2=as.numeric(gsub(",", "", V2)),
           V4=as.numeric(gsub(",", "", V4)),
           V6=as.numeric(gsub(",", "", V6)),
           V3=V3/100,
           V5=V5/100,
           V7=V7/100) %>%
    rename(civ_pop=V1,
           labor_force=V2, labor_force_pct=V3,
           employed=V4, employed_pct=V5,
           unemployed=V6, unemployed_pct=V7)
 
}
 
state_unemployment <- bind_rows(pblapply(state.name, get_state_data))

This will give us a data frame for employment(/unemployment) rates for all the (US) states. I only wanted to focus on New England and a few others for the course example, so this bit filters out them out:

state_unemployment %>%
  filter(state %in% c("California", "Ohio", "Rhode Island", "Maine",
                      "Massachusetts", "Connecticut", "Vermont",
                      "New Hampshire", "Nebraska")) -> some

With that setup out of the way, let me introduce the two new functions: `sg_add_marker` and `sg_annotate`. `sg_add_marker` adds a vertical, dotted line that spans the height of the graph and is placed at the designated spot on the x axis. You can add an optional label for the marker by specifying the y position, label text, color, size, space away from the line and how it’s aligned – start (left), center (middle), right (end). This is primarily useful for placing the label on either side of the line.

`sg_annotate` is for adding text anywhere on the streamgraph. The original use for it was to label streams, but you can use it any way you think would add meaning to your streamgraph. You can see them both in action below, where I plot the streamgraph for unemployment (%) for the selected states, then label the start of each recession since 1980 (with the peak national unemployment rate) with a marker and also label each stream:

streamgraph(some, "state", "unemployed_pct", "month") %>%
  sg_axis_x(tick_interval=10, tick_units = "year", tick_format="%Y") %>%
  sg_axis_y(0) %>%
  sg_add_marker(x=as.Date("1981-07-01"), "1981 (10.8%)", anchor="end") %>%
  sg_add_marker(x=as.Date("1990-07-01"), "1990 (7.8%)", anchor="start") %>%
  sg_add_marker(x=as.Date("2001-03-01"), "2001 (6.3%)", anchor="end") %>%
  sg_add_marker(x=as.Date("2007-12-01"), "2007 (10.1%)", anchor="end") %>%
  sg_annotate(label="Vermont", x=as.Date("1978-04-01"), y=0.6, color="#ffffff") %>%
  sg_annotate(label="Maine", x=as.Date("1978-03-01"), y=0.30, color="#ffffff") %>%
  sg_annotate(label="Nebraska", x=as.Date("1977-06-01"), y=0.41, color="#ffffff") %>%
  sg_annotate(label="Massachusetts", x=as.Date("1977-06-01"), y=0.36, color="#ffffff") %>%
  sg_annotate(label="New Hampshire", x=as.Date("1978-03-01"), y=0.435, color="#ffffff") %>%
  sg_annotate(label="California", x=as.Date("1978-02-01"), y=0.175, color="#ffffff") %>%
  sg_annotate(label="Rhode Island", x=as.Date("1977-11-01"), y=0.55, color="#ffffff") %>%
  sg_annotate(label="Ohio", x=as.Date("1978-06-01"), y=0.485, color="#ffffff") %>%
  sg_annotate(label="Connecticut", x=as.Date("1978-01-01"), y=0.235, color="#ffffff") %>%
  sg_fill_tableau() %>%
  sg_legend(show=TRUE)

Selected State Unemployment Figures Since 1976

I probably could have positioned the annotations a bit better, but this should be a good enough example to get the general idea. I may add an option to place the marker vertical lines behind streamgraph and will be adding some toggle options to the interactive version (to hide/show markers and/or annotations).

As usual, the package is up [on github](https://github.com/hrbrmstr/streamgraph) and a contiguous copy of the above snippets are in [this gist](https://gist.github.com/hrbrmstr/4e181ae045807ca3a858).

Three final notes. First, I suggest enabling the y axis when you’re trying to figure out where the y position for a label should be (since the y axis range is calculated by the summed span of the data). Second, the x axis works with both dates and continuous values, but you need to match what you setup the streamgraph with. Finally, just a tip: I’ve found [SVG Crowbar 2](http://nytimes.github.io/svg-crowbar/) to be super-helpful when I need to extract these streamgraphs out for non-interactive reproduction. Just yank the SVG out with it and hand it (or a converted form of it) to whomever is handling final production and they should be able to work with it.

A post on [StackOverflow](http://stackoverflow.com/questions/28725604/streamgraphs-dataviz-in-r-wont-plot) asked about using a continuous variable for the x-axis (vs dates) in my [streamgraph package](http://github.com/hrbrmstr/streamgraph). While I provided a workaround for the question, it helped me bump up the priority for adding support for continuous x axis scales. With the [DBIR](http://www.verizonenterprise.com/DBIR/) halfway behind me now, I kicked out a new rev of the package/widget that has support for continuous scales.

Using the data from the SO post, you can see there’s not much difference in how you use continuous vs date scales:

library(streamgraph)
 
dat <- read.table(text="week variable value
40     rev1  372.096
40     rev2  506.880
40     rev3 1411.200
40     rev4  198.528
40     rev5   60.800
43     rev1  342.912
43     rev2  501.120
43     rev3  132.352
43     rev4  267.712
43     rev5   82.368
44     rev1  357.504
44     rev2  466.560", header=TRUE)
 
dat %>% 
  streamgraph("variable","value","week", scale="continuous") %>% 
  sg_axis_x(tick_format="d")

Product Revenue

I’ll be adding support for using a categorical variable on the x axis soon. Once that’s done, it’ll be time to do the CRAN dance.