Skip navigation

The_Fonts_-_Lato-3I tend to obsess over fonts and the latest obsession is Lato.

It was created over three years ago by Łukasz Dziedzic but I only discovered it recently when searching for something to use with my Live Maine Power Outages visualization.

The Lato main site provides desktop fonts and web fonts, but it’s also available at Google Fonts which—despite caving into Google’s insidious tracking—will probably make them load faster and save some bandwidth on your site(s).

Now I just need to import Lato into R and remember that I can tweak the matplotlib font config to use it with iPython as well.

I’ve been getting a huge uptick in views of my Slopegraphs in Python post and I think it’s due to @edwardtufte’s recent slopegraph contest announcement.

The original Python code is crufty and a mess mostly due to the intermittent attention to it, wanting to reduce dependencies and hacking vs programming. I’ve been wanting to do a D3 version for a while, so I went a bit overboard once I learned of Mr Tufte’s challenge and made more of a “workbench” for making slopegraphs:

D3_Slopegraph_Workshop

It’s all in D3/HTML5/javascrpt/CSS and requires no server-side components at all.

You can play with a live, alpha-quality version and check out the rest of the components on github.

It needs work, but it should be a good starting point for folks.

As my track record for “winning” things is scant, if you do end up using the code, passing on word of my upcoming book with @jayjacobs would be
#spiffy :-)

It started with a local R version and migrated to a Shiny version and is now in full D3 glory.

Some down time gave me the opportunity to start a basic D3 version of the outage map, but it needs a bit of work as it relies on a page meta refresh to update (every 5 minutes) vs an inline element dynamic refresh. The fam was getting a bit irked at coding time on Thanksgiving, so keep watching the following gists for updates after the holiday:

Even though I really liked Origin, the performance issues associated with it were just too much to debug and I have a ton of other work to do. Back to Frank it is. Page loads are much faster and there are far fewer warnings in Google’s PageSpeed diagnostics (and the remaining ones I can live with).

I decided to forego the D3 map mentioned in the previous post in favor of a Shiny one since I had 90% of the mapping code written.

I binned the ranges into three groups, changed the color over to something more pleasant (with RColorBrewer), added an interactive table for the counties with outage and have the elements updating every minute.

You can see the Live Outage Map over at it’s live Shiny server. Source is below or over at github if you’ve got blockers enabled.

UPDATE: A Shiny (dynamic) version of this is now available.

We had yet-another power outage this morning due to the weird weather patterns of the week and it was the final catalyst I needed to crank out some R code to map the affected counties.

Central Maine Power provides an outage portal where folks can both report outages and see areas impacted by outages. They use an SAP web service that generates the outage table and the aforelinked page just embeds that URL (http://www3.cmpco.com/OutageReports/CMP.html) as an iframe. We can use the XML package in R to grab that HTML file, parse it, extract the table and then send the data to ggplot.

It should be a good starting point for anyone wishing to do something similar. The next itch to scratch for me on this is a live D3 map that uses the outage table with drill-down capabilities to the linked data.

library(maps)
library(maptools)
library(ggplot2)
library(plyr)
library(XML)
 
cmp.url <- "http://www3.cmpco.com/OutageReports/CMP.html"
# get outage table (first one on the cmp.url page)
cmp.node <- getNodeSet(htmlParse(cmp.url),"//table")[[1]]
cmp.tab <- readHTMLTable(cmp.node,
                         header=c("subregion","total.customers","without.power"),
                         skip.rows=c(1,2,3),
                         trim=TRUE, stringsAsFactors=FALSE)
 
# clean up the table to it's easier to work with
cmp.tab <- cmp.tab[-nrow(cmp.tab),] # get rid of last row
cmp.tab$subregion <- tolower(cmp.tab$subregion)
cmp.tab$total.customers <- as.numeric(gsub(",","",cmp.tab$total.customers))
cmp.tab$without.power <- as.numeric(gsub(",","",cmp.tab$without.power))
 
# get maine map with counties
county.df <- map_data('county')
me <- subset(county.df, region=="maine")
 
# get a copy with just the affected counties
out <- subset(me, subregion %in% cmp.tab$subregion)
 
# add outage into to it
out <- join(out, cmp.tab)
 
# plot the map
gg <- ggplot(me, aes(long, lat, group=group))
gg <- gg + geom_polygon(fill=NA, colour='gray50', size=0.25)
gg <- gg + geom_polygon(data=out, aes(long, lat, group=group, fill=without.power), 
                        colour='gray50', size=0.25)
gg <- gg + scale_fill_gradient2(low="#FFFFCC", mid="#FD8D3C", high="#800026")
gg <- gg + coord_map()
gg <- gg + theme_bw()
gg <- gg + labs(x="", y="", title="CMP (Maine) Customers Without Power by County")
gg <- gg + theme(panel.border = element_blank(),
                 panel.background = element_blank(),
                 panel.grid = element_blank(),
                 axis.text = element_blank(),
                 axis.ticks = element_blank(),
                 legend.position="left",
                 legend.title=element_blank())
gg

Plot_Zoom(click for larger)

Data Driven Security launches in February 2014. @jayjacobs & I have seen half of the book in PDF form so far and it’s almost unbelievable that this journey is almost over.

Data_Driven_Security___Amazon_Sales_Rank_Tracker

We setup a live Amazon “sales rank” tracker over at the book’s web site and provided some Python and JavaScript code to show folks how use the AWS API in conjunction with the dygraphs charting library to do the same for any ISBN. In the coming weeks, we’ll have a Google App Engine component you can clone to setup something similar without the need for your own server(s).

Since @jayjacobs & I are down to the home stretch on Data Driven Security, I thought it would be interesting to do some post-writing pseudo-analyses of the book itself. I won’t have exact page or word counts for a bit, but I wanted to see how many R packages we ended up relying on for the examples in the chapters. It was fairly straightforward to run a grep for calls to library() or require() across all the source files, and I grouped the results into four categories: “analysis”, “core”, “munging” and “visualization”.

Since I <3 D3 circular dendrograms, I figured that would be a fun way to show the groupings. For those who dislike spinning your noggin around, a more traditional one is also presented. You'll need an SVG-capable browser to see the visualizations (below). Stay on the lookout for more "behind the scenes" posts.

visualizationaplpackcolorspaceggdendroggplot2ggthemesgridExtraigraphmapsmaptoolsRColorBrewervcdanalysisbinomcareffectsportfoliosplinesscalesverisrzoorgdalcoredevtoolsstatsmungingbitopsgdatareshapeplyrrjsonRJSONIO

visualizationaplpackcolorspaceggdendroggplot2ggthemesgridExtraigraphmapsmaptoolsRColorBrewervcdanalysisbinomcareffectsportfoliosplinesscalesverisrzoorgdalcoredevtoolsstatsmungingbitopsgdatareshapeplyrrjsonRJSONIO