As many folks know, I live in semi-rural Maine and we were hit pretty hard with a wind+rain storm Sunday to Monday. The hrbrmstr compound had no power (besides a generator) and no stable/high-bandwidth internet (Verizon LTE was heavily congested) since 0500 Monday and still does not as I write this post.
I’ve played with scraping power outage data from Central Maine Power but there’s a great Twitter account — PowerOutage_us — that has done much of the legwork for the entire country. They don’t cover everything and do not provide easily accessible historical data (likely b/c evil folks wld steal it w/o payment or credit) but they do have a site you can poke at and do provide updates via Twitter. As you’ve seen in a previous post, we can use the rtweet
package to easily read Twitter data. And, the power outage tweets are regular enough to identify and parse. But raw data is so…raw.
While one could graph data just for one’s self, I decided to marry this power scraping capability with a recent idea I’ve been toying with adding to hrbrthemes
or ggalt
: gg_tweet()
. Imagine being able to take a ggplot2 object and “plot” it to Twitter, fully conforming to Twitter’s stream or card image sizes. By conforming to these size constraints, they don’t get cropped in the timeline view (if you allow images to be previewed in-timeline). This is even more powerful if you have some helper functions for proper theme-ing (font sizes especially need to be tweaked). Enter gg_tweet()
.
Power Scraping
We’ll cover scraping @PowerOutage_us first, but we’ll start with all the packages we’ll need and a helper function to convert power outage estimates to numeric values:
library(httr)
library(magick)
library(rtweet)
library(stringi)
library(hrbrthemes)
library(tidyverse)
words_to_num <- function(x) {
map_dbl(x, ~{
val <- stri_match_first_regex(.x, "^([[:print:]]+) #PowerOutages")[,2]
mul <- case_when(
stri_detect_regex(val, "[Kk]") ~ 1000,
stri_detect_regex(val, "[Mm]") ~ 1000000,
TRUE ~ 1
)
val <- stri_replace_all_regex(val, "[^[:digit:]\\.]", "")
as.numeric(val) * mul
})
}
Now, I can’t cover setting up rtweet
OAuth here. The vignette and package web site do that well.
The bot tweets infrequently enough that this is really all we need (though, bump up n
as you need to):
outage <- get_timeline("PowerOutage_us", n=300)
Yep, that gets the last 300 tweets from said account. It’s amazingly simple.
Now, the outage tweets for the east coast / northeast are not individually uniform but collectively they are (there’s a pattern that may change but you can tweak this if they do):
filter(outage, stri_detect_regex(text, "\\#(EastCoast|NorthEast)")) %>%
mutate(created_at = lubridate::with_tz(created_at, 'America/New_York')) %>%
mutate(number_out = words_to_num(text)) %>%
ggplot(aes(created_at, number_out)) +
geom_segment(aes(xend=created_at, yend=0), size=5) +
scale_x_datetime(date_labels = "%Y-%m-%d\n%H:%M", date_breaks="2 hours") +
scale_y_comma(limits=c(0,2000000)) +
labs(
x=NULL, y="# Customers Without Power",
title="Northeast Power Outages",
subtitle="Yay! Twitter as a non-blather data source",
caption="Data via: @PowerOutage_us <https://twitter.com/PowerOutage_us>"
) -> gg
That pipe chain looks for key hashtags (for my area), rejiggers the time zone, and calls the helper function to, say, convert 1.2+ Million
to 1200000
. Finally it builds a mostly complete ggplot2 object (you should make the max Y limit more dynamic).
You can plot that on your own (print gg
). We’re here to tweet, so let’s go into the next section.
Magick Tweeting
@opencpu made it possible shunt plot output to a magick
device. This means we have really precise control over ggplot2 output size as well as the ability to add other graphical components to a ggplot2 plot using magick
idioms. One thing we need to take into account is “retina” plots. They are — essentially — double resolution plots (72 => 144 pixels per inch). For the best looking plots we need to go retina, but that also means kicking up base plot theme font sizes a bit. Let’s build on hrbrthemes::theme_ipsum_rc()
a bit and make a theme_tweet_rc()
:
theme_tweet_rc <- function(grid = "XY", style = c("stream", "card"), retina=TRUE) {
style <- match.arg(tolower(style), c("stream", "card"))
switch(
style,
stream = c(24, 18, 16, 14, 12),
card = c(22, 16, 14, 12, 10)
) -> font_sizes
theme_ipsum_rc(
grid = grid,
plot_title_size = font_sizes[1],
subtitle_size = font_sizes[2],
axis_title_size = font_sizes[3],
axis_text_size = font_sizes[4],
caption_size = font_sizes[5]
)
}
Now, we just need a way to take a ggplot2 object and shunt it off to twitter. The following gg_tweet()
function does not (now) use rtweet
as I’ll likely add it to either ggalt
or hrbrthemes
and want to keep dependencies to a minimum. I may opt-in to bypass the current method since it relies on environment variables vs an RDS file for app credential storage. Regardless, one thing I wanted to do here was provide a way to preview the image before tweeting.
So you pass in a ggplot2 object (likely adding the tweet theme to it) and a Twitter status text (there’s a TODO to check the length for 140c compliance) plus choose a style (stream or card, defaulting to stream) and decide on whether you’re cool with the “retina” default.
Unless you tell it to send the tweet it won’t, giving you a chance to preview the image before sending, just in case you want to tweak it a bit before committing it to the Twitterverse. It als returns the magick
object it creates in the event you want to do something more with it:
gg_tweet <- function(g, status = "ggplot2 image", style = c("stream", "card"),
retina=TRUE, send = FALSE) {
style <- match.arg(tolower(style), c("stream", "card"))
switch(
style,
stream = c(w=1024, h=512),
card = c(w=800, h=320)
) -> dims
dims["res"] <- 72
if (retina) dims <- dims * 2
fig <- image_graph(width=dims["w"], height=dims["h"], res=dims["res"])
print(g)
dev.off()
if (send) {
message("Posting image to twitter...")
tf <- tempfile(fileext = ".png")
image_write(fig, tf, format="png")
# Create an app at apps.twitter.com w/callback URL of http://127.0.0.1:1410
# Save the app name, consumer key and secret to the following
# Environment variables
app <- oauth_app(
appname = Sys.getenv("TWITTER_APP_NAME"),
key = Sys.getenv("TWITTER_CONSUMER_KEY"),
secret = Sys.getenv("TWITTER_CONSUMER_SECRET")
)
twitter_token <- oauth1.0_token(oauth_endpoints("twitter"), app)
POST(
url = "https://api.twitter.com/1.1/statuses/update_with_media.json",
config(token = twitter_token),
body = list(
status = status,
media = upload_file(path.expand(tf))
)
) -> res
warn_for_status(res)
unlink(tf)
}
fig
}
Two Great Tastes That Taste Great Together
We can combine the power outage scraper & plotter with the tweeting code and just do:
gg_tweet(
gg + theme_tweet_rc(grid="Y"),
status = "Progress! #rtweet #gg_tweet",
send=TRUE
)
That was, in-fact, the last power outage tweet I sent.
Next Steps
Ironically, given current levels of U.S. news and public “discourse” on Twitter and some inane machinations in my own area of domain expertise (cyber), gg_tweet()
is likely one of the few ways I’ll be interacting with Twitter for a while. You can ping me on Keybase — hrbrmstr — or join the rstats
Keybase team via keybase team request-access rstats
if you need to poke me for anything for a while.
FIN
Kick the tyres and watch for gg_tweet()
ending up in ggalt
or hrbrthemes
. Don’t hesitate to suggest (or code up) feature requests. This is still an idea in-progress and definitely not ready for prime time without a bit more churning. (Also, words_to_num()
can be optimized, it was hastily crafted).
3 Trackbacks/Pingbacks
[…] leave a comment for the author, please follow the link and comment on their blog: R – rud.is. R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: Data […]
[…] article was first published on R – rud.is, and kindly contributed to […]
[…] article was first published on R – rud.is, and kindly contributed to […]