Skip navigation

Author Archives: hrbrmstr

Don't look at me…I do what he does — just slower. #rstats avuncular • ?Resistance Fighter • Cook • Christian • [Master] Chef des Données de Sécurité @ @rapid7

Much of what I need to do for work-work involves using tools that are (for the moment) not in R. Today, I needed to test the validity of (and other processing on) DMARC records and I’m loathe to either reinvent the wheel or reticulate bits from a fragmented programming language ecosystem unless absolutely necessary. Thankfully, there’s libopendmarc which works well on sane operating systems, but it is a C library that needs an interface to use in R.

However, I also really didn’t want to start a new package for this just yet (there will eventually be one, though, and I prefer working in a package context for Rcpp work). I just needed to run opendmarc_policy_store_dmarc() against a decent-sized chunk of domain names and already-retrieved DMARC TXT records. So, I decided to write a small “inline” cppFunction() to get’er done.

Why am I blogging about this?

Despite growing popularity and a nice examples site, many newcomers to Rcpp (literally the way you want to go when it comes to bridging C[++] and R) still voice discontent about there not being enough “easy” examples. Granted, they are quitely likely looking for full-bore tutorials covering a different, explicit use cases. The aforelinked Gallery has some of those and there are codified examples in — literally — rcppexamples. But, there definitely needs to be more blog posts, books and such linking to them and expanding upon them.

Having mentioned that I’m using cppFunction(), one could, further, ask cppFunction() has a help page with an example, so why blather about using it?”. Fair point! And, there is a reason which was hinted at in the opening paragraph.

I need to use libopendmarc and that requires making a “plugin” if I’m going to do this “inline”. For some other DMARC processing I also need to use libresolv since the library needs to make DNS requests and uses resolv. You don’t need a plugin for a package version as you just need to boilerplate some “find these libraries and get their paths right for Makevars.in” and add the linking code in there as well. Here, we need to register two plugins that provide metdata for the magic that happens under the covers when Rcpp takes your inline code, compiles it and makes the function shared object available in R.

Plugins can be complex and do transformations, but the two I needed to write are just helping ensure the right #include lines are there along with the right linker libraries. Here they are:

library(Rcpp)

registerPlugin(
  name = "libresolv",
  plugin = function(x) {
    list(
      includes = "",
      env = list(PKG_LIBS="-lresolv")
    )
  }
)

registerPlugin(
  name = "libopendmarc",
  plugin = function(x) {
    list(
      includes = "#include <opendmarc/dmarc.h>",
      env = list(PKG_LIBS="-lopendmarc")
    )
  }
)

All they do is make data structures available in the environment. We can use inline::getPlugin() to see them:

inline::getPlugin("libresolv")
## $includes
## [1] ""
##
## $env
## $env$PKG_LIBS
## [1] "-lresolv"


inline::getPlugin("libopendmarc")
## $includes
## [1] "#include <opendmarc/dmarc.h>"
## 
## $env
## $env$PKG_LIBS
## [1] "-lopendmarc"

Finally, the tiny bit of C/C++ code to take in the necessary parameters and return the result. In this case, we’re passing in a character vector of domain names and DMARC records and getting back a logical vector with the test results. Apart from the necessary initialization and cleanup code for libopendmarc this is an idiom you’ll recognize if you look over packages that use Rcpp.

cppFunction(
std::vector< bool > is_dmarc_valid(std::vector< std::string> domains,
                                   std::vector< std::string> dmarc_records) {

  std::vector< bool > out(dmarc_records.size());

  DMARC_POLICY_T *pctx;
  OPENDMARC_STATUS_T status;

  pctx = opendmarc_policy_connect_init((u_char *)"1.2.3.4", 0);

  for (unsigned int i=0; i<dmarc_records.size(); i++) {

    status = opendmarc_policy_store_dmarc(
      pctx,
      (u_char *)dmarc_records[i].c_str(),
      (u_char *)domains[i].c_str(),
      NULL
    );

    out[i] = (status == DMARC_PARSE_OKAY);

    pctx = opendmarc_policy_connect_rset(pctx);

  }

  pctx = opendmarc_policy_connect_shutdown(pctx);

  return(out);

}
,
plugins=c("libresolv", "libopendmarc"))

(Note: the code-formatting plugin was tossing a serious fit about the long text field so you’ll need to put a single quote after cppFunction( and before the line with the , if you’re cutting and pasting at home).

Right at the end, the final parameter is telling cppFunction() what plugins to use.

Executing that line shunts a modified version of the function to disk, compiles it and lets us use the function in R (use cacheDir, showOutput and verbose parameters to control how many gory details lie undeneath this pristine shell).

After running the function, is_dmarc_valid() is available in the environment and ready to use.

domains <- c("bit.ly", "bizible.com", "blackmountainsystems.com", "blackspoke.com")
dmarc <-  c("v=DMARC1; p=none; pct=100; rua=mailto:dmarc@bit.ly; ruf=mailto:ruf@dmarc.bitly.net; fo=1;", 
            "v=DMARC1; p=reject; fo=1; rua=mailto:postmaster@bizible.com; ruf=mailto:forensics@bizible.com;", 
            "v=DMARC1; p=quarantine; pct=100; rua=mailto:demarcrecords@blkmtn.com, mailto:ttran@blkmtn.com", 
            "user.cechire.com.")

is_dmarc_valid(domains, dmarc)
## [1]  TRUE  TRUE  TRUE FALSE

Processing those 5 took just about 10 microseconds which meant I could process the ~1,000,000 domains+DMARCs in no time at all. And, I have something I can use in a DMARC utility package (coming “soon”).

Hopefully this was a useful reference for both hooking up external libraries to “inline” Rcpp functions and for how to go about doing this type of thing in general.

Time for another look at what’s new and interesting in the world with the help of Peter Meissner’s (@marvin_dpr) crossword.r?.

The answers to last week’s puzzle have been posted (it seemed to make more sense posting the answers a week later vs the Monday after).

There is a dedicated category — puzzler — to make it easier to find these later on, all in one place. That category also has it’s own RSS feed.

Peter Meissner (@marvin_dpr) released crossword.r? to CRAN today. It’s a spiffy package that makes it dead simple to generate crossword puzzles.

He also made a super spiffy javascript library to pair with it, which can turn crossword model output into an interactive puzzle.

I thought I’d combine those two creations with a way to highlight new/updated packages from the previous week, cool/useful packages in general, and some R functions that might come in handy. Think of it as a weekly way to get some R information while having a bit of fun!

This was a quick, rough creation and I’ll be changing the styles a bit for next Friday’s release, but Peter’s package is so easy to use that I have absolutely no excuse to not keep this a regular feature of the blog.

I’ll release a static, ggplot2 solution to each puzzle the following Monday(s). If you solve it before then, tweet a screen shot of your solution with the tag #rstats #puzzler and I’ll pick the first time-stamped one to highlight the following week.

I’ll also get a GitHub setup for suggestions/contributions to this effort + to hold the puzzle data.

ANSWERS

library(crossword.r)

cw <- Crossword$new(rows = 12, columns = 12)

cw$add_words(
  
  words = c(
    "crosswordr",
    "searchr",
    "kerasformula",
    "fs",
    "crypto",
    "mgcv",
    "startsWith",
    "akima",
    "rcompgen",
    "broom",
    "nord"
  ),
  
  clues = c(
    "New package to generate crosswords (w/o '.')",
    "Interpolation-focused package",
    "Core generalized additive modelling package",
    "Package facilitating searching Bing, Google and more from an R console",
    "New, high-level interface package to 'keras'",
    "Consistent, cross-platform, vectorised filesystem operations package",
    "Interface package to all digital/crypto currency market data",
    "base function to test if a string starts with another string",
    "utils function to generation command completion networks",
    "Package that makes it easy to tidy statistical analyses objects",
    "An arctic, north-bluish color palette package"#,
  )
  
)

We’re doing some interesting studies (cybersecurity-wise, not finance-wise) on digital currency networks at work-work and — while I’m loathe to create a geo-map from IPv4 geolocation data — we:

  • do get (often, woefully inaccurate) latitude & longitude data from our geolocation service (I won’t name-and-shame here); and,
  • there are definite geo-aspects to the prevalence of mining nodes — especially Bitcoin; and,
  • I have been itching to play with the nascent nord palette? in a cartographical context…

so I went on a small diversion to create a bubble plot of geographical Bitcoin node-prevalence.

I tweeted out said image and someone asked if there was code, hence this post.

You’ll be able to read about the methodology we used to capture the Bitcoin node data that underpins the map below later this year. For now, all I can say is that wasn’t garnered from joining the network-proper.

I’m including the geo-data in the gist?, but not the other data elements (you can easily find Bitcoin node data out on the internets from various free APIs and our data is on par with them).

I’m using swatches? for the nord palette since I was hand-picking colors, but you should use @jakekaupp’s most excellent nord package? if you want to use the various palettes more regularly.

I’ve blathered a bit about nord, so let’s start with that (and include the various other packages we’ll use later on):

library(swatches)
library(ggalt) # devtools::install_github("hrbrmstr/ggalt")
library(hrbrthemes) # devtools::install_github("hrbrmstr/hrbrthemes")
library(tidyverse)

nord <- read_palette("nord.ase")

show_palette(nord)

It may not be a perfect palette (accounting for all forms of vision issues and other technical details) but it was designed very well (IMO).

The rest is pretty straightforward:

  • read in the bitcoin geo-data
  • count up by lat/lng
  • figure out which colors to use (that took a bit of trial-and-error)
  • tweak the rest of the ggplot2 canvas styling (that took a wee bit longer)

I’m using development versions of two packages due to their added functionality not being on CRAN (yet). If you’d rather not use a dev-version of hrbrthemes just use a different ipsum theme vs the new theme_ipsum_tw().

read_csv("bitc.csv") %>%
  count(lng, lat, sort = TRUE) -> bubbles_df

world <- map_data("world")
world <- world[world$region != "Antarctica", ]

ggplot() +
  geom_cartogram(
    data = world, map = world,
    aes(x = long, y = lat, map_id = region),
    color = nord["nord3"], fill = nord["nord0"], size = 0.125
  ) +
  geom_point(
    data = bubbles_df, aes(lng, lat, size = n), fill = nord["nord13"],
    shape = 21, alpha = 2/3, stroke = 0.25, color = "#2b2b2b"
  ) +
  coord_proj("+proj=wintri") +
  scale_size_area(name = "Node count", max_size = 20, labels = scales::comma) +
  labs(
    x = NULL, y = NULL,
    title = "Bitcoin Network Geographic Distribution (all node types)",
    subtitle = "(Using bubbles seemed appropriate for some, odd reason)",
    caption = "Source: Rapid7 Project Sonar"
  ) +
  theme_ipsum_tw(plot_title_size = 24, subtitle_size = 12) +
  theme(plot.title = element_text(color = nord["nord14"], hjust = 0.5)) +
  theme(plot.subtitle = element_text(color = nord["nord14"], hjust = 0.5)) +
  theme(panel.grid = element_blank()) +
  theme(plot.background = element_rect(fill = nord["nord3"], color = nord["nord3"])) +
  theme(panel.background = element_rect(fill = nord["nord3"], color = nord["nord3"])) +
  theme(legend.position = c(0.5, 0.05)) +
  theme(axis.text = element_blank()) +
  theme(legend.title = element_text(color = "white")) +
  theme(legend.text = element_text(color = "white")) +
  theme(legend.key = element_rect(fill = nord["nord3"], color = nord["nord3"])) +
  theme(legend.background = element_rect(fill = nord["nord3"], color = nord["nord3"])) +
  theme(legend.direction = "horizontal")

As noted, the RStudio project associated with this post in in this gist?. Also, upon further data-inspection by @jhartftw, we’ve discovered yet-more inconsistencies in the geo-mapping service data (there are way too many nodes in Paris, for example), but the main point of the post was to mostly show and play with the nord palette.

NOTE: The likelihood of this recipe being added to the recent practice bookdown book is slim, but I’ll try to keep the same format for the blog post.

Problem

You want to collect all the tweets in a Twitter tweet thread

Solution

Use a few key functions in rtweet to piece the thread elements back together.

Discussion

In Twitterland, a “thread” is a series of tweets by an author that are in a reply chain to each other which enables them to be displayed sequentially to form a larger & (ostensibly) more cohesive message. Even with the recent 280 character tweet-length increase, threads are still popular and used daily. They’re very easy to distinguish on Twitter but there is no Twitter API call to collect up all the pieces of these threads.

Let’s build a function — get_thread() — that will take as input a starting thread URL or status id and return a data frame of all the tweets in the thread (in order). As a bonus, we’ll also include a way to include all first-level retweets and replies to each threaded tweet (that, too, happens quite a bit).

There are documentation snippets in the code block (below), but the essence of the function is:

  • first, finding the tweet that belongs to the status id to get some metadata
  • then doing a search for tweets from the author that occurred after that tweet (we do this to save on API calls and we grab a bunch of them)
  • rather than do a bunch of things by hand, we make from/to pairs to feed in as vertex edges into igraph
  • once that’s done, separate out the graph into unique subgraphs and find the one containing the starting status id
  • since that subgraph is just a set of status ids, rebuild the data frame from it and put it in order.

There may be occasions where we want to grab the replies or RTs of any of the original thread tweets. They aren’t always useful, but when they are it’d be good to have this context. So, we’ll add an option that — if TRUE — will cause the function to go down the list of threaded tweets and pull the first-level replies and RTs (excluding the ones from the author). We’ll do this using the Twitter search API as it’ll ultimately save on API calls and it puts the filtering closer to the data (I’m generally “a fan” of putting computation as close to the data as possible for any given task). If there were any, they’ll be in a replies column which can be unnested at-will.

Here’s the complete function:

get_thread <- function(first_status, include_replies=FALSE, .timeline_history=3000) {

  require(rtweet, quietly=TRUE, warn.conflicts=FALSE)
  require(igraph, quietly=TRUE, warn.conflicts=FALSE)
  require(tidyverse, quietly=TRUE, warn.conflicts=FALSE)

  first_status <- if (str_detect(first_status[1], "^http[s]://")) basename(first_status[1]) else first_status[1]

  # get first status
  orig <- rtweet::lookup_tweets(first_status)

  # grab the author's timeline to search
  author_timeline <- rtweet::get_timeline(orig$screen_name, n=.timeline_history, since_id=first_status)

  # build a data frame containing from/to pairs (anything the author
  # replied to) that also includes the `first_status` id.
  suppressWarnings(
    dplyr::filter(
      author_timeline,
      (status_id == first_status) | (reply_to_screen_name == orig$screen_name)
    ) %>%
      dplyr::select(status_id, reply_to_status_id) %>%
      igraph::graph_from_data_frame() -> g
  ) # build a graph from this

  # decompose the graph into unique subgraphs and return them to data frames
  igraph::decompose(g) %>%
    purrr::map(igraph::as_data_frame) -> threads_dfs

  # find the thread with our `first_status` ids

  thread_df <- purrr::keep(threads_dfs, ~any(which(unique(unlist(.x, use.names=FALSE)) == first_status)))

  # BONUS: we get them in the order we need!
  thread_order <- purrr::discard(rev(unique(unlist(thread_df))), str_detect, "NA")

  # filter out the thread from the timeline corpus & sort it
  dplyr::filter(author_timeline, status_id %in% pull(thread_df[[1]], from)) %>%
    dplyr::mutate(status_id = factor(status_id, levels=thread_order)) %>%
    dplyr::arrange(status_id) -> tweet_thread

  if (include_replies) {
    # for each status, lookup 1st-level references to it, excluding ones from the original author
    mutate(
      tweet_thread,
      replies = purrr::map(
        as.character(status_id),
        ~rtweet::search_tweets(sprintf("%s -from:%s", .x, orig$screen_name[1]))
      )
    ) -> tweet_thread
  }

  class(tweet_thread) <-  c("tweet_thread", class(tweet_thread))

  return(tweet_thread)

}

Now, if we grab this thread, the function will return the following:

xdf <- get_thread("https://twitter.com/petersagal/status/952910499825451009")

glimpse(select(xdf, 1:5))
## Observations: 10
## Variables: 5
## $ status_id   <fctr> 952910499825451009, 952910695804305408, 952911012990193664, 952911632077852679, 9529121...
## $ created_at  <dttm> 2018-01-15 14:29:02, 2018-01-15 14:29:48, 2018-01-15 14:31:04, 2018-01-15 14:33:31, 201...
## $ user_id     <chr> "14985228", "14985228", "14985228", "14985228", "14985228", "14985228", "14985228", "149...
## $ screen_name <chr> "petersagal", "petersagal", "petersagal", "petersagal", "petersagal", "petersagal", "pet...
## $ text        <chr> "Funny you mention that. I talked to Minniejean (Brown) Trickey, one of the Little Rock ...

purrr::map(xdf$text, strwrap) %>% 
  purrr::map_chr(paste0, collapse="\n") %>% 
  cat(sep="\n\n")
## Funny you mention that. I talked to Minniejean (Brown) Trickey, one of the Little Rock Nine, about
## that very day in front of CHS for my documentary, "Constitution USA." https://t.co/MRwtlfZtvp
## 
## You would think that of all people, she would be satisfied with the government's response to racism
## and hate. Ike sent the 101st Airborne to escort her to class!
## 
## But what I didn't know is that after the 101st left, CHS expelled her on a trumped up charge of
## assault after she spilled some chili on a white student.
## 
## She spilled some chili. After being tripped by another white kid. "We got rid of one of them!" the
## teachers bragged.
## 
## Then, of course, rather than continue to allow black students to attend CHS, the governor of Alabama
## closed the schools. https://t.co/2DfBEI0OTL"The_Lost_Year"
## 
## Ms Brown looked around the country post high school. She saw Jim Crow, firehoses turned on Blacks,
## the murder of the Birmingham Four and the Mississippi Three. She moved to Canada.
## 
## As of 2012, she found herself coming back to Little Rock, a place she told me she never wanted to
## see again. But she had family. And the National Historic Site center was there. She liked to drop
## by, talk to the kids about what happened.
## 
## Now she lives in Little Rock full time. She doesn't care that her name is inscribed on a bench in
## front of the school. She doesn't care that your dad welcomed her back in '99.  She spends time at
## the Center, telling people what really happened. You should go talk to her.
## 
## (Sorry: Arkansas, obviously. Typing too quickly.)
## 
## Here's me, talking to Ms Trickey and Marty Sammon, who served with the 101st at Little Rock. Buddy
## Squiers on camera. CHS is off to the left. https://t.co/ft4LUBf3sr
## 
## https://t.co/EHLbe1finj

The replies data frame looks much the same as the thread data frame — it’s essentially just another rtweet data frame, so we won’t waste electrons showing it.

While that map/map/cat sequence isn’t bad to type, it’d be more convenient if we had a print() method for this structure (this is one reason we added a class to it). It’d be even spiffier if this print() method made it easier to distinguish the main thread from the RT’s/replies — but still show those extra bits of info. We’ll use the crayon package for added emphasis:

print.tweet_thread <- function(x, ...) {
  
  cat(crayon::cyan(sprintf("@%s - %s\n\n", x$screen_name[1], x$created_at[1])))
  
  if (!("replies" %in% colnames(x))) x$replies <- purrr::map(1:nrow(x), ~list())
  
  purrr::walk2(x$text, x$replies, ~{
    
    cat(crayon::green(paste0(strwrap(.x), collapse="\n")), "\n\n", sep="")
    
    if (length(.y) > 0) {
      purrr::walk2(.y$screen_name, .y$text, ~{
        sprintf("@%s\n%s", .x, .y) %>%
          strwrap(indent=8, exdent=8) %>%
          paste0(collapse="\n") %>%
          crayon::silver$italic() %>%
          cat("\n\n", sep="")
      })
    }
    
  })
  
}

Let’s re-capture the tweet thread but also include replies this time and print it out:

ydf <- get_thread("https://twitter.com/petersagal/status/952910499825451009", include_replies=TRUE)

ydf

See Also

I’ve git-chatted with Sir Kearney to see where to best put this function. I mention that as there are some upcoming posts that kick the aforeblogged tweet_shot() up a notch or two and all of this may work better in a tweetview package.

Regardless, drop a note in the comments if there are other bits of functionality or function options you think belong in get_thread().

The new year begins with me being on the hook to crank out a book on advanced web-scraping in R by July (more on that in a future blog post). The bookdown? package seemed to be the best way to go about doing this but I had only played with the toy/default examples of it and wanted to test out the platform with a “Hello, World”-like example of a “real” book to iron out issues and avoid more refactoring later on than I know I will have to do. I’ve been on an rtweet kick as of late (I have no idea why) and had an e-copy of O’Reilly’s 21 Recipes for Mining Twitter in the their synced Dropbox folder (it was a free giveaway a few years ago) and decided to make an rtweet version of it in a bookdown project.

You can find the GitHub repo for it here and the rendered version here. NOTE: I will likely not finish the remaining two chapters (I need to spend the time on the real book :-) but will gladly add you as a co-author if you shoot over a PR.

I began with Sean Kross’ quick start and decided to work primarily in Sublime Text and use a Makefile to manage the build process. Since the goal was to iron out kinks for a real production book, here’s a bullet list of some tips as a result of figuring out what worked for me:

  • Get Yihui Xie’s book. I have a physical copy but having either will help you when things get frustrating (and they do get frustrating at times)
  • Use git. However you instantiate the project, use git source control so you don’t lose your hard work. However some directories are not tracked in git! You may want to modify the line with *.rds in .gitignore to be a bit less brutal if you happen to generate rds files outside of the project but use them in chapter examples. Also, make sure to put other, sensitive items (like .httr-oauth) in that .gitignore to avoid having to reset credentials.
  • Use a Makefile. I like RStudio, but have far more editing tools in Sublime Text for book-ish work. Plus it has an easy build system manager, and I find it easier to navigate files.
  • Make liberal use of code chunks. Chapter 16 has a structure that I used in many of the chapters. One block for library calls (no caching); load fonts (hidden, and primarily for PDF rendering); named, cached logical sections that go with the flow of the chapter text; custom figure dimensions to ensure they come out as desired. Caching will speed up rendering time immensely.
  • Use saved data and a mixture of echo=FALSE, eval=TRUE, echo=TRUE, eval=FALSE for things you generated outside of the book source code (because they may be long running things you don’t want to wait for even once in rendering) but want to show in the book (perhaps with slightly modified source).
  • Despite using git, create a daily compressed archive of the directory tree and stick it on Dropbox (that can be part of the Makefile). Your work is valuable and you need to make sure it’s backed up.
  • Learn about references. Yihui Xie’s book shows how to deal with in- and cross-chapter references, read and use them!
  • Use a bookdown::word_document2 vs PDF and make a custom Word template for it. The default PDF output is fine for basic things, but you’ll want to generate a better one from Word.
  • When things stop rendering properly save your recently edited files and go back in time with git to a working start. This happened to me a few times as I worked across different machines. git makes glitches almost stress free.
  • Use rsync for publishing. I need to add this to the Makefile but one, short command-line call can publish your work in seconds to a web server.

I’ll likely have more tips as the year goes on and will have a follow-up post for using web server access logs to generate “kindle-like” reading statistics for your tomes.

(You can find all R⁶ posts here)

UPDATE 2018-01-01 — this has been added to rtweet (GH version).

A Twitter discussion:

that spawned from Maëlle’s recent look-back post turned into a quick function for capturing an image of a Tweet/thread using webshot, rtweet, magick and glue.

Pass in a status id or a twitter URL and the function will grab an image of the mobile version of the tweet.

The ultimate goal is to make a function that builds a tweet using only R and magick. This will have to do until the new year.

tweet_shot <- function(statusid_or_url, zoom=3) {

  require(glue, quietly=TRUE)
  require(rtweet, quietly=TRUE)
  require(magick, quietly=TRUE)
  require(webshot, quietly=TRUE)

  x <- statusid_or_url[1]

  is_url <- grepl("^http[s]://", x)

  if (is_url) {

    is_twitter <- grepl("twitter", x)
    stopifnot(is_twitter)

    is_status <- grepl("status", x)
    stopifnot(is_status)

    already_mobile <- grepl("://mobile\\.", x)
    if (!already_mobile) x <- sub("://twi", "://mobile.twi", x)

  } else {

    x <- rtweet::lookup_tweets(x)
    stopifnot(nrow(x) > 0)
    x <- glue_data(x, "https://mobile.twitter.com/{screen_name}/status/{status_id}")

  }

  tf <- tempfile(fileext = ".png")
  on.exit(unlink(tf), add=TRUE)

  webshot(url=x, file=tf, zoom=zoom)

  img <- image_read(tf)
  img <- image_trim(img)

  if (zoom > 1) img <- image_scale(img, scales::percent(1/zoom))

  img

}

Now just do one of these:

tweet_shot("947082036019388416")
tweet_shot("https://twitter.com/jhollist/status/947082036019388416")

to get:

2017 is nearly at an end. We humans seem to need these cycles to help us on our path forward and have, throughout history, used these annual demarcation points as a time of reflection of what was, what is an what shall come next.

To that end, I decided it was about time to help quantify a part of the soon-to-be previous annum in R through the fabrication of a reusable template. Said template contains various incantations that will enable the wielder to enumerate their social contributions on:

  • StackOveflow
  • GitHub
  • Twitter
  • WordPress

through the use of a parameterized R markdown document.

The result of one such execution can be found here (for those who want a glimpse of what I was publicly up to in 2017).

Want to see where you contributed the most on SO? There’s a vis for that:

What about your GitHub activity? There’s a vis for that, too:

Perhaps you just want to see your top blog posts for the year. There’s also a vis for that:

Or — maybe — you just want to see how much you blathered on Twitter. There’s even a vis for that:

Take the Rmd for a spin. File issues & PRs for things that need work and take some time to look back on 2017 with a more quantified eye than you may have in years’ past.

Here’s to 2018 being full of magic, awe, wonder and delight for us all!