Quick Hit: Automating Production Graphics Uploads in R Markdown Documents with googledrive

As someone who measures all kinds of things on the internet as part of his $DAYJOB, I can say with some authority that huge swaths of organizations are using cloud-services such as Google Apps, Dropbox and Office 365 as part of their business process workflows. For me, one regular component that touches the “cloud” is when I have to share R-generated charts with our spiffy production team for use in reports, presentations and other general communications.

These are typically project-based tasks and data science team members typically use git- and AWS-based workflows for gathering data, performing analyses and generating output. While git is great at sharing code and ensuring the historical integrity of our analyses, we don’t expect the production team members to be or become experts in git to use our output. They live in Google Drive and thanks to the googledrive? package we can bridge the gap between code and output with just a few lines of R code.

We use “R projects” to organize things and either use spinnable R scripts or R markdown documents inside those projects to gather, clean and analyze data.

For 2019, we’re using new, work-specific R markdown templates that have one new YAML header parameter:

params:
  gdrive_folder_url: "https://drive.google.com/drive/u/2/SOMEUSELESSHEXSTRING"

which just defines the Google Drive folder URL for the final output directory in the ☁️.

Next is a new pre-configured knitr chunk call at the start of these production chart-generating documents:

knitr::opts_chunk$set(
  message = FALSE,
  warning = FALSE, dev = c("png", "cairo_pdf"),
  echo = FALSE,
  fig.retina = 2,
  fig.width = 10,
  fig.height = 6,
  fig.path = "prod/charts/"
)

since the production team wants PDF so they can work with it in their tools. Our testing showed that cairo_pdf produces the best/most consistent output, but PNGs show up better in the composite HTML documents so we use that order deliberately.

The real change is the consistent naming of the fig.path directory. By doing this, all we have to do is add a few lines (again, automatically generated) to the bottom of the document to have all the output automagically go to the proper Google Drive folder:

# Upload to production ----------------------------------------------------

googledrive::drive_auth()

# locate the folder
gdrive_prod_folder <- googledrive::as_id(params$gdrive_folder_url)

# clean it out
gdrls <- googledrive::drive_ls(gdrive_prod_folder)
if (nrow(gdrls) > 0) {
  dplyr::pull(gdrls, id) %>%
    purrr::walk(~googledrive::drive_rm(googledrive::as_id(.x)))
}

# upload new
list.files(here::here("prod/charts"), recursive = TRUE, full.names = TRUE) %>%
  purrr::walk(googledrive::drive_upload, path = gdrive_prod_folder)

Now, we never have to remember to drag documents into a browser and don’t have to load invasive Google applications onto our systems to ensure the right folks have the right files at the right time. We just have to use the new R markdown document type to generate a starter analysis document with all the necessary boilerplate baked in. Plus, .httr-oauth file is automatically ignored in .gitignore so there’s no information leakage to shared git repositories.

FIN

If you want to experiment with this, you can find a pre-configured template in the markdowntemplates package over at sr.ht, GitLab, or GitHub.

If you install the package you’ll be able to select this output type right from the new document dialog:

and new template will be ready to go with no copying, cutting or pasting.

Plus, since the Google Drive folder URL is an R markdown parameter, you can also use this in script automation (provided that you’ve wired up oauth correctly for those scripts).

Cover image from Data-Driven Security
Amazon Author Page

6 Comments Quick Hit: Automating Production Graphics Uploads in R Markdown Documents with googledrive

  1. Pingback: Quick Hit: Automating Production Graphics Uploads in R Markdown Documents with googledrive – Data Science Austria

  2. balciserdar

    thank you for the post. I will definitely use it. the github link has two .com and goes to another website.

    Reply
  3. balciserdar

    I started using it. I am rendering multiple Rmd files inside this template and then upload pdf files to googledrive. Thank you for sharing :)

    Reply

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.