Engaging the tidyverse Clean Slate Protocol

I caught the 0.7.0 release of dplyr on my home CRAN server early Friday morning and immediately set out to install it since I’m eager to finish up my sergeant package and get it on CRAN. “Tidyverse” upgrades aren’t trivial for me as I tinker quite a bit with the tidyverse and create packages that depend on various components. The sergeant package provides — amongst other things — a dplyr back-end for Apache Drill, so it has more tidyverse tendrils than other bits of code I maintain.

macOS binaries weren’t available yet (it generally takes 24-48 hrs for that) so I did an install.packages("dplyr", type="source") and was immediately hit with gcc 7 compilation errors. This seemed odd, but switching back to clang worked fine.

I, then, proceeded to run chunks in an Rmd I’m working on and hit “Encoding” errors on mutate() calls. Not having time to debug further I reverted to 0.5.0 of dplyr and went about my day and promised the tidyverse maintainers that I’d work on a reproducible example after work.

I made R data files from the data frames that were tossing errors and extracted & tweaked a code snippet that consistently generated the error and created a rocker container on one of my linux boxes to validate that this was an error and a cross-platform one. The rocker container used a full fresh-from-source copy of the tidyverse including dplyr 0.7.0. The code worked and no error was generated, so I immediately suspected package rot on my main dev macOS box.

Now, my situation is complicated by an insanely hasty migration to macOS 10.13β1 (I refuse to use the Apple macOS catchy names anymore since the most recent one is just silly) and a move to the gcc 7 toolchain (initially prompted to both get rJava working nicely and reproduce some CRAN noted errors with some packages). Further complications were also created by many invocations of install_github() of various packages regularly overwriting bits of the tidyverse over the past few weeks since the R 3.4.0 release. In other words, the integrity of the “tidyverse” was in serious question on my system and it was time for the Clean Slate Protocol.

Rather than itemize package versions and surgically nipping and tucking, I opted to use packrat to get to my desired end-state of a full-integrity tidyverse install. There are many ways to do this. Feel free to “one-up” me and show your l33t method in the comments. This one will likely be accessible to most — if not all — R users.

I started a new RStudio project in a new session and told it to use packrat. In the new project console, I did install.packages("tidyverse", type="source") and let it go for many minutes. I, then, navigated to the packrat subdirectory where the 3.4 package binaries are housed (just follow the project packrat tree down to the R version directory) and moved all 51 packages (yes, 51 O_o) to the main R library path (which you can figure out by running .libPaths() in any non-packrat-maintained project).

After doing that, I fired up the originally failing Rmd and everything worked fine. ?

I don’t do the Clean Slate Protocol too often (we all get to for new R dot-releases) but it came in handy this time. If you run into errors when trying to get the new dplyr working, you may benefit from the Clean Slate Protocol as well.

If you haven’t seen the changes in 0.6.0/0.7.0 you should check them out and give it a go.

Cover image from Data-Driven Security
Amazon Author Page

6 Comments Engaging the tidyverse Clean Slate Protocol

  1. Pingback: Engaging the tidyverse Clean Slate Protocol | A bunch of data

  2. Pingback: Engaging the tidyverse Clean Slate Protocol – Mubashir Qasim

  3. Pingback: Engaging the tidyverse Clean Slate Protocol – Cyber Security

    1. hrbrmstr

      hrm. i am not sure but i don’t think so. it’s kinda not ready for prime time but I’ve been assured by RStudio that it’s getting the full “tidyverse treatment” soon.

      Reply

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.