Skip navigation

Author Archives: hrbrmstr

Don't look at me…I do what he does — just slower. #rstats avuncular • ?Resistance Fighter • Cook • Christian • [Master] Chef des Données de Sécurité @ @rapid7

The latest round of the 2020 Democratic debates is over and the data from all the 2019 editions of the debates have been added to {ggchicklet}. The structure of the debates2019 built-in dataset has changed a bit:

library(ggchicklet)
library(hrbrthemes)
library(tidyverse)

debates2019
## # A tibble: 641 x 7
##    elapsed timestamp speaker   topic   debate_date debate_group night
##      <dbl> <time>    <chr>     <chr>   <date>             <dbl> <dbl>
##  1   1.04  21:03:05  Warren    Economy 2019-09-13             1     1
##  2   1.13  21:04:29  Klobuchar Economy 2019-09-13             1     1
##  3   1.13  21:06:02  O'Rourke  Economy 2019-09-13             1     1
##  4   0.226 21:07:20  O'Rourke  Economy 2019-09-13             1     1
##  5   1.06  21:07:54  Booker    Economy 2019-09-13             1     1
##  6   0.600 21:09:08  Booker    Economy 2019-09-13             1     1
##  7   0.99  21:09:50  Warren    Economy 2019-09-13             1     1
##  8   0.872 21:11:03  Castro    Economy 2019-09-13             1     1
##  9   1.07  21:12:00  Gabbard   Economy 2019-09-13             1     1
## 10   1.11  21:13:20  de Blasio Economy 2019-09-13             1     1
## # … with 631 more rows

There are now debate_date, debate_group and night columns to make it easier to segment out or group together the debate nights.

The topic names across the online JavaScript data for the June, July and September debates weren’t uniform so they’ve been cleaned up as well:

distinct(debates2019, topic) %>% 
  arrange(topic) %>% 
  print(n=nrow(.))
## # A tibble: 26 x 1
##    topic                  
##    <chr>                  
##  1 Abortion               
##  2 Age                    
##  3 Campaign Finance Reform
##  4 Civil Rights           
##  5 Climate                
##  6 Closing                
##  7 Economy                
##  8 Education              
##  9 Elections Reform       
## 10 Foreign Policy         
## 11 Gun Control            
## 12 Healthcare             
## 13 Immigration            
## 14 Lead                   
## 15 Opening                
## 16 Other                  
## 17 Party Strategy         
## 18 Politics               
## 19 Race                   
## 20 Resilience             
## 21 Socialism              
## 22 Statement              
## 23 Trade                  
## 24 Trump                  
## 25 Veterans               
## 26 Women's Rights 

This should make it easier to compare speaker times per-topic across the debates.

Here’ how to generate the chart in the featured image slot for the September debate:

debates2019 %>%
  filter(debate_group == 3) %>% 
  mutate(speaker = fct_reorder(speaker, elapsed, sum, .desc=FALSE)) %>%
  mutate(topic = fct_inorder(topic)) %>% 
  ggplot(aes(speaker, elapsed, group = timestamp, fill = topic)) +
  geom_chicklet(width = 0.75) +
  scale_y_continuous(
    expand = c(0, 0.0625),
    position = "right",
    breaks = seq(0, 18, 2),
    labels = c(0, sprintf("%d min.", seq(2, 18, 2))),
    limits = c(0, 18)
  ) +
  ggthemes::scale_fill_tableau("Tableau 20") +
  guides(
    fill = guide_legend(nrow = 2)
  ) +
  coord_flip() +
  labs(
    x = NULL, y = NULL, fill = NULL,
    title = "How Long Each Candidate Spoke",
    subtitle = "September 2019 Democratic Debates",
    caption = "Each bar segment represents the length of a candidate’s response to a question.\nOriginal <https://www.nytimes.com/interactive/2019/09/12/us/elections/debate-speaking-time.html>\n#rstats reproduction by @hrbrmstr"
  ) +
  theme_ipsum_rc(grid="X") +
  theme(axis.text.x = element_text(color = "gray60", size = 10)) +
  theme(legend.position = "top")

Now that the field has been thinned a bit (yes, others are still running, but really?) we can see who has blathered the most on stage so far:

debates2019 %>%
  filter(debate_group == 3) %>% 
  distinct(speaker) %>% 
  left_join(debates2019) %>% 
  count(speaker, wt=elapsed, sort=TRUE) %>% 
  mutate(speaker = fct_inorder(speaker) %>% fct_rev()) %>% 
  ggplot(aes(speaker, n)) +
  geom_col(fill = ft_cols$slate, width=0.55) +
  coord_flip() +
  scale_y_continuous(expand = c(0, 0.55), position = "right") +
  labs(
    x = NULL, y = "Speaking time (minutes)",
    title = "Total Speaking Time Across All 2019 Debates\nfor Those Left Standing in September"
  ) +
  theme_ipsum_es(grid="X")


And, here’s what they’ve all blathered about:

debates2019 %>%
  filter(debate_group == 3) %>% 
  distinct(speaker) %>% 
  left_join(debates2019) %>% 
  count(topic, wt=elapsed, sort=TRUE) %>% 
  mutate(topic = fct_inorder(topic) %>% fct_rev()) %>% 
  ggplot(aes(topic, n)) +
  geom_col(fill = ft_cols$slate, width=0.55) +
  coord_flip() +
  scale_y_continuous(expand = c(0, 0.25), position = "right") +
  labs(
    x = NULL, y = "Topic time (minutes)",
    title = "Total Topic Time Across All 2019 Debates\nfor Those Left Standing in September"
  ) +
  theme_ipsum_es(grid="X")

A minor update to RSwitch has been released. Apart from some internal code reorganization there are three user-facing changes.

First, RSwitch is now notarized! That means you won’t get a notice about it being from an “unidentified developer” nor will folks on Catalina see a warning about unable to check the download for malware. You can use {macthekinfe} to check out the application signature and notarization info:

check_sig("/Applications/RSwitch.app") %>% 
  print(n=nrow(.))
## # A tibble: 25 x 2
##    key                         value                                                               
##    <chr>                       <chr>                                                               
##  1 Executable                  /Applications/RSwitch.app/Contents/MacOS/RSwitch                    
##  2 Identifier                  is.rud.bob.RSwitch                                                  
##  3 Format                      app bundle with Mach-O thin (x86_64)                                
##  4 CodeDirectory v             20500 size=1342 flags=0x10000(runtime) hashes=33+5 location=embedded
##  5 VersionPlatform             1                                                                   
##  6 VersionMin                  658944                                                              
##  7 VersionSDK                  659200                                                              
##  8 Hash type                   sha256 size=32                                                      
##  9 CandidateCDHash sha256      efa512a9daabfb9402af8a91697f008b89ffa81e                            
## 10 CandidateCDHashFull sha256  efa512a9daabfb9402af8a91697f008b89ffa81ea014452821e39a5365d80fe6    
## 11 Hash choices                sha256                                                              
## 12 CMSDigest                   efa512a9daabfb9402af8a91697f008b89ffa81ea014452821e39a5365d80fe6    
## 13 CMSDigestType               2                                                                   
## 14 Page size                   4096                                                                
## 15 CDHash                      efa512a9daabfb9402af8a91697f008b89ffa81e                            
## 16 Signature size              8968                                                                
## 17 Authority                   Developer ID Application: Bob Rudis (CBY22P58G8)                    
## 18 Authority                   Developer ID Certification Authority                                
## 19 Authority                   Apple Root CA                                                       
## 20 Timestamp                   Sep 1, 2019 at 08:46:41                                             
## 21 Info.plist entries          26                                                                  
## 22 TeamIdentifier              CBY22P58G8                                                          
## 23 Runtime Version             10.15.0                                                             
## 24 Sealed Resources version    2 rules=13 files=26                                                 
## 25 Internal requirements count 1 size=212
check_notarization("/Applications/RSwitch.app")
## # A tibble: 4 x 2
##   key         value                                           
##   <chr>       <chr>                                           
## 1 application /Applications/RSwitch.app                       
## 2 status      accepted                                        
## 3 source      Notarized Developer ID                          
## 4 origin      Developer ID Application: Bob Rudis (CBY22P58G8)

Note: you may (I’m working on installing it in a fresh Catalina VM to know definitely) need to ensure RSwitch is granted “Full Disk Access” in System Preferences -> Security & Privacy -> [Privacy] tab to ensure it can operate where it needs to:


Next, since it’s possible to have an old set of just package libraries at a given /Library/Frameworks/R.framework/Versions/#.# path but no R binary in said locations, the script now does what the R Core RSwitch app (Simon was kind enough to forward that R-Forge SVN web link) does and performs some extra validation to see if a given R version directory is, indeed, switchable to. Directories that aren’t switchable are shown but grayed out (as in the image, below) and marked as “incomplete”. You probably should clean out those old paths.

Finally, in the same image keen observers will see a few more relevant links have been added to the “bookmarks”. I added them because I frequent them as I work on R-related things.

Folks who are running 1.4.0 should be able to use the “Check for update…” menu item to get to the releases page. You can also get it from the RSwitch landing page or download it directly via: https://rud.is/rswitch/releases/RSwitch-1.4.1.app.zip.

FIN

RSwitch feels feature complete so the pace of development and releases will likely slow a bit. Some spiffy folks have offered both a new app icon and a request to make it easier to switch between running RStudio/R GUI instances and I’m working on incorporating both of those ideas into the app. If you do have a problem, question, or feature request, definitely file an issue on your favorite social coding site. Links to where RSwitch source code can be found to file said issue(s) are at the bottom of the RSwitch landing page.

Swift 5 has been so much fun to hack on that there’s a new update to macOS R-focused mebubar utility RSwitch available. Along with the app comes a new dedicated RSwitch landing page and a new user’s guide since it has enough features to warrant such documentation. Here’s the new menu

The core changes/additions include:

  • a reorganized menu (see above)
  • the use of notifications instead of alerts
  • disabling of download menu entries while download is in progress
  • the ability to start new R GUI or RStudio instances
  • the ability to switch to and make running R GUI or RStudio instances active
  • additional “bookmarks” in the reorganized web resources submenu
  • Built-in check for updates

To make RSwitch launch at startup, just add it as a login item to your user in the “Users & Groups” pane of “System Preferences”.

The guide has information on how all the existing and new features work plus provides documentation on the how to install the alternate R versions available at the R for macOS Developer’s Page. There’s also a slightly expanded set of information on how to contribute to RSwitch development.

FIN

As usual, kick the tyres, file feature requests or bug reports where you’re comfortable, & — if you’re macOS-dev-curious — join in the Swift 5 fun (it really is a pretty fun language).

It’s only been a couple days since the initial version of my revamped take on RSwitch but there have been numerous improvements since then worth mentioning.

For starters, there’s a new app icon that uses the blue and gray from the official (modern) R logo to help visually associate it with R:

In similar fashion, the menubar icon now looks better in dark mode (I may still tweak it a bit, tho).

There are also some new features in the menu bar from v1.0.0:

  • numbered shortcuts for each R version
  • handy menubar links to R resources on the internet
  • two menubar links to make it easier to download the latest RStudio dailies by hand (if you’re not using something like Homebrew already for that) and the latest R-devel macOS distribution tarball
  • saner/cleaner alerts

On tap for 1.4.0 is using Notification Center for user messaging vs icky alerts and, perhaps, some TouchBar icons for Mac folk with capable MacBook Pro models.

FIN

As usual, kick the tyres & file issues, questions, feature requests & PRs where you like.

I caught this post on the The Surprising Number Of Programmers Who Can’t Program from the Hacker News RSS feed. Said post links to another, classic post on the same subject and you should read both before continuing.

Back? Great! Let’s dig in.

Why does hrbrmstr care about this?

Offspring #3 completed his Freshman year at UMaine Orono last year but wanted to stay academically active over the summer (he’s majoring in astrophysics and knows he’ll need some programming skills to excel in his field) and took an introductory C++ course from UMaine that was held virtually, with 1 lecture per week (14 weeks IIRC) and 1 assignment due per week with no other grading.

After seeing what passes for a standard (UMaine is not exactly on the top list of institutions to attend if one wants to be a computer scientist) intro C++ course, I’m not really surprised “Johnny can’t code”. Thirteen weeks in the the class finally started covering OO concepts, and the course is ending with a scant intro to polymorphism. Prior to this, most of the assignments were just variations on each other (read from stdin, loop with conditionals, print output) with no program going over 100 LoC (that includes comments and spacing). This wasn’t a “compsci for non-compsci majors” course, either. Anyone majoring in an area of study that requires programming could have taken this course to fulfill one of the requirements, and they’d be set on a path of forever using StackOverflow copypasta to try to get their future work done.

I’m fairly certain most of #3’s classmates could not program fizzbuzz without googling and even more certain most have no idea they weren’t really “coding in C++” most of the course.

If this is how most other middling colleges are teaching the basics of computer programming, it’s no wonder employers are having a difficult time finding qualified talent.

You have an “R” tag — actually, a few language tags — on this post, so where’s the code?

After the article triggered the lament in the previous section, a crazy, @coolbutuseless-esque thought came into my head: “I wonder how many different language FizzBuz solutions can be created from within R?”.

The criteria for that notion is/was that there needed to be some Rcpp::cppFunction(), reticulate::py_run_string(), V8 context eval()-type way to have the code in-R but then run through those far-super-to-any-other-language’s polyglot extensibility constructs.

Before getting lost in the weeds, there were some other thoughts on language inclusion:

  • Should Java be included? I :heart: {rJava}, but cat()-ing Java code out and running system() to compile it first seemed like cheating (even though that’s kinda just what cppFunction() does). Toss a note into a comment if you think a Java example should be added (or add said Java example in a comment or link to it in one!).
  • I think Julia should be in this example list but do not care enough about it to load {JuliaCall} and craft an example (again, link or post one if you can crank it out quickly).
  • I think Lua could be in this example given the existence of {luar}. If you agree, give it a go!
  • Go & Rust compiled code can also be called in R (thanks to Romain & Jeroen) once they’re turned into C-compatible libraries. Should this polyglot example show this as well?
  • What other languages am I missing?

The aforementioned “weeds”

One criteria for each language fizzbuzz example is that they need to be readable, not hacky-cool. That doesn’t mean the solutions still can’t be a bit creative. We’ll lightly go through each one I managed to code up. First we’ll need some helpers:

suppressPackageStartupMessages({
  library(purrr)
  library(dplyr)
  library(reticulate)
  library(V8)
  library(Rcpp)
})

The R, JavaScript, and Python implementations are all in the microbenchmark() call way down below. Up here are C and C++ versions. The C implementation is boring and straightforward, but we’re using Rprintf() so we can capture the output vs have any output buffering woes impact the timings.

cppFunction('
void cbuzz() {

  // super fast plain C

  for (unsigned int i=1; i<=100; i++) {
    if      (i % 15 == 0) Rprintf("FizzBuzz\\n");
    else if (i %  3 == 0) Rprintf("Fizz\\n");
    else if (i %  5 == 0) Rprintf("Buzz\\n");
    else Rprintf("%d\\n", i);
  }

}
')

The cbuzz() example is just fine even in C++ land, but we can take advantage of some C++11 vectorization features to stay formally in C++-land and play with some fun features like lambdas. This will be a bit slower than the C version plus consume more memory, but shows off some features some folks might not be familiar with:

cppFunction('
void cppbuzz() {

  std::vector<int> numbers(100); // will eventually be 1:100
  std::iota(numbers.begin(), numbers.end(), 1); // kinda sorta equiva of our R 1:100 but not exactly true

  std::vector<std::string> fb(100); // fizzbuzz strings holder

  // transform said 1..100 into fizbuzz strings
  std::transform(
    numbers.begin(), numbers.end(), 
    fb.begin(),
    [](int i) -> std::string { // lambda expression are cool like a fez
        if      (i % 15 == 0) return("FizzBuzz");
        else if (i %  3 == 0) return("Fizz");
        else if (i %  5 == 0) return("Buzz");
        else return(std::to_string(i));
    }
  );

  // round it out with use of for_each and another lambda
  // this turns out to be slightly faster than range-based for-loop
  // collection iteration syntax.
  std::for_each(
    fb.begin(), fb.end(), 
    [](std::string s) { Rcout << s << std::endl; }
  );

}
', 
plugins = c('cpp11'))

Both of those functions are now available to R.

Next, we need to prepare to run JavaScript and Python code, so we’ll initialize both of those environments:

ctx <- v8()

py_config() # not 100% necessary but I keep my needed {reticulate} options in env vars for reproducibility

Then, we tell R to capture all the output. Using sink() is a bit better than capture.output() in this use-case since to avoid nesting calls, and we need to handle Python stdout the same way py_capture_output() does to be fair in our measurements:

output_tools <- import("rpytools.output")
restore_stdout <- output_tools$start_stdout_capture()

cap <- rawConnection(raw(0), "r+")
sink(cap)

There are a few implementations below across the tidy and base R multiverse. Some use vectorization; some do not. This will let us compare overall “speed” of solution. If you have another suggestion for a readable solution in R, drop a note in the comments:

microbenchmark::microbenchmark(

  # tidy_vectors_case() is slowest but you get all sorts of type safety 
  # for free along with very readable idioms.

  tidy_vectors_case = map_chr(1:100, ~{ 
    case_when(
      (.x %% 15 == 0) ~ "FizzBuzz",
      (.x %%  3 == 0) ~ "Fizz",
      (.x %%  5 == 0) ~ "Buzz",
      TRUE ~ as.character(.x)
    )
  }) %>% 
    cat(sep="\n"),

  # tidy_vectors_if() has old-school if/else syntax but still
  # forces us to ensure type safety which is cool.

  tidy_vectors_if = map_chr(1:100, ~{ 
    if (.x %% 15 == 0) return("FizzBuzz")
    if (.x %%  3 == 0) return("Fizz")
    if (.x %%  5 == 0) return("Buzz")
    return(as.character(.x))
  }) %>% 
    cat(sep="\n"),

  # walk() just replaces `for` but stays in vector-land which is cool

  tidy_walk = walk(1:100, ~{
    if (.x %% 15 == 0) cat("FizzBuzz\n")
    if (.x %%  3 == 0) cat("Fizz\n")
    if (.x %%  5 == 0) cat("Buzz\n")
    cat(.x, "\n", sep="")
  }),

  # vapply() gets us some similiar type assurance, albeit with arcane syntax

  base_proper = vapply(1:100, function(.x) {
    if (.x %% 15 == 0) return("FizzBuzz")
    if (.x %%  3 == 0) return("Fizz")
    if (.x %%  5 == 0) return("Buzz")
    return(as.character(.x))
  }, character(1), USE.NAMES = FALSE) %>% 
    cat(sep="\n"),

  # sapply() is def lazy but this can outperform vapply() in some
  # circumstances (like this one) and is a bit less arcane.

  base_lazy = sapply(1:100, function(.x) {
    if (.x %% 15 == 0)  return("FizzBuzz")
    if (.x %%  3 == 0) return("Fizz")
    if (.x %%  5 == 0) return("Buzz")
    return(.x)
  }, USE.NAMES = FALSE) %>% 
    cat(sep="\n"),

  # for loops...ugh. might as well just use C

  base_for = for(.x in 1:100) {
    if      (.x %% 15 == 0) cat("FizzBuzz\n")
    else if (.x %%  3 == 0) cat("Fizz\n")
    else if (.x %%  5 == 0) cat("Buzz\n")
    else cat(.x, "\n", sep="")
  },

  # ok, we'll just use C!

  c_buzz = cbuzz(),

  # we can go back to vector-land in C++

  cpp_buzz = cppbuzz(),

  # some <3 for javascript

  js_readable = ctx$eval('
for (var i=1; i <101; i++){
  if      (i % 15 == 0) console.log("FizzBuzz")
  else if (i %  3 == 0) console.log("Fizz")
  else if (i %  5 == 0) console.log("Buzz")
  else console.log(i)
}
'),

  # icky readable, non-vectorized python

  python = reticulate::py_run_string('
for x in range(1, 101):
  if (x % 15 == 0):
    print("Fizz Buzz")
  elif (x % 5 == 0):
    print("Buzz")
  elif (x % 3 == 0):
    print("Fizz")
  else:
    print(x)
')

) -> res

Turn off output capturing:

sink()
if (!is.null(restore_stdout)) invisible(output_tools$end_stdout_capture(restore_stdout))

We used microbenchmark(), so here are the results:

res
## Unit: microseconds
##               expr       min         lq        mean     median         uq       max neval   cld
##  tidy_vectors_case 20290.749 21266.3680 22717.80292 22231.5960 23044.5690 33005.960   100     e
##    tidy_vectors_if   457.426   493.6270   540.68182   518.8785   577.1195   797.869   100  b   
##          tidy_walk   970.455  1026.2725  1150.77797  1065.4805  1109.9705  8392.916   100   c  
##        base_proper   357.385   375.3910   554.13973   406.8050   450.7490 13907.581   100  b   
##          base_lazy   365.553   395.5790   422.93719   418.1790   444.8225   587.718   100 ab   
##           base_for   521.674   545.9155   576.79214   559.0185   584.5250   968.814   100  b   
##             c_buzz    13.538    16.3335    18.18795    17.6010    19.4340    33.134   100 a    
##           cpp_buzz    39.405    45.1505    63.29352    49.1280    52.9605  1265.359   100 a    
##        js_readable   107.015   123.7015   162.32442   174.7860   187.1215   270.012   100 ab   
##             python  1581.661  1743.4490  2072.04777  1884.1585  1985.8100 12092.325   100    d 

Said results are 🤷🏻‍♀️ since this is a toy example, but I wanted to show that Jeroen’s {V8} can be super fast, especially when there’s no value marshaling to be done and that some things you may have thought should be faster, aren’t.

FIN

Definitely add links or code for changes or additions (especially the aforementioned other languages). Hopefully my lament about the computer science program at UMaine is not universally true for all the programming courses there.

At the bottom of the R for macOS Developer’s Page there’s mention of an “other binary” called “RSwitch” that is “a small GUI that allows you to switch between R versions quickly (if you have multiple versions of R framework installed).” Said switching requires you to use the “tar.gz” versions of R from the R for macOS Developer’s Page since the official CRAN binary installers clean up after themselves quite nicely to prevent potentially wacky behavior.

All the RSwitch GUI did was change the Current alias target in /Library/Frameworks/R.framework/Versions to the appropriate version. You can do that from the command line but the switcher GUI was created so that means some folks prefer click-switching and I have found myself using the GUI on occasion (before it stopped working on macOS Vista^wCatalina).

Since I:

  • work on Catalina most of the day
  • play with oldrel and devel versions of R
  • needed to brush up on Swift 5 coding
  • wanted RSwitch as a menubar app vs one with a dialog that I could easily lose across 15 desktops
  • decided to see if it was possible to make it work sandboxed (TLDR: it isn’t)
  • really wanted a different icon for the binary
  • couldn’t sleep last night

there was sufficient justification to create a 64-bit version of this app.

You can clone the project from any of the following social coding sites:

and, you can either compile it yourself — which is recommended since it’s 2019 and the days of even remotely trusting binaries off the internet are long gone — or build it. It should work on 10.14+ since I set that as the target, but file an issue where you like if you have, well, issues with the code or binary.

Once you do have it working, there will be a dial-switch menu in the menubar and a menu that should look something like:

The item with the checkbox is the Current alias.

FIN

Kick the tyres, file issues & PRs as you’re wont to do and prepare for the forthcoming clickpocalypse as Apple nears their GA release of Catalina.

It’s been yet-another weirdly busy summer but I’m finally catching up on noting some recent-ish developments in the blog.

First up is a full rewrite of the {wand} pacakge which still uses magic but is 100% R code (vs a mix of compiled C and R code) and now works on Windows. A newer version will be up on CRAN in a bit that has additional MIME mappings and enables specifying a custom extension mapping data frame. You’ve seen this beast on the blog before, though, by another name.

wand::get_content_type("/etc/syslog.conf")
## [1] "text/plain"

Next is the {ulid} package (which I’ve also previously discussed, here) which is also now on CRAN to meet all your Universally Unique Lexicographically Sortable Identifiers-generation needs.

ulid::ulid_generate()
## [1] "0001EKRGTCRSVA4ACSCQJA61A0"

ulid::unmarshal("0001EKRGTCRSVA4ACSCQJA61A0")
##                    ts              rnd
## 1 2019-07-27 08:27:56 RSVA4ACSCQJA61A0

The [{testthat}] gravity well has caught over 4,000 CRAN packages but it’s not the only testing game in town. The {tinytest} package take slightly more minimalist approach and has the added benefit that the tests come along for the ride with the package, which makes it easier to solicit said test results from package users having problems with your code.

I still use {testthat} in most of my packages but gave {tinytest} a spin for a few of my more recent ones and it’s pretty nifty. The biggest feature I missed when switching to it was the lack of Cmd-Shift-T support for it in RStudio. Since I kinda still want Cmd-Shift-T for all the packages I have that use {testthat} I whipped up an RStudio addin that adds an Addin context menu item (below) and placed it in {hrbraddins}.

If you load up that package you can then bind something like Cmd-Option-Shift-T to that function and have equally quick keystroke access to package tests during development.

> hrbraddins:::run_tiny_test() # from within the {wand} pkg
Running test_wand.R...................   52 tests OK
All ok (52 results)

Finally, I got bit in a recent CRAN submission because I have remote CRAN checks turned off (soooo sloooowww) but had a 404’ing URL in the documentation of one of the methods in the package. Since I have a few more submissions coming up in the next 6-8 weeks I decided to whip up an RStudio addin for an on-the-fly package URL checker that wraps the exact same checks CRAN does for these submissions. (I keybound this to Cmd-Option-Shift-U and you can catch a glimpse of it in the addins menu in the above screenshot).

The output of one run is below. I deliberately modified two working URLs to show what you get output-wise when everything isn’t perfect:

> hrbraddins:::check_package_urls()
Gathering URLs for {wand} (this may take a bit)
- Looking in HTML files...
- Looking in metadata files...
- Looking in news files...
- Looking in Rd files...
- Looking in README files...
- Looking in source files...
- Looking in PDF files...

Checking found URLs (this may also take a bit)
# A tibble: 13 x 4
   url                                                        parent      status is_https
   <chr>                                                      <chr>        <dbl> <lgl>   
 1 https://gitlab.com/hrbrmstr/wand/issuessss                 DESCRIPTION    599 TRUE    
 2 http://gitlab.com/hrbrmstr/wand                            DESCRIPTION    200 FALSE   
 3 https://github.com/jshttp/mime-db                          man/wand.Rd    200 TRUE    
 4 https://github.com/threatstack/libmagic/tree/master/magic/ man/wand.Rd    200 TRUE    
 5 https://ci.appveyor.com/project/hrbrmstr/wand              README.md      200 TRUE    
 6 https://codecov.io/gh/hrbrmstr/wand                        README.md      200 TRUE    
 7 https://cranchecks.info/pkgs/wand                          README.md      200 TRUE    
 8 https://github.com/r-lib/remotes                           README.md      200 TRUE    
 9 https://github.com/threatstack/libmagic/tree/master/magic/ README.md      200 TRUE    
10 https://keybase.io/hrbrmstr                                README.md      200 TRUE    
11 https://travis-ci.org/hrbrmstr/wand                        README.md      200 TRUE    
12 https://www.r-pkg.org/pkg/wand                             README.md      200 TRUE    
13 https://www.repostatus.org/#active                         README.md      200 TRUE  

The {hrbraddins} package itself is just a playground and will never see CRAN, so do not hesitate to yank anything from it and put it in a safer and/or more accessible location for your own work.

FIN

For the packages, file issues and PRs the same way you always would. Same goes for the addins.

Despite being a full-on denizen of all things digital I receive a fair number of dead-tree print magazines as there’s nothing quite like seeing an amazing, large, full-color print data-driven visualization up close and personal. I also like supporting data journalism through the subscriptions since without cash we will only have insane, extreme left/right-wing perspectives out there.

One of these publications is The Economist (I’d subscribe to the Financial Times as well for the non-liberal perspective but I don’t need another mortgage payment right now). The graphics folks at The Economist are Top Notch™ and a great source of inspiration to “do better” when cranking out visuals.

After reading a recent issue, one of their visualization styles stuck in my head. Specifically, the one from this story on the costs of a mammogram. I’ve put it below:

Essentially it’s a boxplot with outliers removed along with significantly different aesthetics than the one we’re all used to seeing. I would not use this for exploratory data analysis or working with other data science team members when poking at a problem but I really like the idea of making “distributions” easier to consume for general audiences and believe The Economist graphics folks have done a superb job focusing on the fundamentals (both statistical and aesthetic).

There are ways to hack something like those out manually in {ggplot2} but it would be nice to just be able to swap out something for geom_boxplot() when deciding to go to “production” with a distribution chart. Thus begat {ggeconodist}.

Since this is just a “quick hit” post we’ll avoid some interim blathering to note that we can use {ggplot2} (and a touch of {grid}) to make the following:

with just a tiny bit of R code:

library(ggeconodist)

ggplot(mammogram_costs, aes(x = city)) +
  geom_econodist(
    aes(ymin = tenth, median = median, ymax = ninetieth), stat = "identity"
  ) +
  scale_y_continuous(expand = c(0,0), position = "right", limits = range(0, 800)) +
  coord_flip() +
  labs(
    x = NULL, y = NULL,
    title = "Mammoscams",
    subtitle = "United States, prices for a mammogram*\nBy metro area, 2016, $",
    caption = "*For three large insurance companies\nSource: Health Care Cost Institute"
  ) +
  theme_econodist() -> gg

grid.newpage()
left_align(gg, c("subtitle", "title", "caption")) %>% 
  add_econodist_legend(econodist_legend_grob(), below = "subtitle") %>% 
  grid.draw()

FIN

A future post (and, even — perhaps — a new screen sharing video) will describe what went into making this new geom, stat, and theme (along with some info on how I managed to reproduce data for the vis since none was provided).

In the interim, hit up the CINC page on {ggeconodist} to learn more about the package. You may want to take a quick look at {hrbrthemes} since it might have some helpers for using all the required theming components.

So kick the tyres, file issues/PRs, and be on the lookout for the director’s cut of the making of {ggeconodist}.