

{"id":9496,"date":"2018-04-08T07:41:14","date_gmt":"2018-04-08T12:41:14","guid":{"rendered":"https:\/\/rud.is\/b\/?p=9496"},"modified":"2018-04-08T07:41:14","modified_gmt":"2018-04-08T12:41:14","slug":"dissecting-r-package-utility-belts","status":"publish","type":"post","link":"https:\/\/rud.is\/b\/2018\/04\/08\/dissecting-r-package-utility-belts\/","title":{"rendered":"Dissecting R Package &#8220;Utility Belts&#8221;"},"content":{"rendered":"<p>Many R package authors (including myself) lump a collection of small, useful functions into some type of <code>utils.R<\/code> file and usually do not export the functions since they are (generally) designed to work on package internals rather than expose their functionality via the exported package API. Just like Batman&#8217;s utility belt, which can be customized for any mission, any set of utilities in a given R package will also likely be different from those in other packages.<\/p>\n<p>I thought it would be neat to take a look at:<\/p>\n<ul>\n<li>just how many packages have one or more <code>util*.R<\/code> files and what the most common file names are for them;<\/li>\n<li>utility function naming preferences &#8212; specifically snake-case, camel-case or dot-case<\/li>\n<li>what the most common &#8220;utility&#8221; functions names are across the packages<\/li>\n<li>coding style &#8212; specifically compare ratios of white space, full-line comments to code size<\/li>\n<\/ul>\n<p>for all the published packages on CRAN.<\/p>\n<p>There are <em>many<\/em> more questions one can ask and then use this corpus to answer, so we&#8217;ll close out the post with a link to it so any intrepid readers can do just that, especially since reproducing the first bit of this post would require a local CRAN mirror (which most folks &#8212; rightly so &#8212; do not have handy).<\/p>\n<h3>Acquiring and Transforming the Data We Need<\/h3>\n<p><img data-recalc-dims=\"1\" loading=\"lazy\" decoding=\"async\" data-attachment-id=\"9499\" data-permalink=\"https:\/\/rud.is\/b\/2018\/04\/08\/dissecting-r-package-utility-belts\/cran\/\" data-orig-file=\"https:\/\/i0.wp.com\/rud.is\/b\/wp-content\/uploads\/2018\/04\/cran.png?fit=730%2C578&amp;ssl=1\" data-orig-size=\"730,578\" data-comments-opened=\"1\" data-image-meta=\"{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}\" data-image-title=\"cran\" data-image-description=\"\" data-image-caption=\"\" data-large-file=\"https:\/\/i0.wp.com\/rud.is\/b\/wp-content\/uploads\/2018\/04\/cran.png?fit=510%2C404&amp;ssl=1\" src=\"https:\/\/i0.wp.com\/rud.is\/b\/wp-content\/uploads\/2018\/04\/cran.png?resize=243%2C192&#038;ssl=1\" alt=\"\" width=\"243\" height=\"192\" class=\"alignright size-full wp-image-9499\" \/>Since I have local CRAN mirror, it&#8217;s just a matter of iterating through all the package <code>tar.gz<\/code> files in <code>src\/contrib<\/code> and <code>grep<\/code>ping through a <code>tar<\/code> listing of each for a pattern like <code>\"R\/util.*$<\/code>. That pattern isn&#8217;t perfect but it&#8217;s quick and we&#8217;ll be able to filter out any files it catches that don&#8217;t belong. I chose to use a small <code>bash<\/code> script for this but it&#8217;s possible to do this with R as well (an exercise left to the reader). The resultant data file looks a bit like the output from an <code>ls -l<\/code> (linux-ish) directory listing:<\/p>\n<pre><code>-rw-r--r--  0 hornik users    1658 Jun  5  2016 AHR\/R\/util.R\n-rw-r--r--  0 ligges users   12609 Dec 13  2016 ALA4R\/R\/utilities_internal.R\n-rw-r--r--  0 hornik users       0 Feb 24  2017 AWR.Kinesis\/R\/utils.R\n-rw-r--r--  0 ligges users    4127 Aug 30  2017 AlphaVantageClient\/R\/utils.R\n-rw-r--r--  0 ligges users     121 Jan 19  2017 AmyloGram\/R\/utils.R\n-rw-r--r--  0 ligges users    2873 Jan 17 23:04 DT\/R\/utils.R\n-rw-r--r--  0 ligges users    3055 Jan 17  2017 cleanr\/inst\/source\/R\/utils.R\ndrwxr-xr-x  0 ligges users       0 Sep 24  2017 JGR\/java\/org\/rosuda\/JGR\/util\/\n<\/code><\/pre>\n<p>I made sure to show a few examples of where a better search pattern would have helped ensure lines like the three at the bottom of that listings aren&#8217;t included. But, we all often have to deal with imperfect data, so we&#8217;ll make sure to deal with that during the ingestion &amp; cleanup process.<\/p>\n<pre id=\"utilitybelt01\"><code class=\"language-r\">library(stringi)\r\nlibrary(hrbrthemes)\r\nlibrary(archive) # devtools::install_github(&quot;jimhester&quot;, &quot;archive&quot;)\r\nlibrary(tidyverse)\r\n\r\n# I ran readr::type_convert() once and it returns this column type spec. By using it \r\n# for subsequent conversions, we&#039;ll gain reproducibility and data format change \r\n# detection capabilities &quot;for free&quot;\r\n\r\ncols(\r\n  permsissions = col_character(),\r\n  links = col_integer(),\r\n  owner = col_character(),\r\n  group = col_character(),\r\n  size = col_integer(),\r\n  month = col_character(),\r\n  day = col_integer(),\r\n  year_hr = col_character(),\r\n  path = col_character()\r\n) -&gt; tar_cols\r\n\r\n# Now, we parse the tar verbose (&#039;ls -l&#039;) listing\r\n\r\nstri_read_lines(&quot;~\/Data\/pkutils.txt&quot;) %&gt;% # stringi was loaded so might as well use it\r\n  stri_split_regex(&quot; +&quot;, 9, simplify = TRUE) %&gt;% # split input into 9 columns\r\n  as_data_frame() %&gt;% # ^^ returns a matrix but data frames are more useful for our work\r\n  set_names(names(tar_cols$cols)) %&gt;% # column names are useful and we can use our colspec for it\r\n  type_convert(col_types = tar_cols) %&gt;% # see comment block before cols()\r\n  mutate(day = sprintf(&quot;%02d&quot;, day)) %&gt;% # now we&#039;ll work on getting the date pieces to be a Date\r\n  mutate(year_hr = case_when( # the year_hr field can be either %Y or %H:%M depending on file &#039;recency&#039;\r\n    stri_detect_fixed(year_hr, &quot;:&quot;) &amp;\r\n      (month %in% c(&quot;Jan&quot;, &quot;Feb&quot;, &quot;Mar&quot;, &quot;Apr&quot;)) ~ &quot;2018&quot;, # if %H:%M but &#039;starter&#039; months it&#039;s 2018\r\n    stri_detect_fixed(year_hr, &quot;:&quot;) &amp;\r\n      (month %in% c(&quot;Dec&quot;, &quot;Nov&quot;, &quot;Oct&quot;, &quot;Sep&quot;, &quot;Aug&quot;, &quot;Jul&quot;, &quot;Jun&quot;)) ~ &quot;2017&quot;, # %H:%M &amp; &#039;end&#039; months\r\n    TRUE ~ year_hr # already in %Y format\r\n  )) %&gt;%\r\n  mutate(date= lubridate::mdy(sprintf(&quot;%s %s, %s&quot;, month, day, year_hr))) %&gt;% # get a Date\r\n  mutate(pkg = stri_match_first_regex(path, &quot;^(.*)\/R\/&quot;)[,2]) %&gt;% # extract package name (stri_extract is also usable here)\r\n  mutate(fil = basename(path)) %&gt;% # extrafct just the file name\r\n  filter(!is.na(pkg)) %&gt;% # handle one type of wrongly included file\r\n  filter(!stri_detect_fixed(pkg, &quot;\/&quot;)) %&gt;% # ande another\r\n  filter(!is.na(path)) -&gt; xdf # and another; but we&#039;re done so we close with an assignment\r\n\r\nglimpse(xdf)\r\n## Observations: 1,746\r\n## Variables: 12\r\n## $ permsissions &lt;chr&gt; &quot;-rw-r--r--&quot;, &quot;-rw-r--r--&quot;, &quot;-rw-r--r--&quot;, &quot;-rw-r-...\r\n## $ links        &lt;int&gt; 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0...\r\n## $ owner        &lt;chr&gt; &quot;hornik&quot;, &quot;ligges&quot;, &quot;hornik&quot;, &quot;ligges&quot;, &quot;ligges&quot;,...\r\n## $ group        &lt;chr&gt; &quot;users&quot;, &quot;users&quot;, &quot;users&quot;, &quot;users&quot;, &quot;users&quot;, &quot;her...\r\n## $ size         &lt;int&gt; 1658, 12609, 0, 4127, 121, 52, 36977, 34198, 3676...\r\n## $ month        &lt;chr&gt; &quot;Jun&quot;, &quot;Dec&quot;, &quot;Feb&quot;, &quot;Aug&quot;, &quot;Jan&quot;, &quot;Aug&quot;, &quot;Jan&quot;, ...\r\n## $ day          &lt;chr&gt; &quot;05&quot;, &quot;13&quot;, &quot;24&quot;, &quot;30&quot;, &quot;19&quot;, &quot;10&quot;, &quot;06&quot;, &quot;10&quot;, &quot;...\r\n## $ year_hr      &lt;chr&gt; &quot;2016&quot;, &quot;2016&quot;, &quot;2017&quot;, &quot;2017&quot;, &quot;2017&quot;, &quot;2017&quot;, &quot;...\r\n## $ path         &lt;chr&gt; &quot;AHR\/R\/util.R&quot;, &quot;ALA4R\/R\/utilities_internal.R&quot;, &quot;...\r\n## $ date         &lt;date&gt; 2016-06-05, 2016-12-13, 2017-02-24, 2017-08-30, ...\r\n## $ pkg          &lt;chr&gt; &quot;AHR&quot;, &quot;ALA4R&quot;, &quot;AWR.Kinesis&quot;, &quot;AlphaVantageClien...\r\n## $ fil          &lt;chr&gt; &quot;util.R&quot;, &quot;utilities_internal.R&quot;, &quot;utils.R&quot;, &quot;uti...<\/code><\/pre>\n<p>To the analysis!<\/p>\n<h3>Finding the Utility of &#8216;util&#8217;s<\/h3>\n<p>A careful look at the <code>glimpse()<\/code> listing shows we have 1,745 files that begin with <code>util<\/code>, but how many <em>packages<\/em> have at least one <code>util<\/code> files?<\/p>\n<pre id=\"utilitybelt02\"><code class=\"language-r\">nrow(distinct(xdf, pkg))\r\n## [1] 1397<\/code><\/pre>\n<p>That&#8217;s roughly 10% of CRAN, but doesn&#8217;t mean other packages do not have &#8220;utility belt&#8221; functions since other authors may have just been more creative or deliberate with their file naming conventions.<\/p>\n<p>Readers with keen eyes may have noticed we spent some deliberate CPU cycles to get a <code>Date<\/code> column. Part of that was to show how to do that (mostly as an example for folks new to R) but we also did it to ask temporal questions, such as &#8220;Are package &#8216;utility belts&#8217; a &#8220;new&#8221; thing?&#8221;. The data suggests that utility belts are products\/attributes of more recently published or updated packages:<\/p>\n<pre id=\"utilitybelt03\"><code class=\"language-r\">distinct(xdf, pkg, date) %&gt;%\r\n  mutate(yr = as.integer(lubridate::year(date))) %&gt;%\r\n  count(yr) %&gt;%\r\n  complete(yr, fill=list(n=0)) %&gt;%\r\n  ggplot(aes(yr, n)) +\r\n  geom_col(fill=&quot;lightslategray&quot;, width=0.65) +\r\n  labs(\r\n    x = NULL, y = &quot;Package count&quot;,\r\n    title = &quot;Recently published or updated packages tend to have more &#039;util&#039;\\nthan older\/less actively-maintained ones&quot;,\r\n    subtitle = &quot;Count of packages (by year) with &#039;util&#039;s&quot;\r\n  ) +\r\n  theme_ipsum_rc(grid=&quot;Y&quot;)<\/code><\/pre>\n<p><img data-recalc-dims=\"1\" loading=\"lazy\" decoding=\"async\" data-attachment-id=\"9508\" data-permalink=\"https:\/\/rud.is\/b\/2018\/04\/08\/dissecting-r-package-utility-belts\/util-time\/\" data-orig-file=\"https:\/\/i0.wp.com\/rud.is\/b\/wp-content\/uploads\/2018\/04\/util-time.png?fit=1738%2C772&amp;ssl=1\" data-orig-size=\"1738,772\" data-comments-opened=\"1\" data-image-meta=\"{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}\" data-image-title=\"util-time\" data-image-description=\"\" data-image-caption=\"\" data-large-file=\"https:\/\/i0.wp.com\/rud.is\/b\/wp-content\/uploads\/2018\/04\/util-time.png?fit=510%2C227&amp;ssl=1\" src=\"https:\/\/i0.wp.com\/rud.is\/b\/wp-content\/uploads\/2018\/04\/util-time.png?resize=510%2C227&#038;ssl=1\" alt=\"\" width=\"510\" height=\"227\" class=\"aligncenter size-full wp-image-9508\" \/><\/p>\n<p>We <em>could<\/em> answer this more completely by going through the CRAN archives for all these packages, but for now we&#8217;ll just see which packages might have helped set this trend going:<\/p>\n<pre id=\"utilitybelt04\"><code class=\"language-r\">distinct(xdf, pkg, date) %&gt;%\r\n      arrange(date) %&gt;% \r\n      print(n=20)\r\n    ## # A tibble: 1,540 x 2\r\n    ##    date       pkg       \r\n    ##  1 1980-01-01 bsts      \r\n    ##  2 2006-06-28 evdbayes  \r\n    ##  3 2006-11-29 hexView   \r\n    ##  4 2006-12-17 StatDataML\r\n    ##  5 2007-10-05 tpr       \r\n    ##  6 2007-11-07 seqinr    \r\n    ##  7 2007-11-26 registry  \r\n    ##  8 2008-07-25 ramps     \r\n    ##  9 2008-10-23 RobAStBase\r\n    ## 10 2009-02-23 vcd       \r\n    ## 11 2009-06-26 ttutils   \r\n    ## 12 2009-07-03 histogram \r\n    ## 13 2009-11-27 polynom   \r\n    ## 14 2009-11-27 tau       \r\n    ## 15 2010-01-05 itertools \r\n    ## 16 2010-01-22 tableplot \r\n    ## 17 2010-06-09 rbugs     \r\n    ## 18 2011-03-17 playwith  \r\n    ## 19 2011-05-11 marelac   \r\n    ## 20 2011-10-11 timeSeries\r\n    ## # ... with 1,520 more rows<\/code><\/pre>\n<p>Going back to our corpus, what are the most common names for these utility belt files?<\/p>\n<pre id=\"utilitybelt06\"><code class=\"language-r\">## count(xdf, fil, sort=TRUE) %&gt;% \r\n    ##   mutate(pct = scales::percent(n\/sum(n))) %&gt;% \r\n    ##   print(n=20)\r\n    ## # A tibble: 409 x 3\r\n    ##    fil                      n pct  \r\n    ##  1 utils.R                865 49.5%\r\n    ##  2 utilities.R            145 8.3% \r\n    ##  3 util.R                 134 7.7% \r\n    ##  4 utils.r                 68 3.9% \r\n    ##  5 utility.R               47 2.7% \r\n    ##  6 utility_functions.R     25 1.4% \r\n    ##  7 util.r                  16 0.9% \r\n    ##  8 utilities.r             14 0.8% \r\n    ##  9 utils-pipe.R             9 0.5% \r\n    ## 10 utilityFunctions.R       6 0.3% \r\n    ## 11 utils-format.r           3 0.2% \r\n    ## 12 util_functions.R         2 0.1% \r\n    ## 13 util_rescale.R           2 0.1% \r\n    ## 14 util-aux.R               2 0.1% \r\n    ## 15 util-checkparam.R        2 0.1% \r\n    ## 16 util-startarg.R          2 0.1% \r\n    ## 17 utilcmst.R               2 0.1% \r\n    ## 18 utilhot.R                2 0.1% \r\n    ## 19 utilities_internal.R     2 0.1% \r\n    ## 20 utility-functions.R      2 0.1% \r\n    ## # ... with 389 more rows<\/code><\/pre>\n<p>Over 50% of other CRAN packages are as &#8220;un-creative&#8221; as I am when it comes to naming these files.<\/p>\n<p>Let&#8217;s see how packed these belts are:<\/p>\n<pre id=\"utilitybelt07\"><code class=\"language-r\">ggplot(xdf, aes(x=&quot;&quot;, size)) +\r\n  ggbeeswarm::geom_quasirandom(\r\n    fill=&quot;lightslategray&quot;, color=&quot;white&quot;,\r\n    alpha=1\/2, stroke=0.25, size=3, shape=21\r\n  ) +\r\n  geom_boxplot(fill=&quot;#00000000&quot;, outlier.colour = &quot;#00000000&quot;) +\r\n  geom_text(\r\n    data=data_frame(), aes(x=-Inf, y=median(xdf$size), label=&quot;Median:\\n2,717&quot;),\r\n    hjust = 0, family = font_rc, size = 3, color = &quot;lightslateblue&quot;\r\n  ) +\r\n  scale_y_comma(\r\n    name = &quot;File size&quot;, trans=&quot;log10&quot;, limits=c(NA, 200000),\r\n    breaks = c(10, 100, 1000, 10000, 100000) \r\n  ) +\r\n  labs(\r\n    x = NULL, \r\n    title = &quot;Most &#039;util&#039; files are between 1K and 10K in size&quot;,\r\n    caption = &quot;Note y-axis log10 scale&quot;\r\n  ) +\r\n  theme_ipsum_rc(grid=&quot;Y&quot;)<\/code><\/pre>\n<p><img data-recalc-dims=\"1\" loading=\"lazy\" decoding=\"async\" data-attachment-id=\"9506\" data-permalink=\"https:\/\/rud.is\/b\/2018\/04\/08\/dissecting-r-package-utility-belts\/util-size\/\" data-orig-file=\"https:\/\/i0.wp.com\/rud.is\/b\/wp-content\/uploads\/2018\/04\/util-size.png?fit=1224%2C1400&amp;ssl=1\" data-orig-size=\"1224,1400\" data-comments-opened=\"1\" data-image-meta=\"{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}\" data-image-title=\"util-size\" data-image-description=\"\" data-image-caption=\"\" data-large-file=\"https:\/\/i0.wp.com\/rud.is\/b\/wp-content\/uploads\/2018\/04\/util-size.png?fit=510%2C583&amp;ssl=1\" src=\"https:\/\/i0.wp.com\/rud.is\/b\/wp-content\/uploads\/2018\/04\/util-size.png?resize=510%2C583&#038;ssl=1\" alt=\"\" width=\"510\" height=\"583\" class=\"aligncenter size-full wp-image-9506\" \/><\/p>\n<p>We&#8217;ll need to do a bit more data collection to answer the last two questions.<\/p>\n<h3>Focus on Functions<\/h3>\n<p><img data-recalc-dims=\"1\" loading=\"lazy\" decoding=\"async\" data-attachment-id=\"9497\" data-permalink=\"https:\/\/rud.is\/b\/2018\/04\/08\/dissecting-r-package-utility-belts\/is_dir\/\" data-orig-file=\"https:\/\/i0.wp.com\/rud.is\/b\/wp-content\/uploads\/2018\/04\/is_dir.png?fit=1292%2C1118&amp;ssl=1\" data-orig-size=\"1292,1118\" data-comments-opened=\"1\" data-image-meta=\"{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}\" data-image-title=\"is_dir\" data-image-description=\"\" data-image-caption=\"\" data-large-file=\"https:\/\/i0.wp.com\/rud.is\/b\/wp-content\/uploads\/2018\/04\/is_dir.png?fit=510%2C441&amp;ssl=1\" src=\"https:\/\/i0.wp.com\/rud.is\/b\/wp-content\/uploads\/2018\/04\/is_dir.png?resize=215%2C186&#038;ssl=1\" alt=\"\" width=\"215\" height=\"186\" class=\"alignright size-full wp-image-9497\" \/>To examine function names and source code statistics, we&#8217;ll need to read in the contents of each file and parse them. Let&#8217;s do that first bit with some help from the <a href=\"https:\/\/github.com\/jimhester\/archive\"><code>archive<\/code><\/a> package which will help us open up these compressed <code>tar<\/code> files and pull out the file(s) we need from them vs have to code this up more manually.<\/p>\n<p>Again, this code is only reproducible if you have CRAN handy, but soon (<em>promise!<\/em>) you&#8217;ll have a file you can work with for the remainder of the post:<\/p>\n<pre id=\"utilitybelt08\"><code class=\"language-r\">extract_source &lt;- function(pkg, fil, .pb = NULL) {\r\n\r\n  if (!is.null(.pb)) .pb$tick()$print()\r\n\r\n  list.files(\r\n    path = &quot;\/cran\/src\/contrib&quot;, # my path to local CRAN\r\n    pattern = sprintf(&quot;^%s_.*gz&quot;, pkg), # rough pattern for the package archive filename\r\n    recursive = FALSE,\r\n    full.names = TRUE\r\n  ) -&gt; tgt\r\n\r\n  con &lt;- archive_read(tgt[1], fil)\r\n  src &lt;- readLines(con, warn = FALSE)\r\n  close(con)\r\n\r\n  paste0(src, collapse=&quot;\\n&quot;)\r\n\r\n}\r\n\r\npb &lt;- progress_estimated(nrow(xdf))\r\nxdf &lt;- mutate(xdf, file_src = map2_chr(pkg, path, extract_source, .pb=pb))<\/code><\/pre>\n<p>That (on-drive) ~10MB data frame is in <a href=\"https:\/\/rud.is\/dl\/utility-belt.rds\">https:\/\/rud.is\/dl\/utility-belt.rds<\/a>. The rest of the post builds off of it so you can start coding along at home now.<\/p>\n<p>Let&#8217;s extract the function names:<\/p>\n<pre id=\"utilitybelt09\"><code class=\"language-r\"># we&#039;ll use these two functions to help test whether bits \r\n# of our parsed code are, indeed, functions. \r\n#\r\n# Alternately: &quot;I heard you liked functions so I made\r\n# functions to help you find functions&quot;\r\n#\r\n# we could have used `rlang` helpers here, but I had these\r\n# handy from pre-`rlang` days.\r\n\r\nis_assign &lt;- function(x) {\r\n  as.character(x) %in% c(&#039;&lt;-&#039;, &#039;=&#039;, &#039;&lt;&lt;-&#039;, &#039;assign&#039;)\r\n}\r\n\r\nis_func &lt;- function(x) {\r\n  is.call(x) &amp;&amp;\r\n    is_assign(x[[1]]) &amp;&amp;\r\n    is.call(x[[3]]) &amp;&amp;\r\n    (x[[3]][[1]] == quote(`function`))\r\n}\r\n\r\nread_rds(&quot;~\/Data\/utility-belt.rds&quot;) %&gt;% # I have this file in ~\/Data; change this for your location\r\n  mutate(parsed = map(file_src, ~parse(text = .x, keep.source = TRUE))) %&gt;% # parse each file\r\n  mutate(func_names = map(parsed, ~{ # go through parsed file\r\n    keep(.x, is_func) %&gt;% # and only keep functions\r\n      map(~as.character(.x[[2]])) %&gt;% # extract the function name\r\n      flatten_chr() # return a character vector\r\n  })) -&gt; xdf<\/code><\/pre>\n<p>With those handy, we can see if there are any commonalities across all these packages:<\/p>\n<pre id=\"utilitybelt10\"><code class=\"language-r\">select(xdf, pkg, fil, func_names) %&gt;%\r\n  unnest() %&gt;%\r\n  count(func_names, sort=TRUE) %&gt;%\r\n  print(n=20)\r\n##    func_names                 n\r\n##  1 %||%                      84\r\n##  2 compact                   19\r\n##  3 isFALSE                   19\r\n##  4 assertthat::on_failure    16\r\n##  5 is_windows                16\r\n##  6 trim                      14\r\n##  7 .on Load                   13 # (IRL there&#039;s no space here but the WP input sanitizer hates this word due to js abuse\r\n##  8 names2                    12\r\n##  9 dots                      11\r\n## 10 is_string                 11\r\n## 11 vlapply                   11\r\n## 12 .onAttach                 10\r\n## 13 error.bars                10\r\n## 14 normalize                 10\r\n## 15 vcapply                   10\r\n## 16 cat0                       9\r\n## 17 collapse                   9\r\n## 18 err                        9\r\n## 19 getmin                     9\r\n## 20 is_dir                     9\r\n## # ... with 1.252e+04 more rows<\/code><\/pre>\n<p>We can also see if there are common case conventions:<\/p>\n<pre id=\"utilitybelt11\"><code class=\"language-r\">select(xdf, pkg, fil, func_names) %&gt;%\r\n  unnest() %&gt;%\r\n  mutate(is_camel = (!stri_detect_fixed(func_names, &quot;_&quot;)) &amp;\r\n           (!stri_detect_regex(func_names, &quot;[[:alpha:]]\\\\.[[:alpha:]]&quot;)) &amp;\r\n           (stri_detect_regex(func_names, &quot;[A-Z]&quot;))) %&gt;%\r\n  mutate(is_dotcase = stri_detect_regex(func_names, &quot;[[:alpha:]]\\\\.[[:alpha:]]&quot;)) %&gt;%\r\n  mutate(is_snake = stri_detect_fixed(func_names, &quot;_&quot;) &amp;\r\n           (!stri_detect_regex(func_names, &quot;[[:alpha:]]\\\\.[[:alpha:]]&quot;))) -&gt; case_hunt\r\n\r\ncount(case_hunt, is_camel, is_dotcase, is_snake) %&gt;% \r\n  mutate(pct = scales::percent(n\/sum(n))) %&gt;% \r\n  mutate(description = c(\r\n    &quot;one-&#039;word&#039; names&quot;,\r\n    &quot;snake_case&quot;,\r\n    &quot;dot.case&quot;,\r\n    &quot;camelCase&quot;\r\n  )) %&gt;% \r\n  arrange(n) %&gt;% \r\n  mutate(description = factor(description, description)) %&gt;% \r\n  ggplot(aes(description, n)) +\r\n  geom_col(fill=&quot;lightslategray&quot;, width=0.65) +\r\n  geom_label(aes(y = n, label=pct), label.size=0, family=font_rc, nudge_y=150) +\r\n  scale_y_comma(&quot;Number of functions&quot;) +\r\n  labs(\r\n    x=NULL,\r\n    title = &quot;dot.case does not seem to be en-vogue for utility belt functions&quot;\r\n  ) +\r\n  theme_ipsum_rc(grid=&quot;Y&quot;)<\/code><\/pre>\n<p><img data-recalc-dims=\"1\" loading=\"lazy\" decoding=\"async\" data-attachment-id=\"9515\" data-permalink=\"https:\/\/rud.is\/b\/2018\/04\/08\/dissecting-r-package-utility-belts\/util-case\/\" data-orig-file=\"https:\/\/i0.wp.com\/rud.is\/b\/wp-content\/uploads\/2018\/04\/util-case.png?fit=1578%2C1302&amp;ssl=1\" data-orig-size=\"1578,1302\" data-comments-opened=\"1\" data-image-meta=\"{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}\" data-image-title=\"util-case\" data-image-description=\"\" data-image-caption=\"\" data-large-file=\"https:\/\/i0.wp.com\/rud.is\/b\/wp-content\/uploads\/2018\/04\/util-case.png?fit=510%2C421&amp;ssl=1\" src=\"https:\/\/i0.wp.com\/rud.is\/b\/wp-content\/uploads\/2018\/04\/util-case.png?resize=510%2C421&#038;ssl=1\" alt=\"\" width=\"510\" height=\"421\" class=\"aligncenter size-full wp-image-9515\" \/><\/p>\n<p>I had a hunch that <code>isX\u2026()<\/code>\/<code>is_x\u2026()<\/code> could be likely names for utility belt functions, so let&#8217;s normalize the function names to snake_case and see if that&#8217;s true:<\/p>\n<pre id=\"utilitybelt12\"><code class=\"language-r\">select(xdf, pkg, fil, func_names) %&gt;%\r\n  unnest() %&gt;%\r\n  filter(stri_detect_regex(func_names, &quot;^(\\\\.is|is)&quot;)) %&gt;%\r\n  mutate(func_names = snakecase::to_snake_case(func_names)) %&gt;%\r\n  count(func_names, sort=TRUE)\r\n## # A tibble: 547 x 2\r\n##    func_names       n\r\n##  1 is_false        24\r\n##  2 is_windows      19\r\n##  3 is_string       18\r\n##  4 is_empty        13\r\n##  5 is_dir          11\r\n##  6 is_formula      11\r\n##  7 is_installed    11\r\n##  8 is_linux         9\r\n##  9 is_na            9\r\n## 10 is_error         8\r\n## # ... with 537 more rows<\/code><\/pre>\n<p>Only 5% (819) out of 14,123 extracted function names are <code>is_<\/code>; not overwhelming, but a respectable slice.<\/p>\n<p>There are more questions we could ask of function names and styles, but we&#8217;ll leave some work for y&#8217;all to do on your own.<\/p>\n<p>Let&#8217;s head over to the final <strike>rooftop<\/strike> exercise.<\/p>\n<h3>Code, Comment &amp; Blank Line Density<\/h3>\n<p><img data-recalc-dims=\"1\" loading=\"lazy\" decoding=\"async\" data-attachment-id=\"9521\" data-permalink=\"https:\/\/rud.is\/b\/2018\/04\/08\/dissecting-r-package-utility-belts\/comment\/\" data-orig-file=\"https:\/\/i0.wp.com\/rud.is\/b\/wp-content\/uploads\/2018\/04\/comment.png?fit=762%2C574&amp;ssl=1\" data-orig-size=\"762,574\" data-comments-opened=\"1\" data-image-meta=\"{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}\" data-image-title=\"comment\" data-image-description=\"\" data-image-caption=\"\" data-large-file=\"https:\/\/i0.wp.com\/rud.is\/b\/wp-content\/uploads\/2018\/04\/comment.png?fit=510%2C384&amp;ssl=1\" src=\"https:\/\/i0.wp.com\/rud.is\/b\/wp-content\/uploads\/2018\/04\/comment.png?resize=254%2C191&#038;ssl=1\" alt=\"\" width=\"254\" height=\"191\" class=\"alignright size-full wp-image-9521\" \/>Since we have the raw source, we can also take a look at coding style. There are many questions we could ask here and more than a few packages we could draw on to help answer them. For now, we&#8217;ll just take a look at the mean ratios of comments and blank lines to code across the packages in this utility belt corpus and give you the opportunity to tease out other interesting tidbits such as &#8220;what base R and other package functions are most often used in utility belt functions?&#8221; or &#8220;are package authors using evil <code>=<\/code> for assignment or proper <code>&lt;-<\/code>?&#8221;.<\/p>\n<pre id=\"utilitybelt13\"><code class=\"language-r\">xdf %&gt;%\r\n  mutate(\r\n    num_lines = stri_count_fixed(xdf$file_src, &quot;\\n&quot;),\r\n    num_blank_lines = stri_count_regex(xdf$file_src, &quot;^[[:space:]]*$&quot;, opts_regex = stri_opts_regex(multiline=TRUE)),\r\n    num_whole_line_comments = lengths(cmnt_df$comments),\r\n    comment_density = num_whole_line_comments \/ (num_lines - num_blank_lines - num_whole_line_comments),\r\n    blank_density = num_blank_lines \/ (num_lines - num_whole_line_comments)\r\n  ) %&gt;%\r\n  select(-permsissions, -links, -owner, -group, month, -day, -year_hr) -&gt; xdf\r\n\r\n# now compute mean ratios\r\ngroup_by(xdf, pkg) %&gt;%\r\n  summarise(\r\n    `Comment-to-code Ratio` = mean(comment_density),\r\n    `Blank lines-to-code Ratio` = mean(blank_density)\r\n  ) %&gt;%\r\n  ungroup() %&gt;%\r\n  filter(!is.infinite(`Comment-to-code Ratio`)) %&gt;%\r\n  filter(!is.nan(`Comment-to-code Ratio`)) %&gt;%\r\n  filter(!is.infinite(`Blank lines-to-code Ratio`)) %&gt;%\r\n  filter(!is.nan(`Blank lines-to-code Ratio`)) %&gt;%\r\n  gather(measure, value, -pkg) -&gt; code_ratios\r\n\r\n# we want to label the median values\r\ngroup_by(code_ratios, measure) %&gt;%\r\n  summarise(median = median(value)) -&gt; code_ratio_meds\r\n\r\nggplot(code_ratios, aes(measure, value, group=measure)) +\r\n  ggbeeswarm::geom_quasirandom(\r\n    fill=&quot;lightslategray&quot;, color=&quot;#2b2b2b&quot;, alpha=1\/2,\r\n    stroke=0.25, size=3, shape=21\r\n  ) +\r\n  geom_boxplot(fill=&quot;#00000000&quot;, outlier.colour = &quot;#00000000&quot;) +\r\n  geom_label(\r\n    data = code_ratio_meds,\r\n    aes(-Inf, c(0.3, 5), label=sprintf(&quot;Median:\\n%s&quot;, round(median, 2)), group=measure),\r\n    family = font_rc, size=3, color=&quot;lightslateblue&quot;, hjust = 0, label.size=0\r\n  ) +\r\n  scale_y_continuous() +\r\n  labs(\r\n    x = NULL, y = NULL,\r\n    caption = &quot;Note free y scale&quot;\r\n  ) +\r\n  facet_wrap(~measure, scales=&quot;free&quot;) +\r\n  theme_ipsum_rc(grid=&quot;Y&quot;, strip_text_face = &quot;bold&quot;) +\r\n  theme(axis.text.x=element_blank())<\/code><\/pre>\n<p><img data-recalc-dims=\"1\" loading=\"lazy\" decoding=\"async\" data-attachment-id=\"9517\" data-permalink=\"https:\/\/rud.is\/b\/2018\/04\/08\/dissecting-r-package-utility-belts\/util-ratio\/\" data-orig-file=\"https:\/\/i0.wp.com\/rud.is\/b\/wp-content\/uploads\/2018\/04\/util-ratio.png?fit=1578%2C1302&amp;ssl=1\" data-orig-size=\"1578,1302\" data-comments-opened=\"1\" data-image-meta=\"{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}\" data-image-title=\"util-ratio\" data-image-description=\"\" data-image-caption=\"\" data-large-file=\"https:\/\/i0.wp.com\/rud.is\/b\/wp-content\/uploads\/2018\/04\/util-ratio.png?fit=510%2C421&amp;ssl=1\" src=\"https:\/\/i0.wp.com\/rud.is\/b\/wp-content\/uploads\/2018\/04\/util-ratio.png?resize=510%2C421&#038;ssl=1\" alt=\"\" width=\"510\" height=\"421\" class=\"aligncenter size-full wp-image-9517\" \/><\/p>\n<h3>FIN<\/h3>\n<p>You can find the (on-drive) ~10MB data frame is at: <a href=\"https:\/\/rud.is\/dl\/utility-belt.rds\">https:\/\/rud.is\/dl\/utility-belt.rds<\/a>.<\/p>\n<p>All the above code in this gist: <a href=\"https:\/\/gist.github.com\/hrbrmstr\/33d29bb39eaa7f2f1e95308038f85b59\">https:\/\/gist.github.com\/hrbrmstr\/33d29bb39eaa7f2f1e95308038f85b59<\/a>.<\/p>\n<p>If you do your own &#8216;utility belt&#8217; analyses, drop a note in the comments with a link to your findings!<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Many R package authors (including myself) lump a collection of small, useful functions into some type of utils.R file and usually do not export the functions since they are (generally) designed to work on package internals rather than expose their functionality via the exported package API. Just like Batman&#8217;s utility belt, which can be customized [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":9498,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"jetpack_post_was_ever_published":false,"_jetpack_newsletter_access":"","_jetpack_dont_email_post_to_subs":false,"_jetpack_newsletter_tier_id":0,"_jetpack_memberships_contains_paywalled_content":false,"_jetpack_memberships_contains_paid_content":false,"activitypub_content_warning":"","activitypub_content_visibility":"","activitypub_max_image_attachments":3,"activitypub_interaction_policy_quote":"anyone","activitypub_status":"","footnotes":""},"categories":[91],"tags":[],"class_list":["post-9496","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-r"],"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v27.3 - https:\/\/yoast.com\/product\/yoast-seo-wordpress\/ -->\n<title>Dissecting R Package &quot;Utility Belts&quot; - rud.is<\/title>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/rud.is\/b\/2018\/04\/08\/dissecting-r-package-utility-belts\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"Dissecting R Package &quot;Utility Belts&quot; - rud.is\" \/>\n<meta property=\"og:description\" content=\"Many R package authors (including myself) lump a collection of small, useful functions into some type of utils.R file and usually do not export the functions since they are (generally) designed to work on package internals rather than expose their functionality via the exported package API. Just like Batman&#8217;s utility belt, which can be customized [&hellip;]\" \/>\n<meta property=\"og:url\" content=\"https:\/\/rud.is\/b\/2018\/04\/08\/dissecting-r-package-utility-belts\/\" \/>\n<meta property=\"og:site_name\" content=\"rud.is\" \/>\n<meta property=\"article:published_time\" content=\"2018-04-08T12:41:14+00:00\" \/>\n<meta property=\"og:image\" content=\"https:\/\/i0.wp.com\/rud.is\/b\/wp-content\/uploads\/2018\/04\/r-utility-belt-final.png?fit=891%2C375&ssl=1\" \/>\n\t<meta property=\"og:image:width\" content=\"891\" \/>\n\t<meta property=\"og:image:height\" content=\"375\" \/>\n\t<meta property=\"og:image:type\" content=\"image\/png\" \/>\n<meta name=\"author\" content=\"hrbrmstr\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:label1\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data1\" content=\"hrbrmstr\" \/>\n\t<meta name=\"twitter:label2\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data2\" content=\"6 minutes\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\\\/\\\/schema.org\",\"@graph\":[{\"@type\":\"Article\",\"@id\":\"https:\\\/\\\/rud.is\\\/b\\\/2018\\\/04\\\/08\\\/dissecting-r-package-utility-belts\\\/#article\",\"isPartOf\":{\"@id\":\"https:\\\/\\\/rud.is\\\/b\\\/2018\\\/04\\\/08\\\/dissecting-r-package-utility-belts\\\/\"},\"author\":{\"name\":\"hrbrmstr\",\"@id\":\"https:\\\/\\\/rud.is\\\/b\\\/#\\\/schema\\\/person\\\/d7cb7487ab0527447f7fda5c423ff886\"},\"headline\":\"Dissecting R Package &#8220;Utility Belts&#8221;\",\"datePublished\":\"2018-04-08T12:41:14+00:00\",\"mainEntityOfPage\":{\"@id\":\"https:\\\/\\\/rud.is\\\/b\\\/2018\\\/04\\\/08\\\/dissecting-r-package-utility-belts\\\/\"},\"wordCount\":1055,\"commentCount\":2,\"publisher\":{\"@id\":\"https:\\\/\\\/rud.is\\\/b\\\/#\\\/schema\\\/person\\\/d7cb7487ab0527447f7fda5c423ff886\"},\"image\":{\"@id\":\"https:\\\/\\\/rud.is\\\/b\\\/2018\\\/04\\\/08\\\/dissecting-r-package-utility-belts\\\/#primaryimage\"},\"thumbnailUrl\":\"https:\\\/\\\/i0.wp.com\\\/rud.is\\\/b\\\/wp-content\\\/uploads\\\/2018\\\/04\\\/r-utility-belt-final.png?fit=891%2C375&ssl=1\",\"articleSection\":[\"R\"],\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"CommentAction\",\"name\":\"Comment\",\"target\":[\"https:\\\/\\\/rud.is\\\/b\\\/2018\\\/04\\\/08\\\/dissecting-r-package-utility-belts\\\/#respond\"]}]},{\"@type\":\"WebPage\",\"@id\":\"https:\\\/\\\/rud.is\\\/b\\\/2018\\\/04\\\/08\\\/dissecting-r-package-utility-belts\\\/\",\"url\":\"https:\\\/\\\/rud.is\\\/b\\\/2018\\\/04\\\/08\\\/dissecting-r-package-utility-belts\\\/\",\"name\":\"Dissecting R Package \\\"Utility Belts\\\" - rud.is\",\"isPartOf\":{\"@id\":\"https:\\\/\\\/rud.is\\\/b\\\/#website\"},\"primaryImageOfPage\":{\"@id\":\"https:\\\/\\\/rud.is\\\/b\\\/2018\\\/04\\\/08\\\/dissecting-r-package-utility-belts\\\/#primaryimage\"},\"image\":{\"@id\":\"https:\\\/\\\/rud.is\\\/b\\\/2018\\\/04\\\/08\\\/dissecting-r-package-utility-belts\\\/#primaryimage\"},\"thumbnailUrl\":\"https:\\\/\\\/i0.wp.com\\\/rud.is\\\/b\\\/wp-content\\\/uploads\\\/2018\\\/04\\\/r-utility-belt-final.png?fit=891%2C375&ssl=1\",\"datePublished\":\"2018-04-08T12:41:14+00:00\",\"breadcrumb\":{\"@id\":\"https:\\\/\\\/rud.is\\\/b\\\/2018\\\/04\\\/08\\\/dissecting-r-package-utility-belts\\\/#breadcrumb\"},\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\\\/\\\/rud.is\\\/b\\\/2018\\\/04\\\/08\\\/dissecting-r-package-utility-belts\\\/\"]}]},{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\\\/\\\/rud.is\\\/b\\\/2018\\\/04\\\/08\\\/dissecting-r-package-utility-belts\\\/#primaryimage\",\"url\":\"https:\\\/\\\/i0.wp.com\\\/rud.is\\\/b\\\/wp-content\\\/uploads\\\/2018\\\/04\\\/r-utility-belt-final.png?fit=891%2C375&ssl=1\",\"contentUrl\":\"https:\\\/\\\/i0.wp.com\\\/rud.is\\\/b\\\/wp-content\\\/uploads\\\/2018\\\/04\\\/r-utility-belt-final.png?fit=891%2C375&ssl=1\",\"width\":\"891\",\"height\":\"375\"},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\\\/\\\/rud.is\\\/b\\\/2018\\\/04\\\/08\\\/dissecting-r-package-utility-belts\\\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Home\",\"item\":\"https:\\\/\\\/rud.is\\\/b\\\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"Dissecting R Package &#8220;Utility Belts&#8221;\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\\\/\\\/rud.is\\\/b\\\/#website\",\"url\":\"https:\\\/\\\/rud.is\\\/b\\\/\",\"name\":\"rud.is\",\"description\":\"&quot;In God we trust. All others must bring data&quot;\",\"publisher\":{\"@id\":\"https:\\\/\\\/rud.is\\\/b\\\/#\\\/schema\\\/person\\\/d7cb7487ab0527447f7fda5c423ff886\"},\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\\\/\\\/rud.is\\\/b\\\/?s={search_term_string}\"},\"query-input\":{\"@type\":\"PropertyValueSpecification\",\"valueRequired\":true,\"valueName\":\"search_term_string\"}}],\"inLanguage\":\"en-US\"},{\"@type\":[\"Person\",\"Organization\"],\"@id\":\"https:\\\/\\\/rud.is\\\/b\\\/#\\\/schema\\\/person\\\/d7cb7487ab0527447f7fda5c423ff886\",\"name\":\"hrbrmstr\",\"image\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\\\/\\\/i0.wp.com\\\/rud.is\\\/b\\\/wp-content\\\/uploads\\\/2023\\\/10\\\/ukr-shield.png?fit=460%2C460&ssl=1\",\"url\":\"https:\\\/\\\/i0.wp.com\\\/rud.is\\\/b\\\/wp-content\\\/uploads\\\/2023\\\/10\\\/ukr-shield.png?fit=460%2C460&ssl=1\",\"contentUrl\":\"https:\\\/\\\/i0.wp.com\\\/rud.is\\\/b\\\/wp-content\\\/uploads\\\/2023\\\/10\\\/ukr-shield.png?fit=460%2C460&ssl=1\",\"width\":460,\"height\":460,\"caption\":\"hrbrmstr\"},\"logo\":{\"@id\":\"https:\\\/\\\/i0.wp.com\\\/rud.is\\\/b\\\/wp-content\\\/uploads\\\/2023\\\/10\\\/ukr-shield.png?fit=460%2C460&ssl=1\"},\"description\":\"Don't look at me\u2026I do what he does \u2014 just slower. #rstats avuncular \u2022 ?Resistance Fighter \u2022 Cook \u2022 Christian \u2022 [Master] Chef des Donn\u00e9es de S\u00e9curit\u00e9 @ @rapid7\",\"sameAs\":[\"http:\\\/\\\/rud.is\"],\"url\":\"https:\\\/\\\/rud.is\\\/b\\\/author\\\/hrbrmstr\\\/\"}]}<\/script>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"Dissecting R Package \"Utility Belts\" - rud.is","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/rud.is\/b\/2018\/04\/08\/dissecting-r-package-utility-belts\/","og_locale":"en_US","og_type":"article","og_title":"Dissecting R Package \"Utility Belts\" - rud.is","og_description":"Many R package authors (including myself) lump a collection of small, useful functions into some type of utils.R file and usually do not export the functions since they are (generally) designed to work on package internals rather than expose their functionality via the exported package API. Just like Batman&#8217;s utility belt, which can be customized [&hellip;]","og_url":"https:\/\/rud.is\/b\/2018\/04\/08\/dissecting-r-package-utility-belts\/","og_site_name":"rud.is","article_published_time":"2018-04-08T12:41:14+00:00","og_image":[{"width":891,"height":375,"url":"https:\/\/i0.wp.com\/rud.is\/b\/wp-content\/uploads\/2018\/04\/r-utility-belt-final.png?fit=891%2C375&ssl=1","type":"image\/png"}],"author":"hrbrmstr","twitter_card":"summary_large_image","twitter_misc":{"Written by":"hrbrmstr","Est. reading time":"6 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"Article","@id":"https:\/\/rud.is\/b\/2018\/04\/08\/dissecting-r-package-utility-belts\/#article","isPartOf":{"@id":"https:\/\/rud.is\/b\/2018\/04\/08\/dissecting-r-package-utility-belts\/"},"author":{"name":"hrbrmstr","@id":"https:\/\/rud.is\/b\/#\/schema\/person\/d7cb7487ab0527447f7fda5c423ff886"},"headline":"Dissecting R Package &#8220;Utility Belts&#8221;","datePublished":"2018-04-08T12:41:14+00:00","mainEntityOfPage":{"@id":"https:\/\/rud.is\/b\/2018\/04\/08\/dissecting-r-package-utility-belts\/"},"wordCount":1055,"commentCount":2,"publisher":{"@id":"https:\/\/rud.is\/b\/#\/schema\/person\/d7cb7487ab0527447f7fda5c423ff886"},"image":{"@id":"https:\/\/rud.is\/b\/2018\/04\/08\/dissecting-r-package-utility-belts\/#primaryimage"},"thumbnailUrl":"https:\/\/i0.wp.com\/rud.is\/b\/wp-content\/uploads\/2018\/04\/r-utility-belt-final.png?fit=891%2C375&ssl=1","articleSection":["R"],"inLanguage":"en-US","potentialAction":[{"@type":"CommentAction","name":"Comment","target":["https:\/\/rud.is\/b\/2018\/04\/08\/dissecting-r-package-utility-belts\/#respond"]}]},{"@type":"WebPage","@id":"https:\/\/rud.is\/b\/2018\/04\/08\/dissecting-r-package-utility-belts\/","url":"https:\/\/rud.is\/b\/2018\/04\/08\/dissecting-r-package-utility-belts\/","name":"Dissecting R Package \"Utility Belts\" - rud.is","isPartOf":{"@id":"https:\/\/rud.is\/b\/#website"},"primaryImageOfPage":{"@id":"https:\/\/rud.is\/b\/2018\/04\/08\/dissecting-r-package-utility-belts\/#primaryimage"},"image":{"@id":"https:\/\/rud.is\/b\/2018\/04\/08\/dissecting-r-package-utility-belts\/#primaryimage"},"thumbnailUrl":"https:\/\/i0.wp.com\/rud.is\/b\/wp-content\/uploads\/2018\/04\/r-utility-belt-final.png?fit=891%2C375&ssl=1","datePublished":"2018-04-08T12:41:14+00:00","breadcrumb":{"@id":"https:\/\/rud.is\/b\/2018\/04\/08\/dissecting-r-package-utility-belts\/#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/rud.is\/b\/2018\/04\/08\/dissecting-r-package-utility-belts\/"]}]},{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/rud.is\/b\/2018\/04\/08\/dissecting-r-package-utility-belts\/#primaryimage","url":"https:\/\/i0.wp.com\/rud.is\/b\/wp-content\/uploads\/2018\/04\/r-utility-belt-final.png?fit=891%2C375&ssl=1","contentUrl":"https:\/\/i0.wp.com\/rud.is\/b\/wp-content\/uploads\/2018\/04\/r-utility-belt-final.png?fit=891%2C375&ssl=1","width":"891","height":"375"},{"@type":"BreadcrumbList","@id":"https:\/\/rud.is\/b\/2018\/04\/08\/dissecting-r-package-utility-belts\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/rud.is\/b\/"},{"@type":"ListItem","position":2,"name":"Dissecting R Package &#8220;Utility Belts&#8221;"}]},{"@type":"WebSite","@id":"https:\/\/rud.is\/b\/#website","url":"https:\/\/rud.is\/b\/","name":"rud.is","description":"&quot;In God we trust. All others must bring data&quot;","publisher":{"@id":"https:\/\/rud.is\/b\/#\/schema\/person\/d7cb7487ab0527447f7fda5c423ff886"},"potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/rud.is\/b\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"en-US"},{"@type":["Person","Organization"],"@id":"https:\/\/rud.is\/b\/#\/schema\/person\/d7cb7487ab0527447f7fda5c423ff886","name":"hrbrmstr","image":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/i0.wp.com\/rud.is\/b\/wp-content\/uploads\/2023\/10\/ukr-shield.png?fit=460%2C460&ssl=1","url":"https:\/\/i0.wp.com\/rud.is\/b\/wp-content\/uploads\/2023\/10\/ukr-shield.png?fit=460%2C460&ssl=1","contentUrl":"https:\/\/i0.wp.com\/rud.is\/b\/wp-content\/uploads\/2023\/10\/ukr-shield.png?fit=460%2C460&ssl=1","width":460,"height":460,"caption":"hrbrmstr"},"logo":{"@id":"https:\/\/i0.wp.com\/rud.is\/b\/wp-content\/uploads\/2023\/10\/ukr-shield.png?fit=460%2C460&ssl=1"},"description":"Don't look at me\u2026I do what he does \u2014 just slower. #rstats avuncular \u2022 ?Resistance Fighter \u2022 Cook \u2022 Christian \u2022 [Master] Chef des Donn\u00e9es de S\u00e9curit\u00e9 @ @rapid7","sameAs":["http:\/\/rud.is"],"url":"https:\/\/rud.is\/b\/author\/hrbrmstr\/"}]}},"jetpack_featured_media_url":"https:\/\/i0.wp.com\/rud.is\/b\/wp-content\/uploads\/2018\/04\/r-utility-belt-final.png?fit=891%2C375&ssl=1","jetpack_shortlink":"https:\/\/wp.me\/p23idr-2ta","jetpack_likes_enabled":true,"jetpack-related-posts":[{"id":6491,"url":"https:\/\/rud.is\/b\/2017\/09\/28\/sodd-stackoverflow-driven-development\/","url_meta":{"origin":9496,"position":0},"title":"SODD \u2014 StackOverflow Driven-Development","author":"hrbrmstr","date":"2017-09-28","format":false,"excerpt":"I occasionally hang out on StackOverflow and often use an answer as an opportunity to fill a package void for a particular need. docxtractr and qrencoder are two (of many) packages that were birthed from SO answers. I usually try to answer with inline code first then expand the functionality\u2026","rel":"","context":"In &quot;R&quot;","block_context":{"text":"R","link":"https:\/\/rud.is\/b\/category\/r\/"},"img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":10834,"url":"https:\/\/rud.is\/b\/2018\/05\/29\/the-fix-is-in-finding-infix-functions-inside-contributed-r-package-utilities-files\/","url_meta":{"origin":9496,"position":1},"title":"The Fix Is In: Finding infix functions inside contributed R package &#8220;utilities&#8221; files","author":"hrbrmstr","date":"2018-05-29","format":false,"excerpt":"Regular readers will recall the \"utility belt\" post from back in April of this year. This is a follow-up to a request made asking for a list of all the % infix functions in those files. We're going to: collect up all of the sources parse them find all the\u2026","rel":"","context":"In &quot;R&quot;","block_context":{"text":"R","link":"https:\/\/rud.is\/b\/category\/r\/"},"img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":3574,"url":"https:\/\/rud.is\/b\/2015\/08\/02\/two-new-r-packages-qrencoder-passwordrandom\/","url_meta":{"origin":9496,"position":2},"title":"Two New R Packages &#8211; qrencoder &#038; passwordrandom","author":"hrbrmstr","date":"2015-08-02","format":false,"excerpt":"Believe it or not, there are two [1] [2] questions on @StackOverflowR about how to make QR codes in R. I personally think QR codes are kinda hokey, but who am I to argue with pressing needs of the #rstats community? I found libqrencode and it's highly brew-able and apt-able\u2026","rel":"","context":"In &quot;R&quot;","block_context":{"text":"R","link":"https:\/\/rud.is\/b\/category\/r\/"},"img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":3622,"url":"https:\/\/rud.is\/b\/2015\/08\/21\/doh-i-could-have-had-just-used-v8\/","url_meta":{"origin":9496,"position":3},"title":"Doh! I Could Have Had Just Used V8!","author":"hrbrmstr","date":"2015-08-21","format":false,"excerpt":"An R user recently had the need to split a \"full, human name\" into component parts to retrieve first & last names. The full names could be anything from something simple like _\"David Regan\"_ to more complex & diverse such as _\"John Smith Jr.\"_, _\"Izaque Iuzuru Nagata\"_ or _\"Christian Schmit\u2026","rel":"","context":"In &quot;Javascript&quot;","block_context":{"text":"Javascript","link":"https:\/\/rud.is\/b\/category\/javascript\/"},"img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":4968,"url":"https:\/\/rud.is\/b\/2017\/02\/01\/exploring-news-coverage-with-newsflash\/","url_meta":{"origin":9496,"position":4},"title":"Exploring News Coverage With newsflash","author":"hrbrmstr","date":"2017-02-01","format":false,"excerpt":"I was enthused to see a mention of this on the GDELT blog since I've been working on an R package dubbed newsflash to work with the API that the form front-ends. Given the current climate, I feel compelled to note that I'm neither a Clinton supporter\/defender\/advocate nor a ?\u2026","rel":"","context":"In &quot;R&quot;","block_context":{"text":"R","link":"https:\/\/rud.is\/b\/category\/r\/"},"img":{"alt_text":"","src":"https:\/\/i0.wp.com\/rud.is\/b\/wp-content\/uploads\/2017\/02\/clinton_plot-1-1.png?fit=1200%2C600&ssl=1&resize=350%2C200","width":350,"height":200,"srcset":"https:\/\/i0.wp.com\/rud.is\/b\/wp-content\/uploads\/2017\/02\/clinton_plot-1-1.png?fit=1200%2C600&ssl=1&resize=350%2C200 1x, https:\/\/i0.wp.com\/rud.is\/b\/wp-content\/uploads\/2017\/02\/clinton_plot-1-1.png?fit=1200%2C600&ssl=1&resize=525%2C300 1.5x, https:\/\/i0.wp.com\/rud.is\/b\/wp-content\/uploads\/2017\/02\/clinton_plot-1-1.png?fit=1200%2C600&ssl=1&resize=700%2C400 2x, https:\/\/i0.wp.com\/rud.is\/b\/wp-content\/uploads\/2017\/02\/clinton_plot-1-1.png?fit=1200%2C600&ssl=1&resize=1050%2C600 3x"},"classes":[]},{"id":12609,"url":"https:\/\/rud.is\/b\/2020\/01\/03\/writing-frictionless-r-package-wrappers-building-a-basic-r-package\/","url_meta":{"origin":9496,"position":5},"title":"Writing Frictionless R Package Wrappers \u2014 Building A Basic R Package","author":"hrbrmstr","date":"2020-01-03","format":false,"excerpt":"Before we start wrapping foreign language code we need to make sure that basic R packages can be created. If you've followed along from the previous post you have everything you need to get started here. Just to make sure, you should be able to fire up a new RStudio\u2026","rel":"","context":"In &quot;R&quot;","block_context":{"text":"R","link":"https:\/\/rud.is\/b\/category\/r\/"},"img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]}],"jetpack_sharing_enabled":true,"_links":{"self":[{"href":"https:\/\/rud.is\/b\/wp-json\/wp\/v2\/posts\/9496","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/rud.is\/b\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/rud.is\/b\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/rud.is\/b\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/rud.is\/b\/wp-json\/wp\/v2\/comments?post=9496"}],"version-history":[{"count":0,"href":"https:\/\/rud.is\/b\/wp-json\/wp\/v2\/posts\/9496\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/rud.is\/b\/wp-json\/wp\/v2\/media\/9498"}],"wp:attachment":[{"href":"https:\/\/rud.is\/b\/wp-json\/wp\/v2\/media?parent=9496"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/rud.is\/b\/wp-json\/wp\/v2\/categories?post=9496"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/rud.is\/b\/wp-json\/wp\/v2\/tags?post=9496"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}