

{"id":6215,"date":"2017-09-04T19:19:58","date_gmt":"2017-09-05T00:19:58","guid":{"rendered":"https:\/\/rud.is\/b\/?p=6215"},"modified":"2018-03-07T17:05:00","modified_gmt":"2018-03-07T22:05:00","slug":"readability-redux","status":"publish","type":"post","link":"https:\/\/rud.is\/b\/2017\/09\/04\/readability-redux\/","title":{"rendered":"Readability Redux"},"content":{"rendered":"<p>I <a href=\"https:\/\/rud.is\/b\/2017\/08\/24\/reticulating-readability\/\">recently posted<\/a> about using a Python module to convert HTML to usable text. Since then, a new package has hit CRAN dubbed <a href=\"https:\/\/cran.r-project.org\/web\/packages\/htm2txt\/index.html\"><code>htm2txt<\/code><\/a> that is 100% R and uses regular expressions to strip tags from text.<\/p>\n<p>I <a href=\"https:\/\/rud.is\/rpubs\/htmltst\/\">gave it a spin<\/a> so folks could compare some basic output, but you should definitely give <code>htm2txt<\/code> a try on your own conversion needs since each method produces different results.<\/p>\n<p>On my macOS systems, the <code>htm2txt<\/code> calls ended up invoking XQuartz (the X11 environment on macOS) and they felt kind of sluggish (base R regular expressions don&#8217;t have a &#8220;compile&#8221; feature and can be sluggish compared to other types of regular expression computations).<\/p>\n<p>I decided to spend some of Labor Day (in the U.S.) laboring (not for long, though) on a (currently small) <code>rJava<\/code>-based R package dubbed <a href=\"https:\/\/github.com\/hrbrmstr\/jericho\"><code>jericho<\/code><\/a> which builds upon work created by Martin Jericho which is used in at-scale initiatives like the Internet Archive. Yes, I&#8217;m trading Java for Python, but the combination of Java+R has been around for much longer and there are many solved problems in Java-space that don&#8217;t need to be re-invented (if you do know a header-only, cross-platform, C++ HTML-to-text library, definitely leave a comment).<\/p>\n<p>Is it worth it to get <code>rJava<\/code> up and running to use <code>jericho<\/code> vs <code>htm2txt<\/code>? Let&#8217;s take a look:<\/p>\n<pre id=\"jericho01\"><code class=\"language-r\">library(jericho) # devtools::install_github(&quot;hrbrmstr\/jericho&quot;)\r\nlibrary(microbenchmark)\r\nlibrary(htm2txt)\r\nlibrary(tidyverse)\r\n\r\nc(\r\n  &quot;https:\/\/medium.com\/starts-with-a-bang\/science-knows-if-a-nation-is-testing-nuclear-bombs-ec5db88f4526&quot;,\r\n  &quot;https:\/\/en.wikipedia.org\/wiki\/Timeline_of_antisemitism&quot;,\r\n  &quot;http:\/\/www.healthsecuritysolutions.com\/2017\/09\/04\/watch-out-more-ransomware-attacks-incoming\/&quot;\r\n) -&gt; urls\r\n\r\nmap_chr(urls, ~paste0(read_lines(.x), collapse=&quot;\\n&quot;)) -&gt; sites_html\r\n\r\nmicrobenchmark(\r\n  jericho_txt = {\r\n    a &lt;- html_to_text(sites_html[1])\r\n  },\r\n  jericho_render = {\r\n    a &lt;- render_html_to_text(sites_html[1])\r\n  },\r\n  htm2txt = {\r\n    a &lt;- htm2txt(sites_html[1])\r\n  },\r\n  times = 10\r\n) -&gt; mb1\r\n\r\n# microbenchmark(\r\n#   jericho_txt = {\r\n#     a &lt;- html_to_text(sites_html[2])\r\n#   },\r\n#   jericho_render = {\r\n#     a &lt;- render_html_to_text(sites_html[2])\r\n#   },\r\n#   htm2txt = {\r\n#     a &lt;- htm2txt(sites_html[2])\r\n#   },\r\n#   times = 10\r\n# ) -&gt; mb2\r\n\r\nmicrobenchmark(\r\n  jericho_txt = {\r\n    a &lt;- html_to_text(sites_html[3])\r\n  },\r\n  jericho_render = {\r\n    a &lt;- render_html_to_text(sites_html[3])\r\n  },\r\n  htm2txt = {\r\n    a &lt;- htm2txt(sites_html[3])\r\n  },\r\n  times = 10\r\n) -&gt; mb3<\/code><\/pre>\n<p>The second benchmark is commented out because I really didn&#8217;t have time wait for it to complete (FWIW <code>jericho<\/code> goes fast in that test). Here&#8217;s what the other two look like:<\/p>\n<pre id=\"jericho02\"><code class=\"language-r\">mb1\r\n## Unit: milliseconds\r\n##            expr         min          lq        mean      median          uq         max neval\r\n##     jericho_txt    4.121872    4.294953    4.567241    4.405356    4.734923    5.621142    10\r\n##  jericho_render    5.446296    5.564006    5.927956    5.719971    6.357465    6.785791    10\r\n##         htm2txt 1014.858678 1021.575316 1035.342729 1029.154451 1042.642065 1082.340132    10\r\n\r\nmb3\r\n## Unit: milliseconds\r\n##            expr        min         lq       mean     median         uq        max neval\r\n##     jericho_txt   2.641352   2.814318   3.297543   3.034445   3.488639   5.437411    10\r\n##  jericho_render   3.034765   3.143431   4.708136   3.746157   5.953550   8.931072    10\r\n##         htm2txt 417.429658 437.493406 446.907140 445.622242 451.443907 484.563958    10<\/code><\/pre>\n<p>You should run the conversion functions on your own systems to compare the results (they&#8217;re somewhat large to incorporate here). I&#8217;m fairly certain they do a comparable &#8212; if not better &#8212; job of extracting clean, pertinent text.<\/p>\n<p>I need to separate the package into two (one for the base JAR and the other for the conversion functions) and add some more tests before a CRAN submission, but I think this would be a good addition to the budding arsenal of HTML-to-text conversion options in R.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>I recently posted about using a Python module to convert HTML to usable text. Since then, a new package has hit CRAN dubbed htm2txt that is 100% R and uses regular expressions to strip tags from text. I gave it a spin so folks could compare some basic output, but you should definitely give htm2txt [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"jetpack_post_was_ever_published":false,"_jetpack_newsletter_access":"","_jetpack_dont_email_post_to_subs":false,"_jetpack_newsletter_tier_id":0,"_jetpack_memberships_contains_paywalled_content":false,"_jetpack_memberships_contains_paid_content":false,"activitypub_content_warning":"","activitypub_content_visibility":"","activitypub_max_image_attachments":3,"activitypub_interaction_policy_quote":"anyone","activitypub_status":"","footnotes":""},"categories":[764,36,91],"tags":[810],"class_list":["post-6215","post","type-post","status-publish","format-standard","hentry","category-data-wrangling","category-html5","category-r","tag-post"],"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v27.2 - https:\/\/yoast.com\/product\/yoast-seo-wordpress\/ -->\n<title>Readability Redux - rud.is<\/title>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/rud.is\/b\/2017\/09\/04\/readability-redux\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"Readability Redux - rud.is\" \/>\n<meta property=\"og:description\" content=\"I recently posted about using a Python module to convert HTML to usable text. Since then, a new package has hit CRAN dubbed htm2txt that is 100% R and uses regular expressions to strip tags from text. I gave it a spin so folks could compare some basic output, but you should definitely give htm2txt [&hellip;]\" \/>\n<meta property=\"og:url\" content=\"https:\/\/rud.is\/b\/2017\/09\/04\/readability-redux\/\" \/>\n<meta property=\"og:site_name\" content=\"rud.is\" \/>\n<meta property=\"article:published_time\" content=\"2017-09-05T00:19:58+00:00\" \/>\n<meta property=\"article:modified_time\" content=\"2018-03-07T22:05:00+00:00\" \/>\n<meta name=\"author\" content=\"hrbrmstr\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:label1\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data1\" content=\"hrbrmstr\" \/>\n\t<meta name=\"twitter:label2\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data2\" content=\"2 minutes\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\/\/schema.org\",\"@graph\":[{\"@type\":\"Article\",\"@id\":\"https:\/\/rud.is\/b\/2017\/09\/04\/readability-redux\/#article\",\"isPartOf\":{\"@id\":\"https:\/\/rud.is\/b\/2017\/09\/04\/readability-redux\/\"},\"author\":{\"name\":\"hrbrmstr\",\"@id\":\"https:\/\/rud.is\/b\/#\/schema\/person\/d7cb7487ab0527447f7fda5c423ff886\"},\"headline\":\"Readability Redux\",\"datePublished\":\"2017-09-05T00:19:58+00:00\",\"dateModified\":\"2018-03-07T22:05:00+00:00\",\"mainEntityOfPage\":{\"@id\":\"https:\/\/rud.is\/b\/2017\/09\/04\/readability-redux\/\"},\"wordCount\":348,\"commentCount\":4,\"publisher\":{\"@id\":\"https:\/\/rud.is\/b\/#\/schema\/person\/d7cb7487ab0527447f7fda5c423ff886\"},\"keywords\":[\"post\"],\"articleSection\":[\"data wrangling\",\"HTML5\",\"R\"],\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"CommentAction\",\"name\":\"Comment\",\"target\":[\"https:\/\/rud.is\/b\/2017\/09\/04\/readability-redux\/#respond\"]}]},{\"@type\":\"WebPage\",\"@id\":\"https:\/\/rud.is\/b\/2017\/09\/04\/readability-redux\/\",\"url\":\"https:\/\/rud.is\/b\/2017\/09\/04\/readability-redux\/\",\"name\":\"Readability Redux - rud.is\",\"isPartOf\":{\"@id\":\"https:\/\/rud.is\/b\/#website\"},\"datePublished\":\"2017-09-05T00:19:58+00:00\",\"dateModified\":\"2018-03-07T22:05:00+00:00\",\"breadcrumb\":{\"@id\":\"https:\/\/rud.is\/b\/2017\/09\/04\/readability-redux\/#breadcrumb\"},\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\/\/rud.is\/b\/2017\/09\/04\/readability-redux\/\"]}]},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\/\/rud.is\/b\/2017\/09\/04\/readability-redux\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Home\",\"item\":\"https:\/\/rud.is\/b\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"Readability Redux\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\/\/rud.is\/b\/#website\",\"url\":\"https:\/\/rud.is\/b\/\",\"name\":\"rud.is\",\"description\":\"&quot;In God we trust. All others must bring data&quot;\",\"publisher\":{\"@id\":\"https:\/\/rud.is\/b\/#\/schema\/person\/d7cb7487ab0527447f7fda5c423ff886\"},\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\/\/rud.is\/b\/?s={search_term_string}\"},\"query-input\":{\"@type\":\"PropertyValueSpecification\",\"valueRequired\":true,\"valueName\":\"search_term_string\"}}],\"inLanguage\":\"en-US\"},{\"@type\":[\"Person\",\"Organization\"],\"@id\":\"https:\/\/rud.is\/b\/#\/schema\/person\/d7cb7487ab0527447f7fda5c423ff886\",\"name\":\"hrbrmstr\",\"image\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/i0.wp.com\/rud.is\/b\/wp-content\/uploads\/2023\/10\/ukr-shield.png?fit=460%2C460&ssl=1\",\"url\":\"https:\/\/i0.wp.com\/rud.is\/b\/wp-content\/uploads\/2023\/10\/ukr-shield.png?fit=460%2C460&ssl=1\",\"contentUrl\":\"https:\/\/i0.wp.com\/rud.is\/b\/wp-content\/uploads\/2023\/10\/ukr-shield.png?fit=460%2C460&ssl=1\",\"width\":460,\"height\":460,\"caption\":\"hrbrmstr\"},\"logo\":{\"@id\":\"https:\/\/i0.wp.com\/rud.is\/b\/wp-content\/uploads\/2023\/10\/ukr-shield.png?fit=460%2C460&ssl=1\"},\"description\":\"Don't look at me\u2026I do what he does \u2014 just slower. #rstats avuncular \u2022 ?Resistance Fighter \u2022 Cook \u2022 Christian \u2022 [Master] Chef des Donn\u00e9es de S\u00e9curit\u00e9 @ @rapid7\",\"sameAs\":[\"http:\/\/rud.is\"],\"url\":\"https:\/\/rud.is\/b\/author\/hrbrmstr\/\"}]}<\/script>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"Readability Redux - rud.is","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/rud.is\/b\/2017\/09\/04\/readability-redux\/","og_locale":"en_US","og_type":"article","og_title":"Readability Redux - rud.is","og_description":"I recently posted about using a Python module to convert HTML to usable text. Since then, a new package has hit CRAN dubbed htm2txt that is 100% R and uses regular expressions to strip tags from text. I gave it a spin so folks could compare some basic output, but you should definitely give htm2txt [&hellip;]","og_url":"https:\/\/rud.is\/b\/2017\/09\/04\/readability-redux\/","og_site_name":"rud.is","article_published_time":"2017-09-05T00:19:58+00:00","article_modified_time":"2018-03-07T22:05:00+00:00","author":"hrbrmstr","twitter_card":"summary_large_image","twitter_misc":{"Written by":"hrbrmstr","Est. reading time":"2 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"Article","@id":"https:\/\/rud.is\/b\/2017\/09\/04\/readability-redux\/#article","isPartOf":{"@id":"https:\/\/rud.is\/b\/2017\/09\/04\/readability-redux\/"},"author":{"name":"hrbrmstr","@id":"https:\/\/rud.is\/b\/#\/schema\/person\/d7cb7487ab0527447f7fda5c423ff886"},"headline":"Readability Redux","datePublished":"2017-09-05T00:19:58+00:00","dateModified":"2018-03-07T22:05:00+00:00","mainEntityOfPage":{"@id":"https:\/\/rud.is\/b\/2017\/09\/04\/readability-redux\/"},"wordCount":348,"commentCount":4,"publisher":{"@id":"https:\/\/rud.is\/b\/#\/schema\/person\/d7cb7487ab0527447f7fda5c423ff886"},"keywords":["post"],"articleSection":["data wrangling","HTML5","R"],"inLanguage":"en-US","potentialAction":[{"@type":"CommentAction","name":"Comment","target":["https:\/\/rud.is\/b\/2017\/09\/04\/readability-redux\/#respond"]}]},{"@type":"WebPage","@id":"https:\/\/rud.is\/b\/2017\/09\/04\/readability-redux\/","url":"https:\/\/rud.is\/b\/2017\/09\/04\/readability-redux\/","name":"Readability Redux - rud.is","isPartOf":{"@id":"https:\/\/rud.is\/b\/#website"},"datePublished":"2017-09-05T00:19:58+00:00","dateModified":"2018-03-07T22:05:00+00:00","breadcrumb":{"@id":"https:\/\/rud.is\/b\/2017\/09\/04\/readability-redux\/#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/rud.is\/b\/2017\/09\/04\/readability-redux\/"]}]},{"@type":"BreadcrumbList","@id":"https:\/\/rud.is\/b\/2017\/09\/04\/readability-redux\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/rud.is\/b\/"},{"@type":"ListItem","position":2,"name":"Readability Redux"}]},{"@type":"WebSite","@id":"https:\/\/rud.is\/b\/#website","url":"https:\/\/rud.is\/b\/","name":"rud.is","description":"&quot;In God we trust. All others must bring data&quot;","publisher":{"@id":"https:\/\/rud.is\/b\/#\/schema\/person\/d7cb7487ab0527447f7fda5c423ff886"},"potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/rud.is\/b\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"en-US"},{"@type":["Person","Organization"],"@id":"https:\/\/rud.is\/b\/#\/schema\/person\/d7cb7487ab0527447f7fda5c423ff886","name":"hrbrmstr","image":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/i0.wp.com\/rud.is\/b\/wp-content\/uploads\/2023\/10\/ukr-shield.png?fit=460%2C460&ssl=1","url":"https:\/\/i0.wp.com\/rud.is\/b\/wp-content\/uploads\/2023\/10\/ukr-shield.png?fit=460%2C460&ssl=1","contentUrl":"https:\/\/i0.wp.com\/rud.is\/b\/wp-content\/uploads\/2023\/10\/ukr-shield.png?fit=460%2C460&ssl=1","width":460,"height":460,"caption":"hrbrmstr"},"logo":{"@id":"https:\/\/i0.wp.com\/rud.is\/b\/wp-content\/uploads\/2023\/10\/ukr-shield.png?fit=460%2C460&ssl=1"},"description":"Don't look at me\u2026I do what he does \u2014 just slower. #rstats avuncular \u2022 ?Resistance Fighter \u2022 Cook \u2022 Christian \u2022 [Master] Chef des Donn\u00e9es de S\u00e9curit\u00e9 @ @rapid7","sameAs":["http:\/\/rud.is"],"url":"https:\/\/rud.is\/b\/author\/hrbrmstr\/"}]}},"jetpack_featured_media_url":"","jetpack_shortlink":"https:\/\/wp.me\/p23idr-1Cf","jetpack_likes_enabled":true,"jetpack-related-posts":[{"id":13042,"url":"https:\/\/rud.is\/b\/2021\/04\/25\/a-small-macos-big-sur-to-extract-indicators-of-compromise\/","url_meta":{"origin":6215,"position":0},"title":"A Small macOS (Big Sur+) App to Extract Indicators of Compromise","author":"hrbrmstr","date":"2021-04-25","format":false,"excerpt":"There's a semi-infrequent-but-frequent-enough-to-be-annoying manual task at $DAYJOB that involves extracting a particular set of strings (identifiable by a fairly benign set of regular expressions) from various interactive text sources (so, not static documents or documents easily scrape-able). Rather than hack something onto Sublime Text or VS Code I made a\u2026","rel":"","context":"In &quot;Information Security&quot;","block_context":{"text":"Information Security","link":"https:\/\/rud.is\/b\/category\/information-security\/"},"img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":7138,"url":"https:\/\/rud.is\/b\/2017\/11\/16\/new-ibm-plex-sans-support-in-hrbrthemes-automating-axis-text-justification\/","url_meta":{"origin":6215,"position":1},"title":"New IBM Plex Sans Support in hrbrthemes + Automating Axis Text Justification","author":"hrbrmstr","date":"2017-11-16","format":false,"excerpt":"IBM has a new set of corporate typefaces --- dubbed \"Plex\" --- and has released them with a generous open license. IBM Plex Sans is not too shabby: (that image was grifted from a Font Squirrel preview page) The digit glyphs are especially nice for charts and the font iself\u2026","rel":"","context":"In &quot;ggplot&quot;","block_context":{"text":"ggplot","link":"https:\/\/rud.is\/b\/category\/ggplot\/"},"img":{"alt_text":"","src":"https:\/\/i0.wp.com\/rud.is\/b\/wp-content\/uploads\/2017\/11\/README-unnamed-chunk-7-1.png?fit=1200%2C840&ssl=1&resize=350%2C200","width":350,"height":200,"srcset":"https:\/\/i0.wp.com\/rud.is\/b\/wp-content\/uploads\/2017\/11\/README-unnamed-chunk-7-1.png?fit=1200%2C840&ssl=1&resize=350%2C200 1x, https:\/\/i0.wp.com\/rud.is\/b\/wp-content\/uploads\/2017\/11\/README-unnamed-chunk-7-1.png?fit=1200%2C840&ssl=1&resize=525%2C300 1.5x, https:\/\/i0.wp.com\/rud.is\/b\/wp-content\/uploads\/2017\/11\/README-unnamed-chunk-7-1.png?fit=1200%2C840&ssl=1&resize=700%2C400 2x, https:\/\/i0.wp.com\/rud.is\/b\/wp-content\/uploads\/2017\/11\/README-unnamed-chunk-7-1.png?fit=1200%2C840&ssl=1&resize=1050%2C600 3x"},"classes":[]},{"id":12185,"url":"https:\/\/rud.is\/b\/2019\/05\/12\/quick-hit-updates-to-quicklookr-and-rdatainfo\/","url_meta":{"origin":6215,"position":2},"title":"Quick Hit: Updates to QuickLookR and {rdatainfo}","author":"hrbrmstr","date":"2019-05-12","format":false,"excerpt":"I'm using GitUgh links here b\/c the issue was submitted there. Those not wishing to be surveilled by Microsoft can find the macOS QuickLook plugin project and {rdatainfo} project in SourceHut and GitLab (~hrbrmstr and hrbrmstr accounts respectively). I hadn't touched QuickLookR? or {rdatainfo}? at all since 2016 since it\u2026","rel":"","context":"In &quot;macOS&quot;","block_context":{"text":"macOS","link":"https:\/\/rud.is\/b\/category\/macos\/"},"img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":12645,"url":"https:\/\/rud.is\/b\/2020\/02\/06\/prying-r-script-files-away-from-xcode-et-al-on-macos\/","url_meta":{"origin":6215,"position":3},"title":"Prying &#8220;.R&#8221; Script Files Away from Xcode (et al) on macOS","author":"hrbrmstr","date":"2020-02-06","format":false,"excerpt":"As the maintainer of RSwitch --- and developer of my own (for personal use) macOS, iOS, watchOS, iPadOS and tvOS apps --- I need the full Apple Xcode install around (more R-focused macOS folk can get away with just the command-line tools being installed). As an Apple Developer who insanely\u2026","rel":"","context":"In &quot;macOS&quot;","block_context":{"text":"macOS","link":"https:\/\/rud.is\/b\/category\/macos\/"},"img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":6154,"url":"https:\/\/rud.is\/b\/2017\/08\/13\/r%e2%81%b6-exploring-macos-applications-with-codesign-gatekeeper-r\/","url_meta":{"origin":6215,"position":4},"title":"R\u2076 \u2014 Exploring macOS Applications with codesign, Gatekeeper &#038; R","author":"hrbrmstr","date":"2017-08-13","format":false,"excerpt":"(General reminder abt \"R\u2076\" posts in that they are heavy on code-examples, minimal on expository. I try to design them with 2-3 \"nuggets\" embedded for those who take the time to walk through the code examples on their systems. I'll always provide further expository if requested in a comment, so\u2026","rel":"","context":"In &quot;macOS&quot;","block_context":{"text":"macOS","link":"https:\/\/rud.is\/b\/category\/macos\/"},"img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":6071,"url":"https:\/\/rud.is\/b\/2017\/06\/10\/engaging-the-tidyverse-clean-slate-protocol\/","url_meta":{"origin":6215,"position":5},"title":"Engaging the tidyverse Clean Slate Protocol","author":"hrbrmstr","date":"2017-06-10","format":false,"excerpt":"I caught the 0.7.0 release of dplyr on my home CRAN server early Friday morning and immediately set out to install it since I'm eager to finish up my sergeant package and get it on CRAN. \"Tidyverse\" upgrades aren't trivial for me as I tinker quite a bit with the\u2026","rel":"","context":"In &quot;dplyr&quot;","block_context":{"text":"dplyr","link":"https:\/\/rud.is\/b\/category\/dplyr\/"},"img":{"alt_text":"","src":"https:\/\/i0.wp.com\/rud.is\/b\/wp-content\/uploads\/2017\/06\/CSP.png?fit=1000%2C416&ssl=1&resize=350%2C200","width":350,"height":200,"srcset":"https:\/\/i0.wp.com\/rud.is\/b\/wp-content\/uploads\/2017\/06\/CSP.png?fit=1000%2C416&ssl=1&resize=350%2C200 1x, https:\/\/i0.wp.com\/rud.is\/b\/wp-content\/uploads\/2017\/06\/CSP.png?fit=1000%2C416&ssl=1&resize=525%2C300 1.5x, https:\/\/i0.wp.com\/rud.is\/b\/wp-content\/uploads\/2017\/06\/CSP.png?fit=1000%2C416&ssl=1&resize=700%2C400 2x"},"classes":[]}],"jetpack_sharing_enabled":true,"_links":{"self":[{"href":"https:\/\/rud.is\/b\/wp-json\/wp\/v2\/posts\/6215","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/rud.is\/b\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/rud.is\/b\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/rud.is\/b\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/rud.is\/b\/wp-json\/wp\/v2\/comments?post=6215"}],"version-history":[{"count":0,"href":"https:\/\/rud.is\/b\/wp-json\/wp\/v2\/posts\/6215\/revisions"}],"wp:attachment":[{"href":"https:\/\/rud.is\/b\/wp-json\/wp\/v2\/media?parent=6215"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/rud.is\/b\/wp-json\/wp\/v2\/categories?post=6215"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/rud.is\/b\/wp-json\/wp\/v2\/tags?post=6215"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}