

{"id":3374,"date":"2015-03-31T20:36:08","date_gmt":"2015-04-01T01:36:08","guid":{"rendered":"http:\/\/rud.is\/b\/?p=3374"},"modified":"2018-03-07T16:44:04","modified_gmt":"2018-03-07T21:44:04","slug":"more-airline-crashes-via-the-hadleyverse","status":"publish","type":"post","link":"https:\/\/rud.is\/b\/2015\/03\/31\/more-airline-crashes-via-the-hadleyverse\/","title":{"rendered":"More Airline Crashes via the Hadleyverse"},"content":{"rendered":"<p>I saw a fly-by `#rstats` mention of more airplane accident data on &#8212; of all places &#8212; LinkedIn (email) today which took me to a [GitHub repo](https:\/\/github.com\/philjette\/CrashData) by @philjette. It seems there&#8217;s a [web site](http:\/\/www.planecrashinfo.com\/) (run by what seems to be a single human) that tracks plane crashes. Here&#8217;s a tweet from @philjette announcing it:<\/p>\n<blockquote class=\"twitter-tweet\" lang=\"en\">\n<p>Wrote some R code for looking at historical crash data&#10;<a href=\"https:\/\/t.co\/bgrMj3PIZu\">https:\/\/t.co\/bgrMj3PIZu<\/a>&#10;<a href=\"https:\/\/mobile.twitter.com\/hashtag\/data?src=hash\">#data<\/a> <a href=\"https:\/\/mobile.twitter.com\/hashtag\/r?src=hash\">#r<\/a> <a href=\"https:\/\/mobile.twitter.com\/hashtag\/DataMining?src=hash\">#DataMining<\/a> <a href=\"http:\/\/t.co\/zYzpOyD9JY\">pic.twitter.com\/zYzpOyD9JY<\/a><\/p>\n<p>&mdash; PhilJ (@philjette) <a href=\"https:\/\/mobile.twitter.com\/philjette\/status\/581115214554353664\">March 26, 2015<\/a><\/p><\/blockquote>\n<p><script async src=\"\/\/platform.twitter.com\/widgets.js\" charset=\"utf-8\"><\/script><\/p>\n<p>The repo contains the R code that scrapes the site and it&#8217;s (mostly) in old-school R and works really well. I&#8217;m collecting and conjuring many bits of R for the classes I&#8217;m teaching in the fall and thought that it would be useful to replicate @philjette&#8217;s example in modern Hadleyverse style (i.e. `dplyr`, `rvest`, etc). I even submitted a [pull request](https:\/\/github.com\/philjette\/CrashData\/pull\/1) to him with the additional version. I&#8217;ve replicated it below with some additional comments for those wanting to jump into the Hadleyverse. No shiny `ggplot2` graphs this time, I&#8217;m afraid. This is all raw code, but will hopefully be useful to those learning the modern ropes. <\/p>\n<p>Just to get the setup bits out of the way, here&#8217;s all the packages I&#8217;ll be using:<\/p>\n<pre lang=\"rsplus\">library(dplyr)\r\nlibrary(rvest)\r\nlibrary(magrittr)\r\nlibrary(stringr)\r\nlibrary(lubridate)\r\nlibrary(pbapply)<\/pre>\n<p>Phil made a function to grab data for a whole year, so I did the same and gave it a default parameter of the current year (programmatically). I also tossed in some parameter checking for good measure. <\/p>\n<p>The basic setup is to:<\/p>\n<p>&#8211; grab the HTML for the page of a given year<br \/>\n&#8211; extract and format the crash dates<br \/>\n&#8211; extract location &#038; operator information, which is made slightly annoying since the site uses a `<br \/>` and includes spurious newlines within a single `<\/p>\n<td>` element<br \/>\n&#8211; extract aircraft type and registration (same issues as previous column)<br \/>\n&#8211; extract accident details, which are embedded in a highly formatted column that requires `str_match_all` to handle (well)<\/p>\n<p>Some things worth mentioning:<\/p>\n<p>&#8211; `data_frame` is super-helpful in not-creating `factors` from the character vectors<br \/>\n&#8211; `bind_rows` and `bind_cols` are a nice alternative to using `data.table` functions<br \/>\n&#8211; I think `stringr` needs a more pipe-friendly replacement for `gsub` and, perhaps, even `ifesle` (yes, I guess I could submit a PR). The `.` just feels wrong in pipes to me, still<br \/>\n&#8211; if you&#8217;re not using `pbapply` functions (free progress bars for everyone!) you _should_ be, especially for long scraping operations<br \/>\n&#8211; sometimes XPath entries can be less verbose than CSS (and easier to craft) and I have no issue mixing them in scraping code when necessary<\/p>\n<p>Here&#8217;s the new `get_data` function (_updated per comment and to also add some more hadleyverse goodness_):<\/p>\n<pre lang=\"rsplus\">#' retrieve crash data for a given year\r\n#' defaults to current year\r\n#' earliest year in the database is 1920\r\nget_data <- function(year=as.numeric(format(Sys.Date(), \"%Y\"))) {\r\n\r\n  crash_base <- \"http:\/\/www.planecrashinfo.com\/%d\/%s.htm\"\r\n\r\n  if (year < 1920 | year > as.numeric(format(Sys.Date(), \"%Y\"))) {\r\n    stop(\"year must be >=1920 and <=current year\", call.=FALSE)\r\n  }\r\n\r\n  # get crash date\r\n\r\n  pg <- html(sprintf(crash_base, year, year))\r\n  pg %>%\r\n    html_nodes(\"table > tr > td:nth-child(1)\") %>%\r\n    html_text() %>%\r\n    extract(-1) %>%\r\n    dmy() %>%\r\n    data_frame(date=.) -> date\r\n\r\n  # get location and operator\r\n\r\n  loc_op <- bind_rows(lapply(1:length(date), function(i) {\r\n\r\n    pg %>%\r\n      html_nodes(xpath=sprintf(\"\/\/table\/tr\/td[2]\/*\/br[%d]\/preceding-sibling::text()\", i)) %>%\r\n      html_text() %>%\r\n      str_trim() %>%\r\n      str_replace_all(\"^(Near|Off) \", \"\") -> loc\r\n\r\n    pg %>%\r\n      html_nodes(xpath=sprintf(\"\/\/table\/tr\/td[2]\/*\/br[%d]\/following-sibling::text()\", i)) %>%\r\n      html_text() %>%\r\n      str_replace_all(\"(^[[:space:]]*|[[:space:]]*$|\\\\n)\", \"\") -> op\r\n\r\n    data_frame(location=loc, operator=op)\r\n\r\n  }))\r\n\r\n  # get type & registration\r\n\r\n  type_reg <- bind_rows(lapply(1:length(date), function(i) {\r\n\r\n    pg %>%\r\n      html_nodes(xpath=sprintf(\"\/\/table\/tr\/td[3]\/*\/br[%d]\/preceding-sibling::text()\", i)) %>%\r\n      html_text() %>%\r\n      str_replace_all(\"(^[[:space:]]*|[[:space:]]*$|\\\\n)\", \"\") %>%\r\n      ifelse(.==\"?\", NA, .) -> typ\r\n\r\n    pg %>% html_nodes(xpath=sprintf(\"\/\/table\/tr\/td[3]\/*\/br[%d]\/following-sibling::text()\", i)) %>%\r\n      html_text() %>%\r\n      str_replace_all(\"(^[[:space:]]*|[[:space:]]*$|\\\\n)\", \"\") %>%\r\n      ifelse(.==\"?\", NA, .) -> reg\r\n\r\n    data_frame(type=typ, registration=reg)\r\n\r\n  }))\r\n\r\n  # get fatalities\r\n\r\n  pg %>% html_nodes(\"table > tr > td:nth-child(4)\") %>%\r\n    html_text() %>%\r\n    str_match_all(\"([[:digit:]]+)\/([[:digit:]]+)\\\\(([[:digit:]]+)\\\\)\") %>%\r\n    lapply(function(x) {\r\n      data_frame(aboard=as.numeric(x[2]), fatalties=as.numeric(x[3]), ground=as.numeric(x[4]))\r\n    }) %>%\r\n    bind_rows %>% tail(-1) -> afg\r\n\r\n  bind_cols(date, loc_op, type_reg, afg)\r\n\r\n}<\/pre>\n<p>While that gets one year, it&#8217;s super-simple to get all crashes since 1950:<\/p>\n<pre lang=\"rsplus\">crashes <- bind_rows(pblapply(1950:2015, get_data))<\/pre>\n<p>Yep. That's it. Now `crashes` contains a `data.frame` (well, `tbl_df`) of all the crashes since 1950, ready for further analysis.<\/p>\n<p>For the class I'm teaching, I'll be extending this to grab the extra details for each crash link and then performing more data science-y operations.<\/p>\n<p>If you've got any streamlining tips or alternate ways to handle the scraping Hadleyverse-style please drop a note in the comments. Also, definitely check out Phil's great solution, especially to compare it to this new version.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>I saw a fly-by `#rstats` mention of more airplane accident data on &#8212; of all places &#8212; LinkedIn (email) today which took me to a [GitHub repo](https:\/\/github.com\/philjette\/CrashData) by @philjette. It seems there&#8217;s a [web site](http:\/\/www.planecrashinfo.com\/) (run by what seems to be a single human) that tracks plane crashes. Here&#8217;s a tweet from @philjette announcing it: [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"jetpack_post_was_ever_published":false,"_jetpack_newsletter_access":"","_jetpack_dont_email_post_to_subs":true,"_jetpack_newsletter_tier_id":0,"_jetpack_memberships_contains_paywalled_content":false,"_jetpack_memberships_contains_paid_content":false,"activitypub_content_warning":"","activitypub_content_visibility":"","activitypub_max_image_attachments":3,"activitypub_interaction_policy_quote":"anyone","activitypub_status":"","footnotes":""},"categories":[677,91],"tags":[810],"class_list":["post-3374","post","type-post","status-publish","format-standard","hentry","category-data-analysis-2","category-r","tag-post"],"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v27.3 - https:\/\/yoast.com\/product\/yoast-seo-wordpress\/ -->\n<title>More Airline Crashes via the Hadleyverse - rud.is<\/title>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/rud.is\/b\/2015\/03\/31\/more-airline-crashes-via-the-hadleyverse\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"More Airline Crashes via the Hadleyverse - rud.is\" \/>\n<meta property=\"og:description\" content=\"I saw a fly-by `#rstats` mention of more airplane accident data on &#8212; of all places &#8212; LinkedIn (email) today which took me to a [GitHub repo](https:\/\/github.com\/philjette\/CrashData) by @philjette. It seems there&#8217;s a [web site](http:\/\/www.planecrashinfo.com\/) (run by what seems to be a single human) that tracks plane crashes. Here&#8217;s a tweet from @philjette announcing it: [&hellip;]\" \/>\n<meta property=\"og:url\" content=\"https:\/\/rud.is\/b\/2015\/03\/31\/more-airline-crashes-via-the-hadleyverse\/\" \/>\n<meta property=\"og:site_name\" content=\"rud.is\" \/>\n<meta property=\"article:published_time\" content=\"2015-04-01T01:36:08+00:00\" \/>\n<meta property=\"article:modified_time\" content=\"2018-03-07T21:44:04+00:00\" \/>\n<meta name=\"author\" content=\"hrbrmstr\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:label1\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data1\" content=\"hrbrmstr\" \/>\n\t<meta name=\"twitter:label2\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data2\" content=\"3 minutes\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\\\/\\\/schema.org\",\"@graph\":[{\"@type\":\"Article\",\"@id\":\"https:\\\/\\\/rud.is\\\/b\\\/2015\\\/03\\\/31\\\/more-airline-crashes-via-the-hadleyverse\\\/#article\",\"isPartOf\":{\"@id\":\"https:\\\/\\\/rud.is\\\/b\\\/2015\\\/03\\\/31\\\/more-airline-crashes-via-the-hadleyverse\\\/\"},\"author\":{\"name\":\"hrbrmstr\",\"@id\":\"https:\\\/\\\/rud.is\\\/b\\\/#\\\/schema\\\/person\\\/d7cb7487ab0527447f7fda5c423ff886\"},\"headline\":\"More Airline Crashes via the Hadleyverse\",\"datePublished\":\"2015-04-01T01:36:08+00:00\",\"dateModified\":\"2018-03-07T21:44:04+00:00\",\"mainEntityOfPage\":{\"@id\":\"https:\\\/\\\/rud.is\\\/b\\\/2015\\\/03\\\/31\\\/more-airline-crashes-via-the-hadleyverse\\\/\"},\"wordCount\":580,\"commentCount\":10,\"publisher\":{\"@id\":\"https:\\\/\\\/rud.is\\\/b\\\/#\\\/schema\\\/person\\\/d7cb7487ab0527447f7fda5c423ff886\"},\"keywords\":[\"post\"],\"articleSection\":[\"Data Analysis\",\"R\"],\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"CommentAction\",\"name\":\"Comment\",\"target\":[\"https:\\\/\\\/rud.is\\\/b\\\/2015\\\/03\\\/31\\\/more-airline-crashes-via-the-hadleyverse\\\/#respond\"]}]},{\"@type\":\"WebPage\",\"@id\":\"https:\\\/\\\/rud.is\\\/b\\\/2015\\\/03\\\/31\\\/more-airline-crashes-via-the-hadleyverse\\\/\",\"url\":\"https:\\\/\\\/rud.is\\\/b\\\/2015\\\/03\\\/31\\\/more-airline-crashes-via-the-hadleyverse\\\/\",\"name\":\"More Airline Crashes via the Hadleyverse - rud.is\",\"isPartOf\":{\"@id\":\"https:\\\/\\\/rud.is\\\/b\\\/#website\"},\"datePublished\":\"2015-04-01T01:36:08+00:00\",\"dateModified\":\"2018-03-07T21:44:04+00:00\",\"breadcrumb\":{\"@id\":\"https:\\\/\\\/rud.is\\\/b\\\/2015\\\/03\\\/31\\\/more-airline-crashes-via-the-hadleyverse\\\/#breadcrumb\"},\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\\\/\\\/rud.is\\\/b\\\/2015\\\/03\\\/31\\\/more-airline-crashes-via-the-hadleyverse\\\/\"]}]},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\\\/\\\/rud.is\\\/b\\\/2015\\\/03\\\/31\\\/more-airline-crashes-via-the-hadleyverse\\\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Home\",\"item\":\"https:\\\/\\\/rud.is\\\/b\\\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"More Airline Crashes via the Hadleyverse\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\\\/\\\/rud.is\\\/b\\\/#website\",\"url\":\"https:\\\/\\\/rud.is\\\/b\\\/\",\"name\":\"rud.is\",\"description\":\"&quot;In God we trust. All others must bring data&quot;\",\"publisher\":{\"@id\":\"https:\\\/\\\/rud.is\\\/b\\\/#\\\/schema\\\/person\\\/d7cb7487ab0527447f7fda5c423ff886\"},\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\\\/\\\/rud.is\\\/b\\\/?s={search_term_string}\"},\"query-input\":{\"@type\":\"PropertyValueSpecification\",\"valueRequired\":true,\"valueName\":\"search_term_string\"}}],\"inLanguage\":\"en-US\"},{\"@type\":[\"Person\",\"Organization\"],\"@id\":\"https:\\\/\\\/rud.is\\\/b\\\/#\\\/schema\\\/person\\\/d7cb7487ab0527447f7fda5c423ff886\",\"name\":\"hrbrmstr\",\"image\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\\\/\\\/i0.wp.com\\\/rud.is\\\/b\\\/wp-content\\\/uploads\\\/2023\\\/10\\\/ukr-shield.png?fit=460%2C460&ssl=1\",\"url\":\"https:\\\/\\\/i0.wp.com\\\/rud.is\\\/b\\\/wp-content\\\/uploads\\\/2023\\\/10\\\/ukr-shield.png?fit=460%2C460&ssl=1\",\"contentUrl\":\"https:\\\/\\\/i0.wp.com\\\/rud.is\\\/b\\\/wp-content\\\/uploads\\\/2023\\\/10\\\/ukr-shield.png?fit=460%2C460&ssl=1\",\"width\":460,\"height\":460,\"caption\":\"hrbrmstr\"},\"logo\":{\"@id\":\"https:\\\/\\\/i0.wp.com\\\/rud.is\\\/b\\\/wp-content\\\/uploads\\\/2023\\\/10\\\/ukr-shield.png?fit=460%2C460&ssl=1\"},\"description\":\"Don't look at me\u2026I do what he does \u2014 just slower. #rstats avuncular \u2022 ?Resistance Fighter \u2022 Cook \u2022 Christian \u2022 [Master] Chef des Donn\u00e9es de S\u00e9curit\u00e9 @ @rapid7\",\"sameAs\":[\"http:\\\/\\\/rud.is\"],\"url\":\"https:\\\/\\\/rud.is\\\/b\\\/author\\\/hrbrmstr\\\/\"}]}<\/script>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"More Airline Crashes via the Hadleyverse - rud.is","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/rud.is\/b\/2015\/03\/31\/more-airline-crashes-via-the-hadleyverse\/","og_locale":"en_US","og_type":"article","og_title":"More Airline Crashes via the Hadleyverse - rud.is","og_description":"I saw a fly-by `#rstats` mention of more airplane accident data on &#8212; of all places &#8212; LinkedIn (email) today which took me to a [GitHub repo](https:\/\/github.com\/philjette\/CrashData) by @philjette. It seems there&#8217;s a [web site](http:\/\/www.planecrashinfo.com\/) (run by what seems to be a single human) that tracks plane crashes. Here&#8217;s a tweet from @philjette announcing it: [&hellip;]","og_url":"https:\/\/rud.is\/b\/2015\/03\/31\/more-airline-crashes-via-the-hadleyverse\/","og_site_name":"rud.is","article_published_time":"2015-04-01T01:36:08+00:00","article_modified_time":"2018-03-07T21:44:04+00:00","author":"hrbrmstr","twitter_card":"summary_large_image","twitter_misc":{"Written by":"hrbrmstr","Est. reading time":"3 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"Article","@id":"https:\/\/rud.is\/b\/2015\/03\/31\/more-airline-crashes-via-the-hadleyverse\/#article","isPartOf":{"@id":"https:\/\/rud.is\/b\/2015\/03\/31\/more-airline-crashes-via-the-hadleyverse\/"},"author":{"name":"hrbrmstr","@id":"https:\/\/rud.is\/b\/#\/schema\/person\/d7cb7487ab0527447f7fda5c423ff886"},"headline":"More Airline Crashes via the Hadleyverse","datePublished":"2015-04-01T01:36:08+00:00","dateModified":"2018-03-07T21:44:04+00:00","mainEntityOfPage":{"@id":"https:\/\/rud.is\/b\/2015\/03\/31\/more-airline-crashes-via-the-hadleyverse\/"},"wordCount":580,"commentCount":10,"publisher":{"@id":"https:\/\/rud.is\/b\/#\/schema\/person\/d7cb7487ab0527447f7fda5c423ff886"},"keywords":["post"],"articleSection":["Data Analysis","R"],"inLanguage":"en-US","potentialAction":[{"@type":"CommentAction","name":"Comment","target":["https:\/\/rud.is\/b\/2015\/03\/31\/more-airline-crashes-via-the-hadleyverse\/#respond"]}]},{"@type":"WebPage","@id":"https:\/\/rud.is\/b\/2015\/03\/31\/more-airline-crashes-via-the-hadleyverse\/","url":"https:\/\/rud.is\/b\/2015\/03\/31\/more-airline-crashes-via-the-hadleyverse\/","name":"More Airline Crashes via the Hadleyverse - rud.is","isPartOf":{"@id":"https:\/\/rud.is\/b\/#website"},"datePublished":"2015-04-01T01:36:08+00:00","dateModified":"2018-03-07T21:44:04+00:00","breadcrumb":{"@id":"https:\/\/rud.is\/b\/2015\/03\/31\/more-airline-crashes-via-the-hadleyverse\/#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/rud.is\/b\/2015\/03\/31\/more-airline-crashes-via-the-hadleyverse\/"]}]},{"@type":"BreadcrumbList","@id":"https:\/\/rud.is\/b\/2015\/03\/31\/more-airline-crashes-via-the-hadleyverse\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/rud.is\/b\/"},{"@type":"ListItem","position":2,"name":"More Airline Crashes via the Hadleyverse"}]},{"@type":"WebSite","@id":"https:\/\/rud.is\/b\/#website","url":"https:\/\/rud.is\/b\/","name":"rud.is","description":"&quot;In God we trust. All others must bring data&quot;","publisher":{"@id":"https:\/\/rud.is\/b\/#\/schema\/person\/d7cb7487ab0527447f7fda5c423ff886"},"potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/rud.is\/b\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"en-US"},{"@type":["Person","Organization"],"@id":"https:\/\/rud.is\/b\/#\/schema\/person\/d7cb7487ab0527447f7fda5c423ff886","name":"hrbrmstr","image":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/i0.wp.com\/rud.is\/b\/wp-content\/uploads\/2023\/10\/ukr-shield.png?fit=460%2C460&ssl=1","url":"https:\/\/i0.wp.com\/rud.is\/b\/wp-content\/uploads\/2023\/10\/ukr-shield.png?fit=460%2C460&ssl=1","contentUrl":"https:\/\/i0.wp.com\/rud.is\/b\/wp-content\/uploads\/2023\/10\/ukr-shield.png?fit=460%2C460&ssl=1","width":460,"height":460,"caption":"hrbrmstr"},"logo":{"@id":"https:\/\/i0.wp.com\/rud.is\/b\/wp-content\/uploads\/2023\/10\/ukr-shield.png?fit=460%2C460&ssl=1"},"description":"Don't look at me\u2026I do what he does \u2014 just slower. #rstats avuncular \u2022 ?Resistance Fighter \u2022 Cook \u2022 Christian \u2022 [Master] Chef des Donn\u00e9es de S\u00e9curit\u00e9 @ @rapid7","sameAs":["http:\/\/rud.is"],"url":"https:\/\/rud.is\/b\/author\/hrbrmstr\/"}]}},"jetpack_featured_media_url":"","jetpack_shortlink":"https:\/\/wp.me\/p23idr-Sq","jetpack_likes_enabled":true,"jetpack-related-posts":[{"id":2933,"url":"https:\/\/rud.is\/b\/2014\/02\/20\/using-twitter-as-a-data-source-for-monitoring-password-dumps\/","url_meta":{"origin":3374,"position":0},"title":"Using Twitter as a Data Source For Monitoring Password Dumps","author":"hrbrmstr","date":"2014-02-20","format":false,"excerpt":"I shot a quick post over at the [Data Driven Security blog](http:\/\/bit.ly\/1hyqJiT) explaining how to separate Twitter data gathering from R code via the Ruby `t` ([github repo](https:\/\/github.com\/sferik\/t)) command. Using `t` frees R code from having to be a Twitter processor and lets the analyst focus on analysis and visualization,\u2026","rel":"","context":"In &quot;Data Analysis&quot;","block_context":{"text":"Data Analysis","link":"https:\/\/rud.is\/b\/category\/data-analysis-2\/"},"img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":4225,"url":"https:\/\/rud.is\/b\/2016\/03\/30\/introducing-a-weekly-r-python-js-etc-vis-challenge\/","url_meta":{"origin":3374,"position":1},"title":"Introducing a Weekly R \/ Python \/ JS \/ etc Vis Challenge!","author":"hrbrmstr","date":"2016-03-30","format":false,"excerpt":">UPDATE: Deadline is now 2016-04-05 23:59 EDT; next vis challenge is 2016-04-06! Per a suggestion, I'm going to try to find a neat data set (prbly one from @jsvine) to feature each week and toss up some sample code (99% of the time prbly in R) and offer up a\u2026","rel":"","context":"In &quot;Charts &amp; Graphs&quot;","block_context":{"text":"Charts &amp; Graphs","link":"https:\/\/rud.is\/b\/category\/charts-graphs\/"},"img":{"alt_text":"","src":"https:\/\/i0.wp.com\/rud.is\/b\/wp-content\/uploads\/2016\/03\/RStudioScreenSnapz024.png?fit=1200%2C605&ssl=1&resize=350%2C200","width":350,"height":200,"srcset":"https:\/\/i0.wp.com\/rud.is\/b\/wp-content\/uploads\/2016\/03\/RStudioScreenSnapz024.png?fit=1200%2C605&ssl=1&resize=350%2C200 1x, https:\/\/i0.wp.com\/rud.is\/b\/wp-content\/uploads\/2016\/03\/RStudioScreenSnapz024.png?fit=1200%2C605&ssl=1&resize=525%2C300 1.5x, https:\/\/i0.wp.com\/rud.is\/b\/wp-content\/uploads\/2016\/03\/RStudioScreenSnapz024.png?fit=1200%2C605&ssl=1&resize=700%2C400 2x, https:\/\/i0.wp.com\/rud.is\/b\/wp-content\/uploads\/2016\/03\/RStudioScreenSnapz024.png?fit=1200%2C605&ssl=1&resize=1050%2C600 3x"},"classes":[]},{"id":5565,"url":"https:\/\/rud.is\/b\/2017\/04\/01\/r%e2%81%b4-snow-day-facets\/","url_meta":{"origin":3374,"position":2},"title":"R\u2076 \u2014 Snow Day Facets","author":"hrbrmstr","date":"2017-04-01","format":false,"excerpt":"Back in 2014 I blogged about first snowfall dates for a given U.S. state. It's April 1, 2017 and we're slated to get 12-18\" of snow up here in Maine and @mrshrbrmstr asked how often this \u2014\u00a0snow in May \u2014 has occurred near us. As with all of these \"R\u2076\u2026","rel":"","context":"In &quot;ggplot&quot;","block_context":{"text":"ggplot","link":"https:\/\/rud.is\/b\/category\/ggplot\/"},"img":{"alt_text":"","src":"https:\/\/i0.wp.com\/rud.is\/b\/wp-content\/uploads\/2017\/04\/Cursor_and___projects_snowfirst_-_master_-_RStudio.png?fit=1200%2C1098&ssl=1&resize=350%2C200","width":350,"height":200,"srcset":"https:\/\/i0.wp.com\/rud.is\/b\/wp-content\/uploads\/2017\/04\/Cursor_and___projects_snowfirst_-_master_-_RStudio.png?fit=1200%2C1098&ssl=1&resize=350%2C200 1x, https:\/\/i0.wp.com\/rud.is\/b\/wp-content\/uploads\/2017\/04\/Cursor_and___projects_snowfirst_-_master_-_RStudio.png?fit=1200%2C1098&ssl=1&resize=525%2C300 1.5x, https:\/\/i0.wp.com\/rud.is\/b\/wp-content\/uploads\/2017\/04\/Cursor_and___projects_snowfirst_-_master_-_RStudio.png?fit=1200%2C1098&ssl=1&resize=700%2C400 2x, https:\/\/i0.wp.com\/rud.is\/b\/wp-content\/uploads\/2017\/04\/Cursor_and___projects_snowfirst_-_master_-_RStudio.png?fit=1200%2C1098&ssl=1&resize=1050%2C600 3x"},"classes":[]},{"id":11837,"url":"https:\/\/rud.is\/b\/2019\/01\/30\/quick-hit-using-seymour-to-subscribe-to-your-gitlahub-repo-issues-in-feedly\/","url_meta":{"origin":3374,"position":3},"title":"Quick Hit: Using seymour to Subscribe to your Git[la|hu]b Repo Issues in Feedly","author":"hrbrmstr","date":"2019-01-30","format":false,"excerpt":"The seymour? Feedly API package has been updated to support subscribing to RSS\/Atom feeds. Previously the package was intended to just treat your Feedly as a data source, but there was a compelling use case for enabling subscription support: subscribing to code repository issues. Sure, there's already email notice integration\u2026","rel":"","context":"In &quot;R&quot;","block_context":{"text":"R","link":"https:\/\/rud.is\/b\/category\/r\/"},"img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":4292,"url":"https:\/\/rud.is\/b\/2016\/04\/13\/52-vis-week-1-winners\/","url_meta":{"origin":3374,"position":4},"title":"52 Vis Week 1 Winners!","author":"hrbrmstr","date":"2016-04-13","format":false,"excerpt":"The response to 52Vis has exceeded expectations and there have been great entries for both weeks. It's time to award some prizes! ### Week 1 - Send in the Drones I'll take [this week](https:\/\/github.com\/52vis\/2016-13) in comment submission order (remember, the rules changed to submission via PR in Week 2). NOTE:\u2026","rel":"","context":"In &quot;52vis&quot;","block_context":{"text":"52vis","link":"https:\/\/rud.is\/b\/category\/52vis\/"},"img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":19774,"url":"https:\/\/rud.is\/b\/2024\/05\/03\/cve-2024-27322-should-never-have-been-assigned-and-r-data-files-are-still-super-risky-even-in-r-4-4-0\/","url_meta":{"origin":3374,"position":5},"title":"CVE-2024-27322 Should Never Have Been Assigned And R Data Files Are Still Super Risky Even In R 4.4.0","author":"hrbrmstr","date":"2024-05-03","format":false,"excerpt":"I had not planned to blog this (this is an incredibly time-crunched week for me) but CERT\/CC and CISA made a big deal out of a non-vulnerability in R, and it's making the round on socmed, so here we are. A security vendor decided to try to get some hype\u2026","rel":"","context":"In &quot;R&quot;","block_context":{"text":"R","link":"https:\/\/rud.is\/b\/category\/r\/"},"img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]}],"jetpack_sharing_enabled":true,"_links":{"self":[{"href":"https:\/\/rud.is\/b\/wp-json\/wp\/v2\/posts\/3374","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/rud.is\/b\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/rud.is\/b\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/rud.is\/b\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/rud.is\/b\/wp-json\/wp\/v2\/comments?post=3374"}],"version-history":[{"count":0,"href":"https:\/\/rud.is\/b\/wp-json\/wp\/v2\/posts\/3374\/revisions"}],"wp:attachment":[{"href":"https:\/\/rud.is\/b\/wp-json\/wp\/v2\/media?parent=3374"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/rud.is\/b\/wp-json\/wp\/v2\/categories?post=3374"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/rud.is\/b\/wp-json\/wp\/v2\/tags?post=3374"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}