{"id":5854,"date":"2017-04-30T07:27:42","date_gmt":"2017-04-30T12:27:42","guid":{"rendered":"https:\/\/rud.is\/b\/?p=5854"},"modified":"2018-03-07T17:18:36","modified_gmt":"2018-03-07T22:18:36","slug":"r%e2%81%b6-using-pandoc-from-r-a-neat-package-for-reading-subtitles","status":"publish","type":"post","link":"https:\/\/rud.is\/b\/2017\/04\/30\/r%e2%81%b6-using-pandoc-from-r-a-neat-package-for-reading-subtitles\/","title":{"rendered":"R\u2076 \u2014 Using pandoc from R + A Neat Package For Reading Subtitles"},"content":{"rendered":"<p>Once I realized that my planned, larger post would not come to fruition today I took the R\u2076 post (i.e. &#8220;minimal expository, keen focus&#8221;) route, prompted by a Twitter discussion with some R mates who needed to convert &#8220;lightly formatted&#8221; Microsoft Word (<code>docx<\/code>) documents to markdown. Something like this:<\/p>\n<p><img data-recalc-dims=\"1\" loading=\"lazy\" decoding=\"async\" data-attachment-id=\"5855\" data-permalink=\"https:\/\/rud.is\/b\/2017\/04\/30\/r%e2%81%b6-using-pandoc-from-r-a-neat-package-for-reading-subtitles\/simple\/\" data-orig-file=\"https:\/\/i0.wp.com\/rud.is\/b\/wp-content\/uploads\/2017\/04\/simple.png?fit=936%2C742&amp;ssl=1\" data-orig-size=\"936,742\" data-comments-opened=\"1\" data-image-meta=\"{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}\" data-image-title=\"simple\" data-image-description=\"\" data-image-caption=\"\" data-large-file=\"https:\/\/i0.wp.com\/rud.is\/b\/wp-content\/uploads\/2017\/04\/simple.png?fit=510%2C404&amp;ssl=1\" src=\"https:\/\/i0.wp.com\/rud.is\/b\/wp-content\/uploads\/2017\/04\/simple.png?resize=468%2C371&#038;ssl=1\" alt=\"\" width=\"468\" height=\"371\" class=\"aligncenter wp-image-5855\" \/><\/p>\n<p>to:<\/p>\n<pre id=\"pandoc-01\"><code class=\"language-markdown\">Does pandoc work?\r\n=================\r\n\r\nSimple document with **bold** and *italics*.<\/code><\/pre>\n<p>This is definitely a job that <a href=\"http:\/\/pandoc.org\/\"><code>pandoc<\/code><\/a> can handle.<\/p>\n<p><code>pandoc<\/code> is a <a href=\"https:\/\/www.haskell.org\/\">Haskell<\/a> (yes, Haskell) program created by <a href=\"http:\/\/johnmacfarlane.net\/\">John MacFarlane<\/a> and is an amazing tool for transcoding documents. And, if you&#8217;re a &#8220;modern&#8221; R\/RStudio user, you likely use it every day because it&#8217;s ultimately what powers <code>rmarkdown<\/code> \/ <code>knitr<\/code>.<\/p>\n<p>Yes, you read that correctly. Your beautiful PDF, Word and HTML R reports are powered by \u2014 and, would not be possible without \u2014 Haskell.<\/p>\n<p>Doing the aforementioned conversion from <code>docx<\/code> to markdown is super-simple from R:<\/p>\n<pre id=\"pandoc-02\"><code class=\"language-r\">rmarkdown::pandoc_convert(&quot;simple.docx&quot;, &quot;markdown&quot;, output=&quot;simple.md&quot;)<\/code><\/pre>\n<p>Give the help on <code>rmarkdown::pandoc_convert()<\/code> a read as well as the very thorough and helpful documentation over at <a href=\"http:\/\/pandoc.org\"><code>pandoc.org<\/code><\/a> to see the power available at your command.<\/p>\n<h3>Just One More Thing<\/h3>\n<p>This section \u2014 technically \u2014\u00a0violates the R\u2076 principle so you can stop reading if you&#8217;re a purist :-)<\/p>\n<p>There&#8217;s a neat, non-on-CRAN package by Fran\u00e7ois Keck called <code>subtools<\/code> \u2014 <a href=\"https:\/\/github.com\/fkeck\/subtools\">https:\/\/github.com\/fkeck\/subtools<\/a> which can slice, dice and reformat digital content subtitles. There are <a href=\"https:\/\/en.wikipedia.org\/wiki\/Comparison_of_video_player_software#Subtitle_ability\">multiple formats<\/a> for these subtitle files and it seems to be able to handle them all.<\/p>\n<p>There was a post (earlier in April) about <a href=\"http:\/\/datameetsmedia.com\/ranking-the-negativity-of-black-mirror-episodes-with-sentiment-analysis\/\">Ranking the Negativity of Black Mirror Episodes<\/a>. That post is python and I&#8217;ve never had time to fully replicate it in R.<\/p>\n<p>Here&#8217;s a snippet (sans expository) that can get you started pulling in subtitles into R and <code>tidytext<\/code>. I would have written scraper code but the various subtitle aggregation sites make that a task suited for something like my <a href=\"https:\/\/github.com\/hrbrmstr\/splashr\"><code>splashr<\/code><\/a> package and I just had no cycles to write the code. So, I grabbed the first season of &#8220;The Flash&#8221; and use the Bing sentiment lexicon from <code>tidytext<\/code> to see how the season looked.<\/p>\n<p>The overall scoring for a given episode is naive and can definitely be improved upon.<\/p>\n<p>Definitely drop a link to anything you create in the comments!<\/p>\n<pre id=\"pandoc-03\"><code class=\"language-r\"># devtools::install_github(&quot;fkeck\/subtools&quot;)\r\n\r\nlibrary(subtools)\r\nlibrary(tidytext)\r\nlibrary(hrbrthemes)\r\nlibrary(tidyverse)\r\n\r\ndata(stop_words)\r\n\r\nbing &lt;- get_sentiments(&quot;bing&quot;)\r\nafinn &lt;- get_sentiments(&quot;afinn&quot;)\r\n\r\nfils &lt;- list.files(&quot;flash\/01&quot;, pattern = &quot;srt$&quot;, full.names = TRUE)\r\n\r\npb &lt;- progress_estimated(length(fils))\r\n\r\nmap_df(1:length(fils), ~{\r\n\r\n  pb$tick()$print()\r\n\r\n  read.subtitles(fils[.x]) %&gt;%\r\n    sentencify() %&gt;%\r\n    .$subtitles %&gt;%\r\n    unnest_tokens(word, Text) %&gt;%\r\n    anti_join(stop_words, by=&quot;word&quot;) %&gt;%\r\n    inner_join(bing, by=&quot;word&quot;) %&gt;%\r\n    inner_join(afinn, by=&quot;word&quot;) %&gt;%\r\n    mutate(season = 1, ep = .x)\r\n\r\n}) %&gt;% as_tibble() -&gt; season_sentiments\r\n\r\n\r\ncount(season_sentiments, ep, sentiment) %&gt;%\r\n  mutate(pct = n\/sum(n),\r\n         pct = ifelse(sentiment == &quot;negative&quot;, -pct, pct)) -&gt; bing_sent\r\n\r\nggplot() +\r\n  geom_ribbon(data = filter(bing_sent, sentiment==&quot;positive&quot;),\r\n              aes(ep, ymin=0, ymax=pct, fill=sentiment), alpha=3\/4) +\r\n  geom_ribbon(data = filter(bing_sent, sentiment==&quot;negative&quot;),\r\n              aes(ep, ymin=0, ymax=pct, fill=sentiment), alpha=3\/4) +\r\n  scale_x_continuous(expand=c(0,0.5), breaks=seq(1, 23, 2)) +\r\n  scale_y_continuous(expand=c(0,0), limits=c(-1,1),\r\n                     labels=c(&quot;100%\\nnegative&quot;, &quot;50%&quot;, &quot;0&quot;, &quot;50%&quot;, &quot;positive\\n100%&quot;)) +\r\n  labs(x=&quot;Season 1 Episode&quot;, y=NULL, title=&quot;The Flash \u2014 Season 1&quot;,\r\n       subtitle=&quot;Sentiment balance per episode&quot;) +\r\n  scale_fill_ipsum(name=&quot;Sentiment&quot;) +\r\n  guides(fill = guide_legend(reverse=TRUE)) +\r\n  theme_ipsum_rc(grid=&quot;Y&quot;) +\r\n  theme(axis.text.y=element_text(vjust=c(0, 0.5, 0.5, 0.5, 1)))<\/code><\/pre>\n<p><a href=\"https:\/\/i0.wp.com\/rud.is\/b\/wp-content\/uploads\/2017\/04\/flash.png?ssl=1\"><img data-recalc-dims=\"1\" loading=\"lazy\" decoding=\"async\" data-attachment-id=\"5866\" data-permalink=\"https:\/\/rud.is\/b\/2017\/04\/30\/r%e2%81%b6-using-pandoc-from-r-a-neat-package-for-reading-subtitles\/flash\/\" data-orig-file=\"https:\/\/i0.wp.com\/rud.is\/b\/wp-content\/uploads\/2017\/04\/flash.png?fit=1504%2C806&amp;ssl=1\" data-orig-size=\"1504,806\" data-comments-opened=\"1\" data-image-meta=\"{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}\" data-image-title=\"flash\" data-image-description=\"\" data-image-caption=\"\" data-large-file=\"https:\/\/i0.wp.com\/rud.is\/b\/wp-content\/uploads\/2017\/04\/flash.png?fit=510%2C273&amp;ssl=1\" src=\"https:\/\/i0.wp.com\/rud.is\/b\/wp-content\/uploads\/2017\/04\/flash.png?resize=510%2C273&#038;ssl=1\" alt=\"\" width=\"510\" height=\"273\" class=\"aligncenter size-full wp-image-5866\" \/><\/p>\n","protected":false},"excerpt":{"rendered":"<p>Once I realized that my planned, larger post would not come to fruition today I took the R\u2076 post (i.e. &#8220;minimal expository, keen focus&#8221;) route, prompted by a Twitter discussion with some R mates who needed to convert &#8220;lightly formatted&#8221; Microsoft Word (docx) documents to markdown. Something like this: to: This is definitely a job [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":5866,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_jetpack_newsletter_access":"","_jetpack_dont_email_post_to_subs":false,"_jetpack_newsletter_tier_id":0,"_jetpack_memberships_contains_paywalled_content":false,"_jetpack_feature_clip_id":0,"_jetpack_memberships_contains_paid_content":false,"activitypub_content_warning":"","activitypub_content_visibility":"","activitypub_max_image_attachments":3,"activitypub_interaction_policy_quote":"anyone","activitypub_status":"","footnotes":"","jetpack_post_was_ever_published":false},"categories":[91],"tags":[788,789,810,787],"class_list":["post-5854","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-r","tag-markdown","tag-pandoc","tag-post","tag-r6"],"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v28.0 - https:\/\/yoast.com\/product\/yoast-seo-wordpress\/ -->\n<title>R\u2076 \u2014 Using pandoc from R + A Neat Package For Reading Subtitles - rud.is<\/title>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/rud.is\/b\/2017\/04\/30\/r\u2076-using-pandoc-from-r-a-neat-package-for-reading-subtitles\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"R\u2076 \u2014 Using pandoc from R + A Neat Package For Reading Subtitles - rud.is\" \/>\n<meta property=\"og:description\" content=\"Once I realized that my planned, larger post would not come to fruition today I took the R\u2076 post (i.e. &#8220;minimal expository, keen focus&#8221;) route, prompted by a Twitter discussion with some R mates who needed to convert &#8220;lightly formatted&#8221; Microsoft Word (docx) documents to markdown. Something like this: to: This is definitely a job [&hellip;]\" \/>\n<meta property=\"og:url\" content=\"https:\/\/rud.is\/b\/2017\/04\/30\/r\u2076-using-pandoc-from-r-a-neat-package-for-reading-subtitles\/\" \/>\n<meta property=\"og:site_name\" content=\"rud.is\" \/>\n<meta property=\"article:published_time\" content=\"2017-04-30T12:27:42+00:00\" \/>\n<meta property=\"article:modified_time\" content=\"2018-03-07T22:18:36+00:00\" \/>\n<meta property=\"og:image\" content=\"https:\/\/i0.wp.com\/rud.is\/b\/wp-content\/uploads\/2017\/04\/flash.png?fit=1504%2C806&ssl=1\" \/>\n\t<meta property=\"og:image:width\" content=\"1504\" \/>\n\t<meta property=\"og:image:height\" content=\"806\" \/>\n\t<meta property=\"og:image:type\" content=\"image\/png\" \/>\n<meta name=\"author\" content=\"hrbrmstr\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:label1\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data1\" content=\"hrbrmstr\" \/>\n\t<meta name=\"twitter:label2\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data2\" content=\"2 minutes\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\\\/\\\/schema.org\",\"@graph\":[{\"@type\":\"Article\",\"@id\":\"https:\\\/\\\/rud.is\\\/b\\\/2017\\\/04\\\/30\\\/r%e2%81%b6-using-pandoc-from-r-a-neat-package-for-reading-subtitles\\\/#article\",\"isPartOf\":{\"@id\":\"https:\\\/\\\/rud.is\\\/b\\\/2017\\\/04\\\/30\\\/r%e2%81%b6-using-pandoc-from-r-a-neat-package-for-reading-subtitles\\\/\"},\"author\":{\"name\":\"hrbrmstr\",\"@id\":\"https:\\\/\\\/rud.is\\\/b\\\/#\\\/schema\\\/person\\\/d7cb7487ab0527447f7fda5c423ff886\"},\"headline\":\"R\u2076 \u2014 Using pandoc from R + A Neat Package For Reading Subtitles\",\"datePublished\":\"2017-04-30T12:27:42+00:00\",\"dateModified\":\"2018-03-07T22:18:36+00:00\",\"mainEntityOfPage\":{\"@id\":\"https:\\\/\\\/rud.is\\\/b\\\/2017\\\/04\\\/30\\\/r%e2%81%b6-using-pandoc-from-r-a-neat-package-for-reading-subtitles\\\/\"},\"wordCount\":375,\"commentCount\":6,\"publisher\":{\"@id\":\"https:\\\/\\\/rud.is\\\/b\\\/#\\\/schema\\\/person\\\/d7cb7487ab0527447f7fda5c423ff886\"},\"image\":{\"@id\":\"https:\\\/\\\/rud.is\\\/b\\\/2017\\\/04\\\/30\\\/r%e2%81%b6-using-pandoc-from-r-a-neat-package-for-reading-subtitles\\\/#primaryimage\"},\"thumbnailUrl\":\"https:\\\/\\\/i0.wp.com\\\/rud.is\\\/b\\\/wp-content\\\/uploads\\\/2017\\\/04\\\/flash.png?fit=1504%2C806&ssl=1\",\"keywords\":[\"markdown\",\"pandoc\",\"post\",\"r6\"],\"articleSection\":[\"R\"],\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"CommentAction\",\"name\":\"Comment\",\"target\":[\"https:\\\/\\\/rud.is\\\/b\\\/2017\\\/04\\\/30\\\/r%e2%81%b6-using-pandoc-from-r-a-neat-package-for-reading-subtitles\\\/#respond\"]}]},{\"@type\":\"WebPage\",\"@id\":\"https:\\\/\\\/rud.is\\\/b\\\/2017\\\/04\\\/30\\\/r%e2%81%b6-using-pandoc-from-r-a-neat-package-for-reading-subtitles\\\/\",\"url\":\"https:\\\/\\\/rud.is\\\/b\\\/2017\\\/04\\\/30\\\/r%e2%81%b6-using-pandoc-from-r-a-neat-package-for-reading-subtitles\\\/\",\"name\":\"R\u2076 \u2014 Using pandoc from R + A Neat Package For Reading Subtitles - rud.is\",\"isPartOf\":{\"@id\":\"https:\\\/\\\/rud.is\\\/b\\\/#website\"},\"primaryImageOfPage\":{\"@id\":\"https:\\\/\\\/rud.is\\\/b\\\/2017\\\/04\\\/30\\\/r%e2%81%b6-using-pandoc-from-r-a-neat-package-for-reading-subtitles\\\/#primaryimage\"},\"image\":{\"@id\":\"https:\\\/\\\/rud.is\\\/b\\\/2017\\\/04\\\/30\\\/r%e2%81%b6-using-pandoc-from-r-a-neat-package-for-reading-subtitles\\\/#primaryimage\"},\"thumbnailUrl\":\"https:\\\/\\\/i0.wp.com\\\/rud.is\\\/b\\\/wp-content\\\/uploads\\\/2017\\\/04\\\/flash.png?fit=1504%2C806&ssl=1\",\"datePublished\":\"2017-04-30T12:27:42+00:00\",\"dateModified\":\"2018-03-07T22:18:36+00:00\",\"breadcrumb\":{\"@id\":\"https:\\\/\\\/rud.is\\\/b\\\/2017\\\/04\\\/30\\\/r%e2%81%b6-using-pandoc-from-r-a-neat-package-for-reading-subtitles\\\/#breadcrumb\"},\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\\\/\\\/rud.is\\\/b\\\/2017\\\/04\\\/30\\\/r%e2%81%b6-using-pandoc-from-r-a-neat-package-for-reading-subtitles\\\/\"]}]},{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\\\/\\\/rud.is\\\/b\\\/2017\\\/04\\\/30\\\/r%e2%81%b6-using-pandoc-from-r-a-neat-package-for-reading-subtitles\\\/#primaryimage\",\"url\":\"https:\\\/\\\/i0.wp.com\\\/rud.is\\\/b\\\/wp-content\\\/uploads\\\/2017\\\/04\\\/flash.png?fit=1504%2C806&ssl=1\",\"contentUrl\":\"https:\\\/\\\/i0.wp.com\\\/rud.is\\\/b\\\/wp-content\\\/uploads\\\/2017\\\/04\\\/flash.png?fit=1504%2C806&ssl=1\",\"width\":1504,\"height\":806},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\\\/\\\/rud.is\\\/b\\\/2017\\\/04\\\/30\\\/r%e2%81%b6-using-pandoc-from-r-a-neat-package-for-reading-subtitles\\\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Home\",\"item\":\"https:\\\/\\\/rud.is\\\/b\\\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"R\u2076 \u2014 Using pandoc from R + A Neat Package For Reading Subtitles\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\\\/\\\/rud.is\\\/b\\\/#website\",\"url\":\"https:\\\/\\\/rud.is\\\/b\\\/\",\"name\":\"rud.is\",\"description\":\"&quot;In God we trust. All others must bring data&quot;\",\"publisher\":{\"@id\":\"https:\\\/\\\/rud.is\\\/b\\\/#\\\/schema\\\/person\\\/d7cb7487ab0527447f7fda5c423ff886\"},\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\\\/\\\/rud.is\\\/b\\\/?s={search_term_string}\"},\"query-input\":{\"@type\":\"PropertyValueSpecification\",\"valueRequired\":true,\"valueName\":\"search_term_string\"}}],\"inLanguage\":\"en-US\"},{\"@type\":[\"Person\",\"Organization\"],\"@id\":\"https:\\\/\\\/rud.is\\\/b\\\/#\\\/schema\\\/person\\\/d7cb7487ab0527447f7fda5c423ff886\",\"name\":\"hrbrmstr\",\"image\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\\\/\\\/i0.wp.com\\\/rud.is\\\/b\\\/wp-content\\\/uploads\\\/2023\\\/10\\\/ukr-shield.png?fit=460%2C460&ssl=1\",\"url\":\"https:\\\/\\\/i0.wp.com\\\/rud.is\\\/b\\\/wp-content\\\/uploads\\\/2023\\\/10\\\/ukr-shield.png?fit=460%2C460&ssl=1\",\"contentUrl\":\"https:\\\/\\\/i0.wp.com\\\/rud.is\\\/b\\\/wp-content\\\/uploads\\\/2023\\\/10\\\/ukr-shield.png?fit=460%2C460&ssl=1\",\"width\":460,\"height\":460,\"caption\":\"hrbrmstr\"},\"logo\":{\"@id\":\"https:\\\/\\\/i0.wp.com\\\/rud.is\\\/b\\\/wp-content\\\/uploads\\\/2023\\\/10\\\/ukr-shield.png?fit=460%2C460&ssl=1\"},\"description\":\"Don't look at me\u2026I do what he does \u2014 just slower. #rstats avuncular \u2022 ?Resistance Fighter \u2022 Cook \u2022 Christian \u2022 [Master] Chef des Donn\u00e9es de S\u00e9curit\u00e9 @ @rapid7\",\"sameAs\":[\"http:\\\/\\\/rud.is\"],\"url\":\"https:\\\/\\\/rud.is\\\/b\\\/author\\\/hrbrmstr\\\/\"}]}<\/script>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"R\u2076 \u2014 Using pandoc from R + A Neat Package For Reading Subtitles - rud.is","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/rud.is\/b\/2017\/04\/30\/r\u2076-using-pandoc-from-r-a-neat-package-for-reading-subtitles\/","og_locale":"en_US","og_type":"article","og_title":"R\u2076 \u2014 Using pandoc from R + A Neat Package For Reading Subtitles - rud.is","og_description":"Once I realized that my planned, larger post would not come to fruition today I took the R\u2076 post (i.e. &#8220;minimal expository, keen focus&#8221;) route, prompted by a Twitter discussion with some R mates who needed to convert &#8220;lightly formatted&#8221; Microsoft Word (docx) documents to markdown. Something like this: to: This is definitely a job [&hellip;]","og_url":"https:\/\/rud.is\/b\/2017\/04\/30\/r\u2076-using-pandoc-from-r-a-neat-package-for-reading-subtitles\/","og_site_name":"rud.is","article_published_time":"2017-04-30T12:27:42+00:00","article_modified_time":"2018-03-07T22:18:36+00:00","og_image":[{"width":1504,"height":806,"url":"https:\/\/i0.wp.com\/rud.is\/b\/wp-content\/uploads\/2017\/04\/flash.png?fit=1504%2C806&ssl=1","type":"image\/png"}],"author":"hrbrmstr","twitter_card":"summary_large_image","twitter_misc":{"Written by":"hrbrmstr","Est. reading time":"2 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"Article","@id":"https:\/\/rud.is\/b\/2017\/04\/30\/r%e2%81%b6-using-pandoc-from-r-a-neat-package-for-reading-subtitles\/#article","isPartOf":{"@id":"https:\/\/rud.is\/b\/2017\/04\/30\/r%e2%81%b6-using-pandoc-from-r-a-neat-package-for-reading-subtitles\/"},"author":{"name":"hrbrmstr","@id":"https:\/\/rud.is\/b\/#\/schema\/person\/d7cb7487ab0527447f7fda5c423ff886"},"headline":"R\u2076 \u2014 Using pandoc from R + A Neat Package For Reading Subtitles","datePublished":"2017-04-30T12:27:42+00:00","dateModified":"2018-03-07T22:18:36+00:00","mainEntityOfPage":{"@id":"https:\/\/rud.is\/b\/2017\/04\/30\/r%e2%81%b6-using-pandoc-from-r-a-neat-package-for-reading-subtitles\/"},"wordCount":375,"commentCount":6,"publisher":{"@id":"https:\/\/rud.is\/b\/#\/schema\/person\/d7cb7487ab0527447f7fda5c423ff886"},"image":{"@id":"https:\/\/rud.is\/b\/2017\/04\/30\/r%e2%81%b6-using-pandoc-from-r-a-neat-package-for-reading-subtitles\/#primaryimage"},"thumbnailUrl":"https:\/\/i0.wp.com\/rud.is\/b\/wp-content\/uploads\/2017\/04\/flash.png?fit=1504%2C806&ssl=1","keywords":["markdown","pandoc","post","r6"],"articleSection":["R"],"inLanguage":"en-US","potentialAction":[{"@type":"CommentAction","name":"Comment","target":["https:\/\/rud.is\/b\/2017\/04\/30\/r%e2%81%b6-using-pandoc-from-r-a-neat-package-for-reading-subtitles\/#respond"]}]},{"@type":"WebPage","@id":"https:\/\/rud.is\/b\/2017\/04\/30\/r%e2%81%b6-using-pandoc-from-r-a-neat-package-for-reading-subtitles\/","url":"https:\/\/rud.is\/b\/2017\/04\/30\/r%e2%81%b6-using-pandoc-from-r-a-neat-package-for-reading-subtitles\/","name":"R\u2076 \u2014 Using pandoc from R + A Neat Package For Reading Subtitles - rud.is","isPartOf":{"@id":"https:\/\/rud.is\/b\/#website"},"primaryImageOfPage":{"@id":"https:\/\/rud.is\/b\/2017\/04\/30\/r%e2%81%b6-using-pandoc-from-r-a-neat-package-for-reading-subtitles\/#primaryimage"},"image":{"@id":"https:\/\/rud.is\/b\/2017\/04\/30\/r%e2%81%b6-using-pandoc-from-r-a-neat-package-for-reading-subtitles\/#primaryimage"},"thumbnailUrl":"https:\/\/i0.wp.com\/rud.is\/b\/wp-content\/uploads\/2017\/04\/flash.png?fit=1504%2C806&ssl=1","datePublished":"2017-04-30T12:27:42+00:00","dateModified":"2018-03-07T22:18:36+00:00","breadcrumb":{"@id":"https:\/\/rud.is\/b\/2017\/04\/30\/r%e2%81%b6-using-pandoc-from-r-a-neat-package-for-reading-subtitles\/#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/rud.is\/b\/2017\/04\/30\/r%e2%81%b6-using-pandoc-from-r-a-neat-package-for-reading-subtitles\/"]}]},{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/rud.is\/b\/2017\/04\/30\/r%e2%81%b6-using-pandoc-from-r-a-neat-package-for-reading-subtitles\/#primaryimage","url":"https:\/\/i0.wp.com\/rud.is\/b\/wp-content\/uploads\/2017\/04\/flash.png?fit=1504%2C806&ssl=1","contentUrl":"https:\/\/i0.wp.com\/rud.is\/b\/wp-content\/uploads\/2017\/04\/flash.png?fit=1504%2C806&ssl=1","width":1504,"height":806},{"@type":"BreadcrumbList","@id":"https:\/\/rud.is\/b\/2017\/04\/30\/r%e2%81%b6-using-pandoc-from-r-a-neat-package-for-reading-subtitles\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/rud.is\/b\/"},{"@type":"ListItem","position":2,"name":"R\u2076 \u2014 Using pandoc from R + A Neat Package For Reading Subtitles"}]},{"@type":"WebSite","@id":"https:\/\/rud.is\/b\/#website","url":"https:\/\/rud.is\/b\/","name":"rud.is","description":"&quot;In God we trust. All others must bring data&quot;","publisher":{"@id":"https:\/\/rud.is\/b\/#\/schema\/person\/d7cb7487ab0527447f7fda5c423ff886"},"potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/rud.is\/b\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"en-US"},{"@type":["Person","Organization"],"@id":"https:\/\/rud.is\/b\/#\/schema\/person\/d7cb7487ab0527447f7fda5c423ff886","name":"hrbrmstr","image":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/i0.wp.com\/rud.is\/b\/wp-content\/uploads\/2023\/10\/ukr-shield.png?fit=460%2C460&ssl=1","url":"https:\/\/i0.wp.com\/rud.is\/b\/wp-content\/uploads\/2023\/10\/ukr-shield.png?fit=460%2C460&ssl=1","contentUrl":"https:\/\/i0.wp.com\/rud.is\/b\/wp-content\/uploads\/2023\/10\/ukr-shield.png?fit=460%2C460&ssl=1","width":460,"height":460,"caption":"hrbrmstr"},"logo":{"@id":"https:\/\/i0.wp.com\/rud.is\/b\/wp-content\/uploads\/2023\/10\/ukr-shield.png?fit=460%2C460&ssl=1"},"description":"Don't look at me\u2026I do what he does \u2014 just slower. #rstats avuncular \u2022 ?Resistance Fighter \u2022 Cook \u2022 Christian \u2022 [Master] Chef des Donn\u00e9es de S\u00e9curit\u00e9 @ @rapid7","sameAs":["http:\/\/rud.is"],"url":"https:\/\/rud.is\/b\/author\/hrbrmstr\/"}]}},"jetpack_featured_media_url":"https:\/\/i0.wp.com\/rud.is\/b\/wp-content\/uploads\/2017\/04\/flash.png?fit=1504%2C806&ssl=1","jetpack_shortlink":"https:\/\/wp.me\/p23idr-1wq","jetpack_likes_enabled":true,"jetpack-related-posts":[{"id":9579,"url":"https:\/\/rud.is\/b\/2018\/04\/12\/convert-epub-to-text-for-processing-in-r\/","url_meta":{"origin":5854,"position":0},"title":"Convert epub to Text for Processing in R","author":"hrbrmstr","date":"2018-04-12","format":false,"excerpt":"@RMHoge asked the following on Twitter: Hello #rstats hyve mind! Is there a package that reads epub into R? I can not find any, I now convert to text and parse the text but you sort of lose the structure of the text. Pinging @dataandme @hrbrmstr\u2014 Roel (@RMHoge) April 12,\u2026","rel":"","context":"In &quot;R&quot;","block_context":{"text":"R","link":"https:\/\/rud.is\/b\/category\/r\/"},"img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":4867,"url":"https:\/\/rud.is\/b\/2017\/01\/10\/knit-directly-to-jupyter-notebooks-from-rstudio\/","url_meta":{"origin":5854,"position":1},"title":"Knit directly to jupyter notebooks from RStudio","author":"hrbrmstr","date":"2017-01-10","format":false,"excerpt":"Did you know that you can completely replace the \"knitting\" engine in R Markdown documents? Well, you can! Why would you want to do this? Well, in the case of this post, to commit the unpardonable sin of creating a clunky jupyter notebook from a pristine Rmd file. I'm definitely\u2026","rel":"","context":"In &quot;Python&quot;","block_context":{"text":"Python","link":"https:\/\/rud.is\/b\/category\/python-2\/"},"img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":11812,"url":"https:\/\/rud.is\/b\/2019\/01\/24\/quick-hit-automating-production-graphics-uploads-in-r-markdown-documents-with-googledrive\/","url_meta":{"origin":5854,"position":2},"title":"Quick Hit: Automating Production Graphics Uploads in R Markdown Documents with googledrive","author":"hrbrmstr","date":"2019-01-24","format":false,"excerpt":"As someone who measures all kinds of things on the internet as part of his $DAYJOB, I can say with some authority that huge swaths of organizations are using cloud-services such as Google Apps, Dropbox and Office 365 as part of their business process workflows. For me, one regular component\u2026","rel":"","context":"In &quot;R&quot;","block_context":{"text":"R","link":"https:\/\/rud.is\/b\/category\/r\/"},"img":{"alt_text":"","src":"https:\/\/i0.wp.com\/rud.is\/b\/wp-content\/uploads\/2019\/01\/gdrive-auto-upload.png?resize=350%2C200&ssl=1","width":350,"height":200,"srcset":"https:\/\/i0.wp.com\/rud.is\/b\/wp-content\/uploads\/2019\/01\/gdrive-auto-upload.png?resize=350%2C200&ssl=1 1x, https:\/\/i0.wp.com\/rud.is\/b\/wp-content\/uploads\/2019\/01\/gdrive-auto-upload.png?resize=525%2C300&ssl=1 1.5x, https:\/\/i0.wp.com\/rud.is\/b\/wp-content\/uploads\/2019\/01\/gdrive-auto-upload.png?resize=700%2C400&ssl=1 2x, https:\/\/i0.wp.com\/rud.is\/b\/wp-content\/uploads\/2019\/01\/gdrive-auto-upload.png?resize=1050%2C600&ssl=1 3x"},"classes":[]},{"id":3914,"url":"https:\/\/rud.is\/b\/2016\/02\/04\/alternate-r-markdown-templates\/","url_meta":{"origin":5854,"position":3},"title":"Alternate R Markdown Templates","author":"hrbrmstr","date":"2016-02-04","format":false,"excerpt":"The `knitr`\/R markdown system is a great way to organize reports and analyses. However, the built-in ones (that come with RStudio\/the `rmarkdown` package) rely on Bootstrap and also use jQuery. There's nothing wrong with that, but the generated standalone HTML documents (which are a great way to distribute reports) don't\u2026","rel":"","context":"In &quot;R&quot;","block_context":{"text":"R","link":"https:\/\/rud.is\/b\/category\/r\/"},"img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":12094,"url":"https:\/\/rud.is\/b\/2019\/03\/17\/handling-sharing-pcaps-like-a-boss-with-packettotal\/","url_meta":{"origin":5854,"position":4},"title":"Handling &#038; Sharing PCAPs Like a Boss with PacketTotal","author":"hrbrmstr","date":"2019-03-17","format":false,"excerpt":"The fine folks over at @PacketTotal bequeathed an API token on me so I cranked out an R package for it to enable more dynamic investigations work (RStudio makes for an amazing incident responder investigations console given that you can script in multiple languages, code in C[++], and write documentation\u2026","rel":"","context":"In &quot;Cybersecurity&quot;","block_context":{"text":"Cybersecurity","link":"https:\/\/rud.is\/b\/category\/cybersecurity\/"},"img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":4040,"url":"https:\/\/rud.is\/b\/2016\/03\/04\/capturing-wild-widgets-with-webshot\/","url_meta":{"origin":5854,"position":5},"title":"Capturing wild widgets with webshot","author":"hrbrmstr","date":"2016-03-04","format":false,"excerpt":"NOTE: you won't need to use this function if you use the [development version](https:\/\/github.com\/yihui\/knitr) of `knitr` Winston Chang released his [`webshot`](https:\/\/github.com\/wch\/webshot) package to CRAN this past week. The package wraps the immensely useful [`phantomjs`](http:\/\/phantomjs.org\/) utility and makes it dirt simple to capture whole or partial web pages in R. One\u2026","rel":"","context":"In &quot;htmlwidgets&quot;","block_context":{"text":"htmlwidgets","link":"https:\/\/rud.is\/b\/category\/htmlwidgets\/"},"img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]}],"jetpack_sharing_enabled":true,"_links":{"self":[{"href":"https:\/\/rud.is\/b\/wp-json\/wp\/v2\/posts\/5854","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/rud.is\/b\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/rud.is\/b\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/rud.is\/b\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/rud.is\/b\/wp-json\/wp\/v2\/comments?post=5854"}],"version-history":[{"count":14,"href":"https:\/\/rud.is\/b\/wp-json\/wp\/v2\/posts\/5854\/revisions"}],"predecessor-version":[{"id":8634,"href":"https:\/\/rud.is\/b\/wp-json\/wp\/v2\/posts\/5854\/revisions\/8634"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/rud.is\/b\/wp-json\/wp\/v2\/media\/5866"}],"wp:attachment":[{"href":"https:\/\/rud.is\/b\/wp-json\/wp\/v2\/media?parent=5854"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/rud.is\/b\/wp-json\/wp\/v2\/categories?post=5854"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/rud.is\/b\/wp-json\/wp\/v2\/tags?post=5854"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}