

{"id":7334,"date":"2017-11-29T18:16:55","date_gmt":"2017-11-29T23:16:55","guid":{"rendered":"https:\/\/rud.is\/b\/?p=7334"},"modified":"2018-03-10T07:53:58","modified_gmt":"2018-03-10T12:53:58","slug":"sentiment-analysis-of-a-christmas-carol","status":"publish","type":"post","link":"https:\/\/rud.is\/b\/2017\/11\/29\/sentiment-analysis-of-a-christmas-carol\/","title":{"rendered":"Sentiment Analysis of &#8220;A Christmas Carol&#8221;"},"content":{"rendered":"<p>Our family has been reading, listening to and watching &#8220;A Christmas Carol&#8221; for just abt 30 years now. I got it into my crazy noggin to perform a sentiment analysis on it the other day and tweeted out the results, but a large chunk of the R community is not on Twitter and it would be good to get a holiday-themed post or two up for the season.<\/p>\n<p>One reason I embarked on this endeavour is that @juliasilge &amp; @drob made it so gosh darn easy to do so with:<\/p>\n<p><a href=\"https:\/\/www.amazon.com\/Text-Mining-R-Tidy-Approach\/dp\/1491981652\/ref=as_li_ss_tl?ie=UTF8&amp;linkCode=sl1&amp;tag=rudisdotnet-20&amp;linkId=fb41e5bdeed11cf15a959bd9d8af2f42\"><img data-recalc-dims=\"1\" loading=\"lazy\" decoding=\"async\" data-attachment-id=\"7338\" data-permalink=\"https:\/\/rud.is\/b\/2017\/11\/29\/sentiment-analysis-of-a-christmas-carol\/51qe8dw8zkl-_sx379_bo1204203200_\/\" data-orig-file=\"https:\/\/i0.wp.com\/rud.is\/b\/wp-content\/uploads\/2017\/11\/51qE8dW8ZkL._SX379_BO1204203200_.jpg?fit=381%2C499&amp;ssl=1\" data-orig-size=\"381,499\" data-comments-opened=\"1\" data-image-meta=\"{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}\" data-image-title=\"Tidy Text Mining with R\" data-image-description=\"\" data-image-caption=\"\" data-large-file=\"https:\/\/i0.wp.com\/rud.is\/b\/wp-content\/uploads\/2017\/11\/51qE8dW8ZkL._SX379_BO1204203200_.jpg?fit=381%2C499&amp;ssl=1\" src=\"https:\/\/i0.wp.com\/rud.is\/b\/wp-content\/uploads\/2017\/11\/51qE8dW8ZkL._SX379_BO1204203200_.jpg?resize=381%2C499&#038;ssl=1\" alt=\"\" width=\"381\" height=\"499\" class=\"aligncenter size-full wp-image-7338\" \/><\/a><\/p>\n<p>(btw: That makes an <em>excellent<\/em> <a href=\"https:\/\/www.amazon.com\/Text-Mining-R-Tidy-Approach\/dp\/1491981652\/ref=as_li_ss_tl?ie=UTF8&amp;linkCode=sl1&amp;tag=rudisdotnet-20&amp;linkId=fb41e5bdeed11cf15a959bd9d8af2f42\">holiday gift<\/a> for the data scientist[s] in your life.)<\/p>\n<p>Let us begin!<\/p>\n<h2>STAVE I: hrbrmstr&#8217;s Code<\/h2>\n<p>We need the text of this book to work with and thankfully it&#8217;s long been in the public domain. As @drob noted, we can use the <code>gutenbergr<\/code> package to retrieve it. We&#8217;ll use an RStudio project structure for this and cache the results locally to avoid burning bandwidth:<\/p>\n<pre id=\"stave01\"><code class=\"language-r\">library(rprojroot)\r\nlibrary(gutenbergr)\r\nlibrary(hrbrthemes)\r\nlibrary(stringi)\r\nlibrary(tidytext)\r\nlibrary(tidyverse)\r\n\r\nrt &lt;- find_rstudio_root_file()\r\n\r\ncarol_rds &lt;- file.path(rt, &quot;data&quot;, &quot;carol.rds&quot;)\r\n\r\nif (!file.exists(carol_rds)) {\r\n  carol_df &lt;- gutenberg_download(&quot;46&quot;)\r\n  write_rds(carol_df, carol_rds)\r\n} else {\r\n  carol_df &lt;- read_rds(carol_rds)\r\n}<\/code><\/pre>\n<p>How did I know to use <code>46<\/code>? We can use <code>gutenberg_works()<\/code> to get to that info:<\/p>\n<pre id=\"stave01a\"><code class=\"language-r\">gutenberg_works(author==&quot;Dickens, Charles&quot;)\r\n## # A tibble: 74 x 8\r\n##    gutenberg_id                                                                                    title\r\n##           &lt;int&gt;                                                                                    &lt;chr&gt;\r\n##  1           46                             A Christmas Carol in Prose; Being a Ghost Story of Christmas\r\n##  2           98                                                                     A Tale of Two Cities\r\n##  3          564                                                               The Mystery of Edwin Drood\r\n##  4          580                                                                      The Pickwick Papers\r\n##  5          588                                                                  Master Humphrey&#039;s Clock\r\n##  6          644                                                  The Haunted Man and the Ghost&#039;s Bargain\r\n##  7          650                                                                      Pictures from Italy\r\n##  8          653 &quot;The Chimes\\r\\nA Goblin Story of Some Bells That Rang an Old Year out and a New Year In&quot;\r\n##  9          675                                                                           American Notes\r\n## 10          678                                          The Cricket on the Hearth: A Fairy Tale of Home\r\n## # ... with 64 more rows, and 6 more variables: author &lt;chr&gt;, gutenberg_author_id &lt;int&gt;, language &lt;chr&gt;,\r\n## #   gutenberg_bookshelf &lt;chr&gt;, rights &lt;chr&gt;, has_text &lt;lgl&gt;<\/code><\/pre>\n<h2>STAVE II: The first of three wrangles<\/h2>\n<p>We&#8217;re eventually going to make a ggplot2 faceted chart of the sentiments by paragraphs in each stave (chapter). I wanted nicer titles for the facets so we&#8217;ll clean up the stave titles first:<\/p>\n<pre id=\"stave02\"><code class=\"language-r\">#&#039; Convenience only\r\ncarol_txt &lt;- carol_df$text\r\n\r\n# Just want the chapters (staves)\r\ncarol_txt &lt;- carol_txt[-(1:(which(grepl(&quot;STAVE I:&quot;, carol_txt)))-1)]\r\n\r\n#&#039; We&#039;ll need this later to make prettier facet titles\r\ndata_frame(\r\n  stave = 1:5,\r\n  title = sprintf(&quot;Stave %s: %s&quot;, stave, carol_txt[stri_detect_fixed(carol_txt, &quot;STAVE&quot;)] %&gt;%\r\n    stri_replace_first_regex(&quot;STAVE [[:alpha:]]{1,3}: &quot;, &quot;&quot;) %&gt;%\r\n    stri_trans_totitle())\r\n) -&gt; stave_titles<\/code><\/pre>\n<p><code>stri_trans_totitle()<\/code> is a super-handy function and all we&#8217;re doing here is extracting the stave titles and doing a small transformation. There are scads of ways to do this, so don&#8217;t get stuck on this example. Try out other ways of doing this munging.<\/p>\n<p>You&#8217;ll also see that I made sure we started at the first stave break vs include the title bits in the analysis.<\/p>\n<p>Now, we need to prep the text for text analysis.<\/p>\n<h2>STAVE III: The second of three wrangles<\/h2>\n<p>There are other text mining packages and processes in R. I&#8217;m using <code>tidytext<\/code> because it takes care of so many details for you and does so elegantly. I was also at the rOpenSci Unconf where the idea was spawned &amp; worked on and I&#8217;m glad it blossomed into such a great package and a book!<\/p>\n<p>Since we (I) want to do the analysis by stave &amp; paragraph, let&#8217;s break the text into those chunks. Note that I&#8217;m doing an extra break by sentence in the event folks out there want to replicate this work but do so on a more granular level.<\/p>\n<pre id=\"stave03\"><code class=\"language-r\">#&#039; Break the text up into chapters, paragraphs, sentences, and words,\r\n#&#039; preserving the hierarchy so we can use it later.\r\ndata_frame(txt = carol_txt) %&gt;%\r\n  unnest_tokens(chapter, txt, token=&quot;regex&quot;, pattern=&quot;STAVE [[:alpha:]]{1,3}: [[:alpha:] [:punct:]]+&quot;) %&gt;%\r\n  mutate(stave = 1:n()) %&gt;%\r\n  unnest_tokens(paragraph, chapter, token = &quot;paragraphs&quot;) %&gt;% \r\n  group_by(stave) %&gt;%\r\n  mutate(para = 1:n()) %&gt;% \r\n  ungroup() %&gt;%\r\n  unnest_tokens(sentence, paragraph, token=&quot;sentences&quot;) %&gt;% \r\n  group_by(stave, para) %&gt;%\r\n  mutate(sent = 1:n()) %&gt;% \r\n  ungroup() %&gt;%\r\n  unnest_tokens(word, sentence) -&gt; carol_tokens\r\n\r\ncarol_tokens\r\n##  A tibble: 28,710 x 4\r\n##   stave  para  sent   word\r\n##   &lt;int&gt; &lt;int&gt; &lt;int&gt;  &lt;chr&gt;\r\n## 1     1     1     1 marley\r\n## 2     1     1     1    was\r\n## 3     1     1     1   dead\r\n## 4     1     1     1     to\r\n## 5     1     1     1  begin\r\n## 6     1     1     1   with\r\n## 7     1     1     1  there\r\n## 8     1     1     1     is\r\n## 9     1     1     1     no\r\n## 0     1     1     1  doubt\r\n##  ... with 28,700 more rows<\/code><\/pre>\n<p>By indexing each hierarchy level, we have the flexibility to do all sorts of structured analyses just by choosing grouping combinations.<\/p>\n<h2>STAVE IV: The third of three wrangles<\/h2>\n<p>Now, we need to layer in some sentiments and do some basic sentiment calculations. Many of these sentiment-al posts (including this one) take a naive approach with basic match and only looking at 1-grams. One reason I didn&#8217;t go further was to make the code accessible to new R folk (since I primarily blog for new R folk :-). I&#8217;m prepping some 2018 posts with more involved text analysis themes and will likely add some complexity then with other texts.<\/p>\n<pre id=\"stave04\"><code class=\"language-r\">#&#039; Retrieve sentiments and compute them.\r\n#&#039;\r\n#&#039; I left the `index` in vs just use `paragraph` since it&#039;ll make this easier to reuse\r\n#&#039; this block (which I&#039;m not doing but thought I might).\r\ninner_join(carol_tokens, get_sentiments(&quot;nrc&quot;), &quot;word&quot;) %&gt;%\r\n  count(stave, index = para, sentiment) %&gt;%\r\n  spread(sentiment, n, fill = 0) %&gt;%\r\n  mutate(sentiment = positive - negative) %&gt;%\r\n  left_join(stave_titles, &quot;stave&quot;) -&gt; carol_with_sent<\/code><\/pre>\n<h2>STAVE V: The end of it<\/h2>\n<p>Now, we just need to do some really basic ggplot-ing to to get to our desired result:<\/p>\n<pre id=\"stave05\"><code class=\"language-r\">ggplot(carol_with_sent) +\r\n  geom_segment(aes(index, sentiment, xend=index, yend=0, color=title), size=0.33) +\r\n  scale_x_comma(limits=range(carol_with_sent$index)) +\r\n  scale_y_comma() +\r\n  scale_color_ipsum() +\r\n  facet_wrap(~title, scales=&quot;free_x&quot;, ncol=5) +\r\n  labs(x=NULL, y=&quot;Sentiment&quot;,\r\n       title=&quot;Sentiment Analysis of A Christmas Carol&quot;,\r\n       subtitle=&quot;By stave &amp; \u00b6&quot;,\r\n       caption=&quot;Humbug!&quot;) +\r\n  theme_ipsum_rc(grid=&quot;Y&quot;, axis_text_size = 8, strip_text_face = &quot;italic&quot;, strip_text_size = 10.5) +\r\n  theme(legend.position=&quot;none&quot;)<\/code><\/pre>\n<p><a href=\"https:\/\/i0.wp.com\/rud.is\/b\/wp-content\/uploads\/2017\/11\/plot_zoom_png-3-1.png?ssl=1\"><img data-recalc-dims=\"1\" loading=\"lazy\" decoding=\"async\" data-attachment-id=\"7343\" data-permalink=\"https:\/\/rud.is\/b\/2017\/11\/29\/sentiment-analysis-of-a-christmas-carol\/plot_zoom_png-3-2\/\" data-orig-file=\"https:\/\/i0.wp.com\/rud.is\/b\/wp-content\/uploads\/2017\/11\/plot_zoom_png-3-1.png?fit=2932%2C876&amp;ssl=1\" data-orig-size=\"2932,876\" data-comments-opened=\"1\" data-image-meta=\"{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}\" data-image-title=\"plot_zoom_png-3\" data-image-description=\"\" data-image-caption=\"\" data-large-file=\"https:\/\/i0.wp.com\/rud.is\/b\/wp-content\/uploads\/2017\/11\/plot_zoom_png-3-1.png?fit=510%2C152&amp;ssl=1\" src=\"https:\/\/i0.wp.com\/rud.is\/b\/wp-content\/uploads\/2017\/11\/plot_zoom_png-3-1.png?resize=510%2C152&#038;ssl=1\" alt=\"\" width=\"510\" height=\"152\" class=\"aligncenter size-full wp-image-7343\" \/><\/a><\/p>\n<p>You&#8217;ll want to tap\/click on that to make it bigger.<\/p>\n<p>Despite using a naive analysis, I think it tracks pretty well with the flow of the book.<\/p>\n<p><strong>Stave one<\/strong> is quite bleak. Marley is morose and frightening. There is no joy apart from Fred&#8217;s brief appearance.<\/p>\n<p>The truly terrible (-10 sentiment) paragraph also makes sense:<\/p>\n<blockquote><p>\n  <em>Marley\u2019s face. It was not in impenetrable shadow as the other objects in the yard were, but had a dismal light about it, like a bad lobster in a dark cellar. It was not angry or ferocious, but looked at Scrooge as Marley used to look: with ghostly spectacles turned up on its ghostly forehead. The hair was curiously stirred, as if by breath or hot air; and, though the eyes were wide open, they were perfectly motionless. That, and its livid colour, made it horrible; but its horror seemed to be in spite of the face and beyond its control, rather than a part of its own expression.<\/em>\n<\/p><\/blockquote>\n<p>(I got to that via this snippet which you can use as a template for finding the other significant sentiment points:)<\/p>\n<pre id=\"stave05a\"><code class=\"language-r\">filter(\r\n  carol_tokens, stave == 1,\r\n  para == filter(carol_with_sent, stave==1) %&gt;% \r\n    filter(sentiment == min(sentiment)) %&gt;% \r\n    pull(index)\r\n)<\/code><\/pre>\n<p><strong>Stave two<\/strong> (Christmas past) is all about Scrooge&#8217;s youth and includes details about Fezziwig&#8217;s party so the mostly-positive tone also makes sense.<\/p>\n<p><strong>Stave three<\/strong> (Christmas present) has the highest:<\/p>\n<blockquote><p>\n  <em>The Grocers\u2019! oh, the Grocers\u2019! nearly closed, with perhaps two shutters down, or one; but through those gaps such glimpses! It was not alone that the scales descending on the counter made a merry sound, or that the twine and roller parted company so briskly, or that the canisters were rattled up and down like juggling tricks, or even that the blended scents of tea and coffee were so grateful to the nose, or even that the raisins were so plentiful and rare, the almonds so extremely white, the sticks of cinnamon so long and straight, the other spices so delicious, the candied fruits so caked and spotted with molten sugar as to make the coldest lookers-on feel faint and subsequently bilious. Nor was it that the figs were moist and pulpy, or that the French plums blushed in modest tartness from their highly-decorated boxes, or that everything was good to eat and in its Christmas dress; but the customers were all so hurried and so eager in the hopeful promise of the day, that they tumbled up against each other at the door, crashing their wicker baskets wildly, and left their purchases upon the counter, and came running back to fetch them, and committed hundreds of the like mistakes, in the best humour possible; while the Grocer and his people were so frank and fresh that the polished hearts with which they fastened their aprons behind might have been their own, worn outside for general inspection, and for Christmas daws to peck at if they chose.<\/em>\n<\/p><\/blockquote>\n<p>and lowest (sentiment) points of the entire book:<\/p>\n<blockquote><p>\n  <em>And now, without a word of warning from the Ghost, they stood upon a bleak and desert moor, where monstrous masses of rude stone were cast about, as though it were the burial-place of giants; and water spread itself wheresoever it listed, or would have done so, but for the frost that held it prisoner; and nothing grew but moss and furze, and coarse rank grass. Down in the west the setting sun had left a streak of fiery red, which glared upon the desolation for an instant, like a sullen eye, and frowning lower, lower, lower yet, was lost in the thick gloom of darkest night.<\/em>\n<\/p><\/blockquote>\n<p><strong>Stave four<\/strong> (Christmas yet to come) is fairly middling. I had expected to see lower marks here. The standout negative sentiment paragraph (and the one that follows) are pretty dark, though:<\/p>\n<blockquote><p>\n  <em>They left the busy scene, and went into an obscure part of the town, where Scrooge had never penetrated before, although he recognised its situation, and its bad repute. The ways were foul and narrow; the shops and houses wretched; the people half-naked, drunken, slipshod, ugly. Alleys and archways, like so many cesspools, disgorged their offences of smell, and dirt, and life, upon the straggling streets; and the whole quarter reeked with crime, with filth, and misery.<\/em>\n<\/p><\/blockquote>\n<p>Finally, <strong>Stave five<\/strong> is both short and positive (<em>whew<\/em>!). Which I heartily agree with!<\/p>\n<h2>FIN<\/h2>\n<p>The code is up on <a href=\"https:\/\/github.com\/hrbrmstr\/tidyscrooge\">GitHub<\/a> and I hope that it will inspire more folks to experiment with this fun (&amp; useful!) aspect of data science.<\/p>\n<p>Make sure to send links to anything you create and shoot over PRs for anything you think I did that was awry.<\/p>\n<p>For those who celebrate Christmas, I hope you keep Christmas as well as or even better than old Scrooge. <em>&#8220;May that be truly said of us, and all of us! And so, as Tiny Tim observed, God bless Us, Every One!&#8221;<\/em><\/p>\n","protected":false},"excerpt":{"rendered":"<p>Our family has been reading, listening to and watching &#8220;A Christmas Carol&#8221; for just abt 30 years now. I got it into my crazy noggin to perform a sentiment analysis on it the other day and tweeted out the results, but a large chunk of the R community is not on Twitter and it would [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":7343,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"jetpack_post_was_ever_published":false,"_jetpack_newsletter_access":"","_jetpack_dont_email_post_to_subs":false,"_jetpack_newsletter_tier_id":0,"_jetpack_memberships_contains_paywalled_content":false,"_jetpack_memberships_contains_paid_content":false,"activitypub_content_warning":"","activitypub_content_visibility":"","activitypub_max_image_attachments":3,"activitypub_interaction_policy_quote":"anyone","activitypub_status":"","footnotes":""},"categories":[764,91,804],"tags":[810],"class_list":["post-7334","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-data-wrangling","category-r","category-text-mining","tag-post"],"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v27.3 - https:\/\/yoast.com\/product\/yoast-seo-wordpress\/ -->\n<title>Sentiment Analysis of &quot;A Christmas Carol&quot; - rud.is<\/title>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/rud.is\/b\/2017\/11\/29\/sentiment-analysis-of-a-christmas-carol\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"Sentiment Analysis of &quot;A Christmas Carol&quot; - rud.is\" \/>\n<meta property=\"og:description\" content=\"Our family has been reading, listening to and watching &#8220;A Christmas Carol&#8221; for just abt 30 years now. I got it into my crazy noggin to perform a sentiment analysis on it the other day and tweeted out the results, but a large chunk of the R community is not on Twitter and it would [&hellip;]\" \/>\n<meta property=\"og:url\" content=\"https:\/\/rud.is\/b\/2017\/11\/29\/sentiment-analysis-of-a-christmas-carol\/\" \/>\n<meta property=\"og:site_name\" content=\"rud.is\" \/>\n<meta property=\"article:published_time\" content=\"2017-11-29T23:16:55+00:00\" \/>\n<meta property=\"article:modified_time\" content=\"2018-03-10T12:53:58+00:00\" \/>\n<meta property=\"og:image\" content=\"https:\/\/i0.wp.com\/rud.is\/b\/wp-content\/uploads\/2017\/11\/plot_zoom_png-3-1.png?fit=2932%2C876&ssl=1\" \/>\n\t<meta property=\"og:image:width\" content=\"2932\" \/>\n\t<meta property=\"og:image:height\" content=\"876\" \/>\n\t<meta property=\"og:image:type\" content=\"image\/png\" \/>\n<meta name=\"author\" content=\"hrbrmstr\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:label1\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data1\" content=\"hrbrmstr\" \/>\n\t<meta name=\"twitter:label2\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data2\" content=\"7 minutes\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\\\/\\\/schema.org\",\"@graph\":[{\"@type\":\"Article\",\"@id\":\"https:\\\/\\\/rud.is\\\/b\\\/2017\\\/11\\\/29\\\/sentiment-analysis-of-a-christmas-carol\\\/#article\",\"isPartOf\":{\"@id\":\"https:\\\/\\\/rud.is\\\/b\\\/2017\\\/11\\\/29\\\/sentiment-analysis-of-a-christmas-carol\\\/\"},\"author\":{\"name\":\"hrbrmstr\",\"@id\":\"https:\\\/\\\/rud.is\\\/b\\\/#\\\/schema\\\/person\\\/d7cb7487ab0527447f7fda5c423ff886\"},\"headline\":\"Sentiment Analysis of &#8220;A Christmas Carol&#8221;\",\"datePublished\":\"2017-11-29T23:16:55+00:00\",\"dateModified\":\"2018-03-10T12:53:58+00:00\",\"mainEntityOfPage\":{\"@id\":\"https:\\\/\\\/rud.is\\\/b\\\/2017\\\/11\\\/29\\\/sentiment-analysis-of-a-christmas-carol\\\/\"},\"wordCount\":1385,\"commentCount\":7,\"publisher\":{\"@id\":\"https:\\\/\\\/rud.is\\\/b\\\/#\\\/schema\\\/person\\\/d7cb7487ab0527447f7fda5c423ff886\"},\"image\":{\"@id\":\"https:\\\/\\\/rud.is\\\/b\\\/2017\\\/11\\\/29\\\/sentiment-analysis-of-a-christmas-carol\\\/#primaryimage\"},\"thumbnailUrl\":\"https:\\\/\\\/i0.wp.com\\\/rud.is\\\/b\\\/wp-content\\\/uploads\\\/2017\\\/11\\\/plot_zoom_png-3-1.png?fit=2932%2C876&ssl=1\",\"keywords\":[\"post\"],\"articleSection\":[\"data wrangling\",\"R\",\"text-mining\"],\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"CommentAction\",\"name\":\"Comment\",\"target\":[\"https:\\\/\\\/rud.is\\\/b\\\/2017\\\/11\\\/29\\\/sentiment-analysis-of-a-christmas-carol\\\/#respond\"]}]},{\"@type\":\"WebPage\",\"@id\":\"https:\\\/\\\/rud.is\\\/b\\\/2017\\\/11\\\/29\\\/sentiment-analysis-of-a-christmas-carol\\\/\",\"url\":\"https:\\\/\\\/rud.is\\\/b\\\/2017\\\/11\\\/29\\\/sentiment-analysis-of-a-christmas-carol\\\/\",\"name\":\"Sentiment Analysis of \\\"A Christmas Carol\\\" - rud.is\",\"isPartOf\":{\"@id\":\"https:\\\/\\\/rud.is\\\/b\\\/#website\"},\"primaryImageOfPage\":{\"@id\":\"https:\\\/\\\/rud.is\\\/b\\\/2017\\\/11\\\/29\\\/sentiment-analysis-of-a-christmas-carol\\\/#primaryimage\"},\"image\":{\"@id\":\"https:\\\/\\\/rud.is\\\/b\\\/2017\\\/11\\\/29\\\/sentiment-analysis-of-a-christmas-carol\\\/#primaryimage\"},\"thumbnailUrl\":\"https:\\\/\\\/i0.wp.com\\\/rud.is\\\/b\\\/wp-content\\\/uploads\\\/2017\\\/11\\\/plot_zoom_png-3-1.png?fit=2932%2C876&ssl=1\",\"datePublished\":\"2017-11-29T23:16:55+00:00\",\"dateModified\":\"2018-03-10T12:53:58+00:00\",\"breadcrumb\":{\"@id\":\"https:\\\/\\\/rud.is\\\/b\\\/2017\\\/11\\\/29\\\/sentiment-analysis-of-a-christmas-carol\\\/#breadcrumb\"},\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\\\/\\\/rud.is\\\/b\\\/2017\\\/11\\\/29\\\/sentiment-analysis-of-a-christmas-carol\\\/\"]}]},{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\\\/\\\/rud.is\\\/b\\\/2017\\\/11\\\/29\\\/sentiment-analysis-of-a-christmas-carol\\\/#primaryimage\",\"url\":\"https:\\\/\\\/i0.wp.com\\\/rud.is\\\/b\\\/wp-content\\\/uploads\\\/2017\\\/11\\\/plot_zoom_png-3-1.png?fit=2932%2C876&ssl=1\",\"contentUrl\":\"https:\\\/\\\/i0.wp.com\\\/rud.is\\\/b\\\/wp-content\\\/uploads\\\/2017\\\/11\\\/plot_zoom_png-3-1.png?fit=2932%2C876&ssl=1\",\"width\":2932,\"height\":876},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\\\/\\\/rud.is\\\/b\\\/2017\\\/11\\\/29\\\/sentiment-analysis-of-a-christmas-carol\\\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Home\",\"item\":\"https:\\\/\\\/rud.is\\\/b\\\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"Sentiment Analysis of &#8220;A Christmas Carol&#8221;\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\\\/\\\/rud.is\\\/b\\\/#website\",\"url\":\"https:\\\/\\\/rud.is\\\/b\\\/\",\"name\":\"rud.is\",\"description\":\"&quot;In God we trust. All others must bring data&quot;\",\"publisher\":{\"@id\":\"https:\\\/\\\/rud.is\\\/b\\\/#\\\/schema\\\/person\\\/d7cb7487ab0527447f7fda5c423ff886\"},\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\\\/\\\/rud.is\\\/b\\\/?s={search_term_string}\"},\"query-input\":{\"@type\":\"PropertyValueSpecification\",\"valueRequired\":true,\"valueName\":\"search_term_string\"}}],\"inLanguage\":\"en-US\"},{\"@type\":[\"Person\",\"Organization\"],\"@id\":\"https:\\\/\\\/rud.is\\\/b\\\/#\\\/schema\\\/person\\\/d7cb7487ab0527447f7fda5c423ff886\",\"name\":\"hrbrmstr\",\"image\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\\\/\\\/i0.wp.com\\\/rud.is\\\/b\\\/wp-content\\\/uploads\\\/2023\\\/10\\\/ukr-shield.png?fit=460%2C460&ssl=1\",\"url\":\"https:\\\/\\\/i0.wp.com\\\/rud.is\\\/b\\\/wp-content\\\/uploads\\\/2023\\\/10\\\/ukr-shield.png?fit=460%2C460&ssl=1\",\"contentUrl\":\"https:\\\/\\\/i0.wp.com\\\/rud.is\\\/b\\\/wp-content\\\/uploads\\\/2023\\\/10\\\/ukr-shield.png?fit=460%2C460&ssl=1\",\"width\":460,\"height\":460,\"caption\":\"hrbrmstr\"},\"logo\":{\"@id\":\"https:\\\/\\\/i0.wp.com\\\/rud.is\\\/b\\\/wp-content\\\/uploads\\\/2023\\\/10\\\/ukr-shield.png?fit=460%2C460&ssl=1\"},\"description\":\"Don't look at me\u2026I do what he does \u2014 just slower. #rstats avuncular \u2022 ?Resistance Fighter \u2022 Cook \u2022 Christian \u2022 [Master] Chef des Donn\u00e9es de S\u00e9curit\u00e9 @ @rapid7\",\"sameAs\":[\"http:\\\/\\\/rud.is\"],\"url\":\"https:\\\/\\\/rud.is\\\/b\\\/author\\\/hrbrmstr\\\/\"}]}<\/script>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"Sentiment Analysis of \"A Christmas Carol\" - rud.is","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/rud.is\/b\/2017\/11\/29\/sentiment-analysis-of-a-christmas-carol\/","og_locale":"en_US","og_type":"article","og_title":"Sentiment Analysis of \"A Christmas Carol\" - rud.is","og_description":"Our family has been reading, listening to and watching &#8220;A Christmas Carol&#8221; for just abt 30 years now. I got it into my crazy noggin to perform a sentiment analysis on it the other day and tweeted out the results, but a large chunk of the R community is not on Twitter and it would [&hellip;]","og_url":"https:\/\/rud.is\/b\/2017\/11\/29\/sentiment-analysis-of-a-christmas-carol\/","og_site_name":"rud.is","article_published_time":"2017-11-29T23:16:55+00:00","article_modified_time":"2018-03-10T12:53:58+00:00","og_image":[{"width":2932,"height":876,"url":"https:\/\/i0.wp.com\/rud.is\/b\/wp-content\/uploads\/2017\/11\/plot_zoom_png-3-1.png?fit=2932%2C876&ssl=1","type":"image\/png"}],"author":"hrbrmstr","twitter_card":"summary_large_image","twitter_misc":{"Written by":"hrbrmstr","Est. reading time":"7 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"Article","@id":"https:\/\/rud.is\/b\/2017\/11\/29\/sentiment-analysis-of-a-christmas-carol\/#article","isPartOf":{"@id":"https:\/\/rud.is\/b\/2017\/11\/29\/sentiment-analysis-of-a-christmas-carol\/"},"author":{"name":"hrbrmstr","@id":"https:\/\/rud.is\/b\/#\/schema\/person\/d7cb7487ab0527447f7fda5c423ff886"},"headline":"Sentiment Analysis of &#8220;A Christmas Carol&#8221;","datePublished":"2017-11-29T23:16:55+00:00","dateModified":"2018-03-10T12:53:58+00:00","mainEntityOfPage":{"@id":"https:\/\/rud.is\/b\/2017\/11\/29\/sentiment-analysis-of-a-christmas-carol\/"},"wordCount":1385,"commentCount":7,"publisher":{"@id":"https:\/\/rud.is\/b\/#\/schema\/person\/d7cb7487ab0527447f7fda5c423ff886"},"image":{"@id":"https:\/\/rud.is\/b\/2017\/11\/29\/sentiment-analysis-of-a-christmas-carol\/#primaryimage"},"thumbnailUrl":"https:\/\/i0.wp.com\/rud.is\/b\/wp-content\/uploads\/2017\/11\/plot_zoom_png-3-1.png?fit=2932%2C876&ssl=1","keywords":["post"],"articleSection":["data wrangling","R","text-mining"],"inLanguage":"en-US","potentialAction":[{"@type":"CommentAction","name":"Comment","target":["https:\/\/rud.is\/b\/2017\/11\/29\/sentiment-analysis-of-a-christmas-carol\/#respond"]}]},{"@type":"WebPage","@id":"https:\/\/rud.is\/b\/2017\/11\/29\/sentiment-analysis-of-a-christmas-carol\/","url":"https:\/\/rud.is\/b\/2017\/11\/29\/sentiment-analysis-of-a-christmas-carol\/","name":"Sentiment Analysis of \"A Christmas Carol\" - rud.is","isPartOf":{"@id":"https:\/\/rud.is\/b\/#website"},"primaryImageOfPage":{"@id":"https:\/\/rud.is\/b\/2017\/11\/29\/sentiment-analysis-of-a-christmas-carol\/#primaryimage"},"image":{"@id":"https:\/\/rud.is\/b\/2017\/11\/29\/sentiment-analysis-of-a-christmas-carol\/#primaryimage"},"thumbnailUrl":"https:\/\/i0.wp.com\/rud.is\/b\/wp-content\/uploads\/2017\/11\/plot_zoom_png-3-1.png?fit=2932%2C876&ssl=1","datePublished":"2017-11-29T23:16:55+00:00","dateModified":"2018-03-10T12:53:58+00:00","breadcrumb":{"@id":"https:\/\/rud.is\/b\/2017\/11\/29\/sentiment-analysis-of-a-christmas-carol\/#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/rud.is\/b\/2017\/11\/29\/sentiment-analysis-of-a-christmas-carol\/"]}]},{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/rud.is\/b\/2017\/11\/29\/sentiment-analysis-of-a-christmas-carol\/#primaryimage","url":"https:\/\/i0.wp.com\/rud.is\/b\/wp-content\/uploads\/2017\/11\/plot_zoom_png-3-1.png?fit=2932%2C876&ssl=1","contentUrl":"https:\/\/i0.wp.com\/rud.is\/b\/wp-content\/uploads\/2017\/11\/plot_zoom_png-3-1.png?fit=2932%2C876&ssl=1","width":2932,"height":876},{"@type":"BreadcrumbList","@id":"https:\/\/rud.is\/b\/2017\/11\/29\/sentiment-analysis-of-a-christmas-carol\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/rud.is\/b\/"},{"@type":"ListItem","position":2,"name":"Sentiment Analysis of &#8220;A Christmas Carol&#8221;"}]},{"@type":"WebSite","@id":"https:\/\/rud.is\/b\/#website","url":"https:\/\/rud.is\/b\/","name":"rud.is","description":"&quot;In God we trust. All others must bring data&quot;","publisher":{"@id":"https:\/\/rud.is\/b\/#\/schema\/person\/d7cb7487ab0527447f7fda5c423ff886"},"potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/rud.is\/b\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"en-US"},{"@type":["Person","Organization"],"@id":"https:\/\/rud.is\/b\/#\/schema\/person\/d7cb7487ab0527447f7fda5c423ff886","name":"hrbrmstr","image":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/i0.wp.com\/rud.is\/b\/wp-content\/uploads\/2023\/10\/ukr-shield.png?fit=460%2C460&ssl=1","url":"https:\/\/i0.wp.com\/rud.is\/b\/wp-content\/uploads\/2023\/10\/ukr-shield.png?fit=460%2C460&ssl=1","contentUrl":"https:\/\/i0.wp.com\/rud.is\/b\/wp-content\/uploads\/2023\/10\/ukr-shield.png?fit=460%2C460&ssl=1","width":460,"height":460,"caption":"hrbrmstr"},"logo":{"@id":"https:\/\/i0.wp.com\/rud.is\/b\/wp-content\/uploads\/2023\/10\/ukr-shield.png?fit=460%2C460&ssl=1"},"description":"Don't look at me\u2026I do what he does \u2014 just slower. #rstats avuncular \u2022 ?Resistance Fighter \u2022 Cook \u2022 Christian \u2022 [Master] Chef des Donn\u00e9es de S\u00e9curit\u00e9 @ @rapid7","sameAs":["http:\/\/rud.is"],"url":"https:\/\/rud.is\/b\/author\/hrbrmstr\/"}]}},"jetpack_featured_media_url":"https:\/\/i0.wp.com\/rud.is\/b\/wp-content\/uploads\/2017\/11\/plot_zoom_png-3-1.png?fit=2932%2C876&ssl=1","jetpack_shortlink":"https:\/\/wp.me\/p23idr-1Ui","jetpack_likes_enabled":true,"jetpack-related-posts":[{"id":2933,"url":"https:\/\/rud.is\/b\/2014\/02\/20\/using-twitter-as-a-data-source-for-monitoring-password-dumps\/","url_meta":{"origin":7334,"position":0},"title":"Using Twitter as a Data Source For Monitoring Password Dumps","author":"hrbrmstr","date":"2014-02-20","format":false,"excerpt":"I shot a quick post over at the [Data Driven Security blog](http:\/\/bit.ly\/1hyqJiT) explaining how to separate Twitter data gathering from R code via the Ruby `t` ([github repo](https:\/\/github.com\/sferik\/t)) command. Using `t` frees R code from having to be a Twitter processor and lets the analyst focus on analysis and visualization,\u2026","rel":"","context":"In &quot;Data Analysis&quot;","block_context":{"text":"Data Analysis","link":"https:\/\/rud.is\/b\/category\/data-analysis-2\/"},"img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":2782,"url":"https:\/\/rud.is\/b\/2013\/11\/13\/visual-anatomy-of-r-packages-used-in-data-driven-security\/","url_meta":{"origin":7334,"position":1},"title":"Visual Anatomy Of R Packages Used in Data Driven Security","author":"hrbrmstr","date":"2013-11-13","format":false,"excerpt":"Since @jayjacobs & I are down to the home stretch on Data Driven Security, I thought it would be interesting to do some post-writing pseudo-analyses of the book itself. I won't have exact page or word counts for a bit, but I wanted to see how many R packages we\u2026","rel":"","context":"In &quot;Data Analysis&quot;","block_context":{"text":"Data Analysis","link":"https:\/\/rud.is\/b\/category\/data-analysis-2\/"},"img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":2190,"url":"https:\/\/rud.is\/b\/2013\/02\/27\/follow-upresources-grc-t18-data-analysis-and-visualization-for-security-professionals-rsac\/","url_meta":{"origin":7334,"position":2},"title":"Follow up\/Resources :: GRC-T18 \u2013 Data Analysis and Visualization for Security Professionals #RSAC","author":"hrbrmstr","date":"2013-02-27","format":false,"excerpt":"Many thanks to all who attended the talk @jayjacobs & I gave at RSA on Tuesday, February 26, 2013. It was really great to be able to talk to so many of you afterwards as well. We've enumerated quite a bit of non-slide-but-in-presentation information that we wanted to aggregate into\u2026","rel":"","context":"In &quot;Charts &amp; Graphs&quot;","block_context":{"text":"Charts &amp; Graphs","link":"https:\/\/rud.is\/b\/category\/charts-graphs\/"},"img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":12510,"url":"https:\/\/rud.is\/b\/2019\/09\/14\/twitter-account-analysis-in-r\/","url_meta":{"origin":7334,"position":3},"title":"Twitter &#8220;Account Analysis&#8221; in R","author":"hrbrmstr","date":"2019-09-14","format":false,"excerpt":"This past week @propublica linked to a really spiffy resource for getting an overview of a Twitter user's profile and activity called accountanalysis. It has a beautiful interface that works as well on mobile as it does in a real browser. It also is fully interactive and supports cross-filtering (zoom\u2026","rel":"","context":"In &quot;R&quot;","block_context":{"text":"R","link":"https:\/\/rud.is\/b\/category\/r\/"},"img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":13120,"url":"https:\/\/rud.is\/b\/2021\/07\/20\/packet-maze-solving-a-cyberdefenders-pcap-puzzle-with-r-zeek-and-tshark\/","url_meta":{"origin":7334,"position":4},"title":"Packet Maze: Solving a CyberDefenders PCAP Puzzle with R, Zeek, and tshark","author":"hrbrmstr","date":"2021-07-20","format":false,"excerpt":"It was a rainy weekend in southern Maine and I really didn't feel like doing chores, so I was skimming through RSS feeds and noticed a link to a PacketMaze challenge in the latest This Week In 4n6. Since it's also been a while since I've done any serious content\u2026","rel":"","context":"In &quot;Cybersecurity&quot;","block_context":{"text":"Cybersecurity","link":"https:\/\/rud.is\/b\/category\/cybersecurity\/"},"img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":4460,"url":"https:\/\/rud.is\/b\/2016\/06\/19\/a-call-to-armslist-data-analysis\/","url_meta":{"origin":7334,"position":5},"title":"A Call to Arms[list] Data Analysis!","author":"hrbrmstr","date":"2016-06-19","format":false,"excerpt":"The NPR vis team contributed to a recent [story](http:\/\/n.pr\/1USSliN) about Armslist, a \"craigslist for guns\". Now, I'm neither pro-\"gun\" or anti-\"gun\" since this subject, like most heated ones, has more than two sides. What I _am_ is pro-*data*, and the U.S. Congress is so [deep in the pockets of the\u2026","rel":"","context":"In &quot;Charts &amp; Graphs&quot;","block_context":{"text":"Charts &amp; Graphs","link":"https:\/\/rud.is\/b\/category\/charts-graphs\/"},"img":{"alt_text":"","src":"https:\/\/i0.wp.com\/rud.is\/b\/wp-content\/uploads\/2016\/06\/per-day-1.png?fit=1200%2C750&ssl=1&resize=350%2C200","width":350,"height":200,"srcset":"https:\/\/i0.wp.com\/rud.is\/b\/wp-content\/uploads\/2016\/06\/per-day-1.png?fit=1200%2C750&ssl=1&resize=350%2C200 1x, https:\/\/i0.wp.com\/rud.is\/b\/wp-content\/uploads\/2016\/06\/per-day-1.png?fit=1200%2C750&ssl=1&resize=525%2C300 1.5x, https:\/\/i0.wp.com\/rud.is\/b\/wp-content\/uploads\/2016\/06\/per-day-1.png?fit=1200%2C750&ssl=1&resize=700%2C400 2x, https:\/\/i0.wp.com\/rud.is\/b\/wp-content\/uploads\/2016\/06\/per-day-1.png?fit=1200%2C750&ssl=1&resize=1050%2C600 3x"},"classes":[]}],"jetpack_sharing_enabled":true,"_links":{"self":[{"href":"https:\/\/rud.is\/b\/wp-json\/wp\/v2\/posts\/7334","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/rud.is\/b\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/rud.is\/b\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/rud.is\/b\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/rud.is\/b\/wp-json\/wp\/v2\/comments?post=7334"}],"version-history":[{"count":0,"href":"https:\/\/rud.is\/b\/wp-json\/wp\/v2\/posts\/7334\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/rud.is\/b\/wp-json\/wp\/v2\/media\/7343"}],"wp:attachment":[{"href":"https:\/\/rud.is\/b\/wp-json\/wp\/v2\/media?parent=7334"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/rud.is\/b\/wp-json\/wp\/v2\/categories?post=7334"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/rud.is\/b\/wp-json\/wp\/v2\/tags?post=7334"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}