

{"id":3642,"date":"2015-08-24T12:51:05","date_gmt":"2015-08-24T17:51:05","guid":{"rendered":"http:\/\/rud.is\/b\/?p=3642"},"modified":"2018-03-07T16:43:28","modified_gmt":"2018-03-07T21:43:28","slug":"new-pacakge-docxtractr-easily-extract-tables-from-microsoft-word-docs","status":"publish","type":"post","link":"https:\/\/rud.is\/b\/2015\/08\/24\/new-pacakge-docxtractr-easily-extract-tables-from-microsoft-word-docs\/","title":{"rendered":"New Pacakge &#8220;docxtractr&#8221; &#8211; Easily Extract Tables From Microsoft Word Docs"},"content":{"rendered":"<p><strong>UPDATE<\/strong>: `docxtractr` is now [on CRAN](https:\/\/cran.rstudio.com\/web\/packages\/docxtractr\/index.html)<\/p>\n<p>&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;<\/p>\n<p>This is more of a follow-up from [yesterday&#8217;s post](http:\/\/rud.is\/b\/2015\/08\/23\/using-r-to-get-data-out-of-word-docs\/). The hack and function in said post was fine, but it was limited to uniform tables and made you do more work than you had to. So, there&#8217;s now a `devtools`-installable package [on github](https:\/\/github.com\/hrbrmstr\/docxtractr) that makes it way easier to get information about the tables in a Word document and extract them&mdash;uniform or not.<\/p>\n<p>There are plenty of examples in the GitHub README and also in the package examples. But, I will show the basic functionality here.<\/p>\n<p>The package ships with four example Word documents, but we&#8217;ll work with the last one: `complex.doc`. It has five tables and the last two have varying columns and rows and look like:<\/p>\n<p><img data-recalc-dims=\"1\" loading=\"lazy\" decoding=\"async\" data-attachment-id=\"3643\" data-permalink=\"https:\/\/rud.is\/b\/2015\/08\/24\/new-pacakge-docxtractr-easily-extract-tables-from-microsoft-word-docs\/complex\/\" data-orig-file=\"https:\/\/i0.wp.com\/rud.is\/b\/wp-content\/uploads\/2015\/08\/complex.png?fit=401%2C333&amp;ssl=1\" data-orig-size=\"401,333\" data-comments-opened=\"1\" data-image-meta=\"{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}\" data-image-title=\"complex\" data-image-description=\"\" data-image-caption=\"\" data-large-file=\"https:\/\/i0.wp.com\/rud.is\/b\/wp-content\/uploads\/2015\/08\/complex.png?fit=401%2C333&amp;ssl=1\" src=\"https:\/\/i0.wp.com\/rud.is\/b\/wp-content\/uploads\/2015\/08\/complex.png?resize=401%2C333&#038;ssl=1\" alt=\"complex\" width=\"401\" height=\"333\" class=\"aligncenter size-full wp-image-3643\" \/><\/p>\n<p>Let&#8217;s read those two in:<\/p>\n<pre id=\"prism-r-code\"><code class=\"language-r\">complx &lt;- read_docx(system.file(&quot;examples\/complex.docx&quot;, package=&quot;docxtractr&quot;))\r\n\r\ndocx_tbl_count(complx)\r\n#&gt; [1] 5\r\n\r\ndocx_describe_tbls(complx)\r\n#&gt; Word document [\/Library\/Frameworks\/R.framework\/Versions\/3.2\/Resources\/library\/docxtractr\/examples\/complex.docx]\r\n#&gt; \r\n#&gt; Table 1\r\n#&gt;   total cells: 16\r\n#&gt;   row count  : 4\r\n#&gt;   uniform    : likely!\r\n#&gt;   has header : likely! =&gt; possibly [This, Is, A, Column]\r\n#&gt; \r\n#&gt; Table 2\r\n#&gt;   total cells: 12\r\n#&gt;   row count  : 4\r\n#&gt;   uniform    : likely!\r\n#&gt;   has header : likely! =&gt; possibly [Foo, Bar, Baz]\r\n#&gt; \r\n#&gt; Table 3\r\n#&gt;   total cells: 14\r\n#&gt;   row count  : 7\r\n#&gt;   uniform    : likely!\r\n#&gt;   has header : likely! =&gt; possibly [Foo, Bar]\r\n#&gt; \r\n#&gt; Table 4\r\n#&gt;   total cells: 11\r\n#&gt;   row count  : 4\r\n#&gt;   uniform    : unlikely =&gt; found differing cell counts (3, 2) across some rows \r\n#&gt;   has header : likely! =&gt; possibly [Foo, Bar, Baz]\r\n#&gt; \r\n#&gt; Table 5\r\n#&gt;   total cells: 21\r\n#&gt;   row count  : 7\r\n#&gt;   uniform    : likely!\r\n#&gt;   has header : unlikely\r\n\r\n\r\ndocx_extract_tbl(complx, 4, header=TRUE)\r\n#&gt; Source: local data frame [3 x 3]\r\n#&gt; \r\n#&gt;   Foo  Bar Baz\r\n#&gt; 1  Aa BbCc  NA\r\n#&gt; 2  Dd   Ee  Ff\r\n#&gt; 3  Gg   Hh  ii\r\n\r\ndocx_extract_tbl(complx, 5, header=TRUE)\r\n#&gt; Source: local data frame [6 x 3]\r\n#&gt; \r\n#&gt;    Foo Bar Baz\r\n#&gt; 1   Aa  Bb  Cc\r\n#&gt; 2   Dd  Ee  Ff\r\n#&gt; 3   Gg  Hh  Ii\r\n#&gt; 4 Jj88  Kk  Ll\r\n#&gt; 5       Uu  Ii\r\n#&gt; 6   Hh  Ii   h<\/code><\/pre>\n<p>It reads in &#8220;uniform&#8221; tables properly and will warn you if there is a header marked in Word but not asked for in the extraction.<\/p>\n<p>Next steps are to both allow specifying column types and try to guess column types (`readr` has some nice functions for this) and perhaps return more metadata (if possible).<\/p>\n<p>Feature requests &#038; bug reports are most welcome [on GitHub](https:\/\/github.com\/hrbrmstr\/docxtractr\/issues).<\/p>\n","protected":false},"excerpt":{"rendered":"<p>UPDATE: `docxtractr` is now [on CRAN](https:\/\/cran.rstudio.com\/web\/packages\/docxtractr\/index.html) &#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212; This is more of a follow-up from [yesterday&#8217;s post](http:\/\/rud.is\/b\/2015\/08\/23\/using-r-to-get-data-out-of-word-docs\/). The hack and function in said post was fine, but it was limited to uniform tables and made you do more work than you had to. So, there&#8217;s now a `devtools`-installable package [on github](https:\/\/github.com\/hrbrmstr\/docxtractr) that makes it way easier [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"jetpack_post_was_ever_published":false,"_jetpack_newsletter_access":"","_jetpack_dont_email_post_to_subs":true,"_jetpack_newsletter_tier_id":0,"_jetpack_memberships_contains_paywalled_content":false,"_jetpack_memberships_contains_paid_content":false,"activitypub_content_warning":"","activitypub_content_visibility":"","activitypub_max_image_attachments":3,"activitypub_interaction_policy_quote":"anyone","activitypub_status":"","footnotes":""},"categories":[91,732],"tags":[810],"class_list":["post-3642","post","type-post","status-publish","format-standard","hentry","category-r","category-xml","tag-post"],"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v27.3 - https:\/\/yoast.com\/product\/yoast-seo-wordpress\/ -->\n<title>New Pacakge &quot;docxtractr&quot; - Easily Extract Tables From Microsoft Word Docs - rud.is<\/title>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/rud.is\/b\/2015\/08\/24\/new-pacakge-docxtractr-easily-extract-tables-from-microsoft-word-docs\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"New Pacakge &quot;docxtractr&quot; - Easily Extract Tables From Microsoft Word Docs - rud.is\" \/>\n<meta property=\"og:description\" content=\"UPDATE: `docxtractr` is now [on CRAN](https:\/\/cran.rstudio.com\/web\/packages\/docxtractr\/index.html) &#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212; This is more of a follow-up from [yesterday&#8217;s post](http:\/\/rud.is\/b\/2015\/08\/23\/using-r-to-get-data-out-of-word-docs\/). The hack and function in said post was fine, but it was limited to uniform tables and made you do more work than you had to. So, there&#8217;s now a `devtools`-installable package [on github](https:\/\/github.com\/hrbrmstr\/docxtractr) that makes it way easier [&hellip;]\" \/>\n<meta property=\"og:url\" content=\"https:\/\/rud.is\/b\/2015\/08\/24\/new-pacakge-docxtractr-easily-extract-tables-from-microsoft-word-docs\/\" \/>\n<meta property=\"og:site_name\" content=\"rud.is\" \/>\n<meta property=\"article:published_time\" content=\"2015-08-24T17:51:05+00:00\" \/>\n<meta property=\"article:modified_time\" content=\"2018-03-07T21:43:28+00:00\" \/>\n<meta property=\"og:image\" content=\"https:\/\/rud.is\/b\/wp-content\/uploads\/2015\/08\/complex.png\" \/>\n<meta name=\"author\" content=\"hrbrmstr\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:label1\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data1\" content=\"hrbrmstr\" \/>\n\t<meta name=\"twitter:label2\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data2\" content=\"1 minute\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\\\/\\\/schema.org\",\"@graph\":[{\"@type\":\"Article\",\"@id\":\"https:\\\/\\\/rud.is\\\/b\\\/2015\\\/08\\\/24\\\/new-pacakge-docxtractr-easily-extract-tables-from-microsoft-word-docs\\\/#article\",\"isPartOf\":{\"@id\":\"https:\\\/\\\/rud.is\\\/b\\\/2015\\\/08\\\/24\\\/new-pacakge-docxtractr-easily-extract-tables-from-microsoft-word-docs\\\/\"},\"author\":{\"name\":\"hrbrmstr\",\"@id\":\"https:\\\/\\\/rud.is\\\/b\\\/#\\\/schema\\\/person\\\/d7cb7487ab0527447f7fda5c423ff886\"},\"headline\":\"New Pacakge &#8220;docxtractr&#8221; &#8211; Easily Extract Tables From Microsoft Word Docs\",\"datePublished\":\"2015-08-24T17:51:05+00:00\",\"dateModified\":\"2018-03-07T21:43:28+00:00\",\"mainEntityOfPage\":{\"@id\":\"https:\\\/\\\/rud.is\\\/b\\\/2015\\\/08\\\/24\\\/new-pacakge-docxtractr-easily-extract-tables-from-microsoft-word-docs\\\/\"},\"wordCount\":242,\"commentCount\":4,\"publisher\":{\"@id\":\"https:\\\/\\\/rud.is\\\/b\\\/#\\\/schema\\\/person\\\/d7cb7487ab0527447f7fda5c423ff886\"},\"image\":{\"@id\":\"https:\\\/\\\/rud.is\\\/b\\\/2015\\\/08\\\/24\\\/new-pacakge-docxtractr-easily-extract-tables-from-microsoft-word-docs\\\/#primaryimage\"},\"thumbnailUrl\":\"https:\\\/\\\/rud.is\\\/b\\\/wp-content\\\/uploads\\\/2015\\\/08\\\/complex.png\",\"keywords\":[\"post\"],\"articleSection\":[\"R\",\"xml\"],\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"CommentAction\",\"name\":\"Comment\",\"target\":[\"https:\\\/\\\/rud.is\\\/b\\\/2015\\\/08\\\/24\\\/new-pacakge-docxtractr-easily-extract-tables-from-microsoft-word-docs\\\/#respond\"]}]},{\"@type\":\"WebPage\",\"@id\":\"https:\\\/\\\/rud.is\\\/b\\\/2015\\\/08\\\/24\\\/new-pacakge-docxtractr-easily-extract-tables-from-microsoft-word-docs\\\/\",\"url\":\"https:\\\/\\\/rud.is\\\/b\\\/2015\\\/08\\\/24\\\/new-pacakge-docxtractr-easily-extract-tables-from-microsoft-word-docs\\\/\",\"name\":\"New Pacakge \\\"docxtractr\\\" - Easily Extract Tables From Microsoft Word Docs - rud.is\",\"isPartOf\":{\"@id\":\"https:\\\/\\\/rud.is\\\/b\\\/#website\"},\"primaryImageOfPage\":{\"@id\":\"https:\\\/\\\/rud.is\\\/b\\\/2015\\\/08\\\/24\\\/new-pacakge-docxtractr-easily-extract-tables-from-microsoft-word-docs\\\/#primaryimage\"},\"image\":{\"@id\":\"https:\\\/\\\/rud.is\\\/b\\\/2015\\\/08\\\/24\\\/new-pacakge-docxtractr-easily-extract-tables-from-microsoft-word-docs\\\/#primaryimage\"},\"thumbnailUrl\":\"https:\\\/\\\/rud.is\\\/b\\\/wp-content\\\/uploads\\\/2015\\\/08\\\/complex.png\",\"datePublished\":\"2015-08-24T17:51:05+00:00\",\"dateModified\":\"2018-03-07T21:43:28+00:00\",\"breadcrumb\":{\"@id\":\"https:\\\/\\\/rud.is\\\/b\\\/2015\\\/08\\\/24\\\/new-pacakge-docxtractr-easily-extract-tables-from-microsoft-word-docs\\\/#breadcrumb\"},\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\\\/\\\/rud.is\\\/b\\\/2015\\\/08\\\/24\\\/new-pacakge-docxtractr-easily-extract-tables-from-microsoft-word-docs\\\/\"]}]},{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\\\/\\\/rud.is\\\/b\\\/2015\\\/08\\\/24\\\/new-pacakge-docxtractr-easily-extract-tables-from-microsoft-word-docs\\\/#primaryimage\",\"url\":\"https:\\\/\\\/i0.wp.com\\\/rud.is\\\/b\\\/wp-content\\\/uploads\\\/2015\\\/08\\\/complex.png?fit=401%2C333&ssl=1\",\"contentUrl\":\"https:\\\/\\\/i0.wp.com\\\/rud.is\\\/b\\\/wp-content\\\/uploads\\\/2015\\\/08\\\/complex.png?fit=401%2C333&ssl=1\",\"width\":401,\"height\":333},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\\\/\\\/rud.is\\\/b\\\/2015\\\/08\\\/24\\\/new-pacakge-docxtractr-easily-extract-tables-from-microsoft-word-docs\\\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Home\",\"item\":\"https:\\\/\\\/rud.is\\\/b\\\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"New Pacakge &#8220;docxtractr&#8221; &#8211; Easily Extract Tables From Microsoft Word Docs\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\\\/\\\/rud.is\\\/b\\\/#website\",\"url\":\"https:\\\/\\\/rud.is\\\/b\\\/\",\"name\":\"rud.is\",\"description\":\"&quot;In God we trust. All others must bring data&quot;\",\"publisher\":{\"@id\":\"https:\\\/\\\/rud.is\\\/b\\\/#\\\/schema\\\/person\\\/d7cb7487ab0527447f7fda5c423ff886\"},\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\\\/\\\/rud.is\\\/b\\\/?s={search_term_string}\"},\"query-input\":{\"@type\":\"PropertyValueSpecification\",\"valueRequired\":true,\"valueName\":\"search_term_string\"}}],\"inLanguage\":\"en-US\"},{\"@type\":[\"Person\",\"Organization\"],\"@id\":\"https:\\\/\\\/rud.is\\\/b\\\/#\\\/schema\\\/person\\\/d7cb7487ab0527447f7fda5c423ff886\",\"name\":\"hrbrmstr\",\"image\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\\\/\\\/i0.wp.com\\\/rud.is\\\/b\\\/wp-content\\\/uploads\\\/2023\\\/10\\\/ukr-shield.png?fit=460%2C460&ssl=1\",\"url\":\"https:\\\/\\\/i0.wp.com\\\/rud.is\\\/b\\\/wp-content\\\/uploads\\\/2023\\\/10\\\/ukr-shield.png?fit=460%2C460&ssl=1\",\"contentUrl\":\"https:\\\/\\\/i0.wp.com\\\/rud.is\\\/b\\\/wp-content\\\/uploads\\\/2023\\\/10\\\/ukr-shield.png?fit=460%2C460&ssl=1\",\"width\":460,\"height\":460,\"caption\":\"hrbrmstr\"},\"logo\":{\"@id\":\"https:\\\/\\\/i0.wp.com\\\/rud.is\\\/b\\\/wp-content\\\/uploads\\\/2023\\\/10\\\/ukr-shield.png?fit=460%2C460&ssl=1\"},\"description\":\"Don't look at me\u2026I do what he does \u2014 just slower. #rstats avuncular \u2022 ?Resistance Fighter \u2022 Cook \u2022 Christian \u2022 [Master] Chef des Donn\u00e9es de S\u00e9curit\u00e9 @ @rapid7\",\"sameAs\":[\"http:\\\/\\\/rud.is\"],\"url\":\"https:\\\/\\\/rud.is\\\/b\\\/author\\\/hrbrmstr\\\/\"}]}<\/script>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"New Pacakge \"docxtractr\" - Easily Extract Tables From Microsoft Word Docs - rud.is","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/rud.is\/b\/2015\/08\/24\/new-pacakge-docxtractr-easily-extract-tables-from-microsoft-word-docs\/","og_locale":"en_US","og_type":"article","og_title":"New Pacakge \"docxtractr\" - Easily Extract Tables From Microsoft Word Docs - rud.is","og_description":"UPDATE: `docxtractr` is now [on CRAN](https:\/\/cran.rstudio.com\/web\/packages\/docxtractr\/index.html) &#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212; This is more of a follow-up from [yesterday&#8217;s post](http:\/\/rud.is\/b\/2015\/08\/23\/using-r-to-get-data-out-of-word-docs\/). The hack and function in said post was fine, but it was limited to uniform tables and made you do more work than you had to. So, there&#8217;s now a `devtools`-installable package [on github](https:\/\/github.com\/hrbrmstr\/docxtractr) that makes it way easier [&hellip;]","og_url":"https:\/\/rud.is\/b\/2015\/08\/24\/new-pacakge-docxtractr-easily-extract-tables-from-microsoft-word-docs\/","og_site_name":"rud.is","article_published_time":"2015-08-24T17:51:05+00:00","article_modified_time":"2018-03-07T21:43:28+00:00","og_image":[{"url":"https:\/\/rud.is\/b\/wp-content\/uploads\/2015\/08\/complex.png","type":"","width":"","height":""}],"author":"hrbrmstr","twitter_card":"summary_large_image","twitter_misc":{"Written by":"hrbrmstr","Est. reading time":"1 minute"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"Article","@id":"https:\/\/rud.is\/b\/2015\/08\/24\/new-pacakge-docxtractr-easily-extract-tables-from-microsoft-word-docs\/#article","isPartOf":{"@id":"https:\/\/rud.is\/b\/2015\/08\/24\/new-pacakge-docxtractr-easily-extract-tables-from-microsoft-word-docs\/"},"author":{"name":"hrbrmstr","@id":"https:\/\/rud.is\/b\/#\/schema\/person\/d7cb7487ab0527447f7fda5c423ff886"},"headline":"New Pacakge &#8220;docxtractr&#8221; &#8211; Easily Extract Tables From Microsoft Word Docs","datePublished":"2015-08-24T17:51:05+00:00","dateModified":"2018-03-07T21:43:28+00:00","mainEntityOfPage":{"@id":"https:\/\/rud.is\/b\/2015\/08\/24\/new-pacakge-docxtractr-easily-extract-tables-from-microsoft-word-docs\/"},"wordCount":242,"commentCount":4,"publisher":{"@id":"https:\/\/rud.is\/b\/#\/schema\/person\/d7cb7487ab0527447f7fda5c423ff886"},"image":{"@id":"https:\/\/rud.is\/b\/2015\/08\/24\/new-pacakge-docxtractr-easily-extract-tables-from-microsoft-word-docs\/#primaryimage"},"thumbnailUrl":"https:\/\/rud.is\/b\/wp-content\/uploads\/2015\/08\/complex.png","keywords":["post"],"articleSection":["R","xml"],"inLanguage":"en-US","potentialAction":[{"@type":"CommentAction","name":"Comment","target":["https:\/\/rud.is\/b\/2015\/08\/24\/new-pacakge-docxtractr-easily-extract-tables-from-microsoft-word-docs\/#respond"]}]},{"@type":"WebPage","@id":"https:\/\/rud.is\/b\/2015\/08\/24\/new-pacakge-docxtractr-easily-extract-tables-from-microsoft-word-docs\/","url":"https:\/\/rud.is\/b\/2015\/08\/24\/new-pacakge-docxtractr-easily-extract-tables-from-microsoft-word-docs\/","name":"New Pacakge \"docxtractr\" - Easily Extract Tables From Microsoft Word Docs - rud.is","isPartOf":{"@id":"https:\/\/rud.is\/b\/#website"},"primaryImageOfPage":{"@id":"https:\/\/rud.is\/b\/2015\/08\/24\/new-pacakge-docxtractr-easily-extract-tables-from-microsoft-word-docs\/#primaryimage"},"image":{"@id":"https:\/\/rud.is\/b\/2015\/08\/24\/new-pacakge-docxtractr-easily-extract-tables-from-microsoft-word-docs\/#primaryimage"},"thumbnailUrl":"https:\/\/rud.is\/b\/wp-content\/uploads\/2015\/08\/complex.png","datePublished":"2015-08-24T17:51:05+00:00","dateModified":"2018-03-07T21:43:28+00:00","breadcrumb":{"@id":"https:\/\/rud.is\/b\/2015\/08\/24\/new-pacakge-docxtractr-easily-extract-tables-from-microsoft-word-docs\/#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/rud.is\/b\/2015\/08\/24\/new-pacakge-docxtractr-easily-extract-tables-from-microsoft-word-docs\/"]}]},{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/rud.is\/b\/2015\/08\/24\/new-pacakge-docxtractr-easily-extract-tables-from-microsoft-word-docs\/#primaryimage","url":"https:\/\/i0.wp.com\/rud.is\/b\/wp-content\/uploads\/2015\/08\/complex.png?fit=401%2C333&ssl=1","contentUrl":"https:\/\/i0.wp.com\/rud.is\/b\/wp-content\/uploads\/2015\/08\/complex.png?fit=401%2C333&ssl=1","width":401,"height":333},{"@type":"BreadcrumbList","@id":"https:\/\/rud.is\/b\/2015\/08\/24\/new-pacakge-docxtractr-easily-extract-tables-from-microsoft-word-docs\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/rud.is\/b\/"},{"@type":"ListItem","position":2,"name":"New Pacakge &#8220;docxtractr&#8221; &#8211; Easily Extract Tables From Microsoft Word Docs"}]},{"@type":"WebSite","@id":"https:\/\/rud.is\/b\/#website","url":"https:\/\/rud.is\/b\/","name":"rud.is","description":"&quot;In God we trust. All others must bring data&quot;","publisher":{"@id":"https:\/\/rud.is\/b\/#\/schema\/person\/d7cb7487ab0527447f7fda5c423ff886"},"potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/rud.is\/b\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"en-US"},{"@type":["Person","Organization"],"@id":"https:\/\/rud.is\/b\/#\/schema\/person\/d7cb7487ab0527447f7fda5c423ff886","name":"hrbrmstr","image":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/i0.wp.com\/rud.is\/b\/wp-content\/uploads\/2023\/10\/ukr-shield.png?fit=460%2C460&ssl=1","url":"https:\/\/i0.wp.com\/rud.is\/b\/wp-content\/uploads\/2023\/10\/ukr-shield.png?fit=460%2C460&ssl=1","contentUrl":"https:\/\/i0.wp.com\/rud.is\/b\/wp-content\/uploads\/2023\/10\/ukr-shield.png?fit=460%2C460&ssl=1","width":460,"height":460,"caption":"hrbrmstr"},"logo":{"@id":"https:\/\/i0.wp.com\/rud.is\/b\/wp-content\/uploads\/2023\/10\/ukr-shield.png?fit=460%2C460&ssl=1"},"description":"Don't look at me\u2026I do what he does \u2014 just slower. #rstats avuncular \u2022 ?Resistance Fighter \u2022 Cook \u2022 Christian \u2022 [Master] Chef des Donn\u00e9es de S\u00e9curit\u00e9 @ @rapid7","sameAs":["http:\/\/rud.is"],"url":"https:\/\/rud.is\/b\/author\/hrbrmstr\/"}]}},"jetpack_featured_media_url":"","jetpack_shortlink":"https:\/\/wp.me\/p23idr-WK","jetpack_likes_enabled":true,"jetpack-related-posts":[{"id":3633,"url":"https:\/\/rud.is\/b\/2015\/08\/23\/using-r-to-get-data-out-of-word-docs\/","url_meta":{"origin":3642,"position":0},"title":"Using R To Get Data *Out Of* Word Docs","author":"hrbrmstr","date":"2015-08-23","format":false,"excerpt":"NOTE: after reading this post head on over to this new one as it has wrapped this functionality (and more!) into a package. Also: docxtractr is now on CRAN This was asked on twitter recently: Is it possible to import data entered in MS Word into R - I have\u2026","rel":"","context":"In &quot;R&quot;","block_context":{"text":"R","link":"https:\/\/rud.is\/b\/category\/r\/"},"img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":4547,"url":"https:\/\/rud.is\/b\/2016\/07\/24\/mid-year-r-packages-update-summary\/","url_meta":{"origin":3642,"position":1},"title":"Mid-year R Packages Update Summary","author":"hrbrmstr","date":"2016-07-24","format":false,"excerpt":"I been updating some existing packages and github-releasing new ones (before a CRAN push). Most are \"cyber\"-related, but there are some general purpose ones. Here's a quick overview: docxtractr (CRAN, now, v0.2.0) was initially designed to make it easy to get data tables out of MS Word (docx) documents. The\u2026","rel":"","context":"In &quot;R&quot;","block_context":{"text":"R","link":"https:\/\/rud.is\/b\/category\/r\/"},"img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":6491,"url":"https:\/\/rud.is\/b\/2017\/09\/28\/sodd-stackoverflow-driven-development\/","url_meta":{"origin":3642,"position":2},"title":"SODD \u2014 StackOverflow Driven-Development","author":"hrbrmstr","date":"2017-09-28","format":false,"excerpt":"I occasionally hang out on StackOverflow and often use an answer as an opportunity to fill a package void for a particular need. docxtractr and qrencoder are two (of many) packages that were birthed from SO answers. I usually try to answer with inline code first then expand the functionality\u2026","rel":"","context":"In &quot;R&quot;","block_context":{"text":"R","link":"https:\/\/rud.is\/b\/category\/r\/"},"img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":11746,"url":"https:\/\/rud.is\/b\/2019\/01\/10\/waffle-geoms-other-miscellaneous-in-development-package-updates\/","url_meta":{"origin":3642,"position":3},"title":"Waffle Geoms &#038; Other Miscellaneous In-Development Package Updates","author":"hrbrmstr","date":"2019-01-10","format":false,"excerpt":"More than just sergeant has been hacked on recently, so here's a run-down of various ? updates: waffle The square pie chart generating waffle? package now contains a nascent geom_waffle() so you can do things like this: library(hrbrthemes) library(waffle) library(tidyverse) tibble( parts = factor(rep(month.abb[1:3], 3), levels=month.abb[1:3]), values = c(10, 20,\u2026","rel":"","context":"In &quot;R&quot;","block_context":{"text":"R","link":"https:\/\/rud.is\/b\/category\/r\/"},"img":{"alt_text":"","src":"https:\/\/i0.wp.com\/rud.is\/b\/wp-content\/uploads\/2019\/01\/fwg.png?fit=900%2C1200&ssl=1&resize=350%2C200","width":350,"height":200,"srcset":"https:\/\/i0.wp.com\/rud.is\/b\/wp-content\/uploads\/2019\/01\/fwg.png?fit=900%2C1200&ssl=1&resize=350%2C200 1x, https:\/\/i0.wp.com\/rud.is\/b\/wp-content\/uploads\/2019\/01\/fwg.png?fit=900%2C1200&ssl=1&resize=525%2C300 1.5x, https:\/\/i0.wp.com\/rud.is\/b\/wp-content\/uploads\/2019\/01\/fwg.png?fit=900%2C1200&ssl=1&resize=700%2C400 2x"},"classes":[]},{"id":8230,"url":"https:\/\/rud.is\/b\/2018\/02\/16\/pym-js-library-vulnerability-in-widgetframe-package\/","url_meta":{"origin":3642,"position":4},"title":"Pym.js Library Vulnerability in widgetframe Package","author":"hrbrmstr","date":"2018-02-16","format":false,"excerpt":"What's Up? The NPR Visuals Team created and maintains a javascript library that makes it super easy to embed iframes on web pages and have said documents still be responsive. The widgetframe R htmlwidget uses pym.js to bring this (much needed) functionality into widgets and (eventually) shiny apps. NPR reported\u2026","rel":"","context":"In &quot;Cybersecurity&quot;","block_context":{"text":"Cybersecurity","link":"https:\/\/rud.is\/b\/category\/cybersecurity\/"},"img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":3460,"url":"https:\/\/rud.is\/b\/2015\/06\/15\/metricsgraphics-0-8-5-is-now-on-cran\/","url_meta":{"origin":3642,"position":5},"title":"metricsgraphics 0.8.5 is now on CRAN!","author":"hrbrmstr","date":"2015-06-15","format":false,"excerpt":"I'm super-pleased to announce that the Benevolent CRAN Overlords [accepted the metricsgraphics package](http:\/\/cran.r-project.org\/web\/packages\/metricsgraphics\/index.html) into CRAN over the weekend. Now, you no longer need to rely on github\/devtools to use [MetricsGraphics.js](http:\/\/metricsgraphicsjs.org\/) charts from your R scripts. If you're not familiar with `htmlwidgets`, take a look at [the official site for them](http:\/\/www.htmlwidgets.org\/).\u2026","rel":"","context":"In &quot;d3&quot;","block_context":{"text":"d3","link":"https:\/\/rud.is\/b\/category\/d3\/"},"img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]}],"jetpack_sharing_enabled":true,"_links":{"self":[{"href":"https:\/\/rud.is\/b\/wp-json\/wp\/v2\/posts\/3642","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/rud.is\/b\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/rud.is\/b\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/rud.is\/b\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/rud.is\/b\/wp-json\/wp\/v2\/comments?post=3642"}],"version-history":[{"count":0,"href":"https:\/\/rud.is\/b\/wp-json\/wp\/v2\/posts\/3642\/revisions"}],"wp:attachment":[{"href":"https:\/\/rud.is\/b\/wp-json\/wp\/v2\/media?parent=3642"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/rud.is\/b\/wp-json\/wp\/v2\/categories?post=3642"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/rud.is\/b\/wp-json\/wp\/v2\/tags?post=3642"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}