

{"id":4942,"date":"2017-01-26T07:37:21","date_gmt":"2017-01-26T12:37:21","guid":{"rendered":"https:\/\/rud.is\/b\/?p=4942"},"modified":"2018-03-10T07:54:35","modified_gmt":"2018-03-10T12:54:35","slug":"one-view-of-the-impact-of-the-new-immigration-ban-freeing-pdf-data-with-tabulizer","status":"publish","type":"post","link":"https:\/\/rud.is\/b\/2017\/01\/26\/one-view-of-the-impact-of-the-new-immigration-ban-freeing-pdf-data-with-tabulizer\/","title":{"rendered":"One View of the Impact of the New Immigration Ban (+ freeing PDF data with tabulizer)"},"content":{"rendered":"<p>Dear Leader has <a href=\"https:\/\/www.reuters.com\/article\/us-usa-trump-immigration-exclusive-idUSKBN1582XQ\">made good<\/a> on his campaign promise to &#8220;crack down&#8221; on immigration from &#8220;dangerous&#8221; countries. I wanted to both see one side of the impact of that decree \u2014 how many potential immigrants per year might this be impacting \u2014 and show toss up some code that shows how to free data from PDF documents using the @rOpenSci <code>tabulizer<\/code> package \u2014 authored by (@thosjleeper) \u2014 (since knowing how to find, free and validate the veracity of U.S. gov data is kinda ++paramount now).<\/p>\n<p>This is just one view and I encourage others to find, grab and blog other visa-related data and other government data in general.<\/p>\n<p>So, the data is locked up in <a href=\"https:\/\/travel.state.gov\/content\/dam\/visas\/Statistics\/AnnualReports\/FY2016AnnualReport\/FY16AnnualReport-TableIII.pdf\">this PDF document<\/a>:<\/p>\n<p><a href=\"https:\/\/i0.wp.com\/rud.is\/b\/wp-content\/uploads\/2017\/01\/Cursor_and_FY16AnnualReport-TableIII_pdf__page_2_of_6_.png?ssl=1\"><img data-recalc-dims=\"1\" loading=\"lazy\" decoding=\"async\" data-attachment-id=\"4953\" data-permalink=\"https:\/\/rud.is\/b\/2017\/01\/26\/one-view-of-the-impact-of-the-new-immigration-ban-freeing-pdf-data-with-tabulizer\/cursor_and_fy16annualreport-tableiii_pdf__page_2_of_6_\/\" data-orig-file=\"https:\/\/i0.wp.com\/rud.is\/b\/wp-content\/uploads\/2017\/01\/Cursor_and_FY16AnnualReport-TableIII_pdf__page_2_of_6_.png?fit=944%2C1214&amp;ssl=1\" data-orig-size=\"944,1214\" data-comments-opened=\"1\" data-image-meta=\"{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}\" data-image-title=\"Cursor_and_FY16AnnualReport-TableIII_pdf__page_2_of_6_\" data-image-description=\"\" data-image-caption=\"\" data-large-file=\"https:\/\/i0.wp.com\/rud.is\/b\/wp-content\/uploads\/2017\/01\/Cursor_and_FY16AnnualReport-TableIII_pdf__page_2_of_6_.png?fit=510%2C656&amp;ssl=1\" src=\"https:\/\/i0.wp.com\/rud.is\/b\/wp-content\/uploads\/2017\/01\/Cursor_and_FY16AnnualReport-TableIII_pdf__page_2_of_6_.png?resize=510%2C656&#038;ssl=1\" alt=\"\" width=\"510\" height=\"656\" class=\"aligncenter size-full wp-image-4953\" \/><\/a><\/p>\n<p>As PDF documents go, it&#8217;s not horribad since the tables are fairly regular. But <em>I&#8217;m<\/em> not transcribing that and traditional PDF text extracting tools on the command-line or in R would also require writing more code than I have time for right now.<\/p>\n<p>Enter: <a href=\"https:\/\/ropensci.org\/tutorials\/tabulizer_tutorial\/\"><code>tabulizer<\/code><\/a> \u2014 an R package that wraps <a href=\"https:\/\/github.com\/tabulapdf\/tabula-java\/\"><code>tabula<\/code><\/a> Java functions and makes them simple to use. I&#8217;m only showing one aspect of it here and you should check out the aforelinked tutorial to see all the features.<\/p>\n<p>First, we need to setup our environment, download the PDF and extract the tables with <code>tabulizer<\/code>:<\/p>\n<pre id=\"tb-01-set\"><code class=\"language-r\">library(tabulizer)\r\nlibrary(hrbrmisc)\r\nlibrary(ggalt)\r\nlibrary(stringi)\r\nlibrary(tidyverse)\r\n\r\nURL &lt;- &quot;https:\/\/travel.state.gov\/content\/dam\/visas\/Statistics\/AnnualReports\/FY2016AnnualReport\/FY16AnnualReport-TableIII.pdf&quot;\r\nfil &lt;- sprintf(&quot;%s&quot;, basename(URL))\r\nif (!file.exists(fil)) download.file(URL, fil)\r\n\r\ntabs &lt;- tabulizer::extract_tables(&quot;FY16AnnualReport-TableIII.pdf&quot;)<\/code><\/pre>\n<p>You should <code>str(tabs)<\/code> in your R session. It found all our data, but put it into a list with 7 elements. You actually need to peruse this list to see where it mis-aligned columns. In the &#8220;old days&#8221;, reading this in and cleaning it up would have taken the form of splitting &amp; replacing elements in character vectors. Now, after our inspection, we can exclude rows we don&#8217;t want, move columns around and get a nice tidy data frame with very little effort:<\/p>\n<pre id=\"tb-02-set\"><code class=\"language-r\">bind_rows(\r\n  tbl_df(tabs[[1]][-1,]),\r\n  tbl_df(tabs[[2]][-c(12,13),]),\r\n  tbl_df(tabs[[3]][-c(7, 10:11), -2]),\r\n  tbl_df(tabs[[4]][-21,]),\r\n  tbl_df(tabs[[5]]),\r\n  tbl_df(tabs[[6]][-c(6:7, 30:32),]),\r\n  tbl_df(tabs[[7]][-c(11:12, 25:27),])\r\n) %&gt;%\r\n  setNames(c(&quot;foreign_state&quot;, &quot;immediate_relatives&quot;,  &quot;special_mmigrants&quot;,\r\n             &quot;family_preference&quot;, &quot;employment_preference&quot;, &quot;diversity_immigrants&quot;,&quot;total&quot;)) %&gt;% \r\n  mutate_each(funs(make_numeric), -foreign_state) %&gt;%\r\n  mutate(foreign_state=trimws(foreign_state)) -&gt; total_visas_2016<\/code><\/pre>\n<p>I&#8217;ve cleaned up PDFs before and that code was a <em>joy<\/em> to write compared to previous efforts. No use of <code>purrr<\/code> since I was referencing the <code>list<\/code> structure in the console as I entered in the various matrix coordinates to edit out.<\/p>\n<p>Finally, we can extract the target &#8220;bad&#8221; countries and see how many human beings could be impacted this year by referencing immigration stats for last year:<\/p>\n<pre id=\"tb-03-set\"><code class=\"language-r\">filter(foreign_state %in% c(&quot;Iran&quot;, &quot;Iraq&quot;, &quot;Libya&quot;, &quot;Somalia&quot;, &quot;Sudan&quot;, &quot;Syria&quot;, &quot;Yemen&quot;)) %&gt;%\r\n  gather(preference, value, -foreign_state) %&gt;%\r\n  mutate(preference=stri_replace_all_fixed(preference, &quot;_&quot;, &quot; &quot; )) %&gt;%\r\n  mutate(preference=stri_trans_totitle(preference)) -&gt; banned_visas\r\n\r\nggplot(banned_visas, aes(foreign_state, value)) +\r\n  geom_col(width=0.65) +\r\n  scale_y_continuous(expand=c(0,5), label=scales::comma) +\r\n  facet_wrap(~preference, scales=&quot;free_y&quot;) +\r\n  labs(x=&quot;# Visas&quot;, y=NULL, title=&quot;Immigrant Visas Issued (2016)&quot;,\r\n       subtitle=&quot;By Foreign State of Chargeability or Place of Birth; Fiscal Year 2016; [Total n=31,804] \u2014 Note free Y scales&quot;,\r\n       caption=&quot;Visa types explanation: https:\/\/travel.state.gov\/content\/visas\/en\/general\/all-visa-categories.html\\nSource: https:\/\/travel.state.gov\/content\/visas\/en\/law-and-policy\/statistics\/annual-reports\/report-of-the-visa-office-2016.html&quot;) +\r\n  theme_hrbrmstr_msc(grid=&quot;Y&quot;) +\r\n  theme(axis.text=element_text(size=12))<\/code><\/pre>\n<p><a href=\"https:\/\/i0.wp.com\/rud.is\/b\/wp-content\/uploads\/2017\/01\/RStudio.png?ssl=1\"><img data-recalc-dims=\"1\" loading=\"lazy\" decoding=\"async\" data-attachment-id=\"4943\" data-permalink=\"https:\/\/rud.is\/b\/2017\/01\/26\/one-view-of-the-impact-of-the-new-immigration-ban-freeing-pdf-data-with-tabulizer\/rstudio-14\/\" data-orig-file=\"https:\/\/i0.wp.com\/rud.is\/b\/wp-content\/uploads\/2017\/01\/RStudio.png?fit=2518%2C1112&amp;ssl=1\" data-orig-size=\"2518,1112\" data-comments-opened=\"1\" data-image-meta=\"{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}\" data-image-title=\"RStudio\" data-image-description=\"\" data-image-caption=\"\" data-large-file=\"https:\/\/i0.wp.com\/rud.is\/b\/wp-content\/uploads\/2017\/01\/RStudio.png?fit=510%2C225&amp;ssl=1\" src=\"https:\/\/i0.wp.com\/rud.is\/b\/wp-content\/uploads\/2017\/01\/RStudio.png?resize=510%2C225&#038;ssl=1\" alt=\"\" width=\"510\" height=\"225\" class=\"aligncenter size-full wp-image-4943\" \/><\/a><\/p>\n<p>~32,000 human beings potentially impacted, many who will remain separated from family (&#8220;family preference&#8221;); plus, the business impact of losing access to skilled labor (&#8220;employment preference&#8221;).<\/p>\n<p>Go forth and find more US gov data to free (before it disappears)!<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Dear Leader has made good on his campaign promise to &#8220;crack down&#8221; on immigration from &#8220;dangerous&#8221; countries. I wanted to both see one side of the impact of that decree \u2014 how many potential immigrants per year might this be impacting \u2014 and show toss up some code that shows how to free data from [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":4943,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"jetpack_post_was_ever_published":false,"_jetpack_newsletter_access":"","_jetpack_dont_email_post_to_subs":true,"_jetpack_newsletter_tier_id":0,"_jetpack_memberships_contains_paywalled_content":false,"_jetpack_memberships_contains_paid_content":false,"activitypub_content_warning":"","activitypub_content_visibility":"","activitypub_max_image_attachments":3,"activitypub_interaction_policy_quote":"anyone","activitypub_status":"","footnotes":""},"categories":[91],"tags":[810],"class_list":["post-4942","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-r","tag-post"],"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v27.3 - https:\/\/yoast.com\/product\/yoast-seo-wordpress\/ -->\n<title>One View of the Impact of the New Immigration Ban (+ freeing PDF data with tabulizer) - rud.is<\/title>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/rud.is\/b\/2017\/01\/26\/one-view-of-the-impact-of-the-new-immigration-ban-freeing-pdf-data-with-tabulizer\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"One View of the Impact of the New Immigration Ban (+ freeing PDF data with tabulizer) - rud.is\" \/>\n<meta property=\"og:description\" content=\"Dear Leader has made good on his campaign promise to &#8220;crack down&#8221; on immigration from &#8220;dangerous&#8221; countries. I wanted to both see one side of the impact of that decree \u2014 how many potential immigrants per year might this be impacting \u2014 and show toss up some code that shows how to free data from [&hellip;]\" \/>\n<meta property=\"og:url\" content=\"https:\/\/rud.is\/b\/2017\/01\/26\/one-view-of-the-impact-of-the-new-immigration-ban-freeing-pdf-data-with-tabulizer\/\" \/>\n<meta property=\"og:site_name\" content=\"rud.is\" \/>\n<meta property=\"article:published_time\" content=\"2017-01-26T12:37:21+00:00\" \/>\n<meta property=\"article:modified_time\" content=\"2018-03-10T12:54:35+00:00\" \/>\n<meta property=\"og:image\" content=\"https:\/\/i0.wp.com\/rud.is\/b\/wp-content\/uploads\/2017\/01\/RStudio.png?fit=2518%2C1112&ssl=1\" \/>\n\t<meta property=\"og:image:width\" content=\"2518\" \/>\n\t<meta property=\"og:image:height\" content=\"1112\" \/>\n\t<meta property=\"og:image:type\" content=\"image\/png\" \/>\n<meta name=\"author\" content=\"hrbrmstr\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:label1\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data1\" content=\"hrbrmstr\" \/>\n\t<meta name=\"twitter:label2\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data2\" content=\"2 minutes\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\\\/\\\/schema.org\",\"@graph\":[{\"@type\":\"Article\",\"@id\":\"https:\\\/\\\/rud.is\\\/b\\\/2017\\\/01\\\/26\\\/one-view-of-the-impact-of-the-new-immigration-ban-freeing-pdf-data-with-tabulizer\\\/#article\",\"isPartOf\":{\"@id\":\"https:\\\/\\\/rud.is\\\/b\\\/2017\\\/01\\\/26\\\/one-view-of-the-impact-of-the-new-immigration-ban-freeing-pdf-data-with-tabulizer\\\/\"},\"author\":{\"name\":\"hrbrmstr\",\"@id\":\"https:\\\/\\\/rud.is\\\/b\\\/#\\\/schema\\\/person\\\/d7cb7487ab0527447f7fda5c423ff886\"},\"headline\":\"One View of the Impact of the New Immigration Ban (+ freeing PDF data with tabulizer)\",\"datePublished\":\"2017-01-26T12:37:21+00:00\",\"dateModified\":\"2018-03-10T12:54:35+00:00\",\"mainEntityOfPage\":{\"@id\":\"https:\\\/\\\/rud.is\\\/b\\\/2017\\\/01\\\/26\\\/one-view-of-the-impact-of-the-new-immigration-ban-freeing-pdf-data-with-tabulizer\\\/\"},\"wordCount\":430,\"commentCount\":10,\"publisher\":{\"@id\":\"https:\\\/\\\/rud.is\\\/b\\\/#\\\/schema\\\/person\\\/d7cb7487ab0527447f7fda5c423ff886\"},\"image\":{\"@id\":\"https:\\\/\\\/rud.is\\\/b\\\/2017\\\/01\\\/26\\\/one-view-of-the-impact-of-the-new-immigration-ban-freeing-pdf-data-with-tabulizer\\\/#primaryimage\"},\"thumbnailUrl\":\"https:\\\/\\\/i0.wp.com\\\/rud.is\\\/b\\\/wp-content\\\/uploads\\\/2017\\\/01\\\/RStudio.png?fit=2518%2C1112&ssl=1\",\"keywords\":[\"post\"],\"articleSection\":[\"R\"],\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"CommentAction\",\"name\":\"Comment\",\"target\":[\"https:\\\/\\\/rud.is\\\/b\\\/2017\\\/01\\\/26\\\/one-view-of-the-impact-of-the-new-immigration-ban-freeing-pdf-data-with-tabulizer\\\/#respond\"]}]},{\"@type\":\"WebPage\",\"@id\":\"https:\\\/\\\/rud.is\\\/b\\\/2017\\\/01\\\/26\\\/one-view-of-the-impact-of-the-new-immigration-ban-freeing-pdf-data-with-tabulizer\\\/\",\"url\":\"https:\\\/\\\/rud.is\\\/b\\\/2017\\\/01\\\/26\\\/one-view-of-the-impact-of-the-new-immigration-ban-freeing-pdf-data-with-tabulizer\\\/\",\"name\":\"One View of the Impact of the New Immigration Ban (+ freeing PDF data with tabulizer) - rud.is\",\"isPartOf\":{\"@id\":\"https:\\\/\\\/rud.is\\\/b\\\/#website\"},\"primaryImageOfPage\":{\"@id\":\"https:\\\/\\\/rud.is\\\/b\\\/2017\\\/01\\\/26\\\/one-view-of-the-impact-of-the-new-immigration-ban-freeing-pdf-data-with-tabulizer\\\/#primaryimage\"},\"image\":{\"@id\":\"https:\\\/\\\/rud.is\\\/b\\\/2017\\\/01\\\/26\\\/one-view-of-the-impact-of-the-new-immigration-ban-freeing-pdf-data-with-tabulizer\\\/#primaryimage\"},\"thumbnailUrl\":\"https:\\\/\\\/i0.wp.com\\\/rud.is\\\/b\\\/wp-content\\\/uploads\\\/2017\\\/01\\\/RStudio.png?fit=2518%2C1112&ssl=1\",\"datePublished\":\"2017-01-26T12:37:21+00:00\",\"dateModified\":\"2018-03-10T12:54:35+00:00\",\"breadcrumb\":{\"@id\":\"https:\\\/\\\/rud.is\\\/b\\\/2017\\\/01\\\/26\\\/one-view-of-the-impact-of-the-new-immigration-ban-freeing-pdf-data-with-tabulizer\\\/#breadcrumb\"},\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\\\/\\\/rud.is\\\/b\\\/2017\\\/01\\\/26\\\/one-view-of-the-impact-of-the-new-immigration-ban-freeing-pdf-data-with-tabulizer\\\/\"]}]},{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\\\/\\\/rud.is\\\/b\\\/2017\\\/01\\\/26\\\/one-view-of-the-impact-of-the-new-immigration-ban-freeing-pdf-data-with-tabulizer\\\/#primaryimage\",\"url\":\"https:\\\/\\\/i0.wp.com\\\/rud.is\\\/b\\\/wp-content\\\/uploads\\\/2017\\\/01\\\/RStudio.png?fit=2518%2C1112&ssl=1\",\"contentUrl\":\"https:\\\/\\\/i0.wp.com\\\/rud.is\\\/b\\\/wp-content\\\/uploads\\\/2017\\\/01\\\/RStudio.png?fit=2518%2C1112&ssl=1\",\"width\":2518,\"height\":1112},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\\\/\\\/rud.is\\\/b\\\/2017\\\/01\\\/26\\\/one-view-of-the-impact-of-the-new-immigration-ban-freeing-pdf-data-with-tabulizer\\\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Home\",\"item\":\"https:\\\/\\\/rud.is\\\/b\\\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"One View of the Impact of the New Immigration Ban (+ freeing PDF data with tabulizer)\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\\\/\\\/rud.is\\\/b\\\/#website\",\"url\":\"https:\\\/\\\/rud.is\\\/b\\\/\",\"name\":\"rud.is\",\"description\":\"&quot;In God we trust. All others must bring data&quot;\",\"publisher\":{\"@id\":\"https:\\\/\\\/rud.is\\\/b\\\/#\\\/schema\\\/person\\\/d7cb7487ab0527447f7fda5c423ff886\"},\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\\\/\\\/rud.is\\\/b\\\/?s={search_term_string}\"},\"query-input\":{\"@type\":\"PropertyValueSpecification\",\"valueRequired\":true,\"valueName\":\"search_term_string\"}}],\"inLanguage\":\"en-US\"},{\"@type\":[\"Person\",\"Organization\"],\"@id\":\"https:\\\/\\\/rud.is\\\/b\\\/#\\\/schema\\\/person\\\/d7cb7487ab0527447f7fda5c423ff886\",\"name\":\"hrbrmstr\",\"image\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\\\/\\\/i0.wp.com\\\/rud.is\\\/b\\\/wp-content\\\/uploads\\\/2023\\\/10\\\/ukr-shield.png?fit=460%2C460&ssl=1\",\"url\":\"https:\\\/\\\/i0.wp.com\\\/rud.is\\\/b\\\/wp-content\\\/uploads\\\/2023\\\/10\\\/ukr-shield.png?fit=460%2C460&ssl=1\",\"contentUrl\":\"https:\\\/\\\/i0.wp.com\\\/rud.is\\\/b\\\/wp-content\\\/uploads\\\/2023\\\/10\\\/ukr-shield.png?fit=460%2C460&ssl=1\",\"width\":460,\"height\":460,\"caption\":\"hrbrmstr\"},\"logo\":{\"@id\":\"https:\\\/\\\/i0.wp.com\\\/rud.is\\\/b\\\/wp-content\\\/uploads\\\/2023\\\/10\\\/ukr-shield.png?fit=460%2C460&ssl=1\"},\"description\":\"Don't look at me\u2026I do what he does \u2014 just slower. #rstats avuncular \u2022 ?Resistance Fighter \u2022 Cook \u2022 Christian \u2022 [Master] Chef des Donn\u00e9es de S\u00e9curit\u00e9 @ @rapid7\",\"sameAs\":[\"http:\\\/\\\/rud.is\"],\"url\":\"https:\\\/\\\/rud.is\\\/b\\\/author\\\/hrbrmstr\\\/\"}]}<\/script>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"One View of the Impact of the New Immigration Ban (+ freeing PDF data with tabulizer) - rud.is","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/rud.is\/b\/2017\/01\/26\/one-view-of-the-impact-of-the-new-immigration-ban-freeing-pdf-data-with-tabulizer\/","og_locale":"en_US","og_type":"article","og_title":"One View of the Impact of the New Immigration Ban (+ freeing PDF data with tabulizer) - rud.is","og_description":"Dear Leader has made good on his campaign promise to &#8220;crack down&#8221; on immigration from &#8220;dangerous&#8221; countries. I wanted to both see one side of the impact of that decree \u2014 how many potential immigrants per year might this be impacting \u2014 and show toss up some code that shows how to free data from [&hellip;]","og_url":"https:\/\/rud.is\/b\/2017\/01\/26\/one-view-of-the-impact-of-the-new-immigration-ban-freeing-pdf-data-with-tabulizer\/","og_site_name":"rud.is","article_published_time":"2017-01-26T12:37:21+00:00","article_modified_time":"2018-03-10T12:54:35+00:00","og_image":[{"width":2518,"height":1112,"url":"https:\/\/i0.wp.com\/rud.is\/b\/wp-content\/uploads\/2017\/01\/RStudio.png?fit=2518%2C1112&ssl=1","type":"image\/png"}],"author":"hrbrmstr","twitter_card":"summary_large_image","twitter_misc":{"Written by":"hrbrmstr","Est. reading time":"2 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"Article","@id":"https:\/\/rud.is\/b\/2017\/01\/26\/one-view-of-the-impact-of-the-new-immigration-ban-freeing-pdf-data-with-tabulizer\/#article","isPartOf":{"@id":"https:\/\/rud.is\/b\/2017\/01\/26\/one-view-of-the-impact-of-the-new-immigration-ban-freeing-pdf-data-with-tabulizer\/"},"author":{"name":"hrbrmstr","@id":"https:\/\/rud.is\/b\/#\/schema\/person\/d7cb7487ab0527447f7fda5c423ff886"},"headline":"One View of the Impact of the New Immigration Ban (+ freeing PDF data with tabulizer)","datePublished":"2017-01-26T12:37:21+00:00","dateModified":"2018-03-10T12:54:35+00:00","mainEntityOfPage":{"@id":"https:\/\/rud.is\/b\/2017\/01\/26\/one-view-of-the-impact-of-the-new-immigration-ban-freeing-pdf-data-with-tabulizer\/"},"wordCount":430,"commentCount":10,"publisher":{"@id":"https:\/\/rud.is\/b\/#\/schema\/person\/d7cb7487ab0527447f7fda5c423ff886"},"image":{"@id":"https:\/\/rud.is\/b\/2017\/01\/26\/one-view-of-the-impact-of-the-new-immigration-ban-freeing-pdf-data-with-tabulizer\/#primaryimage"},"thumbnailUrl":"https:\/\/i0.wp.com\/rud.is\/b\/wp-content\/uploads\/2017\/01\/RStudio.png?fit=2518%2C1112&ssl=1","keywords":["post"],"articleSection":["R"],"inLanguage":"en-US","potentialAction":[{"@type":"CommentAction","name":"Comment","target":["https:\/\/rud.is\/b\/2017\/01\/26\/one-view-of-the-impact-of-the-new-immigration-ban-freeing-pdf-data-with-tabulizer\/#respond"]}]},{"@type":"WebPage","@id":"https:\/\/rud.is\/b\/2017\/01\/26\/one-view-of-the-impact-of-the-new-immigration-ban-freeing-pdf-data-with-tabulizer\/","url":"https:\/\/rud.is\/b\/2017\/01\/26\/one-view-of-the-impact-of-the-new-immigration-ban-freeing-pdf-data-with-tabulizer\/","name":"One View of the Impact of the New Immigration Ban (+ freeing PDF data with tabulizer) - rud.is","isPartOf":{"@id":"https:\/\/rud.is\/b\/#website"},"primaryImageOfPage":{"@id":"https:\/\/rud.is\/b\/2017\/01\/26\/one-view-of-the-impact-of-the-new-immigration-ban-freeing-pdf-data-with-tabulizer\/#primaryimage"},"image":{"@id":"https:\/\/rud.is\/b\/2017\/01\/26\/one-view-of-the-impact-of-the-new-immigration-ban-freeing-pdf-data-with-tabulizer\/#primaryimage"},"thumbnailUrl":"https:\/\/i0.wp.com\/rud.is\/b\/wp-content\/uploads\/2017\/01\/RStudio.png?fit=2518%2C1112&ssl=1","datePublished":"2017-01-26T12:37:21+00:00","dateModified":"2018-03-10T12:54:35+00:00","breadcrumb":{"@id":"https:\/\/rud.is\/b\/2017\/01\/26\/one-view-of-the-impact-of-the-new-immigration-ban-freeing-pdf-data-with-tabulizer\/#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/rud.is\/b\/2017\/01\/26\/one-view-of-the-impact-of-the-new-immigration-ban-freeing-pdf-data-with-tabulizer\/"]}]},{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/rud.is\/b\/2017\/01\/26\/one-view-of-the-impact-of-the-new-immigration-ban-freeing-pdf-data-with-tabulizer\/#primaryimage","url":"https:\/\/i0.wp.com\/rud.is\/b\/wp-content\/uploads\/2017\/01\/RStudio.png?fit=2518%2C1112&ssl=1","contentUrl":"https:\/\/i0.wp.com\/rud.is\/b\/wp-content\/uploads\/2017\/01\/RStudio.png?fit=2518%2C1112&ssl=1","width":2518,"height":1112},{"@type":"BreadcrumbList","@id":"https:\/\/rud.is\/b\/2017\/01\/26\/one-view-of-the-impact-of-the-new-immigration-ban-freeing-pdf-data-with-tabulizer\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/rud.is\/b\/"},{"@type":"ListItem","position":2,"name":"One View of the Impact of the New Immigration Ban (+ freeing PDF data with tabulizer)"}]},{"@type":"WebSite","@id":"https:\/\/rud.is\/b\/#website","url":"https:\/\/rud.is\/b\/","name":"rud.is","description":"&quot;In God we trust. All others must bring data&quot;","publisher":{"@id":"https:\/\/rud.is\/b\/#\/schema\/person\/d7cb7487ab0527447f7fda5c423ff886"},"potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/rud.is\/b\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"en-US"},{"@type":["Person","Organization"],"@id":"https:\/\/rud.is\/b\/#\/schema\/person\/d7cb7487ab0527447f7fda5c423ff886","name":"hrbrmstr","image":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/i0.wp.com\/rud.is\/b\/wp-content\/uploads\/2023\/10\/ukr-shield.png?fit=460%2C460&ssl=1","url":"https:\/\/i0.wp.com\/rud.is\/b\/wp-content\/uploads\/2023\/10\/ukr-shield.png?fit=460%2C460&ssl=1","contentUrl":"https:\/\/i0.wp.com\/rud.is\/b\/wp-content\/uploads\/2023\/10\/ukr-shield.png?fit=460%2C460&ssl=1","width":460,"height":460,"caption":"hrbrmstr"},"logo":{"@id":"https:\/\/i0.wp.com\/rud.is\/b\/wp-content\/uploads\/2023\/10\/ukr-shield.png?fit=460%2C460&ssl=1"},"description":"Don't look at me\u2026I do what he does \u2014 just slower. #rstats avuncular \u2022 ?Resistance Fighter \u2022 Cook \u2022 Christian \u2022 [Master] Chef des Donn\u00e9es de S\u00e9curit\u00e9 @ @rapid7","sameAs":["http:\/\/rud.is"],"url":"https:\/\/rud.is\/b\/author\/hrbrmstr\/"}]}},"jetpack_featured_media_url":"https:\/\/i0.wp.com\/rud.is\/b\/wp-content\/uploads\/2017\/01\/RStudio.png?fit=2518%2C1112&ssl=1","jetpack_shortlink":"https:\/\/wp.me\/p23idr-1hI","jetpack_likes_enabled":true,"jetpack-related-posts":[{"id":589,"url":"https:\/\/rud.is\/b\/2011\/06\/14\/weis-2011-session-2-identity-social-networks-personalized-advertising-privacy-controls\/","url_meta":{"origin":4942,"position":0},"title":"WEIS 2011 :: Session 2 :: Identity :: Social Networks, Personalized Advertising &#038; Privacy Controls","author":"hrbrmstr","date":"2011-06-14","format":false,"excerpt":"Catherine Tucker Presentation [PDF] Catherine's talk was really good. She handled questions well and is a very dynamic speaker. I'm looking forward to the paper. Twitter transcript #weis2011 Premise of the study was to see what impact privacy controls enablement\/usage have on advertising. It's an empirical study #data! #weis2011 click\u2026","rel":"","context":"In &quot;Information Security&quot;","block_context":{"text":"Information Security","link":"https:\/\/rud.is\/b\/category\/information-security\/"},"img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":10966,"url":"https:\/\/rud.is\/b\/2018\/07\/02\/freeing-pdf-data-to-account-for-the-unaccounted\/","url_meta":{"origin":4942,"position":1},"title":"Freeing PDF Data to Account for the Unaccounted","author":"hrbrmstr","date":"2018-07-02","format":false,"excerpt":"I've mentioned @stiles before on the blog but for those new to my blatherings, Matt is a top-notch data journalist with the @latimes and currently stationed in South Korea. I can only imagine how much busier his life has gotten since that fateful, awful November 2016 Tuesday, but I'm truly\u2026","rel":"","context":"In &quot;R&quot;","block_context":{"text":"R","link":"https:\/\/rud.is\/b\/category\/r\/"},"img":{"alt_text":"","src":"https:\/\/i0.wp.com\/rud.is\/b\/wp-content\/uploads\/2018\/07\/plot_zoom_png.png?fit=1200%2C681&ssl=1&resize=350%2C200","width":350,"height":200,"srcset":"https:\/\/i0.wp.com\/rud.is\/b\/wp-content\/uploads\/2018\/07\/plot_zoom_png.png?fit=1200%2C681&ssl=1&resize=350%2C200 1x, https:\/\/i0.wp.com\/rud.is\/b\/wp-content\/uploads\/2018\/07\/plot_zoom_png.png?fit=1200%2C681&ssl=1&resize=525%2C300 1.5x, https:\/\/i0.wp.com\/rud.is\/b\/wp-content\/uploads\/2018\/07\/plot_zoom_png.png?fit=1200%2C681&ssl=1&resize=700%2C400 2x, https:\/\/i0.wp.com\/rud.is\/b\/wp-content\/uploads\/2018\/07\/plot_zoom_png.png?fit=1200%2C681&ssl=1&resize=1050%2C600 3x"},"classes":[]},{"id":12630,"url":"https:\/\/rud.is\/b\/2020\/01\/21\/davos-2020-world-economic-forum-2020-global-risk-report-cyber-cliffs-notes\/","url_meta":{"origin":4942,"position":2},"title":"Davos 2020 World Economic Forum 2020 Global Risk Report Cyber Cliffs Notes","author":"hrbrmstr","date":"2020-01-21","format":false,"excerpt":"Each year the World Economic Forum releases their Global Risk Report around the time of the annual Davos conference. This year's report is out and below are notes on the \"cyber\" content to help others speed-read through those sections (in the event you don't read the whole thing). Their expert\u2026","rel":"","context":"In &quot;Cybersecurity&quot;","block_context":{"text":"Cybersecurity","link":"https:\/\/rud.is\/b\/category\/cybersecurity\/"},"img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":568,"url":"https:\/\/rud.is\/b\/2011\/06\/14\/weis-2011-session-1-attacks-the-impact-of-immediate-disclosure-on-attack-diffusion-volume\/","url_meta":{"origin":4942,"position":3},"title":"WEIS 2011 :: Session 1 :: Attacks :: The Impact of Immediate Disclosure on Attack Diffusion &#038; Volume","author":"hrbrmstr","date":"2011-06-14","format":false,"excerpt":"Sam Ransbotham Sabayasachi Mitra Presentation [PDF] Twitter transcript #weis2011 Does immediate disclosure of vulns affect exploitation attempts? Looking at impact on risk\/diffusion\/volume #weis2011 speaker is presenting standard attack process & security processes timelines (slides will be in the blog post) #weis2011 the fundamental question is when from the vulnerability discovery\u2026","rel":"","context":"In &quot;Information Security&quot;","block_context":{"text":"Information Security","link":"https:\/\/rud.is\/b\/category\/information-security\/"},"img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":4886,"url":"https:\/\/rud.is\/b\/2017\/01\/16\/the-devils-in-the-davos-details-a-quick-look-at-this-years-wef-global-risks-report\/","url_meta":{"origin":4942,"position":4},"title":"The Devil&#8217;s in the [Davos] Details \u2014 A quick look at this year&#8217;s WEF Global Risks Report","author":"hrbrmstr","date":"2017-01-16","format":false,"excerpt":"It's Davos time again. Each year the World Economic Forum (WEF) gathers the global elite together to discuss how they're going to shape our collective future. WEF also releases their annual Global Risks Report at the same time. I read it every year and have, in the past, borrowed some\u2026","rel":"","context":"In &quot;R&quot;","block_context":{"text":"R","link":"https:\/\/rud.is\/b\/category\/r\/"},"img":{"alt_text":"","src":"https:\/\/i0.wp.com\/rud.is\/b\/wp-content\/uploads\/2017\/01\/Cursor_and___Development_devils_in_the_davos_-_RStudio-4.png?fit=1200%2C536&ssl=1&resize=350%2C200","width":350,"height":200,"srcset":"https:\/\/i0.wp.com\/rud.is\/b\/wp-content\/uploads\/2017\/01\/Cursor_and___Development_devils_in_the_davos_-_RStudio-4.png?fit=1200%2C536&ssl=1&resize=350%2C200 1x, https:\/\/i0.wp.com\/rud.is\/b\/wp-content\/uploads\/2017\/01\/Cursor_and___Development_devils_in_the_davos_-_RStudio-4.png?fit=1200%2C536&ssl=1&resize=525%2C300 1.5x, https:\/\/i0.wp.com\/rud.is\/b\/wp-content\/uploads\/2017\/01\/Cursor_and___Development_devils_in_the_davos_-_RStudio-4.png?fit=1200%2C536&ssl=1&resize=700%2C400 2x, https:\/\/i0.wp.com\/rud.is\/b\/wp-content\/uploads\/2017\/01\/Cursor_and___Development_devils_in_the_davos_-_RStudio-4.png?fit=1200%2C536&ssl=1&resize=1050%2C600 3x"},"classes":[]},{"id":5801,"url":"https:\/\/rud.is\/b\/2017\/04\/13\/come-fly-with-me-well-not-really-comparing-involuntary-disembarking-rates-across-u-s-airlines-in-r\/","url_meta":{"origin":4942,"position":5},"title":"Come Fly With Me (well, not really) \u2014 Comparing Involuntary Disembarking Rates Across U.S. Airlines in R","author":"hrbrmstr","date":"2017-04-13","format":false,"excerpt":"By now, word of the forcible deplanement of a medical professional by United has reached even the remotest of outposts in the #rstats universe. Since the news brought this practice to global attention, I found some aggregate U.S. Gov data made a quick, annual, aggregate look at this soon after\u2026","rel":"","context":"In &quot;data wrangling&quot;","block_context":{"text":"data wrangling","link":"https:\/\/rud.is\/b\/category\/data-wrangling\/"},"img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]}],"jetpack_sharing_enabled":true,"_links":{"self":[{"href":"https:\/\/rud.is\/b\/wp-json\/wp\/v2\/posts\/4942","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/rud.is\/b\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/rud.is\/b\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/rud.is\/b\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/rud.is\/b\/wp-json\/wp\/v2\/comments?post=4942"}],"version-history":[{"count":0,"href":"https:\/\/rud.is\/b\/wp-json\/wp\/v2\/posts\/4942\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/rud.is\/b\/wp-json\/wp\/v2\/media\/4943"}],"wp:attachment":[{"href":"https:\/\/rud.is\/b\/wp-json\/wp\/v2\/media?parent=4942"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/rud.is\/b\/wp-json\/wp\/v2\/categories?post=4942"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/rud.is\/b\/wp-json\/wp\/v2\/tags?post=4942"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}