

{"id":10121,"date":"2018-04-20T10:44:29","date_gmt":"2018-04-20T15:44:29","guid":{"rendered":"https:\/\/rud.is\/b\/?p=10121"},"modified":"2018-10-05T11:02:31","modified_gmt":"2018-10-05T16:02:31","slug":"painless-odbc-dplyr-connections-to-amazon-athena-and-apache-drill-with-r-odbc","status":"publish","type":"post","link":"https:\/\/rud.is\/b\/2018\/04\/20\/painless-odbc-dplyr-connections-to-amazon-athena-and-apache-drill-with-r-odbc\/","title":{"rendered":"Painless ODBC  + dplyr Connections to Amazon Athena and Apache Drill with R &#038; odbc"},"content":{"rendered":"<p>I spent some time this morning upgrading the JDBC driver (and changing up some supporting code to account for changes to it) for my <a href=\"https:\/\/github.com\/hrbrmstr\/metis\"><code>metis<\/code> package?<\/a> which connects R up to Amazon Athena via RJDBC. I&#8217;m used to JDBC and have to deal with Java separately from R so I&#8217;m also comfortable with Java, JDBC and keeping R working with Java. I notified the <a rel=\"tag\" class=\"hashtag u-tag u-category\" href=\"https:\/\/rud.is\/b\/tag\/rstats\/\">#rstats<\/a> Twitterverse about it and it started this thread (click on the embed to go to it &#8212; and, yes, this means Twitter is tracking you via this post unless you&#8217;ve blocked their JavaScript):<\/p>\n<blockquote class=\"twitter-tweet\" data-lang=\"en\">\n<p lang=\"en\" dir=\"ltr\">The (GitHub only for now) <a href=\"https:\/\/twitter.com\/hashtag\/rstats?src=hash&amp;ref_src=twsrc%5Etfw\">#rstats<\/a> metis package for wiring up R to @amazonathema via RJDBC now uses &amp; includes the new Simba Athena JDBC Driver 2.0.2 JAR <a href=\"https:\/\/t.co\/wvwV6IxCNd\">https:\/\/t.co\/wvwV6IxCNd<\/a> (cc: <a href=\"https:\/\/twitter.com\/dabdine?ref_src=twsrc%5Etfw\">@dabdine<\/a>)<\/p>\n<p>&mdash; hrbrmstr (@hrbrmstr) <a href=\"https:\/\/twitter.com\/hrbrmstr\/status\/987279127299358720?ref_src=twsrc%5Etfw\">April 20, 2018<\/a><\/p><\/blockquote>\n<p><script async src=\"https:\/\/platform.twitter.com\/widgets.js\" charset=\"utf-8\"><\/script><\/p>\n<p>If you do scroll through the thread you&#8217;ll see @hadleywickham suggested using the <code>odbc<\/code> package with the ODBC driver for Athena.<\/p>\n<p>I, and others, have noted that ODBC on macOS (and &#8212; for me, at least &#8212; Linux) never really played well together for us. Given that I&#8217;m familiar with JDBC, I just gravitated towards using it after trying it out with raw Java and it worked fine in R.<\/p>\n<p>Never one to discount advice from Hadley, I quickly <a href=\"https:\/\/docs.aws.amazon.com\/athena\/latest\/ug\/connect-with-odbc.html\">grabbed the Athena ODBC driver<\/a> and installed it and wired up an <code>odbc<\/code> + <code>dplyr<\/code> connection almost instantly:<\/p>\n<pre id=\"athenadrillodbc01\"><code class=\"language-r\">library(odbc)\r\nlibrary(tidyverse)\r\n\r\nDBI::dbConnect(\r\n  odbc::odbc(), \r\n  driver = &quot;\/Library\/simba\/athenaodbc\/lib\/libathenaodbc_sbu.dylib&quot;, \r\n  Schema = &quot;sampledb&quot;,\r\n  AwsRegion = &quot;us-east-1&quot;,\r\n  AuthenticationType = &quot;Default Credentials&quot;,\r\n  S3OutputLocation = &quot;s3:\/\/aws-athena-query-results-redacted&quot;\r\n) -&gt; con\r\n\r\nsome_tbl &lt;- tbl(con, &quot;elb_logs&quot;)\r\n\r\nsome_tbl\r\n## # Source:   table&lt;elb_logs&gt; [?? x 16]\r\n## # Database: Amazon Athena 01.00.0000[@Amazon Athena\/AwsDataCatalog]\r\n##    timestamp    elbname requestip  requestport backendip backendport\r\n##    &lt;chr&gt;        &lt;chr&gt;   &lt;chr&gt;            &lt;int&gt; &lt;chr&gt;           &lt;int&gt;\r\n##  1 2014-09-26T\u2026 lb-demo 249.6.80.\u2026        5123 249.6.80\u2026        8888\r\n##  2 2014-09-26T\u2026 lb-demo 246.22.15\u2026        5123 248.178.\u2026        8888\r\n##  3 2014-09-26T\u2026 lb-demo 248.179.3\u2026       45667 254.70.2\u2026         443\r\n##  4 2014-09-26T\u2026 lb-demo 243.2.127\u2026       14496 248.178.\u2026          80\r\n##  5 2014-09-26T\u2026 lb-demo 247.76.18\u2026        6887 252.0.81\u2026        8888\r\n##  6 2014-09-26T\u2026 lb-demo 254.110.3\u2026       22052 248.178.\u2026        8888\r\n##  7 2014-09-26T\u2026 lb-demo 249.113.2\u2026       24902 245.241.\u2026        8888\r\n##  8 2014-09-26T\u2026 lb-demo 246.128.7\u2026        5123 244.202.\u2026        8888\r\n##  9 2014-09-26T\u2026 lb-demo 249.6.80.\u2026       24902 255.226.\u2026        8888\r\n## 10 2014-09-26T\u2026 lb-demo 253.102.6\u2026        6887 246.22.1\u2026        8888\r\n## # ... with more rows, and 10 more variables:\r\n## #   requestprocessingtime &lt;dbl&gt;, backendprocessingtime &lt;dbl&gt;,\r\n## #   clientresponsetime &lt;dbl&gt;, elbresponsecode &lt;chr&gt;,\r\n## #   backendresponsecode &lt;chr&gt;, receivedbytes &lt;S3: integer64&gt;,\r\n## #   sentbytes &lt;S3: integer64&gt;, requestverb &lt;chr&gt;, url &lt;chr&gt;,\r\n## #   protocol &lt;chr&gt;## <\/code><\/pre>\n<p>The TLDR is that I can now use 100% <code>dplyr<\/code> idioms with Athena vs add one to the RJDBC driver I made for <code>metis<\/code>. The <code>metis<\/code> package will still be around to support JDBC on systems that do have issues with ODBC and to add other methods that work with the AWS Athena API (managing Athena vs the interactive queries part).<\/p>\n<p>The downside is that I&#8217;m now even more likely to run up the AWS bill ;-)<\/p>\n<h3>What About Drill?<\/h3>\n<p>I also maintain the <a href=\"https:\/\/github.com\/hrbrmstr\/sergeant\"><code>sergeant<\/code> package?<\/a> which provides REST API and REST query access to Apache Drill along with a REST API <code>DBI<\/code> driver and an RJDBC interface for Drill. I remember trying to get the MapR ODBC client working with R a few years ago so I made the package (which was also a great learning experience).<\/p>\n<p>I noticed there was a <a href=\"http:\/\/package.mapr.com\/tools\/MapR-ODBC\/MapR_Drill\/MapRDrill_odbc_v1.3.16.1049\/\">very recent MapR Drill ODBC driver released<\/a>. Since I was on a roll, I figured why not try it one more time, especially since the RStudio team has <a href=\"https:\/\/db.rstudio.com\/\">made it dead simple<\/a> to work with ODBC from R.<\/p>\n<pre id=\"athenadrillodbc02\"><code class=\"language-r\">library(odbc)\r\nlibrary(tidyverse)\r\n\r\nDBI::dbConnect(\r\n  odbc::odbc(), \r\n  driver = &quot;\/Library\/mapr\/drill\/lib\/libdrillodbc_sbu.dylib&quot;,\r\n  ConnectionType = &quot;Zookeeper&quot;,\r\n  AuthenticationType = &quot;No Authentication&quot;,\r\n  ZKCLusterID = &quot;CLUSTERID&quot;,\r\n  ZkQuorum = &quot;HOST:2181&quot;,\r\n  AdvancedProperties = &quot;CastAnyToVarchar=true;HandshakeTimeout=30;QueryTimeout=180;TimestampTZDisplayTimezone=utc;\r\nExcludedSchemas=sys,INFORMATION_SCHEMA;NumberOfPrefetchBuffers=5;&quot;\r\n) -&gt; drill_con\r\n\r\n(employee &lt;- tbl(drill_con, sql(&quot;SELECT * FROM cp.`employee.json`&quot;)))\r\n## # Source:   SQL [?? x 16]\r\n## # Database: Drill 01.13.0000[@Apache Drill Server\/DRILL]\r\n##    employee_id   full_name    first_name last_name position_id   position_title   store_id  \r\n##    &lt;S3: integer&gt; &lt;chr&gt;        &lt;chr&gt;      &lt;chr&gt;     &lt;S3: integer&gt; &lt;chr&gt;            &lt;S3: inte&gt;\r\n##  1 1             Sheri Nowmer Sheri      Nowmer    1             President        0         \r\n##  2 2             Derrick Whe\u2026 Derrick    Whelply   2             VP Country Mana\u2026 0         \r\n##  3 4             Michael Spe\u2026 Michael    Spence    2             VP Country Mana\u2026 0         \r\n##  4 5             Maya Gutier\u2026 Maya       Gutierrez 2             VP Country Mana\u2026 0         \r\n##  5 6             Roberta Dam\u2026 Roberta    Damstra   3             VP Information \u2026 0         \r\n##  6 7             Rebecca Kan\u2026 Rebecca    Kanagaki  4             VP Human Resour\u2026 0         \r\n##  7 8             Kim Brunner  Kim        Brunner   11            Store Manager    9         \r\n##  8 9             Brenda Blum\u2026 Brenda     Blumberg  11            Store Manager    21        \r\n##  9 10            Darren Stanz Darren     Stanz     5             VP Finance       0         \r\n## 10 11            Jonathan Mu\u2026 Jonathan   Murraiin  11            Store Manager    1         \r\n## # ... with more rows, and 9 more variables: department_id &lt;S3: integer64&gt;, birth_date &lt;chr&gt;,\r\n## #   hire_date &lt;chr&gt;, salary &lt;dbl&gt;, supervisor_id &lt;S3: integer64&gt;, education_level &lt;chr&gt;,\r\n## #   marital_status &lt;chr&gt;, gender &lt;chr&gt;, management_role &lt;chr&gt;## \r\n\r\ncount(employee, position_title, sort=TRUE)\r\n## # Source:     lazy query [?? x 2]\r\n## # Database:   Drill 01.13.0000[@Apache Drill Server\/DRILL]\r\n## # Ordered by: desc(n)\r\n##    position_title            n              \r\n##    &lt;chr&gt;                     &lt;S3: integer64&gt;\r\n##  1 Store Temporary Checker   268            \r\n##  2 Store Temporary Stocker   264            \r\n##  3 Store Permanent Checker   226            \r\n##  4 Store Permanent Stocker   222            \r\n##  5 Store Shift Supervisor    52             \r\n##  6 Store Permanent Butcher   32             \r\n##  7 Store Manager             24             \r\n##  8 Store Assistant Manager   24             \r\n##  9 Store Information Systems 16             \r\n## 10 HQ Finance and Accounting 8              \r\n## # ... with more rows##<\/code><\/pre>\n<p>Apart from having to do that <code>sql(\u2026)<\/code> to make the table connection work, it was pretty painless and I had both Athena and Drill working with <code>dplyr<\/code> verbs in under ten minutes (total).<\/p>\n<p>You can head on over to the main Apache Drill site to learn all about the <a href=\"https:\/\/drill.apache.org\/docs\/installing-the-odbc-driver\/\">ODBC driver configuration parameters<\/a> and I&#8217;ve updated my ongoing <a href=\"https:\/\/rud.is\/books\/drill-sergeant-rstats\/\">Using Apache Drill with R e-book<\/a> to <a href=\"https:\/\/rud.is\/books\/drill-sergeant-rstats\/wiring-up-drill-and-r-odbc-style.html\">include this information<\/a>. I will also keep maintaining the existing <code>sergeant<\/code> package but also be including some additional methods provide ODBC usage guidance and potentially other helpers if there are any &#8220;gotchas&#8221; that arise.<\/p>\n<h3>FIN<\/h3>\n<p>The <code>odbc<\/code> package is super-slick and it&#8217;s refreshing to be able to use <code>dplyr<\/code> verbs with Athena vs gosh-awful SQL. <em>However<\/em>, for some of our needs the hand-crafted queries will still be necessary as they are far more optimized than what would likely get pieced together via the <code>dplyr<\/code> verbs. However, those queries can also be put right into <code>sql()<\/code> with the Athena ODBC driver connection and used via the same <code>dplyr<\/code> verb magic afterwards.<\/p>\n<p>Today is, indeed, a good day to query!<\/p>\n","protected":false},"excerpt":{"rendered":"<p>I spent some time this morning upgrading the JDBC driver (and changing up some supporting code to account for changes to it) for my metis package? which connects R up to Amazon Athena via RJDBC. I&#8217;m used to JDBC and have to deal with Java separately from R so I&#8217;m also comfortable with Java, JDBC [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":10123,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"jetpack_post_was_ever_published":false,"_jetpack_newsletter_access":"","_jetpack_dont_email_post_to_subs":false,"_jetpack_newsletter_tier_id":0,"_jetpack_memberships_contains_paywalled_content":false,"_jetpack_memberships_contains_paid_content":false,"activitypub_content_warning":"","activitypub_content_visibility":"","activitypub_max_image_attachments":3,"activitypub_interaction_policy_quote":"anyone","activitypub_status":"","footnotes":""},"categories":[819,818,781,91],"tags":[],"class_list":["post-10121","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-apache-drill","category-athena","category-drill","category-r"],"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v27.3 - https:\/\/yoast.com\/product\/yoast-seo-wordpress\/ -->\n<title>Painless ODBC + dplyr Connections to Amazon Athena and Apache Drill with R &amp; odbc - rud.is<\/title>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/rud.is\/b\/2018\/04\/20\/painless-odbc-dplyr-connections-to-amazon-athena-and-apache-drill-with-r-odbc\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"Painless ODBC + dplyr Connections to Amazon Athena and Apache Drill with R &amp; odbc - rud.is\" \/>\n<meta property=\"og:description\" content=\"I spent some time this morning upgrading the JDBC driver (and changing up some supporting code to account for changes to it) for my metis package? which connects R up to Amazon Athena via RJDBC. I&#8217;m used to JDBC and have to deal with Java separately from R so I&#8217;m also comfortable with Java, JDBC [&hellip;]\" \/>\n<meta property=\"og:url\" content=\"https:\/\/rud.is\/b\/2018\/04\/20\/painless-odbc-dplyr-connections-to-amazon-athena-and-apache-drill-with-r-odbc\/\" \/>\n<meta property=\"og:site_name\" content=\"rud.is\" \/>\n<meta property=\"article:published_time\" content=\"2018-04-20T15:44:29+00:00\" \/>\n<meta property=\"article:modified_time\" content=\"2018-10-05T16:02:31+00:00\" \/>\n<meta property=\"og:image\" content=\"https:\/\/i0.wp.com\/rud.is\/b\/wp-content\/uploads\/2018\/04\/today-is-a-good-day-to-query.jpg?fit=700%2C535&ssl=1\" \/>\n\t<meta property=\"og:image:width\" content=\"700\" \/>\n\t<meta property=\"og:image:height\" content=\"535\" \/>\n\t<meta property=\"og:image:type\" content=\"image\/jpeg\" \/>\n<meta name=\"author\" content=\"hrbrmstr\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:label1\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data1\" content=\"hrbrmstr\" \/>\n\t<meta name=\"twitter:label2\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data2\" content=\"3 minutes\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\\\/\\\/schema.org\",\"@graph\":[{\"@type\":\"Article\",\"@id\":\"https:\\\/\\\/rud.is\\\/b\\\/2018\\\/04\\\/20\\\/painless-odbc-dplyr-connections-to-amazon-athena-and-apache-drill-with-r-odbc\\\/#article\",\"isPartOf\":{\"@id\":\"https:\\\/\\\/rud.is\\\/b\\\/2018\\\/04\\\/20\\\/painless-odbc-dplyr-connections-to-amazon-athena-and-apache-drill-with-r-odbc\\\/\"},\"author\":{\"name\":\"hrbrmstr\",\"@id\":\"https:\\\/\\\/rud.is\\\/b\\\/#\\\/schema\\\/person\\\/d7cb7487ab0527447f7fda5c423ff886\"},\"headline\":\"Painless ODBC + dplyr Connections to Amazon Athena and Apache Drill with R &#038; odbc\",\"datePublished\":\"2018-04-20T15:44:29+00:00\",\"dateModified\":\"2018-10-05T16:02:31+00:00\",\"mainEntityOfPage\":{\"@id\":\"https:\\\/\\\/rud.is\\\/b\\\/2018\\\/04\\\/20\\\/painless-odbc-dplyr-connections-to-amazon-athena-and-apache-drill-with-r-odbc\\\/\"},\"wordCount\":609,\"commentCount\":8,\"publisher\":{\"@id\":\"https:\\\/\\\/rud.is\\\/b\\\/#\\\/schema\\\/person\\\/d7cb7487ab0527447f7fda5c423ff886\"},\"image\":{\"@id\":\"https:\\\/\\\/rud.is\\\/b\\\/2018\\\/04\\\/20\\\/painless-odbc-dplyr-connections-to-amazon-athena-and-apache-drill-with-r-odbc\\\/#primaryimage\"},\"thumbnailUrl\":\"https:\\\/\\\/i0.wp.com\\\/rud.is\\\/b\\\/wp-content\\\/uploads\\\/2018\\\/04\\\/today-is-a-good-day-to-query.jpg?fit=700%2C535&ssl=1\",\"articleSection\":[\"Apache Drill\",\"athena\",\"drill\",\"R\"],\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"CommentAction\",\"name\":\"Comment\",\"target\":[\"https:\\\/\\\/rud.is\\\/b\\\/2018\\\/04\\\/20\\\/painless-odbc-dplyr-connections-to-amazon-athena-and-apache-drill-with-r-odbc\\\/#respond\"]}]},{\"@type\":\"WebPage\",\"@id\":\"https:\\\/\\\/rud.is\\\/b\\\/2018\\\/04\\\/20\\\/painless-odbc-dplyr-connections-to-amazon-athena-and-apache-drill-with-r-odbc\\\/\",\"url\":\"https:\\\/\\\/rud.is\\\/b\\\/2018\\\/04\\\/20\\\/painless-odbc-dplyr-connections-to-amazon-athena-and-apache-drill-with-r-odbc\\\/\",\"name\":\"Painless ODBC + dplyr Connections to Amazon Athena and Apache Drill with R & odbc - rud.is\",\"isPartOf\":{\"@id\":\"https:\\\/\\\/rud.is\\\/b\\\/#website\"},\"primaryImageOfPage\":{\"@id\":\"https:\\\/\\\/rud.is\\\/b\\\/2018\\\/04\\\/20\\\/painless-odbc-dplyr-connections-to-amazon-athena-and-apache-drill-with-r-odbc\\\/#primaryimage\"},\"image\":{\"@id\":\"https:\\\/\\\/rud.is\\\/b\\\/2018\\\/04\\\/20\\\/painless-odbc-dplyr-connections-to-amazon-athena-and-apache-drill-with-r-odbc\\\/#primaryimage\"},\"thumbnailUrl\":\"https:\\\/\\\/i0.wp.com\\\/rud.is\\\/b\\\/wp-content\\\/uploads\\\/2018\\\/04\\\/today-is-a-good-day-to-query.jpg?fit=700%2C535&ssl=1\",\"datePublished\":\"2018-04-20T15:44:29+00:00\",\"dateModified\":\"2018-10-05T16:02:31+00:00\",\"breadcrumb\":{\"@id\":\"https:\\\/\\\/rud.is\\\/b\\\/2018\\\/04\\\/20\\\/painless-odbc-dplyr-connections-to-amazon-athena-and-apache-drill-with-r-odbc\\\/#breadcrumb\"},\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\\\/\\\/rud.is\\\/b\\\/2018\\\/04\\\/20\\\/painless-odbc-dplyr-connections-to-amazon-athena-and-apache-drill-with-r-odbc\\\/\"]}]},{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\\\/\\\/rud.is\\\/b\\\/2018\\\/04\\\/20\\\/painless-odbc-dplyr-connections-to-amazon-athena-and-apache-drill-with-r-odbc\\\/#primaryimage\",\"url\":\"https:\\\/\\\/i0.wp.com\\\/rud.is\\\/b\\\/wp-content\\\/uploads\\\/2018\\\/04\\\/today-is-a-good-day-to-query.jpg?fit=700%2C535&ssl=1\",\"contentUrl\":\"https:\\\/\\\/i0.wp.com\\\/rud.is\\\/b\\\/wp-content\\\/uploads\\\/2018\\\/04\\\/today-is-a-good-day-to-query.jpg?fit=700%2C535&ssl=1\",\"width\":\"700\",\"height\":\"535\"},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\\\/\\\/rud.is\\\/b\\\/2018\\\/04\\\/20\\\/painless-odbc-dplyr-connections-to-amazon-athena-and-apache-drill-with-r-odbc\\\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Home\",\"item\":\"https:\\\/\\\/rud.is\\\/b\\\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"Painless ODBC + dplyr Connections to Amazon Athena and Apache Drill with R &#038; odbc\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\\\/\\\/rud.is\\\/b\\\/#website\",\"url\":\"https:\\\/\\\/rud.is\\\/b\\\/\",\"name\":\"rud.is\",\"description\":\"&quot;In God we trust. All others must bring data&quot;\",\"publisher\":{\"@id\":\"https:\\\/\\\/rud.is\\\/b\\\/#\\\/schema\\\/person\\\/d7cb7487ab0527447f7fda5c423ff886\"},\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\\\/\\\/rud.is\\\/b\\\/?s={search_term_string}\"},\"query-input\":{\"@type\":\"PropertyValueSpecification\",\"valueRequired\":true,\"valueName\":\"search_term_string\"}}],\"inLanguage\":\"en-US\"},{\"@type\":[\"Person\",\"Organization\"],\"@id\":\"https:\\\/\\\/rud.is\\\/b\\\/#\\\/schema\\\/person\\\/d7cb7487ab0527447f7fda5c423ff886\",\"name\":\"hrbrmstr\",\"image\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\\\/\\\/i0.wp.com\\\/rud.is\\\/b\\\/wp-content\\\/uploads\\\/2023\\\/10\\\/ukr-shield.png?fit=460%2C460&ssl=1\",\"url\":\"https:\\\/\\\/i0.wp.com\\\/rud.is\\\/b\\\/wp-content\\\/uploads\\\/2023\\\/10\\\/ukr-shield.png?fit=460%2C460&ssl=1\",\"contentUrl\":\"https:\\\/\\\/i0.wp.com\\\/rud.is\\\/b\\\/wp-content\\\/uploads\\\/2023\\\/10\\\/ukr-shield.png?fit=460%2C460&ssl=1\",\"width\":460,\"height\":460,\"caption\":\"hrbrmstr\"},\"logo\":{\"@id\":\"https:\\\/\\\/i0.wp.com\\\/rud.is\\\/b\\\/wp-content\\\/uploads\\\/2023\\\/10\\\/ukr-shield.png?fit=460%2C460&ssl=1\"},\"description\":\"Don't look at me\u2026I do what he does \u2014 just slower. #rstats avuncular \u2022 ?Resistance Fighter \u2022 Cook \u2022 Christian \u2022 [Master] Chef des Donn\u00e9es de S\u00e9curit\u00e9 @ @rapid7\",\"sameAs\":[\"http:\\\/\\\/rud.is\"],\"url\":\"https:\\\/\\\/rud.is\\\/b\\\/author\\\/hrbrmstr\\\/\"}]}<\/script>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"Painless ODBC + dplyr Connections to Amazon Athena and Apache Drill with R & odbc - rud.is","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/rud.is\/b\/2018\/04\/20\/painless-odbc-dplyr-connections-to-amazon-athena-and-apache-drill-with-r-odbc\/","og_locale":"en_US","og_type":"article","og_title":"Painless ODBC + dplyr Connections to Amazon Athena and Apache Drill with R & odbc - rud.is","og_description":"I spent some time this morning upgrading the JDBC driver (and changing up some supporting code to account for changes to it) for my metis package? which connects R up to Amazon Athena via RJDBC. I&#8217;m used to JDBC and have to deal with Java separately from R so I&#8217;m also comfortable with Java, JDBC [&hellip;]","og_url":"https:\/\/rud.is\/b\/2018\/04\/20\/painless-odbc-dplyr-connections-to-amazon-athena-and-apache-drill-with-r-odbc\/","og_site_name":"rud.is","article_published_time":"2018-04-20T15:44:29+00:00","article_modified_time":"2018-10-05T16:02:31+00:00","og_image":[{"width":700,"height":535,"url":"https:\/\/i0.wp.com\/rud.is\/b\/wp-content\/uploads\/2018\/04\/today-is-a-good-day-to-query.jpg?fit=700%2C535&ssl=1","type":"image\/jpeg"}],"author":"hrbrmstr","twitter_card":"summary_large_image","twitter_misc":{"Written by":"hrbrmstr","Est. reading time":"3 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"Article","@id":"https:\/\/rud.is\/b\/2018\/04\/20\/painless-odbc-dplyr-connections-to-amazon-athena-and-apache-drill-with-r-odbc\/#article","isPartOf":{"@id":"https:\/\/rud.is\/b\/2018\/04\/20\/painless-odbc-dplyr-connections-to-amazon-athena-and-apache-drill-with-r-odbc\/"},"author":{"name":"hrbrmstr","@id":"https:\/\/rud.is\/b\/#\/schema\/person\/d7cb7487ab0527447f7fda5c423ff886"},"headline":"Painless ODBC + dplyr Connections to Amazon Athena and Apache Drill with R &#038; odbc","datePublished":"2018-04-20T15:44:29+00:00","dateModified":"2018-10-05T16:02:31+00:00","mainEntityOfPage":{"@id":"https:\/\/rud.is\/b\/2018\/04\/20\/painless-odbc-dplyr-connections-to-amazon-athena-and-apache-drill-with-r-odbc\/"},"wordCount":609,"commentCount":8,"publisher":{"@id":"https:\/\/rud.is\/b\/#\/schema\/person\/d7cb7487ab0527447f7fda5c423ff886"},"image":{"@id":"https:\/\/rud.is\/b\/2018\/04\/20\/painless-odbc-dplyr-connections-to-amazon-athena-and-apache-drill-with-r-odbc\/#primaryimage"},"thumbnailUrl":"https:\/\/i0.wp.com\/rud.is\/b\/wp-content\/uploads\/2018\/04\/today-is-a-good-day-to-query.jpg?fit=700%2C535&ssl=1","articleSection":["Apache Drill","athena","drill","R"],"inLanguage":"en-US","potentialAction":[{"@type":"CommentAction","name":"Comment","target":["https:\/\/rud.is\/b\/2018\/04\/20\/painless-odbc-dplyr-connections-to-amazon-athena-and-apache-drill-with-r-odbc\/#respond"]}]},{"@type":"WebPage","@id":"https:\/\/rud.is\/b\/2018\/04\/20\/painless-odbc-dplyr-connections-to-amazon-athena-and-apache-drill-with-r-odbc\/","url":"https:\/\/rud.is\/b\/2018\/04\/20\/painless-odbc-dplyr-connections-to-amazon-athena-and-apache-drill-with-r-odbc\/","name":"Painless ODBC + dplyr Connections to Amazon Athena and Apache Drill with R & odbc - rud.is","isPartOf":{"@id":"https:\/\/rud.is\/b\/#website"},"primaryImageOfPage":{"@id":"https:\/\/rud.is\/b\/2018\/04\/20\/painless-odbc-dplyr-connections-to-amazon-athena-and-apache-drill-with-r-odbc\/#primaryimage"},"image":{"@id":"https:\/\/rud.is\/b\/2018\/04\/20\/painless-odbc-dplyr-connections-to-amazon-athena-and-apache-drill-with-r-odbc\/#primaryimage"},"thumbnailUrl":"https:\/\/i0.wp.com\/rud.is\/b\/wp-content\/uploads\/2018\/04\/today-is-a-good-day-to-query.jpg?fit=700%2C535&ssl=1","datePublished":"2018-04-20T15:44:29+00:00","dateModified":"2018-10-05T16:02:31+00:00","breadcrumb":{"@id":"https:\/\/rud.is\/b\/2018\/04\/20\/painless-odbc-dplyr-connections-to-amazon-athena-and-apache-drill-with-r-odbc\/#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/rud.is\/b\/2018\/04\/20\/painless-odbc-dplyr-connections-to-amazon-athena-and-apache-drill-with-r-odbc\/"]}]},{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/rud.is\/b\/2018\/04\/20\/painless-odbc-dplyr-connections-to-amazon-athena-and-apache-drill-with-r-odbc\/#primaryimage","url":"https:\/\/i0.wp.com\/rud.is\/b\/wp-content\/uploads\/2018\/04\/today-is-a-good-day-to-query.jpg?fit=700%2C535&ssl=1","contentUrl":"https:\/\/i0.wp.com\/rud.is\/b\/wp-content\/uploads\/2018\/04\/today-is-a-good-day-to-query.jpg?fit=700%2C535&ssl=1","width":"700","height":"535"},{"@type":"BreadcrumbList","@id":"https:\/\/rud.is\/b\/2018\/04\/20\/painless-odbc-dplyr-connections-to-amazon-athena-and-apache-drill-with-r-odbc\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/rud.is\/b\/"},{"@type":"ListItem","position":2,"name":"Painless ODBC + dplyr Connections to Amazon Athena and Apache Drill with R &#038; odbc"}]},{"@type":"WebSite","@id":"https:\/\/rud.is\/b\/#website","url":"https:\/\/rud.is\/b\/","name":"rud.is","description":"&quot;In God we trust. All others must bring data&quot;","publisher":{"@id":"https:\/\/rud.is\/b\/#\/schema\/person\/d7cb7487ab0527447f7fda5c423ff886"},"potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/rud.is\/b\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"en-US"},{"@type":["Person","Organization"],"@id":"https:\/\/rud.is\/b\/#\/schema\/person\/d7cb7487ab0527447f7fda5c423ff886","name":"hrbrmstr","image":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/i0.wp.com\/rud.is\/b\/wp-content\/uploads\/2023\/10\/ukr-shield.png?fit=460%2C460&ssl=1","url":"https:\/\/i0.wp.com\/rud.is\/b\/wp-content\/uploads\/2023\/10\/ukr-shield.png?fit=460%2C460&ssl=1","contentUrl":"https:\/\/i0.wp.com\/rud.is\/b\/wp-content\/uploads\/2023\/10\/ukr-shield.png?fit=460%2C460&ssl=1","width":460,"height":460,"caption":"hrbrmstr"},"logo":{"@id":"https:\/\/i0.wp.com\/rud.is\/b\/wp-content\/uploads\/2023\/10\/ukr-shield.png?fit=460%2C460&ssl=1"},"description":"Don't look at me\u2026I do what he does \u2014 just slower. #rstats avuncular \u2022 ?Resistance Fighter \u2022 Cook \u2022 Christian \u2022 [Master] Chef des Donn\u00e9es de S\u00e9curit\u00e9 @ @rapid7","sameAs":["http:\/\/rud.is"],"url":"https:\/\/rud.is\/b\/author\/hrbrmstr\/"}]}},"jetpack_featured_media_url":"https:\/\/i0.wp.com\/rud.is\/b\/wp-content\/uploads\/2018\/04\/today-is-a-good-day-to-query.jpg?fit=700%2C535&ssl=1","jetpack_shortlink":"https:\/\/wp.me\/p23idr-2Df","jetpack_likes_enabled":true,"jetpack-related-posts":[{"id":11921,"url":"https:\/\/rud.is\/b\/2019\/02\/17\/conquering-caffeinated-amazon-athena-with-the-metis-trio-of-packages\/","url_meta":{"origin":10121,"position":0},"title":"Conquering Caffeinated Amazon Athena with the metis Trio of Packages","author":"hrbrmstr","date":"2019-02-17","format":false,"excerpt":"I must preface this post with the posit that if you're doing anything interactive() with Amazon Athena you should seriously consider just using their free ODBC drivers as it's the easiest way to wire them up to R DBI- and tidyverse-wise. I've said as much in previous posts. Drop a\u2026","rel":"","context":"In &quot;athena&quot;","block_context":{"text":"athena","link":"https:\/\/rud.is\/b\/category\/athena\/"},"img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":11070,"url":"https:\/\/rud.is\/b\/2018\/07\/14\/alleviating-aws-athena-aggravation-with-asynchronous-assistance\/","url_meta":{"origin":10121,"position":1},"title":"Alleviating AWS Athena Aggravation with Asynchronous Assistance","author":"hrbrmstr","date":"2018-07-14","format":false,"excerpt":"I've blogged about how to use Amazon Athena with R before and if you are a regular Athena user, you've likely run into a situation where you prepare a dplyr chain, fire off a collect() and then wait. And, wait. And, wait. And, wait. Queries that take significant processing time\u2026","rel":"","context":"In &quot;athena&quot;","block_context":{"text":"athena","link":"https:\/\/rud.is\/b\/category\/athena\/"},"img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":5954,"url":"https:\/\/rud.is\/b\/2017\/05\/16\/r%e2%81%b6-using-r-with-amazon-athena-awas-temporary-security-credentials\/","url_meta":{"origin":10121,"position":2},"title":"R\u2076 \u2014 Using R With Amazon Athena &#038; AWS Temporary Security Credentials","author":"hrbrmstr","date":"2017-05-16","format":false,"excerpt":"Most of the examples of working with most of the AWS services show basic username & password authentication. That's all well-and-good, but many shops use the AWS Security Token Service to provide temporary credentials and session tokens to limit exposure and provide more uniform multi-factor authentication. At my workplace, Frank\u2026","rel":"","context":"In &quot;R&quot;","block_context":{"text":"R","link":"https:\/\/rud.is\/b\/category\/r\/"},"img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":12855,"url":"https:\/\/rud.is\/b\/2020\/11\/20\/updated-apache-drill-r-jdbc-interface-package-sergeant-caffeinated-with-dbplyr-2-x-compatibility\/","url_meta":{"origin":10121,"position":3},"title":"Updated Apache Drill R JDBC Interface Package {sergeant.caffeinated} With {dbplyr} 2.x Compatibility","author":"hrbrmstr","date":"2020-11-20","format":false,"excerpt":"While the future of the Apache Drill ecosystem is somewhat in-play (MapR \u2014 a major sponsoring org for the project \u2014 is kinda dead), I still use it almost daily (on my local home office cluster) to avoid handing over any more money to Amazon than I\/we already do. The\u2026","rel":"","context":"In &quot;Apache Drill&quot;","block_context":{"text":"Apache Drill","link":"https:\/\/rud.is\/b\/category\/apache-drill\/"},"img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":11369,"url":"https:\/\/rud.is\/b\/2018\/08\/11\/connecting-apache-zeppelin-and-apache-drill-postgresql-etc\/","url_meta":{"origin":10121,"position":4},"title":"Connecting Apache Zeppelin and Apache Drill, PostgreSQL, etc.","author":"hrbrmstr","date":"2018-08-11","format":false,"excerpt":"A previous post showed how to use a different authentication provider to wire up Apache Zeppelin and Amazon Athena. As noted in that post, Zeppelin is a \"notebook\" alternative to Jupyter (and other) notebooks. Unlike Jupyter, I can tolerate Zeppelin and it's got some nifty features like plug-and-play JDBC access.\u2026","rel":"","context":"In &quot;Apache Drill&quot;","block_context":{"text":"Apache Drill","link":"https:\/\/rud.is\/b\/category\/apache-drill\/"},"img":{"alt_text":"","src":"https:\/\/i0.wp.com\/rud.is\/b\/wp-content\/uploads\/2018\/08\/z-drill-2.png?fit=1200%2C542&ssl=1&resize=350%2C200","width":350,"height":200,"srcset":"https:\/\/i0.wp.com\/rud.is\/b\/wp-content\/uploads\/2018\/08\/z-drill-2.png?fit=1200%2C542&ssl=1&resize=350%2C200 1x, https:\/\/i0.wp.com\/rud.is\/b\/wp-content\/uploads\/2018\/08\/z-drill-2.png?fit=1200%2C542&ssl=1&resize=525%2C300 1.5x, https:\/\/i0.wp.com\/rud.is\/b\/wp-content\/uploads\/2018\/08\/z-drill-2.png?fit=1200%2C542&ssl=1&resize=700%2C400 2x, https:\/\/i0.wp.com\/rud.is\/b\/wp-content\/uploads\/2018\/08\/z-drill-2.png?fit=1200%2C542&ssl=1&resize=1050%2C600 3x"},"classes":[]},{"id":11346,"url":"https:\/\/rud.is\/b\/2018\/08\/11\/connecting-apache-zeppelin-up-to-amazon-athena-with-an-iam-profile-name\/","url_meta":{"origin":10121,"position":5},"title":"Connecting Apache Zeppelin Up to Amazon Athena with an IAM Profile Name","author":"hrbrmstr","date":"2018-08-11","format":false,"excerpt":"Apache Zeppelin is a \"notebook\" alternative to Jupyter (and other) notebooks. It supports a plethora of kernels\/interpreters and can do a ton of things that this post isn't going to discuss (perhaps future ones will, especially since it's the first \"notebook\" environment I've been able to tolerate for longer than\u2026","rel":"","context":"In &quot;athena&quot;","block_context":{"text":"athena","link":"https:\/\/rud.is\/b\/category\/athena\/"},"img":{"alt_text":"","src":"https:\/\/i0.wp.com\/rud.is\/b\/wp-content\/uploads\/2018\/08\/athena-example-1.png?fit=1200%2C704&ssl=1&resize=350%2C200","width":350,"height":200,"srcset":"https:\/\/i0.wp.com\/rud.is\/b\/wp-content\/uploads\/2018\/08\/athena-example-1.png?fit=1200%2C704&ssl=1&resize=350%2C200 1x, https:\/\/i0.wp.com\/rud.is\/b\/wp-content\/uploads\/2018\/08\/athena-example-1.png?fit=1200%2C704&ssl=1&resize=525%2C300 1.5x, https:\/\/i0.wp.com\/rud.is\/b\/wp-content\/uploads\/2018\/08\/athena-example-1.png?fit=1200%2C704&ssl=1&resize=700%2C400 2x, https:\/\/i0.wp.com\/rud.is\/b\/wp-content\/uploads\/2018\/08\/athena-example-1.png?fit=1200%2C704&ssl=1&resize=1050%2C600 3x"},"classes":[]}],"jetpack_sharing_enabled":true,"_links":{"self":[{"href":"https:\/\/rud.is\/b\/wp-json\/wp\/v2\/posts\/10121","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/rud.is\/b\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/rud.is\/b\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/rud.is\/b\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/rud.is\/b\/wp-json\/wp\/v2\/comments?post=10121"}],"version-history":[{"count":0,"href":"https:\/\/rud.is\/b\/wp-json\/wp\/v2\/posts\/10121\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/rud.is\/b\/wp-json\/wp\/v2\/media\/10123"}],"wp:attachment":[{"href":"https:\/\/rud.is\/b\/wp-json\/wp\/v2\/media?parent=10121"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/rud.is\/b\/wp-json\/wp\/v2\/categories?post=10121"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/rud.is\/b\/wp-json\/wp\/v2\/tags?post=10121"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}