

{"id":11070,"date":"2018-07-14T19:39:37","date_gmt":"2018-07-15T00:39:37","guid":{"rendered":"https:\/\/rud.is\/b\/?p=11070"},"modified":"2018-07-14T19:39:37","modified_gmt":"2018-07-15T00:39:37","slug":"alleviating-aws-athena-aggravation-with-asynchronous-assistance","status":"publish","type":"post","link":"https:\/\/rud.is\/b\/2018\/07\/14\/alleviating-aws-athena-aggravation-with-asynchronous-assistance\/","title":{"rendered":"Alleviating AWS Athena Aggravation with Asynchronous Assistance"},"content":{"rendered":"<p>I&#8217;ve blogged about <a href=\"https:\/\/rud.is\/b\/2018\/04\/20\/painless-odbc-dplyr-connections-to-amazon-athena-and-apache-drill-with-r-odbc\/\">how to use Amazon Athena with R<\/a> before and if you are a regular Athena user, you&#8217;ve likely run into a situation where you prepare a <code>dplyr<\/code> chain, fire off a <code>collect()<\/code> and then wait.<\/p>\n<p>And, wait.<\/p>\n<p>And, wait.<\/p>\n<p>And, wait.<\/p>\n<p>Queries that take significant processing time or have large result sets do not play nicely with the provided ODBC and JDBC drivers. This means &#8220;hung&#8221; R sessions and severe frustration, especially when you can login to the AWS Athena console and see that the results are <em>right there<\/em>!!<\/p>\n<p>I&#8217;ve been crafting SQL by hand or using <code>sql_render()<\/code> by hand to avoid this (when I remember to) but finally felt sufficient frustration to craft a better way, provided you can install and run <code>rJava<\/code>-based code (it&#8217;s 2018 and that still is not an easy given on many systems unfortunately).<\/p>\n<p>There are two functions below:<\/p>\n<ul>\n<li><code>collect_async()<\/code>, and<\/li>\n<li><code>gather_results()<\/code><\/li>\n<\/ul>\n<p>The <code>collect_async()<\/code> function is designed to be used like <code>collect()<\/code> but uses Athena components from the AWS SDK for Java to execute the SQL query behind the <code>dplyr<\/code> chain <em>asynchronously<\/em>. The companion function <code>gather_results()<\/code> takes the object created by <code>collect_async()<\/code> and checks to see if the results are ready. If if they are, it will use the <code>aws.s3<\/code> package to download them. Personally, I&#8217;d just <code>aws s3 sync ...<\/code> from the command line vs use the <code>aws.s3<\/code> package but that&#8217;s not everyone&#8217;s cup of tea.<\/p>\n<p>Once I figure out the best package API for this I&#8217;ll add it to the <code>metis<\/code> package. There are many AWS idiosyncrasies that need to be accounted for and I&#8217;d rather ship this current set of functions via the blog so folks can use it (and tweak it to their needs) before waiting for perfection.<\/p>\n<p>Here&#8217;s the code:<\/p>\n<pre><code class=\"language-r\">library(rJava)\nlibrary(awsjavasdk)\nlibrary(aws.signature)\nlibrary(aws.s3)\nlibrary(odbc)\nlibrary(tidyverse)\nlibrary(dbplyr)\n\n#' Collect Amazon Athena query results asynchronously\n#' \n#' Long running Athena queries and Athena queries with large result\n#' sets can seriously stall a `dplyr` processing chain due to poorly\n#' implemented ODBC and JDBC drivers. The AWS SDK for Athena has \n#' methods that support submitting a query asynchronously for \"batch\"\n#' processing. All Athena resutls are stored in CSV files in S3 and it's\n#' easy to use the R `aws.s3` package to grab these or perform an\n#' `aws s3 sync ...` operation on the command line.\n#' \n#' @md\n#' @param obj the `dplyr` chain\n#' @param schema Athena schema (usually matches the `Schema` parameter to the \n#'        Simba ODBC connection)\n#' @param region Your AWS region. All lower case with dashes (usually matches\n#'        the `AwsRegion` parameter to the Simba ODBC connection)\n#' @param results_bucket the S3 results bucket where query results are stored \n#'        (usually matches the `S3OutputLocation` parameter to the Simba ODBC\n#'        connection)\n#' @return a `list` with the query execution ID and the S3 bucket. This object\n#'         is designed to be passed to the companion `gather_results()` if you\n#'         want to use the `aws.s3` package to retrieve the results. Otherwise,\n#'         sync the file however you want using the query execution id.\n#' @note You may need to change up the authentication provider depending on how \n#'       you use credentials with Athena\ncollect_async <- function(obj, schema, region, results_bucket) {\n\n  ugly_query <- as.character(sql_render(obj))\n\n  region <- toupper(region)\n  region <- gsub(\"-\", \"_\", region, fixed=TRUE)\n\n  regions <- J(\"com.amazonaws.regions.Regions\")\n\n  available_regions <- grep(\"^[[:upper:][:digit:]_]+$\", names(regions), value=TRUE)\n  if (!region %in% available_regions) stop(\"Invalid region.\", call.=FALSE)\n\n  switch(\n    region,\n    \"GovCloud\" = regions$GovCloud,\n    \"US_EAST_1\" = regions$US_EAST_1,\n    \"US_EAST_2\" = regions$US_EAST_2,\n    \"US_WEST_1\" = regions$US_WEST_1,\n    \"US_WEST_2\" = regions$US_WEST_2,\n    \"EU_WEST_1\" = regions$EU_WEST_1,\n    \"EU_WEST_2\" = regions$EU_WEST_2,\n    \"EU_WEST_3\" = regions$EU_WEST_3,\n    \"EU_CENTRAL_1\" = regions$EU_CENTRAL_1,\n    \"AP_SOUTH_1\" = regions$AP_SOUTH_1,\n    \"AP_SOUTHEAST_1\" = regions$AP_SOUTHEAST_1,\n    \"AP_SOUTHEAST_2\" = regions$AP_SOUTHEAST_2,\n    \"AP_NORTHEAST_1\" = regions$AP_NORTHEAST_1,\n    \"AP_NORTHEAST_2\" = regions$AP_NORTHEAST_2,\n    \"SA_EAST_1\" = regions$SA_EAST_1,\n    \"CN_NORTH_1\" = regions$CN_NORTH_1,\n    \"CN_NORTHWEST_1\" = regions$CN_NORTHWEST_1,\n    \"CA_CENTRAL_1\" = regions$CA_CENTRAL_1,\n    \"DEFAULT_REGION\" = regions$DEFAULT_REGION\n  ) -> region\n\n  provider <- J(\"com.amazonaws.auth.DefaultAWSCredentialsProviderChain\")\n  client <- J(\"com.amazonaws.services.athena.AmazonAthenaAsyncClientBuilder\")\n\n  my_client <- client$standard()\n  my_client <- my_client$withRegion(region)\n  my_client <- my_client$withCredentials(provider$getInstance())\n  my_client <- my_client$build()\n\n  queryExecutionContext <- .jnew(\"com.amazonaws.services.athena.model.QueryExecutionContext\")\n  context <- queryExecutionContext$withDatabase(schema)\n  result <- .jnew(\"com.amazonaws.services.athena.model.ResultConfiguration\")\n  result$setOutputLocation(results_bucket)\n\n  startQueryExecutionRequest <- .jnew(\"com.amazonaws.services.athena.model.StartQueryExecutionRequest\")\n  startQueryExecutionRequest$setQueryString(ugly_query)\n  startQueryExecutionRequest$setQueryExecutionContext(context)\n  startQueryExecutionRequest$setResultConfiguration(result)\n\n  res <- my_client$startQueryExecutionAsync(startQueryExecutionRequest)\n\n  r <- res$get()\n  qex_id <- r$getQueryExecutionId()\n\n  list(\n    qex_id = qex_id,\n    results_bucket = results_bucket\n  )\n\n}\n\n#' Gather the results of an asynchronous query\n#'\n#' @md\n#' @param async_result the result of a call to `collect_async()`\n#' @return a data frame (tibble) or `NULL` if the query results are not ready yet\ngather_results <- function(async_result) {\n  if (bucket_exists(sprintf(\"%s\/%s\", async_result$results_bucket, async_result$qex_id))) {\n    readr::read_csv(\n      get_object(sprintf(\"%s\/%s.csv\", async_result$results_bucket, async_result$qex_id))\n    )\n  } else {\n    message(\"Results are not in the designated bucket.\")\n    return(NULL)\n  }\n}<\/code><\/pre>\n<p>Now, we give it a go:<\/p>\n<pre><code class=\"language-r\"># Setup the credentials you're using\nuse_credentials(\"personal\")\n\n# load the AWS Java SDK classes\nawsjavasdk::load_sdk()\n\n# necessary for Simba ODBC and the async query ops\naws_region <- \"us-east-1\"\nathena_schema <- \"sampledb\"\nathena_results_bucket <- \"s3:\/\/aws-athena-query-results-redacted\"\n\n# connect to Athena and the sample database\nDBI::dbConnect(\n  odbc::odbc(),\n  driver = \"\/Library\/simba\/athenaodbc\/lib\/libathenaodbc_sbu.dylib\",\n  Schema = athena_schema,\n  AwsRegion = aws_region,\n  AuthenticationType = \"IAM Profile\",\n  AwsProfile = \"personal\",\n  S3OutputLocation = athena_results_bucket\n) -> con\n\n# the sample table in the sample db\/schema\nelb_logs <- tbl(con, \"elb_logs\")\n\n# create your dplyr chain. This one is small so I don't incur charges\n# collect_async() MUST be the LAST item in the dplyr chain.\nelb_logs %>%\n  filter(requestip == \"253.89.30.138\") %>%\n  collect_async(\n    schema = athena_schema,\n    region = aws_region,\n    results_bucket = athena_results_bucket\n  ) -> async_result\n\nasync_result\n## $qex_id\n## [1] \"d5fe7754-919b-47c5-bd7d-3ccdb1a3a414\"\n## \n## $results_bucket\n## [1] \"s3:\/\/aws-athena-query-results-redacted\"\n\n# For long queries we can wait a bit but the function will tell us if the results\n# are there or not.\n\ngather_results(async_result)\n## Parsed with column specification:\n## cols(\n##   timestamp = col_datetime(format = \"\"),\n##   elbname = col_character(),\n##   requestip = col_character(),\n##   requestport = col_integer(),\n##   backendip = col_character(),\n##   backendport = col_integer(),\n##   requestprocessingtime = col_double(),\n##   backendprocessingtime = col_double(),\n##   clientresponsetime = col_double(),\n##   elbresponsecode = col_integer(),\n##   backendresponsecode = col_integer(),\n##   receivedbytes = col_integer(),\n##   sentbytes = col_integer(),\n##   requestverb = col_character(),\n##   url = col_character(),\n##   protocol = col_character()\n## )\n## # A tibble: 1 x 16\n##   timestamp           elbname requestip     requestport backendip     backendport\n##   <dttm>              <chr>   <chr>               <int> <chr>               <int>\n## 1 2014-09-29 03:24:38 lb-demo 253.89.30.138       20159 253.89.30.138        8888\n## # ... with 10 more variables: requestprocessingtime <dbl>, backendprocessingtime <dbl>,\n## #   clientresponsetime <dbl>, elbresponsecode <int>, backendresponsecode <int>,\n## #   receivedbytes <int>, sentbytes <int>, requestverb <chr>, url <chr>, protocol <chr><\/code><\/pre>\n<p>If you do try this out and end up needing to tweak it, feedback on what you had to do (via the comments) would be greatly appreciated.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>I&#8217;ve blogged about how to use Amazon Athena with R before and if you are a regular Athena user, you&#8217;ve likely run into a situation where you prepare a dplyr chain, fire off a collect() and then wait. And, wait. And, wait. And, wait. Queries that take significant processing time or have large result sets [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"jetpack_post_was_ever_published":false,"_jetpack_newsletter_access":"","_jetpack_dont_email_post_to_subs":false,"_jetpack_newsletter_tier_id":0,"_jetpack_memberships_contains_paywalled_content":false,"_jetpack_memberships_contains_paid_content":false,"activitypub_content_warning":"","activitypub_content_visibility":"","activitypub_max_image_attachments":3,"activitypub_interaction_policy_quote":"anyone","activitypub_status":"","footnotes":""},"categories":[818,91],"tags":[],"class_list":["post-11070","post","type-post","status-publish","format-standard","hentry","category-athena","category-r"],"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v27.3 - https:\/\/yoast.com\/product\/yoast-seo-wordpress\/ -->\n<title>Alleviating AWS Athena Aggravation with Asynchronous Assistance - rud.is<\/title>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/rud.is\/b\/2018\/07\/14\/alleviating-aws-athena-aggravation-with-asynchronous-assistance\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"Alleviating AWS Athena Aggravation with Asynchronous Assistance - rud.is\" \/>\n<meta property=\"og:description\" content=\"I&#8217;ve blogged about how to use Amazon Athena with R before and if you are a regular Athena user, you&#8217;ve likely run into a situation where you prepare a dplyr chain, fire off a collect() and then wait. And, wait. And, wait. And, wait. Queries that take significant processing time or have large result sets [&hellip;]\" \/>\n<meta property=\"og:url\" content=\"https:\/\/rud.is\/b\/2018\/07\/14\/alleviating-aws-athena-aggravation-with-asynchronous-assistance\/\" \/>\n<meta property=\"og:site_name\" content=\"rud.is\" \/>\n<meta property=\"article:published_time\" content=\"2018-07-15T00:39:37+00:00\" \/>\n<meta name=\"author\" content=\"hrbrmstr\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:label1\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data1\" content=\"hrbrmstr\" \/>\n\t<meta name=\"twitter:label2\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data2\" content=\"3 minutes\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\\\/\\\/schema.org\",\"@graph\":[{\"@type\":\"Article\",\"@id\":\"https:\\\/\\\/rud.is\\\/b\\\/2018\\\/07\\\/14\\\/alleviating-aws-athena-aggravation-with-asynchronous-assistance\\\/#article\",\"isPartOf\":{\"@id\":\"https:\\\/\\\/rud.is\\\/b\\\/2018\\\/07\\\/14\\\/alleviating-aws-athena-aggravation-with-asynchronous-assistance\\\/\"},\"author\":{\"name\":\"hrbrmstr\",\"@id\":\"https:\\\/\\\/rud.is\\\/b\\\/#\\\/schema\\\/person\\\/d7cb7487ab0527447f7fda5c423ff886\"},\"headline\":\"Alleviating AWS Athena Aggravation with Asynchronous Assistance\",\"datePublished\":\"2018-07-15T00:39:37+00:00\",\"mainEntityOfPage\":{\"@id\":\"https:\\\/\\\/rud.is\\\/b\\\/2018\\\/07\\\/14\\\/alleviating-aws-athena-aggravation-with-asynchronous-assistance\\\/\"},\"wordCount\":325,\"commentCount\":2,\"publisher\":{\"@id\":\"https:\\\/\\\/rud.is\\\/b\\\/#\\\/schema\\\/person\\\/d7cb7487ab0527447f7fda5c423ff886\"},\"articleSection\":[\"athena\",\"R\"],\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"CommentAction\",\"name\":\"Comment\",\"target\":[\"https:\\\/\\\/rud.is\\\/b\\\/2018\\\/07\\\/14\\\/alleviating-aws-athena-aggravation-with-asynchronous-assistance\\\/#respond\"]}]},{\"@type\":\"WebPage\",\"@id\":\"https:\\\/\\\/rud.is\\\/b\\\/2018\\\/07\\\/14\\\/alleviating-aws-athena-aggravation-with-asynchronous-assistance\\\/\",\"url\":\"https:\\\/\\\/rud.is\\\/b\\\/2018\\\/07\\\/14\\\/alleviating-aws-athena-aggravation-with-asynchronous-assistance\\\/\",\"name\":\"Alleviating AWS Athena Aggravation with Asynchronous Assistance - rud.is\",\"isPartOf\":{\"@id\":\"https:\\\/\\\/rud.is\\\/b\\\/#website\"},\"datePublished\":\"2018-07-15T00:39:37+00:00\",\"breadcrumb\":{\"@id\":\"https:\\\/\\\/rud.is\\\/b\\\/2018\\\/07\\\/14\\\/alleviating-aws-athena-aggravation-with-asynchronous-assistance\\\/#breadcrumb\"},\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\\\/\\\/rud.is\\\/b\\\/2018\\\/07\\\/14\\\/alleviating-aws-athena-aggravation-with-asynchronous-assistance\\\/\"]}]},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\\\/\\\/rud.is\\\/b\\\/2018\\\/07\\\/14\\\/alleviating-aws-athena-aggravation-with-asynchronous-assistance\\\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Home\",\"item\":\"https:\\\/\\\/rud.is\\\/b\\\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"Alleviating AWS Athena Aggravation with Asynchronous Assistance\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\\\/\\\/rud.is\\\/b\\\/#website\",\"url\":\"https:\\\/\\\/rud.is\\\/b\\\/\",\"name\":\"rud.is\",\"description\":\"&quot;In God we trust. All others must bring data&quot;\",\"publisher\":{\"@id\":\"https:\\\/\\\/rud.is\\\/b\\\/#\\\/schema\\\/person\\\/d7cb7487ab0527447f7fda5c423ff886\"},\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\\\/\\\/rud.is\\\/b\\\/?s={search_term_string}\"},\"query-input\":{\"@type\":\"PropertyValueSpecification\",\"valueRequired\":true,\"valueName\":\"search_term_string\"}}],\"inLanguage\":\"en-US\"},{\"@type\":[\"Person\",\"Organization\"],\"@id\":\"https:\\\/\\\/rud.is\\\/b\\\/#\\\/schema\\\/person\\\/d7cb7487ab0527447f7fda5c423ff886\",\"name\":\"hrbrmstr\",\"image\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\\\/\\\/i0.wp.com\\\/rud.is\\\/b\\\/wp-content\\\/uploads\\\/2023\\\/10\\\/ukr-shield.png?fit=460%2C460&ssl=1\",\"url\":\"https:\\\/\\\/i0.wp.com\\\/rud.is\\\/b\\\/wp-content\\\/uploads\\\/2023\\\/10\\\/ukr-shield.png?fit=460%2C460&ssl=1\",\"contentUrl\":\"https:\\\/\\\/i0.wp.com\\\/rud.is\\\/b\\\/wp-content\\\/uploads\\\/2023\\\/10\\\/ukr-shield.png?fit=460%2C460&ssl=1\",\"width\":460,\"height\":460,\"caption\":\"hrbrmstr\"},\"logo\":{\"@id\":\"https:\\\/\\\/i0.wp.com\\\/rud.is\\\/b\\\/wp-content\\\/uploads\\\/2023\\\/10\\\/ukr-shield.png?fit=460%2C460&ssl=1\"},\"description\":\"Don't look at me\u2026I do what he does \u2014 just slower. #rstats avuncular \u2022 ?Resistance Fighter \u2022 Cook \u2022 Christian \u2022 [Master] Chef des Donn\u00e9es de S\u00e9curit\u00e9 @ @rapid7\",\"sameAs\":[\"http:\\\/\\\/rud.is\"],\"url\":\"https:\\\/\\\/rud.is\\\/b\\\/author\\\/hrbrmstr\\\/\"}]}<\/script>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"Alleviating AWS Athena Aggravation with Asynchronous Assistance - rud.is","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/rud.is\/b\/2018\/07\/14\/alleviating-aws-athena-aggravation-with-asynchronous-assistance\/","og_locale":"en_US","og_type":"article","og_title":"Alleviating AWS Athena Aggravation with Asynchronous Assistance - rud.is","og_description":"I&#8217;ve blogged about how to use Amazon Athena with R before and if you are a regular Athena user, you&#8217;ve likely run into a situation where you prepare a dplyr chain, fire off a collect() and then wait. And, wait. And, wait. And, wait. Queries that take significant processing time or have large result sets [&hellip;]","og_url":"https:\/\/rud.is\/b\/2018\/07\/14\/alleviating-aws-athena-aggravation-with-asynchronous-assistance\/","og_site_name":"rud.is","article_published_time":"2018-07-15T00:39:37+00:00","author":"hrbrmstr","twitter_card":"summary_large_image","twitter_misc":{"Written by":"hrbrmstr","Est. reading time":"3 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"Article","@id":"https:\/\/rud.is\/b\/2018\/07\/14\/alleviating-aws-athena-aggravation-with-asynchronous-assistance\/#article","isPartOf":{"@id":"https:\/\/rud.is\/b\/2018\/07\/14\/alleviating-aws-athena-aggravation-with-asynchronous-assistance\/"},"author":{"name":"hrbrmstr","@id":"https:\/\/rud.is\/b\/#\/schema\/person\/d7cb7487ab0527447f7fda5c423ff886"},"headline":"Alleviating AWS Athena Aggravation with Asynchronous Assistance","datePublished":"2018-07-15T00:39:37+00:00","mainEntityOfPage":{"@id":"https:\/\/rud.is\/b\/2018\/07\/14\/alleviating-aws-athena-aggravation-with-asynchronous-assistance\/"},"wordCount":325,"commentCount":2,"publisher":{"@id":"https:\/\/rud.is\/b\/#\/schema\/person\/d7cb7487ab0527447f7fda5c423ff886"},"articleSection":["athena","R"],"inLanguage":"en-US","potentialAction":[{"@type":"CommentAction","name":"Comment","target":["https:\/\/rud.is\/b\/2018\/07\/14\/alleviating-aws-athena-aggravation-with-asynchronous-assistance\/#respond"]}]},{"@type":"WebPage","@id":"https:\/\/rud.is\/b\/2018\/07\/14\/alleviating-aws-athena-aggravation-with-asynchronous-assistance\/","url":"https:\/\/rud.is\/b\/2018\/07\/14\/alleviating-aws-athena-aggravation-with-asynchronous-assistance\/","name":"Alleviating AWS Athena Aggravation with Asynchronous Assistance - rud.is","isPartOf":{"@id":"https:\/\/rud.is\/b\/#website"},"datePublished":"2018-07-15T00:39:37+00:00","breadcrumb":{"@id":"https:\/\/rud.is\/b\/2018\/07\/14\/alleviating-aws-athena-aggravation-with-asynchronous-assistance\/#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/rud.is\/b\/2018\/07\/14\/alleviating-aws-athena-aggravation-with-asynchronous-assistance\/"]}]},{"@type":"BreadcrumbList","@id":"https:\/\/rud.is\/b\/2018\/07\/14\/alleviating-aws-athena-aggravation-with-asynchronous-assistance\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/rud.is\/b\/"},{"@type":"ListItem","position":2,"name":"Alleviating AWS Athena Aggravation with Asynchronous Assistance"}]},{"@type":"WebSite","@id":"https:\/\/rud.is\/b\/#website","url":"https:\/\/rud.is\/b\/","name":"rud.is","description":"&quot;In God we trust. All others must bring data&quot;","publisher":{"@id":"https:\/\/rud.is\/b\/#\/schema\/person\/d7cb7487ab0527447f7fda5c423ff886"},"potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/rud.is\/b\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"en-US"},{"@type":["Person","Organization"],"@id":"https:\/\/rud.is\/b\/#\/schema\/person\/d7cb7487ab0527447f7fda5c423ff886","name":"hrbrmstr","image":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/i0.wp.com\/rud.is\/b\/wp-content\/uploads\/2023\/10\/ukr-shield.png?fit=460%2C460&ssl=1","url":"https:\/\/i0.wp.com\/rud.is\/b\/wp-content\/uploads\/2023\/10\/ukr-shield.png?fit=460%2C460&ssl=1","contentUrl":"https:\/\/i0.wp.com\/rud.is\/b\/wp-content\/uploads\/2023\/10\/ukr-shield.png?fit=460%2C460&ssl=1","width":460,"height":460,"caption":"hrbrmstr"},"logo":{"@id":"https:\/\/i0.wp.com\/rud.is\/b\/wp-content\/uploads\/2023\/10\/ukr-shield.png?fit=460%2C460&ssl=1"},"description":"Don't look at me\u2026I do what he does \u2014 just slower. #rstats avuncular \u2022 ?Resistance Fighter \u2022 Cook \u2022 Christian \u2022 [Master] Chef des Donn\u00e9es de S\u00e9curit\u00e9 @ @rapid7","sameAs":["http:\/\/rud.is"],"url":"https:\/\/rud.is\/b\/author\/hrbrmstr\/"}]}},"jetpack_featured_media_url":"","jetpack_shortlink":"https:\/\/wp.me\/p23idr-2Sy","jetpack_likes_enabled":true,"jetpack-related-posts":[{"id":4696,"url":"https:\/\/rud.is\/b\/2016\/12\/05\/interacting-with-amazon-athena-from-r\/","url_meta":{"origin":11070,"position":0},"title":"Interacting With Amazon Athena from R","author":"hrbrmstr","date":"2016-12-05","format":false,"excerpt":"This is a short post for those looking to test out Amazon Athena with R. Amazon makes Athena available via JDBC, so you can use RJDBC to query data. All you need is their JAR file and some setup information. Here's how to get the JAR file to the current\u2026","rel":"","context":"In &quot;R&quot;","block_context":{"text":"R","link":"https:\/\/rud.is\/b\/category\/r\/"},"img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":11077,"url":"https:\/\/rud.is\/b\/2018\/07\/20\/a-new-boto3-amazon-athena-client-wrapper-with-dplyr-async-query-support\/","url_meta":{"origin":11070,"position":1},"title":"A new &#8216;boto3&#8217; Amazon Athena client wrapper with dplyr async query support","author":"hrbrmstr","date":"2018-07-20","format":false,"excerpt":"A previous post explored how to deal with Amazon Athena queries asynchronously. The function presented is a beast, though it is on purpose (to provide options for folks). In reality, nobody really wants to use rJava wrappers much anymore and dealing with icky Python library calls directly just feels wrong,\u2026","rel":"","context":"In &quot;athena&quot;","block_context":{"text":"athena","link":"https:\/\/rud.is\/b\/category\/athena\/"},"img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":10121,"url":"https:\/\/rud.is\/b\/2018\/04\/20\/painless-odbc-dplyr-connections-to-amazon-athena-and-apache-drill-with-r-odbc\/","url_meta":{"origin":11070,"position":2},"title":"Painless ODBC  + dplyr Connections to Amazon Athena and Apache Drill with R &#038; odbc","author":"hrbrmstr","date":"2018-04-20","format":false,"excerpt":"I spent some time this morning upgrading the JDBC driver (and changing up some supporting code to account for changes to it) for my metis package? which connects R up to Amazon Athena via RJDBC. I'm used to JDBC and have to deal with Java separately from R so I'm\u2026","rel":"","context":"In &quot;Apache Drill&quot;","block_context":{"text":"Apache Drill","link":"https:\/\/rud.is\/b\/category\/apache-drill\/"},"img":{"alt_text":"","src":"https:\/\/i0.wp.com\/rud.is\/b\/wp-content\/uploads\/2018\/04\/today-is-a-good-day-to-query.jpg?fit=700%2C535&ssl=1&resize=350%2C200","width":350,"height":200,"srcset":"https:\/\/i0.wp.com\/rud.is\/b\/wp-content\/uploads\/2018\/04\/today-is-a-good-day-to-query.jpg?fit=700%2C535&ssl=1&resize=350%2C200 1x, https:\/\/i0.wp.com\/rud.is\/b\/wp-content\/uploads\/2018\/04\/today-is-a-good-day-to-query.jpg?fit=700%2C535&ssl=1&resize=525%2C300 1.5x, https:\/\/i0.wp.com\/rud.is\/b\/wp-content\/uploads\/2018\/04\/today-is-a-good-day-to-query.jpg?fit=700%2C535&ssl=1&resize=700%2C400 2x"},"classes":[]},{"id":5954,"url":"https:\/\/rud.is\/b\/2017\/05\/16\/r%e2%81%b6-using-r-with-amazon-athena-awas-temporary-security-credentials\/","url_meta":{"origin":11070,"position":3},"title":"R\u2076 \u2014 Using R With Amazon Athena &#038; AWS Temporary Security Credentials","author":"hrbrmstr","date":"2017-05-16","format":false,"excerpt":"Most of the examples of working with most of the AWS services show basic username & password authentication. That's all well-and-good, but many shops use the AWS Security Token Service to provide temporary credentials and session tokens to limit exposure and provide more uniform multi-factor authentication. At my workplace, Frank\u2026","rel":"","context":"In &quot;R&quot;","block_context":{"text":"R","link":"https:\/\/rud.is\/b\/category\/r\/"},"img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":11978,"url":"https:\/\/rud.is\/b\/2019\/02\/22\/cloudy-with-a-chance-of-caffeinated-query-orchestration-new-rjava-wrappers-for-aws-athena-sdk-for-java\/","url_meta":{"origin":11070,"position":4},"title":"Cloudy with a chance of Caffeinated Query Orchestration &#8211; New rJava Wrappers for AWS Athena SDK for Java","author":"hrbrmstr","date":"2019-02-22","format":false,"excerpt":"There are two fledgling rJava-based R packages that enable working with the AWS SDK for Athena: awsathena | GL| GH awsathenajars | GL| GH They're both needed to conform with the way CRAN like rJava-based packages submitted that also have large JAR dependencies. The goal is to eventually have wrappers\u2026","rel":"","context":"In &quot;Java&quot;","block_context":{"text":"Java","link":"https:\/\/rud.is\/b\/category\/java\/"},"img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":11346,"url":"https:\/\/rud.is\/b\/2018\/08\/11\/connecting-apache-zeppelin-up-to-amazon-athena-with-an-iam-profile-name\/","url_meta":{"origin":11070,"position":5},"title":"Connecting Apache Zeppelin Up to Amazon Athena with an IAM Profile Name","author":"hrbrmstr","date":"2018-08-11","format":false,"excerpt":"Apache Zeppelin is a \"notebook\" alternative to Jupyter (and other) notebooks. It supports a plethora of kernels\/interpreters and can do a ton of things that this post isn't going to discuss (perhaps future ones will, especially since it's the first \"notebook\" environment I've been able to tolerate for longer than\u2026","rel":"","context":"In &quot;athena&quot;","block_context":{"text":"athena","link":"https:\/\/rud.is\/b\/category\/athena\/"},"img":{"alt_text":"","src":"https:\/\/i0.wp.com\/rud.is\/b\/wp-content\/uploads\/2018\/08\/athena-example-1.png?fit=1200%2C704&ssl=1&resize=350%2C200","width":350,"height":200,"srcset":"https:\/\/i0.wp.com\/rud.is\/b\/wp-content\/uploads\/2018\/08\/athena-example-1.png?fit=1200%2C704&ssl=1&resize=350%2C200 1x, https:\/\/i0.wp.com\/rud.is\/b\/wp-content\/uploads\/2018\/08\/athena-example-1.png?fit=1200%2C704&ssl=1&resize=525%2C300 1.5x, https:\/\/i0.wp.com\/rud.is\/b\/wp-content\/uploads\/2018\/08\/athena-example-1.png?fit=1200%2C704&ssl=1&resize=700%2C400 2x, https:\/\/i0.wp.com\/rud.is\/b\/wp-content\/uploads\/2018\/08\/athena-example-1.png?fit=1200%2C704&ssl=1&resize=1050%2C600 3x"},"classes":[]}],"jetpack_sharing_enabled":true,"_links":{"self":[{"href":"https:\/\/rud.is\/b\/wp-json\/wp\/v2\/posts\/11070","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/rud.is\/b\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/rud.is\/b\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/rud.is\/b\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/rud.is\/b\/wp-json\/wp\/v2\/comments?post=11070"}],"version-history":[{"count":0,"href":"https:\/\/rud.is\/b\/wp-json\/wp\/v2\/posts\/11070\/revisions"}],"wp:attachment":[{"href":"https:\/\/rud.is\/b\/wp-json\/wp\/v2\/media?parent=11070"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/rud.is\/b\/wp-json\/wp\/v2\/categories?post=11070"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/rud.is\/b\/wp-json\/wp\/v2\/tags?post=11070"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}