Recipe 4 Wiring Up Drill and R (RJDBC-style)

4.1 Problem

You have Drill installed and want to work with Drill from R using RJDBC idioms.

4.2 Solution

Install, load and use the sergeant package.

4.3 Discussion

Working with Drill in an RJDBC context is pretty straightforward. Assuming you have drill running you need to:

  • connect to the database
  • issue a query

The Drill query interface (http://localhost:8047/query) provides an example query we can use for testing if your setup is working:

SELECT * FROM cp.`employee.json` LIMIT 20

Let’s take a look at the employee.json table using R & RJDBC.

## <JDBCConnection>
## 'data.frame':    20 obs. of  16 variables:
##  $ employee_id    : chr  "1" "2" "4" "5" ...
##  $ full_name      : chr  "Sheri Nowmer" "Derrick Whelply" "Michael Spence" "Maya Gutierrez" ...
##  $ first_name     : chr  "Sheri" "Derrick" "Michael" "Maya" ...
##  $ last_name      : chr  "Nowmer" "Whelply" "Spence" "Gutierrez" ...
##  $ position_id    : chr  "1" "2" "2" "2" ...
##  $ position_title : chr  "President" "VP Country Manager" "VP Country Manager" "VP Country Manager" ...
##  $ store_id       : chr  "0" "0" "0" "0" ...
##  $ department_id  : chr  "1" "1" "1" "1" ...
##  $ birth_date     : chr  "1961-08-26" "1915-07-03" "1969-06-20" "1951-05-10" ...
##  $ hire_date      : chr  "1994-12-01 00:00:00.0" "1994-12-01 00:00:00.0" "1998-01-01 00:00:00.0" "1998-01-01 00:00:00.0" ...
##  $ salary         : chr  "80000.0000" "40000.0000" "40000.0000" "35000.0000" ...
##  $ supervisor_id  : chr  "0" "1" "1" "1" ...
##  $ education_level: chr  "Graduate Degree" "Graduate Degree" "Graduate Degree" "Bachelors Degree" ...
##  $ marital_status : chr  "S" "M" "S" "M" ...
##  $ gender         : chr  "F" "M" "M" "F" ...
##  $ management_role: chr  "Senior Management" "Senior Management" "Senior Management" "Senior Management" ...

The R manual page for drill_jdbc() informs you that you need to have DRILL_JDBC_JAR setup in your environment and for Drill 1.13.0 on a Linux-ish system that should become an entry in your ~/.Renviron file as so:

DRILL_JDBC_JAR=/usr/local/drill/jars/jdbc-driver/drill-jdbc-all-1.13.0.jar

On Windows that will be TBD.

Odds are that if you’re in this section of the book you’re familiar with RJDBC operations. You’ll also be disappointed that most of the RJDBC interface has not been fully implemented since there have been virtually no requests for it. File an issue if you would like more than just the ability to perform queries.

NOTE: You’re really better off using drill_connection() and drill_query() vs go through the RJDBC machinations since you get the same thing without the overhead of Java.