While the future of the Apache Drill ecosystem is somewhat in-play (MapR — a major sponsoring org for the project — is kinda dead), I still use it almost daily (on my local home office cluster) to avoid handing over any more money to Amazon than I/we already do. The latest (yet-to-be-released) v1.18.0 has some… Continue reading
Post Category → Apache Drill
{sergeant} 0.9.0 Is On Its Way to CRAN Mirrors!
Tis been a long time coming, but a minor change to default S3 parameters in tibbles finally caused a push of {sergeant} — the R package that lets you use the Apache Drill REST API via {DBI}, {dplyr}, or directly — to CRAN. The CRAN automatic processing system approved the release just under 19 minutes from… Continue reading
On the Road to 0.8.0 — Some Additional New Features Coming in the sergeant Package
It was probably not difficult to discern from my previous Drill-themed post that I’m fairly excited about the Apache Drill 1.15.0 release. I’ve rounded out most of the existing corners for it in preparation for a long-overdue CRAN update and have been concentrating on two helper features: configuring & launching Drill embedded Docker containers and… Continue reading
Apache Drill 1.15.0 + sergeant 0.8.0 = pcapng Support, Proper Column Types & Mounds of New Metadata
Apache Drill is an innovative distributed SQL engine designed to enable data exploration and analytics on non-relational datastores […] without having to create and manage schemas. […] It has a schema-free JSON document model similar to MongoDB and Elasticsearch; [a plethora of APIs, including] ANSI SQL, ODBC/JDBC, and HTTP[S] REST; [is] extremely user and developer… Continue reading
Driving Drill Dynamically with Docker and Updating Storage Configurations On-the-fly with sergeant
The sergeant? package has a minor update that adds REST API coverage for two “new” storage endpoints that make it possible to add, update and remove storage configurations on-the-fly without using the GUI or manually updating a config file. This is an especially handy feature when paired with Drill’s new, official Docker container since that… Continue reading
In-brief: Using Bro connection logs with Apache Drill
If you’ve got a directory full of Bro NSM logs, it’s easy to work with them in Apache Drill since they’re just tab-separated values (TSV) files by default. The most tedious part is mapping the columns to proper types and hopefully this saves at least one person from typing it out manually: SELECT TO_TIMESTAMP(CAST(columns[0] AS… Continue reading
Updates to the sergeant (Apache Drill connector) Package & a look at Apache Drill 1.14.0 release
Apache Drill 1.14.0 was recently released, bringing with it many new features and a temporary incompatibility with the current rev of the MapR ODBC drivers. The Drill community expects new ODBC drivers to arrive shortly. The sergeant? is an alternative to ODBC for R users as it provides a dplyr interface to the REST API… Continue reading
Connecting Apache Zeppelin and Apache Drill, PostgreSQL, etc.
A previous post showed how to use a different authentication provider to wire up Apache Zeppelin and Amazon Athena. As noted in that post, Zeppelin is a “notebook” alternative to Jupyter (and other) notebooks. Unlike Jupyter, I can tolerate Zeppelin and it’s got some nifty features like plug-and-play JDBC access. Plus it can do some… Continue reading