Like more posts than I care to admit, this one starts innocently enough with a tweet by @gshotwell: Is there a reference document somewhere of which dplyr commands work on various database backends? #rstats — Gordon Shotwell (@gshotwell) April 9, 2019 Since I use at least 4 different d[b]plyr backends every week, this same question… Continue reading
Post Category → dplyr
Apache Drill 1.15.0 + sergeant 0.8.0 = pcapng Support, Proper Column Types & Mounds of New Metadata
Apache Drill is an innovative distributed SQL engine designed to enable data exploration and analytics on non-relational datastores […] without having to create and manage schemas. […] It has a schema-free JSON document model similar to MongoDB and Elasticsearch; [a plethora of APIs, including] ANSI SQL, ODBC/JDBC, and HTTP[S] REST; [is] extremely user and developer… Continue reading
Engaging the tidyverse Clean Slate Protocol
I caught the 0.7.0 release of dplyr on my home CRAN server early Friday morning and immediately set out to install it since I’m eager to finish up my sergeant package and get it on CRAN. “Tidyverse” upgrades aren’t trivial for me as I tinker quite a bit with the tidyverse and create packages that… Continue reading
All-in on R⁶ : Progress [bars] on first post
@eddelbuettel’s idea is a good one. (it’s a quick read…jump there and come back). We’ll avoid confusion and call it R⁶ over here. Feel free to don the superclass. I often wait for a complete example or new package announcement to blog something when a briefly explained snippet might have sufficient utility for many R… Continue reading
Making a Case for case_when
This is a brief (and likely obvious, for some folks) post on the dplyr::case_when() function. Part of my work-work is dealing with data from internet scans. When we’re performing a deeper inspection of a particular internet protocol or service we try to capture as much system and service metadata as possible. Sifting through said metadata… Continue reading
2017-01 Authored Package Updates
The rest of the month is going to be super-hectic and it’s unlikely I’ll be able to do any more to help the push to CRAN 10K, so here’s a breakdown of CRAN and GitHub new packages & package updates that I felt were worth raising awareness on: epidata I mentioned this one last week… Continue reading
sergeant : An R Boot Camp for Apache Drill
I recently mentioned that I’ve been working on a development version of an Apache Drill R package called sergeant. Here’s a lifted “TLDR” on Drill: Drill supports a variety of NoSQL databases and file systems, including HBase, MongoDB, MapR-DB, HDFS, MapR-FS, Amazon S3, Azure Blob Storage, Google Cloud Storage, Swift, NAS and local files. A… Continue reading