I couldn’t let this stand unchallenged: The new Rasmussen Poll, one of the most accurate in the 2016 Election, just out with a Trump 50% Approval Rating.That's higher than O's #'s! — Donald J. Trump (@realDonaldTrump) June 18, 2017 Rasmussen makes their Presidential polling data available for both ? & O. Why not compare their… Continue reading
Replicating the Apache Drill ‘Yelp’ Academic Dataset Analysis with sergeant
The Apache Drill folks have a nice walk-through tutorial on how to analyze the Yelp Academic Dataset with Drill. It’s a bit out of date (the current Yelp data set structure is different enough that the tutorial will error out at various points), but it’s a great example of how to work with large, nested… Continue reading
Keeping Users Safe While Collecting Data
I caught a mention of this project by Pete Warden on Four Short Links today. If his name sounds familiar, he’s the creator of the DSTK, an O’Reilly author, and now works at Google. A decidedly clever and decent chap. The project goal is noble: crowdsource and make a repository of open speech data for… Continue reading
Engaging the tidyverse Clean Slate Protocol
I caught the 0.7.0 release of dplyr on my home CRAN server early Friday morning and immediately set out to install it since I’m eager to finish up my sergeant package and get it on CRAN. “Tidyverse” upgrades aren’t trivial for me as I tinker quite a bit with the tidyverse and create packages that… Continue reading
R⁶ — Scraping Images To PDFs
I’ve been doing intermittent prep work for a follow-up to an earlier post on store closings and came across this CNN Money “article” on it. Said “article” is a deliberately obfuscated or lazily crafted series of GIF images that contain all the Radio Shack impending store closings. It’s the most comprehensive list I’ve found, but… Continue reading
Drilling Into CSVs — Teaser Trailer
I used reading a directory of CSVs as the foundational example in my recent post on idioms. During my exchange with Matt, Hadley and a few others — in the crazy Twitter thread that spawned said post — I mentioned that I’d personally “just use Drill”. I’ll use this post as a bit of a… Continue reading
L.A. Unconf-idential : a.k.a. an rOpenSci #runconf17 Retrospective
Last year, I was able to sit back and lazily “RT” Julia Silge’s excellent retrospective on her 2016 @rOpenSci “unconference” experience. Since Julia was not there this year, and the unconference experience is still in primary storage (LMD v2.0 was a success!) I thought this would be the perfect time for a mindful look-back. And… Continue reading
R⁶ — Idiomatic (for the People)
NOTE: I’ll do my best to ensure the next post will have nothing to do with Twitter, and this post might not completely meet my R⁶ criteria. A single, altruistic, nigh exuberant R tweet about slurping up a directory of CSVs devolved quickly — at least in my opinion, and partly (sadly) with my aid… Continue reading