Despite being in cybersecurity nigh forever (a career that quickly turns one into a determined skeptic if you’re doing your job correctly) I have often trusted various (not to be named) news sources, reports and data sources to provide honest and as-unbiased-as-possible information. The debacle in the U.S. in late 2016 has proven (to me)… Continue reading
Post Category → Data Analysis
Slaying CIDR Orcs with Triebeard (a.k.a. fast trie-based ‘IPv4-in-CIDR’ lookups in R)
The insanely productive elf-lord, @quominus put together a small package ([`triebeard`](https://github.com/ironholds/triebeard)) that exposes an API for [radix/prefix tries](https://en.wikipedia.org/wiki/Trie) at both the R and Rcpp levels. I know he had some personal needs for this and we both kinda need these to augment some functions in our `iptools` package. Despite `triebeard` having both a vignette and… Continue reading
52Vis Week 2 (2016 Week #14) – Honing in on the Homeless
>UPDATE: Since I put in a “pull request” requirement, I intended to put in a link to getting started with GitHub. Dr. Jenny Bryan’s @stat545 has a great [section on git](https://stat545-ubc.github.io/git00_index.html) that should hopefully make it a bit less painful. ### Why 52Vis? In case folks are wondering why I’m doing this, it’s pretty simple…. Continue reading
Visualizing Survey Data : Comparison Between Observations
Cybersecurity is a domain that really likes surveys, or at the very least it has many folks within it that like to conduct and report on surveys. One recent survey on threat intelligence is in it’s second year, so it sets about comparing answers across years. Rather than go into the many technical/statistical issues with… Continue reading
Less Drama, More Encoding
Junk Charts [adeptly noted and fixed](http://junkcharts.typepad.com/junk_charts/2015/10/is-it-worth-the-drama.html) this excessively stylized chart from the WSJ this week: Their take on it does reduce the ZOMGOSH WE ARE DOOMED! look and feel of the WSJ chart: But, we can further reduce the drama by using a more neutral color encoding _and_ encode both the # of outbreaks and… Continue reading
cdcfluview – On The Way to “CRAN 7K”
I like to turn coincidence into convergence whenever possible. This weekend, a user of [cdcfluview](http://github.com/hrbrmstr/cdcfluview) had a question that caused me to notice a difference in behaviour between the package was interacting with CDC FluView API, so I updated the package to accommodate the change and the user. Around the same time, @recology_ tweeted: we're… Continue reading
Scraping jQuery DataTable Programmatic JSON with R
School of Data had a recent post how to copy “every item” from a multi-page list. While their post did provide a neat hack, their “words of warning” are definitely missing some items and the overall methodology can be improved upon with some basic R scripting. First, the technique they outlined relies heavily on how… Continue reading
More Airline Crashes via the Hadleyverse
I saw a fly-by `#rstats` mention of more airplane accident data on — of all places — LinkedIn (email) today which took me to a [GitHub repo](https://github.com/philjette/CrashData) by @philjette. It seems there’s a [web site](http://www.planecrashinfo.com/) (run by what seems to be a single human) that tracks plane crashes. Here’s a tweet from @philjette announcing it:… Continue reading