Skip navigation

Category Archives: Cybersecurity

insert(post, "{ 'standard_disclaimer' : 'My opinion, not my employer\'s' }")

This is a post about the fictional company FredCo. If the context or details presented by the post seem familiar, it’s purely coincidental. This is, again, a fictional story.

Let’s say FredCo had a pretty big breach that (fictionally) garnered media, Twitterverse, tech-world and Government-level attention and that we have some spurious details that let us sit back in our armchairs to opine about. What might have helped create the debacle at FredCo?

Despite (fictional) endless mainstream media coverage and a good chunk of ‘on background’ infosec-media clandestine blatherings we know very little about the breach itself (though it’s been fictionally, officially blamed on failure to patch Apache Struts). We know even less (fictionally officially) about the internal reach of the breach (apart from the limited consumer impact official disclosures). We know even less than that (fictionally officially) about how FredCo operates internally (process-wise).

But, I’ve (fictionally) seen:

  • a detailed breakdown of the number of domains, subdomains, and hosts FredCo “manages”.
  • the open port/service configurations of the public components of those domains
  • public information from individuals who are more willing to (fictionally) violate the CFAA than I am to get more than just port configuration information
  • a 2012/3 SAS 1 Type II report about FredCo controls
  • testimonies from FredCo execs regarding efficacy of $SECURITY_TECHOLOGY and 3 videos purporting to be indicative of expert opine on how to use BIIGG DATERZ to achieve cybersecurity success
  • the board & management structure + senior management bonus structures, complete with incentive-based objectives they were graded on

so, I’m going to blather a bit about how this fictional event should finally tear down the Potemkin village that is the combination of the Regulatory+Audit Industrial Complex and the Cybersecurity Industrial Complex.

“Tear down” with respect to the goal being to help individuals understand that a significant portion of organizations you entrust with your data are not incentivized or equipped to protect your data and that these same conditions exist in more critical areas — such as transportation, health care, and critical infrastructure — and you should expect a failure on the scale of FredCo — only with real, harmful impact — if nothing ends up changing soon.

From the top

There is boilerplate mention of “security” in the objectives of the senior executives between 2015 & 2016 14A filings:

  • CEO: “Employing advanced analytics and technology to help drive client growth, security, efficiency and profitability.”
  • CFO: “Continuing to advance and execute global enterprise risk management processes, including directing increased investment in data security, disaster recovery and regulatory compliance capabilities.”
  • CLO: “Continuing to refine and build out the Company’s global security organization.”
  • President, Workforce Solutions: None
  • CHRO: None
  • President – US Information Services: None

You’ll be happy to know that they all received either “Distinguished” or “Exceeds” on their appraisals and received a multiplier of their bonus & compensation targets as a result.

Furthermore, there is no one in the make-up of FredCo’s board of directors who has shown an interest or specialization in cybersecurity.

From the camera-positioned 50-yard line on instant replay, the board and shareholders of FredCo did not think protection of your identity and extremely personal information was important enough to include on three top executive directives and performance measure and was given little more than boilerplate mention for others. Investigators who look into FredCo’s breach should dig deep into the last decade of the detailed measures for these objectives. I have first-hand experience how these types of HR processes are managed in large orgs, which is why I’m encouraging this area for investigation.

“Security” is a terrible term, but it only works when it is an emergent property of the business processes of an organization. That means it must be contextual for every worker. Some colleagues suggest individual workers should not have to care about cybersecurity when making decisions or doing work, but even minimum-wage retail and grocery store clerks are educated about shoplifting risks and are given tools, tips and techniques to prevent loss. When your HR organizations is not incentivized to help create and maintain a cybersecurity-aware culture from the top you’re going to have problems, and when there are no cyberscurity-oriented targets for the CIO or even business process owners, don’t expect your holey screen door to keep out predators.

Awwwdit, Part I

NOTE: I’m not calling out any particular audit organization as I’ve only seen one fictional official report.

The Regulatory+Audit Industrial Complex is a lucrative business cabal. Governments and large business meta-agencies create structures where processes can be measured, verified and given a big green ✅. This validation exercise is generally done in one or more ways:

  • simple questionnaire, very high level questions, no veracity validation
  • more detailed questionnaire, mid-level questions, usually some in-person lightweight checking
  • detailed questionnaire, but with topics that can be sliced-and-diced by the legal+technical professions to mean literally anything, measured in-person by (usually) extremely junior reviewers with little-to-no domain expertise who follow review playbooks, get overwhelmed with log entries and scope-refinement+reduction and who end up being steered towards “important” but non-material findings

Sure, there are good audits and good auditors, but I will posit they are the rare diamonds in a bucket of zirconia.

We need to cover some technical ground before covering this further, though.

Shocking Struts

We’ll take the stated breach cause at face-value: failure to patch an remote-accessible vulnerability with Apache Struts. This was presented as the singular issue enabling attackers to walk (with crutches) away with scads of identify-theft-enabling personal data, administrator passwords, database passwords, and the recipe for the winning entry in the macaroni salad competition at last year’s HR annual picnic. Who knew one Java library had so much power!

We don’t know the architecture of all the web apps at FredCo. However, your security posture should not be a Jenga game tower, easily destroyed by removing one peg. These are all (generally) components of externally-facing applications at the scale of FredCo:

  • routers
  • switches
  • firewalls
  • load balancers
  • operating systems
  • application servers
  • middleware servers
  • database servers
  • customized code

These are mimicked (to varying levels of efficacy) across:

  • development
  • test
  • staging
  • production

environments.

They may coexist (in various layers of the network) with:

  • HR systems
  • Finance systems
  • Intranet servers
  • Active Directory
  • General user workstations
  • Executive workstations
  • Developer workstations
  • Mobile devices
  • Remote access infrastructure (i.e. VPNs)

A properly incentivized organization ensures there are logical and physical separation between/isolation of “stuff that matters” and that varying levels of authentication & authorization are applied to ensure access is restricted.

Keeping all that “secure” requires:

  • managing thousands of devices (servers, network components, laptops, desktops, mobile devices)
  • managing thousands of identities
  • managing thousands of configurations across systems, networks and devices
  • managing hundreds to thousands of connections between internal and external networks
  • managing thousands of rules
  • managing thousands of vulnerabilities (as they become known)
  • managing a secure development life cycle across hundreds or thousands of applications

Remember, though, that FredCo ostensibly managed all of that well and the data loss was solely due to one Java library.

If your executives (all of them) and workers (all of them) are not incentivized with that list in mind, you will have problems, but let’s talk about the security challenges back in the context of the audit role.

Awwwdit, Part II

The post is already long, so we’ll make this quick.

If I dropped you off — yes, you, because you’re likely as capable as the auditors mentioned in the previous section on audit — into that environment once a year, do you think you’d be able to ferret out issues based on convoluted network diagrams, poorly documented firewall rules and source code, non-standard checklists of user access management processes?

Let’s say I dropped you in months before the known Struts vulnerability and re-answer the question.

The burden placed on internal and — especially — external auditors is great and they are pretty much set up for failure from engagement number one.

Couple IT complexity with the fact that many orgs like FredCo aren’t required to do more than ensure financial reporting processes are ?.

But, even if there were more technical, security-oriented audits performed, you’d likely have ten different report findings by as many firms or auditors, especially if they were point-in-time audits. Furthermore, FredCo has has decades of point-in-time audits but hundreds of auditors and dozens of firms. The conditions of the breach were likely not net-new, so how did decades of systemic IT failures go unnoticed by this cabal?

IT audit functions are a multi-billion dollar business. FredCo is partially the result of the built-in cracks in the way verification is performed in orgs. In other words, I posit the Regulatory+Audit Industrial Complex bears some of the responsibility for FredCo’s breach.

Divisive Devices

From the (now removed) testimonials & videos, it was clear there may have been a “blinky light” problem in the mindset of those responsible for cybersecurity at FredCo. Relying solely on the capabilities of one or more devices (they are usually appliances with blinky lights) and thinking that storing petabytes of log data are going to stop “bad guys’ is a great recipe for a breach parfait.

But, the Cybersecurity Industrial Complex continues to dole out LED-laden boxes with the fervor of a U.S. doctor handing out opioids. Sure, they are just giving orgs what they want, but it doesn’t make it responsible behaviour. Just like the opioid problem, the “device” issue is likely causing cyber-sickness in more organizations that you’d like to admit. You may even know someone who works at an org with a box-addition.

I posit the Cybersecurity Industrial Complex bears some of the responsibility for FredCo’s breach, especially when you consider the hundreds of marketing e-mails I’ve seen post-FredCo breach telling me how CyberBox XJ9-11 would have stopped FredCo’s attackers cold.

A Matter of Trust

If removing a Struts peg from FredCo’s IT Jenga board caused the fictional tower to crash:

  • What do you think the B2B infrastructure looks like?
  • How do you think endpoints are managed?
  • What isolation, segmentation and access controls really exist?
  • How effective do you think their security awareness program is?
  • How many apps are architected & managed as poorly as the breached one?
  • How many shadow IT deployments exist in the ☁️ with your data in it?
  • How can you trust FredCo with anything of importance?

Fictional FIN

In this fictional world I’ve created one ending is:

  • all B2B connections to FredCo have been severed
  • lawyers at a thousand firms are working on language for filings to cancel all B2B contracts with FredCo
  • FredCo was de-listed from exchanges
  • FredCo executives are defending against a slew of criminal and civil charges
  • The U.S. Congress and U.K. Parliament have come together to undertake a joint review of regulatory and audit practices spanning both countries (since it impacted both countries and the Reg+Audit cabal spans both countries they decided to save time and money) resulting in sweeping changes
  • The SEC has mandated detailed cybersecurity objectives be placed on all senior management executives at all public companies and have forced results of those objectives assessments to be part of a new filing requirement.
  • The SEC has also mandated that at least one voting board member of public companies must have demonstrated experience with cybersecurity
  • The FTC creates and enforces standards on cybersecurity product advertising practices
  • You have understood that nobody has your back when it comes to managing your sensitive, personal data and that you must become an active participant in helping to ensure your elected representatives hold all organizations accountable when it comes to taking their responsibilities seriously.

but, another is:

  • FredCo’s stock bounces back
  • FredCo loses no business partners
  • FredCo’s current & former execs faced no civil or criminal charges
  • Congress makes a bit of opportunistic, temporary bluster for the sake of 2018 elections but doesn’t do anything more than berate FredCo publicly
  • You’re so tired of all these breaches and data loss that you go back to playing “Clash of Clans” on your mobile phone and do nothing.

I was about to embark on setting up a background task to sift through R package PDFs for traces of functions that “omit NA values” as a surprise present for Colin Fay and Sir Tierney:

When I got distracted by a PDF in the CRAN doc/contrib directory: Short-refcard.pdf. I’m not a big reference card user but students really like them and after seeing what it was I remembered having seen the document ages ago, but never associated it with CRAN before.

I saw:

by Tom Short, EPRI PEAC, tshort@epri-peac.com 2004-11-07 Granted to the public domain. See www. Rpad. org for the source and latest version. Includes material from R for Beginners by Emmanuel Paradis (with permission).

at the top of the card. The link (which I’ve made unclickable for reasons you’ll see in a sec — don’t visit that URL) was clickable and I tapped it as I wanted to see if it had changed since 2004.

You can open that image in a new tab to see the full, rendered site and take a moment to see if you can find the section that links to objectionable — and, potentially malicious — content. It’s easy to spot.

I made a likely correct assumption that Tom Short had nothing to do with this and wanted to dig into it a bit further to see when this may have happened. So, don your bestest deerstalker and follow along as we see when this may have happened.

Digging In Domain Land

We’ll need some helpers to poke around this data in a safe manner:

library(wayback) # devtools::install_github("hrbrmstr/wayback")
library(ggTimeSeries) # devtools::install_github("AtherEnergy/ggTimeSeries")
library(splashr) # devtools::install_github("hrbrmstr/splashr")
library(passivetotal) # devtools::install_github("hrbrmstr/passivetotal")
library(cymruservices)
library(magick)
library(tidyverse)

(You’ll need to get a RiskIQ PassiveTotal key to use those functions. Also, please donate to Archive.org if you use the wayback package.)

Now, let’s see if the main Rpad content URL is in the wayback machine:

glimpse(archive_available("http://www.rpad.org/Rpad/"))
## Observations: 1
## Variables: 5
## $ url        <chr> "http://www.rpad.org/Rpad/"
## $ available  <lgl> TRUE
## $ closet_url <chr> "http://web.archive.org/web/20170813053454/http://ww...
## $ timestamp  <dttm> 2017-08-13
## $ status     <chr> "200"

It is! Let’s see how many versions of it are in the archive:

x <- cdx_basic_query("http://www.rpad.org/Rpad/")

ts_range <- range(x$timestamp)

count(x, timestamp) %>%
  ggplot(aes(timestamp, n)) +
  geom_segment(aes(xend=timestamp, yend=0)) +
  labs(x=NULL, y="# changes in year", title="rpad.org Wayback Change Timeline") +
  theme_ipsum_rc(grid="Y")

count(x, timestamp) %>%
  mutate(Year = lubridate::year(timestamp)) %>%
  complete(timestamp=seq(ts_range[1], ts_range[2], "1 day"))  %>%
  filter(!is.na(timestamp), !is.na(Year)) %>%
  ggplot(aes(date = timestamp, fill = n)) +
  stat_calendar_heatmap() +
  viridis::scale_fill_viridis(na.value="white", option = "magma") +
  facet_wrap(~Year, ncol=1) +
  labs(x=NULL, y=NULL, title="rpad.org Wayback Change Timeline") +
  theme_ipsum_rc(grid="") +
  theme(axis.text=element_blank()) +
  theme(panel.spacing = grid::unit(0.5, "lines"))

There’s a big span between 2008/9 and 2016/17. Let’s poke around there a bit. First 2016:

tm <- get_timemap("http://www.rpad.org/Rpad/")

(rurl <- filter(tm, lubridate::year(anytime::anydate(datetime)) == 2016))
## # A tibble: 1 x 5
##       rel                                                                   link  type
##     <chr>                                                                  <chr> <chr>
## 1 memento http://web.archive.org/web/20160629104907/http://www.rpad.org:80/Rpad/  <NA>
## # ... with 2 more variables: from <chr>, datetime <chr>

(p2016 <- render_png(url = rurl$link))

Hrm. Could be server or network errors.

Let’s go back to 2009.

(rurl <- filter(tm, lubridate::year(anytime::anydate(datetime)) == 2009))
## # A tibble: 4 x 5
##       rel                                                                  link  type
##     <chr>                                                                 <chr> <chr>
## 1 memento     http://web.archive.org/web/20090219192601/http://rpad.org:80/Rpad  <NA>
## 2 memento http://web.archive.org/web/20090322163146/http://www.rpad.org:80/Rpad  <NA>
## 3 memento http://web.archive.org/web/20090422082321/http://www.rpad.org:80/Rpad  <NA>
## 4 memento http://web.archive.org/web/20090524155658/http://www.rpad.org:80/Rpad  <NA>
## # ... with 2 more variables: from <chr>, datetime <chr>

(p2009 <- render_png(url = rurl$link[4]))

If you poke around that, it looks like the original Rpad content, so it was “safe” back then.

(rurl <- filter(tm, lubridate::year(anytime::anydate(datetime)) == 2017))
## # A tibble: 6 x 5
##       rel                                                                link  type
##     <chr>                                                               <chr> <chr>
## 1 memento  http://web.archive.org/web/20170323222705/http://www.rpad.org/Rpad  <NA>
## 2 memento http://web.archive.org/web/20170331042213/http://www.rpad.org/Rpad/  <NA>
## 3 memento http://web.archive.org/web/20170412070515/http://www.rpad.org/Rpad/  <NA>
## 4 memento http://web.archive.org/web/20170518023345/http://www.rpad.org/Rpad/  <NA>
## 5 memento http://web.archive.org/web/20170702130918/http://www.rpad.org/Rpad/  <NA>
## 6 memento http://web.archive.org/web/20170813053454/http://www.rpad.org/Rpad/  <NA>
## # ... with 2 more variables: from <chr>, datetime <chr>

(p2017 <- render_png(url = rurl$link[1]))

I won’t break your browser and add another giant image, but that one has the icky content. So, it’s a relatively recent takeover and it’s likely that whomever added the icky content links did so to try to ensure those domains and URLs have both good SEO and a positive reputation.

Let’s see if they were dumb enough to make their info public:

rwho <- passive_whois("rpad.org")
str(rwho, 1)
## List of 18
##  $ registryUpdatedAt: chr "2016-10-05"
##  $ admin            :List of 10
##  $ domain           : chr "rpad.org"
##  $ registrant       :List of 10
##  $ telephone        : chr "5078365503"
##  $ organization     : chr "WhoisGuard, Inc."
##  $ billing          : Named list()
##  $ lastLoadedAt     : chr "2017-03-14"
##  $ nameServers      : chr [1:2] "ns-1147.awsdns-15.org" "ns-781.awsdns-33.net"
##  $ whoisServer      : chr "whois.publicinterestregistry.net"
##  $ registered       : chr "2004-06-15"
##  $ contactEmail     : chr "411233718f2a4cad96274be88d39e804.protect@whoisguard.com"
##  $ name             : chr "WhoisGuard Protected"
##  $ expiresAt        : chr "2018-06-15"
##  $ registrar        : chr "eNom, Inc."
##  $ compact          :List of 10
##  $ zone             : Named list()
##  $ tech             :List of 10

Nope. #sigh

Is this site considered “malicious”?

(rclass <- passive_classification("rpad.org"))
## $everCompromised
## NULL

Nope. #sigh

What’s the hosting history for the site?

rdns <- passive_dns("rpad.org")
rorig <- bulk_origin(rdns$results$resolve)

tbl_df(rdns$results) %>%
  type_convert() %>%
  select(firstSeen, resolve) %>%
  left_join(select(rorig, resolve=ip, as_name=as_name)) %>% 
  arrange(firstSeen) %>%
  print(n=100)
## # A tibble: 88 x 3
##              firstSeen        resolve                                              as_name
##                 <dttm>          <chr>                                                <chr>
##  1 2009-12-18 11:15:20  144.58.240.79      EPRI-PA - Electric Power Research Institute, US
##  2 2016-06-19 00:00:00 208.91.197.132 CONFLUENCE-NETWORK-INC - Confluence Networks Inc, VG
##  3 2016-07-29 00:00:00  208.91.197.27 CONFLUENCE-NETWORK-INC - Confluence Networks Inc, VG
##  4 2016-08-12 20:46:15  54.230.14.253                     AMAZON-02 - Amazon.com, Inc., US
##  5 2016-08-16 14:21:17  54.230.94.206                     AMAZON-02 - Amazon.com, Inc., US
##  6 2016-08-19 20:57:04  54.230.95.249                     AMAZON-02 - Amazon.com, Inc., US
##  7 2016-08-26 20:54:02 54.192.197.200                     AMAZON-02 - Amazon.com, Inc., US
##  8 2016-09-12 10:35:41   52.84.40.164                     AMAZON-02 - Amazon.com, Inc., US
##  9 2016-09-17 07:43:03  54.230.11.212                     AMAZON-02 - Amazon.com, Inc., US
## 10 2016-09-23 18:17:50 54.230.202.223                     AMAZON-02 - Amazon.com, Inc., US
## 11 2016-09-30 19:47:31 52.222.174.253                     AMAZON-02 - Amazon.com, Inc., US
## 12 2016-10-24 17:44:38  52.85.112.250                     AMAZON-02 - Amazon.com, Inc., US
## 13 2016-10-28 18:14:16 52.222.174.231                     AMAZON-02 - Amazon.com, Inc., US
## 14 2016-11-11 10:44:22 54.240.162.201                     AMAZON-02 - Amazon.com, Inc., US
## 15 2016-11-17 04:34:15 54.192.197.242                     AMAZON-02 - Amazon.com, Inc., US
## 16 2016-12-16 17:49:29   52.84.32.234                     AMAZON-02 - Amazon.com, Inc., US
## 17 2016-12-19 02:34:32 54.230.141.240                     AMAZON-02 - Amazon.com, Inc., US
## 18 2016-12-23 14:25:32  54.192.37.182                     AMAZON-02 - Amazon.com, Inc., US
## 19 2017-01-20 17:26:28  52.84.126.252                     AMAZON-02 - Amazon.com, Inc., US
## 20 2017-02-03 15:28:24   52.85.94.225                     AMAZON-02 - Amazon.com, Inc., US
## 21 2017-02-10 19:06:07   52.85.94.252                     AMAZON-02 - Amazon.com, Inc., US
## 22 2017-02-17 21:37:21   52.85.63.229                     AMAZON-02 - Amazon.com, Inc., US
## 23 2017-02-24 21:43:45   52.85.63.225                     AMAZON-02 - Amazon.com, Inc., US
## 24 2017-03-05 12:06:32  54.192.19.242                     AMAZON-02 - Amazon.com, Inc., US
## 25 2017-04-01 00:41:07 54.192.203.223                     AMAZON-02 - Amazon.com, Inc., US
## 26 2017-05-19 00:00:00   13.32.246.44                     AMAZON-02 - Amazon.com, Inc., US
## 27 2017-05-28 00:00:00    52.84.74.38                     AMAZON-02 - Amazon.com, Inc., US
## 28 2017-06-07 08:10:32  54.230.15.154                     AMAZON-02 - Amazon.com, Inc., US
## 29 2017-06-07 08:10:32  54.230.15.142                     AMAZON-02 - Amazon.com, Inc., US
## 30 2017-06-07 08:10:32  54.230.15.168                     AMAZON-02 - Amazon.com, Inc., US
## 31 2017-06-07 08:10:32   54.230.15.57                     AMAZON-02 - Amazon.com, Inc., US
## 32 2017-06-07 08:10:32   54.230.15.36                     AMAZON-02 - Amazon.com, Inc., US
## 33 2017-06-07 08:10:32  54.230.15.129                     AMAZON-02 - Amazon.com, Inc., US
## 34 2017-06-07 08:10:32   54.230.15.61                     AMAZON-02 - Amazon.com, Inc., US
## 35 2017-06-07 08:10:32   54.230.15.51                     AMAZON-02 - Amazon.com, Inc., US
## 36 2017-07-16 09:51:12 54.230.187.155                     AMAZON-02 - Amazon.com, Inc., US
## 37 2017-07-16 09:51:12 54.230.187.184                     AMAZON-02 - Amazon.com, Inc., US
## 38 2017-07-16 09:51:12 54.230.187.125                     AMAZON-02 - Amazon.com, Inc., US
## 39 2017-07-16 09:51:12  54.230.187.91                     AMAZON-02 - Amazon.com, Inc., US
## 40 2017-07-16 09:51:12  54.230.187.74                     AMAZON-02 - Amazon.com, Inc., US
## 41 2017-07-16 09:51:12  54.230.187.36                     AMAZON-02 - Amazon.com, Inc., US
## 42 2017-07-16 09:51:12 54.230.187.197                     AMAZON-02 - Amazon.com, Inc., US
## 43 2017-07-16 09:51:12 54.230.187.185                     AMAZON-02 - Amazon.com, Inc., US
## 44 2017-07-17 13:10:13 54.239.168.225                     AMAZON-02 - Amazon.com, Inc., US
## 45 2017-08-06 01:14:07  52.222.149.75                     AMAZON-02 - Amazon.com, Inc., US
## 46 2017-08-06 01:14:07 52.222.149.172                     AMAZON-02 - Amazon.com, Inc., US
## 47 2017-08-06 01:14:07 52.222.149.245                     AMAZON-02 - Amazon.com, Inc., US
## 48 2017-08-06 01:14:07  52.222.149.41                     AMAZON-02 - Amazon.com, Inc., US
## 49 2017-08-06 01:14:07  52.222.149.38                     AMAZON-02 - Amazon.com, Inc., US
## 50 2017-08-06 01:14:07 52.222.149.141                     AMAZON-02 - Amazon.com, Inc., US
## 51 2017-08-06 01:14:07 52.222.149.163                     AMAZON-02 - Amazon.com, Inc., US
## 52 2017-08-06 01:14:07  52.222.149.26                     AMAZON-02 - Amazon.com, Inc., US
## 53 2017-08-11 19:11:08 216.137.61.247                     AMAZON-02 - Amazon.com, Inc., US
## 54 2017-08-21 20:44:52  13.32.253.116                     AMAZON-02 - Amazon.com, Inc., US
## 55 2017-08-21 20:44:52  13.32.253.247                     AMAZON-02 - Amazon.com, Inc., US
## 56 2017-08-21 20:44:52  13.32.253.117                     AMAZON-02 - Amazon.com, Inc., US
## 57 2017-08-21 20:44:52  13.32.253.112                     AMAZON-02 - Amazon.com, Inc., US
## 58 2017-08-21 20:44:52   13.32.253.42                     AMAZON-02 - Amazon.com, Inc., US
## 59 2017-08-21 20:44:52  13.32.253.162                     AMAZON-02 - Amazon.com, Inc., US
## 60 2017-08-21 20:44:52  13.32.253.233                     AMAZON-02 - Amazon.com, Inc., US
## 61 2017-08-21 20:44:52   13.32.253.29                     AMAZON-02 - Amazon.com, Inc., US
## 62 2017-08-23 14:24:15 216.137.61.164                     AMAZON-02 - Amazon.com, Inc., US
## 63 2017-08-23 14:24:15 216.137.61.146                     AMAZON-02 - Amazon.com, Inc., US
## 64 2017-08-23 14:24:15  216.137.61.21                     AMAZON-02 - Amazon.com, Inc., US
## 65 2017-08-23 14:24:15 216.137.61.154                     AMAZON-02 - Amazon.com, Inc., US
## 66 2017-08-23 14:24:15 216.137.61.250                     AMAZON-02 - Amazon.com, Inc., US
## 67 2017-08-23 14:24:15 216.137.61.217                     AMAZON-02 - Amazon.com, Inc., US
## 68 2017-08-23 14:24:15  216.137.61.54                     AMAZON-02 - Amazon.com, Inc., US
## 69 2017-08-25 19:21:58  13.32.218.245                     AMAZON-02 - Amazon.com, Inc., US
## 70 2017-08-26 09:41:34   52.85.173.67                     AMAZON-02 - Amazon.com, Inc., US
## 71 2017-08-26 09:41:34  52.85.173.186                     AMAZON-02 - Amazon.com, Inc., US
## 72 2017-08-26 09:41:34  52.85.173.131                     AMAZON-02 - Amazon.com, Inc., US
## 73 2017-08-26 09:41:34   52.85.173.18                     AMAZON-02 - Amazon.com, Inc., US
## 74 2017-08-26 09:41:34   52.85.173.91                     AMAZON-02 - Amazon.com, Inc., US
## 75 2017-08-26 09:41:34  52.85.173.174                     AMAZON-02 - Amazon.com, Inc., US
## 76 2017-08-26 09:41:34  52.85.173.210                     AMAZON-02 - Amazon.com, Inc., US
## 77 2017-08-26 09:41:34   52.85.173.88                     AMAZON-02 - Amazon.com, Inc., US
## 78 2017-08-27 22:02:41  13.32.253.169                     AMAZON-02 - Amazon.com, Inc., US
## 79 2017-08-27 22:02:41  13.32.253.203                     AMAZON-02 - Amazon.com, Inc., US
## 80 2017-08-27 22:02:41  13.32.253.209                     AMAZON-02 - Amazon.com, Inc., US
## 81 2017-08-29 13:17:37 54.230.141.201                     AMAZON-02 - Amazon.com, Inc., US
## 82 2017-08-29 13:17:37  54.230.141.83                     AMAZON-02 - Amazon.com, Inc., US
## 83 2017-08-29 13:17:37  54.230.141.30                     AMAZON-02 - Amazon.com, Inc., US
## 84 2017-08-29 13:17:37 54.230.141.193                     AMAZON-02 - Amazon.com, Inc., US
## 85 2017-08-29 13:17:37 54.230.141.152                     AMAZON-02 - Amazon.com, Inc., US
## 86 2017-08-29 13:17:37 54.230.141.161                     AMAZON-02 - Amazon.com, Inc., US
## 87 2017-08-29 13:17:37  54.230.141.38                     AMAZON-02 - Amazon.com, Inc., US
## 88 2017-08-29 13:17:37 54.230.141.151                     AMAZON-02 - Amazon.com, Inc., US

Unfortunately, I expected this. The owner keeps moving it around on AWS infrastructure.

So What?

This was an innocent link in a document on CRAN that went to a site that looked legit. A clever individual or organization found the dead domain and saw an opportunity to legitimize some fairly nasty stuff.

Now, I realize nobody is likely using “Rpad” anymore, but this type of situation can happen to any registered domain. If this individual or organization were doing more than trying to make objectionable content legit, they likely could have succeeded, especially if they enticed you with a shiny new devtools::install_…() link with promises of statistically sound animated cat emoji gif creation tools. They did an eerily good job of making this particular site still seem legit.

There’s nothing most folks can do to “fix” that site or have it removed. I’m not sure CRAN should remove the helpful PDF, but with a clickable link, it might be a good thing to suggest.

You’ll see that I used the splashr package (which has been submitted to CRAN but not there yet). It’s a good way to work with potentially malicious web content since you can “see” it and mine content from it without putting your own system at risk.

After going through this, I’ll see what I can do to put some bows on some of the devel-only packages and get them into CRAN so there’s a bit more assurance around using them.

I’m an army of one when it comes to fielding R-related security issues, but if you do come across suspicious items (like this or icky/malicious in other ways) don’t hesitate to drop me an @ or DM on Twitter.

I listen to @NPR throughout the day (on most days) and a story on Ohmconnect piqued my interest (it aired 5 days prior to this post). The TLDR on Ohmconnect is that it ostensibly helps you save energy by making you more aware of consumption and can be enabled to control various bits of IoT you have in your abode to curtail wanton power usage.

OK. So…?

Such a service requires access to (possibly many) accounts and devices to facilitate said awareness and control. Now, it’s 2017 and there’s this thing called OAuth that makes giving such access quite a bit safer than it was in the “old days” when you pretty much had to give your main username and password out to “connect” things.

It — apparently — is not 2017 wherever Ohmconnect developers reside since they ask for your credentials to every service and integration you want enabled. Don’t believe me? Take a look:

That’s just from (mostly) the non-thermostat integrations. They ask for your credentials for all services. That’s insane.

I can understand that they may need power company credentials since such industries are usually far behind the curve when it comes to internet-enablement. That doesn’t mean it’s A Good Thing to provide said credentials, but it’s a necessary evil when a service provider has no support for OAuth and you really want to use some integration to their portal.

Virtually all of the possible Ohmconnect-supported service integrations have OAuth support. Here’s a list of the ones that do/dont:

OAuth Support:

Appears to have no OAuth Support:

  • Lennox
  • Lutron
  • Radio Thermostat (Filtrete)
  • Revolv
  • WeMo

NOTE: The ones labeled as having no OAuth support may have either commercial OAuth support or hidden OAuth support. I’ll gladly modify the post if you leave a comment with official documentation showing they have OAuth support.

On the plus side, Ohmconnect developers now have some links they can follow to learn about OAuth and fix their woefully insecure service.

Why Are Credentials Bad?

Ohmconnect has to store your credentials for other services either in the clear or in some way that’s easy for them to reverse/decode. That means when criminals breach their servers (yes, when) they’ll get access to all the credentials you’ve entered on all those sites. Even if you’re one of the few who don’t use the same password everywhere and manage credentials in an app like @1Password it’s still both a pain to change them and you’ll be at risk during whatever the time-period is between breach and detection (which can be a very long time).

In the highly unlikely event they are doing the OAuth in the background for you (a complete violation of OAuth principles) they still take and process (and, likely store) your credentials for that transaction.

Either way, the request for and use of credentials is either (at best) a naive attempt at simplifying the user experience or (at worse) a brazen disregard for accepted norms for modern user-service integration for non-obvious reasons.

NOTE: I say “when” above as this would be a lovely target of choice for thieves given the types of data it can collect and the demographic that’s likely to use it.

What Can You Do?

Well, if you’re a current Ohmconnect you can cancel your account and change all the credentials for the services you connected. Yes, I’m being serious. If you really like their service, contact customer support and provide the above links and demand that they use OAuth instead.

You should absolutely not connect the devices/services that are on the “Appears to have no OAuth Support” list above to any third-party service if that service needs your credentials to make the connection. There’s no excuse for a cloud-based service to not support OAuth and there are plenty of choices for home/device control. Pick another brand and use it instead.

If you aren’t an Ohmconnect user, I would not sign up until they support OAuth. By defaulting to the “easy” use of username & password they are showing they really don’t take your security & privacy seriously and that means they really don’t deserve your business.

FIN

It is my firm belief that @NPR should either remove the story or issue guidance along with it in text and in audio form. They showcased this company and have all but directly encouraged listeners to use it. Such recommendations should come after much more research, especially security-focused research (they can ask for help with that from many orgs that will give them good advice for free).

In case you’re wondering, I did poke them about this on Twitter immediately after the NPR story and my initial signup attempt but they ignored said poke.

I’m also not providing any links to them given their lax security practices.

First it was OpenDNS selling their souls (and, [y]our data) to Cisco (whom I don’t trust at all with my data).

Now, it’s Dyn — — doing something even worse (purely my own opinion).

I’m currently evaluating offerings by [FoolDNS](http://www.fooldns.com/fooldns-community/english-version/) & [GreenTeam](http://members.greentm.co.uk/) as alternatives and I’ll post updates as I review & test them.

I’m also in search of an open source, RPi-able DNS server with regularly updated Squid-like categorical lists and the ability to white list domains (suggestions welcome in the comments).

I’m a cybersecurity data scientist who knows just what can be done with this type of data when handed to `$BIGCORP`, and I’m far more concerned with Oracle than Cisco, but I’d rather work with a smaller company who has more reason to not sell me out.

The insanely productive elf-lord, @quominus put together a small package ([`triebeard`](https://github.com/ironholds/triebeard)) that exposes an API for [radix/prefix tries](https://en.wikipedia.org/wiki/Trie) at both the R and Rcpp levels. I know he had some personal needs for this and we both kinda need these to augment some functions in our `iptools` package. Despite `triebeard` having both a vignette and function-level examples, I thought it might be good to show a real-world use of the package (at least in the cyber real world): fast determination of which [autonomous system](https://en.wikipedia.org/wiki/Autonomous_system_(Internet)) an IPv4 address is in (if it’s in one at all).

I’m not going to delve to deep into routing (you can find a good primer [here](http://www.kixtart.org/forums/ubbthreads.php?ubb=showflat&Number=81619&site_id=1#import) and one that puts routing in the context of radix tries [here](http://www.juniper.net/documentation/en_US/junos14.1/topics/usage-guidelines/policy-configuring-route-lists-for-use-in-routing-policy-match-conditions.html)) but there exists, essentially, abbreviated tables of which IP addresses belong to a particular network. These tables are in routers on your local networks and across the internet. Groups of these networks (on the internet) are composed into those autonomous systems I mentioned earlier and these tables are used to get the packets that make up the cat videos you watch routed to you as efficiently as possible.

When dealing with cybersecurity data science, it’s often useful to know which autonomous system an IP address belongs in. The world is indeed full of peril and in it there are many dark places. It’s a dangerous business, going out on the internet and we sometimes find it possible to identify unusually malicious autonomous systems by looking up suspicious IP addresses en masse. These mappings look something like this:

CIDR            ASN
1.0.0.0/24      47872
1.0.4.0/24      56203
1.0.5.0/24      56203
1.0.6.0/24      56203
1.0.7.0/24      38803
1.0.48.0/20     49597
1.0.64.0/18     18144

Each CIDR has a start and end IP address which can ultimately be converted to integers. Now, one _could_ just sequentially compare start and end ranges to see which CIDR an IP address belongs in, but there are (as of the day of this post) `647,563` CIDRs to compare against, which—in the worst case—would mean having to traverse through the entire list to find the match (or discover there is no match). There are some trivial ways to slightly optimize this, but the search times could still be fairly long, especially when you’re trying to match a billion IPv4 addresses to ASNs.

By storing the CIDR mask (the number of bits of the leading IP address specified after the `/`) in binary form (strings of 1’s and 0’s) as keys for the trie, we get much faster lookups (only a few comparisons at worst-case vs 647,563).

I made an initial, naïve, mostly straight R, implementation as a precursor to a more low-level implementation in Rcpp in our `iptools` package and to illustrate this use of the `triebeard` package.

One thing we’ll need is a function to convert an IPv4 address (in long integer form) into a binary character string. We _could_ do this with base R, but it’ll be super-slow and it doesn’t take much effort to create it with an Rcpp inline function:

library(Rcpp)
library(inline)

ip_to_binary_string <- rcpp(signature(x="integer"), "
  NumericVector xx(x);

  std::vector<double> X(xx.begin(),xx.end());
  std::vector<std::string> output(X.size());

  for (unsigned int i=0; i<X.size(); i++){

    if ((i % 10000) == 0) Rcpp::checkUserInterrupt();

    output[i] = std::bitset<32>(X[i]).to_string();

  }

  return(Rcpp::wrap(output));
")

ip_to_binary_string(ip_to_numeric("192.168.1.1"))
## [1] "11000000101010000000000100000001"

We take a vector from R and use some C++ standard library functions to convert them to bits. I vectorized this in C++ for speed (which is just a fancy way to say I used a `for` loop). In this case, our short cut will not make for a long delay.

Now, we’ll need a CIDR file. There are [historical ones](http://data.4tu.nl/repository/uuid:d4d23b8e-2077-4592-8b47-cb476ad16e12) avaialble, and I use one that I generated the day of this post (and, referenced in the code block below). You can use [`pyasn`](https://github.com/hadiasghari/pyasn) to make new ones daily (relegating mindless, automated, menial data retrieval tasks to the python goblins, like one should).

library(iptools)
library(stringi)
library(dplyr)
library(purrr)
library(readr)
library(tidyr)

asn_dat_url <- "http://rud.is/dl/asn-20160712.1600.dat.gz"
asn_dat_fil <- basename(asn_dat_url)
if (!file.exists(asn_dat_fil)) download.file(asn_dat_url, asn_dat_fil)

rip <- read_tsv(asn_dat_fil, comment=";", col_names=c("cidr", "asn"))
rip %>%
  separate(cidr, c("ip", "mask"), "/") %>%
  mutate(prefix=stri_sub(ip_to_binary_string(ip_to_numeric(ip)), 1, mask)) -> rip_df

rip_df
## # A tibble: 647,557 x 4
##           ip  mask   asn                   prefix
##        <chr> <chr> <int>                    <chr>
## 1    1.0.0.0    24 47872 000000010000000000000000
## 2    1.0.4.0    24 56203 000000010000000000000100
## 3    1.0.5.0    24 56203 000000010000000000000101
## 4    1.0.6.0    24 56203 000000010000000000000110
## 5    1.0.7.0    24 38803 000000010000000000000111
## 6   1.0.48.0    20 49597     00000001000000000011
## 7   1.0.64.0    18 18144       000000010000000001
## 8  1.0.128.0    17  9737        00000001000000001
## 9  1.0.128.0    18  9737       000000010000000010
## 10 1.0.128.0    19  9737      0000000100000000100
## # ... with 647,547 more rows

You can save off that `data_frame` to an R data file to pull in later (but it’s pretty fast to regenerate).

Now, we create the trie, using the prefix we calculated and a value we’ll piece together for this example:

library(triebeard)

rip_trie <- trie(rip_df$prefix, sprintf("%s/%s|%s", rip_df$ip, rip_df$mask, rip_df$asn))

Yep, that’s it. If you ran this yourself, it should have taken less than 2s on most modern systems to create the nigh 700,000 element trie.

Now, we’ll generate a million random IP addresses and look them up:

set.seed(1492)
data_frame(lkp=ip_random(1000000),
           lkp_bin=ip_to_binary_string(ip_to_numeric(lkp)),
           long=longest_match(rip_trie, lkp_bin)) -> lkp_df

lkp_df
## # A tibble: 1,000,000 x 3
##               lkp                          lkp_bin                long
##             <chr>                            <chr>               <chr>
## 1   35.251.195.57 00100011111110111100001100111001  35.248.0.0/13|4323
## 2     28.57.78.42 00011100001110010100111000101010                <NA>
## 3   24.60.146.202 00011000001111001001001011001010   24.60.0.0/14|7922
## 4    14.236.36.53 00001110111011000010010000110101                <NA>
## 5   7.146.253.182 00000111100100101111110110110110                <NA>
## 6     2.9.228.172 00000010000010011110010010101100     2.9.0.0/16|3215
## 7  108.111.124.79 01101100011011110111110001001111 108.111.0.0/16|3651
## 8    65.78.24.214 01000001010011100001100011010110   65.78.0.0/19|6079
## 9   50.48.151.239 00110010001100001001011111101111   50.48.0.0/13|5650
## 10  97.231.13.131 01100001111001110000110110000011   97.128.0.0/9|6167
## # ... with 999,990 more rows

On most modern systems, that should have taken less than 3s.

The `NA` values are not busted lookups. Many IP networks are assigned but not accessible (see [this](https://en.wikipedia.org/wiki/List_of_assigned_/8_IPv4_address_blocks) for more info). You can validate this with `cymruservices::bulk_origin()` on your own, too).

The trie structure for these CIDRs takes up approximately 9MB of RAM, a small price to pay for speedy lookups (and, memory really is not what the heart desires, anyway). Hopefully the `triebeard` package will help you speed up your own lookups and stay-tuned for a new version of `iptools` with some new and enhanced functions.

Google recently [announced](https://developers.google.com/speed/public-dns/docs/dns-over-https) their DNS-over-HTTPS API, which _”enhances privacy and security between a client and a recursive resolver, and complements DNSSEC to provide end-to-end authenticated DNS lookups”_. The REST API they provided was pretty simple to [wrap into a package](https://github.com/hrbrmstr/gdns) and I tossed in some [SPF](http://www.openspf.org/SPF_Record_Syntax) functions that I had lying around to bulk it up a bit.

### Why DNS-over-HTTPS?

DNS machinations usually happen over UDP (and sometimes TCP). Unless you’re using some fairly modern DNS augmentations, these exchanges happen in cleartext, meaning your query and the response are exposed during transport (and they are already exposed to the server you’re querying for a response).

DNS queries over HTTPS will be harder to [spoof](http://www.veracode.com/security/spoofing-attack) and the query + response will be encrypted, so you gain transport privacy when, say, you’re at Starbucks or from your DSL, FiOS, Gogo Inflight, or cable internet provider (yes, they all snoop on your DNS queries).

You end up trusting Google quite a bit with this API, but if you were currently using `8.8.8.8` or `8.8.4.4` (or their IPv6 equivalents) you were already trusting Google (and it’s likely Google knows what you’re doing on the internet anyway given all the trackers and especially if you’re using Chrome).

One additional item you gain using this API is more control over [`EDNS0`](https://tools.ietf.org/html/draft-vandergaast-edns-client-ip-00) settings. `EDNS0` is a DNS protocol extension that, for example, enables the content delivery networks to pick the “closest” server farm to ensure speedy delivery of your streaming Game of Thrones binge watch. They get to know a piece of your IP address so they can make this decision, but you end up giving away a bit of privacy (though you lose the privacy in the end since the target CDN servers know precisely where you are).

Right now, there’s no way for most clients to use DNS-over-HTTPS directly, but the API can be used in a programmatic fashion, which may be helpful in situations where you need to do some DNS spelunking but UDP is blocked or you’re on a platform that can’t build the [`resolv`](https://github.com/hrbrmstr/resolv) package.

You can learn a bit more about DNS and privacy in this [IETF paper](https://www.ietf.org/mail-archive/web/dns-privacy/current/pdfWqAIUmEl47.pdf) [PDF].

### Mining DNS with `gdns`

The `gdns` package is pretty straightforward. Use the `query()` function to get DNS info for a single entity:

library(gdns)
 
query("apple.com")
## $Status  
## [1] 0          # NOERROR - Standard DNS response code (32 bit integer)
## 
## $TC
## [1] FALSE      # Whether the response is truncated
## 
## $RD
## [1] TRUE       # Should always be true for Google Public DNS
## 
## $RA
## [1] TRUE       # Should always be true for Google Public DNS
## 
## $AD
## [1] FALSE      # Whether all response data was validated with DNSSEC
## 
## $CD
## [1] FALSE      # Whether the client asked to disable DNSSEC
## 
## $Question
##         name type
## 1 apple.com.    1
## 
## $Answer
##         name type  TTL          data
## 1 apple.com.    1 1547 17.172.224.47
## 2 apple.com.    1 1547  17.178.96.59
## 3 apple.com.    1 1547 17.142.160.59
## 
## $Additional
## list()
## 
## $edns_client_subnet
## [1] "0.0.0.0/0"

The `gdns` lookup functions are set to use an `edns_client_subnet` of `0.0.0.0/0`, meaning your local IP address or subnet is not leaked outside of your connection to Google (you can override this behavior).

You can do reverse lookups as well (i.e. query IP addresses):

query("17.172.224.47", "PTR")
## $Status
## [1] 0
## 
## $TC
## [1] FALSE
## 
## $RD
## [1] TRUE
## 
## $RA
## [1] TRUE
## 
## $AD
## [1] FALSE
## 
## $CD
## [1] FALSE
## 
## $Question
##                          name type
## 1 47.224.172.17.in-addr.arpa.   12
## 
## $Answer
##                            name type  TTL                           data
## 1   47.224.172.17.in-addr.arpa.   12 1073               webobjects.info.
## 2   47.224.172.17.in-addr.arpa.   12 1073                   yessql.info.
## 3   47.224.172.17.in-addr.arpa.   12 1073                 apples-msk.ru.
## 4   47.224.172.17.in-addr.arpa.   12 1073                     icloud.se.
## 5   47.224.172.17.in-addr.arpa.   12 1073                     icloud.es.
## 6   47.224.172.17.in-addr.arpa.   12 1073                     icloud.om.
## 7   47.224.172.17.in-addr.arpa.   12 1073                   icloudo.com.
## 8   47.224.172.17.in-addr.arpa.   12 1073                     icloud.ch.
## 9   47.224.172.17.in-addr.arpa.   12 1073                     icloud.fr.
## 10  47.224.172.17.in-addr.arpa.   12 1073                   icloude.com.
## 11  47.224.172.17.in-addr.arpa.   12 1073          camelspaceeffect.com.
## 12  47.224.172.17.in-addr.arpa.   12 1073                 camelphat.com.
## 13  47.224.172.17.in-addr.arpa.   12 1073              alchemysynth.com.
## 14  47.224.172.17.in-addr.arpa.   12 1073                    openni.org.
## 15  47.224.172.17.in-addr.arpa.   12 1073                      swell.am.
## 16  47.224.172.17.in-addr.arpa.   12 1073                  appleweb.net.
## 17  47.224.172.17.in-addr.arpa.   12 1073       appleipodsettlement.com.
## 18  47.224.172.17.in-addr.arpa.   12 1073                    earpod.net.
## 19  47.224.172.17.in-addr.arpa.   12 1073                 yourapple.com.
## 20  47.224.172.17.in-addr.arpa.   12 1073                    xserve.net.
## 21  47.224.172.17.in-addr.arpa.   12 1073                    xserve.com.
## 22  47.224.172.17.in-addr.arpa.   12 1073            velocityengine.com.
## 23  47.224.172.17.in-addr.arpa.   12 1073           velocity-engine.com.
## 24  47.224.172.17.in-addr.arpa.   12 1073            universityarts.com.
## 25  47.224.172.17.in-addr.arpa.   12 1073            thinkdifferent.com.
## 26  47.224.172.17.in-addr.arpa.   12 1073               theatremode.com.
## 27  47.224.172.17.in-addr.arpa.   12 1073               theatermode.com.
## 28  47.224.172.17.in-addr.arpa.   12 1073           streamquicktime.net.
## 29  47.224.172.17.in-addr.arpa.   12 1073           streamquicktime.com.
## 30  47.224.172.17.in-addr.arpa.   12 1073                ripmixburn.com.
## 31  47.224.172.17.in-addr.arpa.   12 1073              rip-mix-burn.com.
## 32  47.224.172.17.in-addr.arpa.   12 1073        quicktimestreaming.net.
## 33  47.224.172.17.in-addr.arpa.   12 1073        quicktimestreaming.com.
## 34  47.224.172.17.in-addr.arpa.   12 1073                  quicktime.cc.
## 35  47.224.172.17.in-addr.arpa.   12 1073                      qttv.net.
## 36  47.224.172.17.in-addr.arpa.   12 1073                      qtml.com.
## 37  47.224.172.17.in-addr.arpa.   12 1073                     qt-tv.net.
## 38  47.224.172.17.in-addr.arpa.   12 1073          publishingsurvey.org.
## 39  47.224.172.17.in-addr.arpa.   12 1073          publishingsurvey.com.
## 40  47.224.172.17.in-addr.arpa.   12 1073        publishingresearch.org.
## 41  47.224.172.17.in-addr.arpa.   12 1073        publishingresearch.com.
## 42  47.224.172.17.in-addr.arpa.   12 1073         publishing-survey.org.
## 43  47.224.172.17.in-addr.arpa.   12 1073         publishing-survey.com.
## 44  47.224.172.17.in-addr.arpa.   12 1073       publishing-research.org.
## 45  47.224.172.17.in-addr.arpa.   12 1073       publishing-research.com.
## 46  47.224.172.17.in-addr.arpa.   12 1073                  powerbook.cc.
## 47  47.224.172.17.in-addr.arpa.   12 1073             playquicktime.net.
## 48  47.224.172.17.in-addr.arpa.   12 1073             playquicktime.com.
## 49  47.224.172.17.in-addr.arpa.   12 1073           nwk-apple.apple.com.
## 50  47.224.172.17.in-addr.arpa.   12 1073                   myapple.net.
## 51  47.224.172.17.in-addr.arpa.   12 1073                  macreach.net.
## 52  47.224.172.17.in-addr.arpa.   12 1073                  macreach.com.
## 53  47.224.172.17.in-addr.arpa.   12 1073                   macmate.com.
## 54  47.224.172.17.in-addr.arpa.   12 1073         macintoshsoftware.com.
## 55  47.224.172.17.in-addr.arpa.   12 1073                    machos.net.
## 56  47.224.172.17.in-addr.arpa.   12 1073                   mach-os.net.
## 57  47.224.172.17.in-addr.arpa.   12 1073                   mach-os.com.
## 58  47.224.172.17.in-addr.arpa.   12 1073                   ischool.com.
## 59  47.224.172.17.in-addr.arpa.   12 1073           insidemacintosh.com.
## 60  47.224.172.17.in-addr.arpa.   12 1073             imovietheater.com.
## 61  47.224.172.17.in-addr.arpa.   12 1073               imoviestage.com.
## 62  47.224.172.17.in-addr.arpa.   12 1073             imoviegallery.com.
## 63  47.224.172.17.in-addr.arpa.   12 1073               imacsources.com.
## 64  47.224.172.17.in-addr.arpa.   12 1073        imac-applecomputer.com.
## 65  47.224.172.17.in-addr.arpa.   12 1073                imac-apple.com.
## 66  47.224.172.17.in-addr.arpa.   12 1073                     ikids.com.
## 67  47.224.172.17.in-addr.arpa.   12 1073              ibookpartner.com.
## 68  47.224.172.17.in-addr.arpa.   12 1073                   geoport.com.
## 69  47.224.172.17.in-addr.arpa.   12 1073                   firewire.cl.
## 70  47.224.172.17.in-addr.arpa.   12 1073               expertapple.com.
## 71  47.224.172.17.in-addr.arpa.   12 1073              edu-research.org.
## 72  47.224.172.17.in-addr.arpa.   12 1073               dvdstudiopro.us.
## 73  47.224.172.17.in-addr.arpa.   12 1073              dvdstudiopro.org.
## 74  47.224.172.17.in-addr.arpa.   12 1073              dvdstudiopro.net.
## 75  47.224.172.17.in-addr.arpa.   12 1073             dvdstudiopro.info.
## 76  47.224.172.17.in-addr.arpa.   12 1073              dvdstudiopro.com.
## 77  47.224.172.17.in-addr.arpa.   12 1073              dvdstudiopro.biz.
## 78  47.224.172.17.in-addr.arpa.   12 1073          developercentral.com.
## 79  47.224.172.17.in-addr.arpa.   12 1073             desktopmovies.org.
## 80  47.224.172.17.in-addr.arpa.   12 1073             desktopmovies.net.
## 81  47.224.172.17.in-addr.arpa.   12 1073              desktopmovie.org.
## 82  47.224.172.17.in-addr.arpa.   12 1073              desktopmovie.net.
## 83  47.224.172.17.in-addr.arpa.   12 1073              desktopmovie.com.
## 84  47.224.172.17.in-addr.arpa.   12 1073          darwinsourcecode.com.
## 85  47.224.172.17.in-addr.arpa.   12 1073              darwinsource.org.
## 86  47.224.172.17.in-addr.arpa.   12 1073              darwinsource.com.
## 87  47.224.172.17.in-addr.arpa.   12 1073                darwincode.com.
## 88  47.224.172.17.in-addr.arpa.   12 1073                carbontest.com.
## 89  47.224.172.17.in-addr.arpa.   12 1073              carbondating.com.
## 90  47.224.172.17.in-addr.arpa.   12 1073                 carbonapi.com.
## 91  47.224.172.17.in-addr.arpa.   12 1073           braeburncapital.com.
## 92  47.224.172.17.in-addr.arpa.   12 1073                  applexpo.net.
## 93  47.224.172.17.in-addr.arpa.   12 1073                  applexpo.com.
## 94  47.224.172.17.in-addr.arpa.   12 1073                applereach.net.
## 95  47.224.172.17.in-addr.arpa.   12 1073                applereach.com.
## 96  47.224.172.17.in-addr.arpa.   12 1073            appleiservices.com.
## 97  47.224.172.17.in-addr.arpa.   12 1073     applefinalcutproworld.org.
## 98  47.224.172.17.in-addr.arpa.   12 1073     applefinalcutproworld.net.
## 99  47.224.172.17.in-addr.arpa.   12 1073     applefinalcutproworld.com.
## 100 47.224.172.17.in-addr.arpa.   12 1073            applefilmmaker.com.
## 101 47.224.172.17.in-addr.arpa.   12 1073             applefilmaker.com.
## 102 47.224.172.17.in-addr.arpa.   12 1073                appleenews.com.
## 103 47.224.172.17.in-addr.arpa.   12 1073               appledarwin.org.
## 104 47.224.172.17.in-addr.arpa.   12 1073               appledarwin.net.
## 105 47.224.172.17.in-addr.arpa.   12 1073               appledarwin.com.
## 106 47.224.172.17.in-addr.arpa.   12 1073         applecomputerimac.com.
## 107 47.224.172.17.in-addr.arpa.   12 1073        applecomputer-imac.com.
## 108 47.224.172.17.in-addr.arpa.   12 1073                  applecare.cc.
## 109 47.224.172.17.in-addr.arpa.   12 1073               applecarbon.com.
## 110 47.224.172.17.in-addr.arpa.   12 1073                 apple-inc.net.
## 111 47.224.172.17.in-addr.arpa.   12 1073               apple-enews.com.
## 112 47.224.172.17.in-addr.arpa.   12 1073              apple-darwin.org.
## 113 47.224.172.17.in-addr.arpa.   12 1073              apple-darwin.net.
## 114 47.224.172.17.in-addr.arpa.   12 1073              apple-darwin.com.
## 115 47.224.172.17.in-addr.arpa.   12 1073                  mobileme.com.
## 116 47.224.172.17.in-addr.arpa.   12 1073                ipa-iphone.net.
## 117 47.224.172.17.in-addr.arpa.   12 1073               jetfuelapps.com.
## 118 47.224.172.17.in-addr.arpa.   12 1073                jetfuelapp.com.
## 119 47.224.172.17.in-addr.arpa.   12 1073                   burstly.net.
## 120 47.224.172.17.in-addr.arpa.   12 1073             appmediagroup.com.
## 121 47.224.172.17.in-addr.arpa.   12 1073             airsupportapp.com.
## 122 47.224.172.17.in-addr.arpa.   12 1073            burstlyrewards.com.
## 123 47.224.172.17.in-addr.arpa.   12 1073        surveys-temp.apple.com.
## 124 47.224.172.17.in-addr.arpa.   12 1073               appleiphone.com.
## 125 47.224.172.17.in-addr.arpa.   12 1073                       asto.re.
## 126 47.224.172.17.in-addr.arpa.   12 1073                 itunesops.com.
## 127 47.224.172.17.in-addr.arpa.   12 1073                     apple.com.
## 128 47.224.172.17.in-addr.arpa.   12 1073     st11p01ww-apple.apple.com.
## 129 47.224.172.17.in-addr.arpa.   12 1073                      apple.by.
## 130 47.224.172.17.in-addr.arpa.   12 1073                 airtunes.info.
## 131 47.224.172.17.in-addr.arpa.   12 1073              applecentre.info.
## 132 47.224.172.17.in-addr.arpa.   12 1073         applecomputerinc.info.
## 133 47.224.172.17.in-addr.arpa.   12 1073                appleexpo.info.
## 134 47.224.172.17.in-addr.arpa.   12 1073             applemasters.info.
## 135 47.224.172.17.in-addr.arpa.   12 1073                 applepay.info.
## 136 47.224.172.17.in-addr.arpa.   12 1073 applepaymerchantsupplies.info.
## 137 47.224.172.17.in-addr.arpa.   12 1073         applepaysupplies.info.
## 138 47.224.172.17.in-addr.arpa.   12 1073              applescript.info.
## 139 47.224.172.17.in-addr.arpa.   12 1073               appleshare.info.
## 140 47.224.172.17.in-addr.arpa.   12 1073                   macosx.info.
## 141 47.224.172.17.in-addr.arpa.   12 1073                powerbook.info.
## 142 47.224.172.17.in-addr.arpa.   12 1073                 powermac.info.
## 143 47.224.172.17.in-addr.arpa.   12 1073            quicktimelive.info.
## 144 47.224.172.17.in-addr.arpa.   12 1073              quicktimetv.info.
## 145 47.224.172.17.in-addr.arpa.   12 1073                 sherlock.info.
## 146 47.224.172.17.in-addr.arpa.   12 1073            shopdifferent.info.
## 147 47.224.172.17.in-addr.arpa.   12 1073                 skyvines.info.
## 148 47.224.172.17.in-addr.arpa.   12 1073                     ubnw.info.
## 
## $Additional
## list()
## 
## $edns_client_subnet
## [1] "0.0.0.0/0"

And, you can go “easter egg” hunting:

cat(query("google-public-dns-a.google.com", "TXT")$Answer$data)
## "http://xkcd.com/1361/"

Note that Google DNS-over-HTTPS supports [all the RR types](http://www.iana.org/assignments/dns-parameters/dns-parameters.xhtml#dns-parameters-4).

If you have more than a few domains to lookup and are querying for the same RR record, you can use the `bulk_query()` function:

hosts <- c("rud.is", "dds.ec", "r-project.org", "rstudio.com", "apple.com")
bulk_query(hosts)
## Source: local data frame [7 x 4]
## 
##             name  type   TTL            data
##            (chr) (int) (int)           (chr)
## 1        rud.is.     1  3599 104.236.112.222
## 2        dds.ec.     1   299   162.243.111.4
## 3 r-project.org.     1  3601   137.208.57.37
## 4   rstudio.com.     1  3599    45.79.156.36
## 5     apple.com.     1  1088   17.172.224.47
## 6     apple.com.     1  1088    17.178.96.59
## 7     apple.com.     1  1088   17.142.160.59

Note that this function only returns a `data_frame` (none of the status fields).

### More DNSpelunking with `gdns`

DNS records contain a treasure trove of data (at least for cybersecurity researchers). Say you have a list of base, primary domains for the Fortune 1000:

library(readr)
library(urltools)
 
URL <- "https://gist.githubusercontent.com/hrbrmstr/ae574201af3de035c684/raw/2d21bb4132b77b38f2992dfaab99649397f238e9/f1000.csv"
fil <- basename(URL)
if (!file.exists(fil)) download.file(URL, fil)
 
f1k <- read_csv(fil)
 
doms1k <- suffix_extract(domain(f1k$website))
doms1k <- paste(doms1k$domain, doms1k$suffix, sep=".")
 
head(doms1k)
## [1] "walmart.com"           "exxonmobil.com"       
## [3] "chevron.com"           "berkshirehathaway.com"
## [5] "apple.com"             "gm.com"

We can get all the `TXT` records for them:

library(parallel)
library(doParallel) # parallel ops will make this go faster
library(foreach)
library(dplyr)
library(ggplot2)
library(grid)
library(hrbrmrkdn)
 
cl <- makePSOCKcluster(4)
registerDoParallel(cl)
 
f1k_l <- foreach(dom=doms1k) %dopar% gdns::bulk_query(dom, "TXT")
f1k <- bind_rows(f1k_l)
 
length(unique(f1k$name))
## [1] 858
 
df <- count(count(f1k, name), `Number of TXT records`=n)
df <- bind_rows(df, data_frame(`Number of TXT records`=0, n=142))
 
gg <- ggplot(df, aes(`Number of TXT records`, n))
gg <- gg + geom_bar(stat="identity", width=0.75)
gg <- gg + scale_x_continuous(expand=c(0,0), breaks=0:13)
gg <- gg + scale_y_continuous(expand=c(0,0))
gg <- gg + labs(y="# Orgs", 
                title="TXT record count per Fortune 1000 Org")
gg <- gg + theme_hrbrmstr(grid="Y", axis="xy")
gg <- gg + theme(axis.title.x=element_text(margin=margin(t=-22)))
gg <- gg + theme(axis.title.y=element_text(angle=0, vjust=1, 
                                           margin=margin(r=-49)))
gg <- gg + theme(plot.margin=margin(t=10, l=30, b=30, r=10))
gg <- gg + theme(plot.title=element_text(margin=margin(b=20)))
gg

Fullscreen_4_11_16__12_35_AM

We can see that 858 of the Fortune 1000 have `TXT` records and more than a few have between 2 and 5 of them. Why look at `TXT` records? Well, they can tell us things like who uses cloud e-mail services, such as Outlook365:

sort(f1k$name[which(grepl("(MS=|outlook)", spf_includes(f1k$data), ignore.case=TRUE))])
##   [1] "21cf.com."                  "77nrg.com."                 "abbott.com."                "acuitybrands.com."         
##   [5] "adm.com."                   "adobe.com."                 "alaskaair.com."             "aleris.com."               
##   [9] "allergan.com."              "altria.com."                "amark.com."                 "ameren.com."               
##  [13] "americantower.com."         "ametek.com."                "amkor.com."                 "amphenol.com."             
##  [17] "amwater.com."               "analog.com."                "anixter.com."               "apachecorp.com."           
##  [21] "archrock.com."              "archrock.com."              "armstrong.com."             "aschulman.com."            
##  [25] "assurant.com."              "autonation.com."            "autozone.com."              "axiall.com."               
##  [29] "bd.com."                    "belk.com."                  "biglots.com."               "bio-rad.com."              
##  [33] "biomet.com."                "bloominbrands.com."         "bms.com."                   "borgwarner.com."           
##  [37] "boydgaming.com."            "brinks.com."                "brocade.com."               "brunswick.com."            
##  [41] "cabotog.com."               "caleres.com."               "campbellsoupcompany.com."   "carefusion.com."           
##  [45] "carlyle.com."               "cartech.com."               "cbrands.com."               "cbre.com."                 
##  [49] "chemtura.com."              "chipotle.com."              "chiquita.com."              "churchdwight.com."         
##  [53] "cinemark.com."              "cintas.com."                "cmc.com."                   "cmsenergy.com."            
##  [57] "cognizant.com."             "colfaxcorp.com."            "columbia.com."              "commscope.com."            
##  [61] "con-way.com."               "convergys.com."             "couche-tard.com."           "crestwoodlp.com."          
##  [65] "crowncastle.com."           "crowncork.com."             "csx.com."                   "cummins.com."              
##  [69] "cunamutual.com."            "dana.com."                  "darlingii.com."             "deanfoods.com."            
##  [73] "dentsplysirona.com."        "discoverfinancial.com."     "disney.com."                "donaldson.com."            
##  [77] "drhorton.com."              "dupont.com."                "dyn-intl.com."              "dynegy.com."               
##  [81] "ea.com."                    "eastman.com."               "ecolab.com."                "edgewell.com."             
##  [85] "edwards.com."               "emc.com."                   "enablemidstream.com."       "energyfutureholdings.com." 
##  [89] "energytransfer.com."        "eogresources.com."          "equinix.com."               "expeditors.com."           
##  [93] "express.com."               "fastenal.com."              "ferrellgas.com."            "fisglobal.com."            
##  [97] "flowserve.com."             "fmglobal.com."              "fnf.com."                   "g-iii.com."                
## [101] "genpt.com."                 "ggp.com."                   "gilead.com."                "goodyear.com."             
## [105] "grainger.com."              "graphicpkg.com."            "hanes.com."                 "hanover.com."              
## [109] "harley-davidson.com."       "harsco.com."                "hasbro.com."                "hbfuller.com."             
## [113] "hei.com."                   "hhgregg.com."               "hnicorp.com."               "homedepot.com."            
## [117] "hpinc.com."                 "hubgroup.com."              "iac.com."                   "igt.com."                  
## [121] "iheartmedia.com."           "insperity.com."             "itt.com."                   "itw.com."                  
## [125] "jarden.com."                "jcpenney.com."              "jll.com."                   "joyglobal.com."            
## [129] "juniper.net."               "kellyservices.com."         "kennametal.com."            "kiewit.com."               
## [133] "kindermorgan.com."          "kindredhealthcare.com."     "kodak.com."                 "lamresearch.com."          
## [137] "lansingtradegroup.com."     "lennar.com."                "levistrauss.com."           "lithia.com."               
## [141] "manitowoc.com."             "manpowergroup.com."         "marathonoil.com."           "marathonpetroleum.com."    
## [145] "mastec.com."                "mastercard.com."            "mattel.com."                "maximintegrated.com."      
## [149] "mednax.com."                "mercuryinsurance.com."      "mgmresorts.com."            "micron.com."               
## [153] "mohawkind.com."             "molsoncoors.com."           "mosaicco.com."              "motorolasolutions.com."    
## [157] "mpgdriven.com."             "mscdirect.com."             "mtb.com."                   "murphyoilcorp.com."        
## [161] "mutualofomaha.com."         "mwv.com."                   "navistar.com."              "nbty.com."                 
## [165] "newellrubbermaid.com."      "nexeosolutions.com."        "nike.com."                  "nobleenergyinc.com."       
## [169] "o-i.com."                   "oge.com."                   "olin.com."                  "omnicomgroup.com."         
## [173] "onsemi.com."                "owens-minor.com."           "paychex.com."               "peabodyenergy.com."        
## [177] "pepboys.com."               "pmi.com."                   "pnkinc.com."                "polaris.com."              
## [181] "polyone.com."               "postholdings.com."          "ppg.com."                   "prudential.com."           
## [185] "qg.com."                    "quantaservices.com."        "quintiles.com."             "rcscapital.com."           
## [189] "rexnord.com."               "roberthalf.com."            "rushenterprises.com."       "ryland.com."               
## [193] "sandisk.com."               "sands.com."                 "scansource.com."            "sempra.com."               
## [197] "sonoco.com."                "spiritaero.com."            "sprouts.com."               "stanleyblackanddecker.com."
## [201] "starwoodhotels.com."        "steelcase.com."             "stryker.com."               "sunedison.com."            
## [205] "sunpower.com."              "supervalu.com."             "swifttrans.com."            "synnex.com."               
## [209] "taylormorrison.com."        "techdata.com."              "tegna.com."                 "tempursealy.com."          
## [213] "tetratech.com."             "theice.com."                "thermofisher.com."          "tjx.com."                  
## [217] "trueblue.com."              "ufpi.com."                  "ulta.com."                  "unfi.com."                 
## [221] "unifiedgrocers.com."        "universalcorp.com."         "vishay.com."                "visteon.com."              
## [225] "vwr.com."                   "westarenergy.com."          "westernunion.com."          "westrock.com."             
## [229] "wfscorp.com."               "whitewave.com."             "wpxenergy.com."             "wyndhamworldwide.com."     
## [233] "xilinx.com."                "xpo.com."                   "yum.com."                   "zimmerbiomet.com."

That’s 236 of them outsourcing some part of e-mail services to Microsoft.

We can also see which ones have terrible mail configs (`+all` or `all` passing):

f1k[which(passes_all(f1k$data)),]$name
## [1] "wfscorp.com."      "dupont.com."       "group1auto.com."   "uhsinc.com."      
## [5] "bigheartpet.com."  "pcconnection.com."

or are configured for Exchange federation services:

sort(f1k$name[which(grepl("==", f1k$data))])
## sort(f1k$name[which(grepl("==", f1k$data))])
##   [1] "21cf.com."                 "aarons.com."               "abbott.com."               "abbvie.com."              
##   [5] "actavis.com."              "activisionblizzard.com."   "acuitybrands.com."         "adm.com."                 
##   [9] "adobe.com."                "adt.com."                  "advanceautoparts.com."     "aecom.com."               
##  [13] "aetna.com."                "agilent.com."              "airproducts.com."          "alcoa.com."               
##  [17] "aleris.com."               "allergan.com."             "alliancedata.com."         "amcnetworks.com."         
##  [21] "amd.com."                  "americantower.com."        "amfam.com."                "amgen.com."               
##  [25] "amtrustgroup.com."         "amtrustgroup.com."         "amtrustgroup.com."         "amtrustgroup.com."        
##  [29] "anadarko.com."             "analog.com."               "apachecorp.com."           "applied.com."             
##  [33] "aptar.com."                "aramark.com."              "aramark.com."              "arcb.com."                
##  [37] "archcoal.com."             "armstrong.com."            "armstrong.com."            "arrow.com."               
##  [41] "asburyauto.com."           "autonation.com."           "avnet.com."                "ball.com."                
##  [45] "bankofamerica.com."        "baxter.com."               "bc.com."                   "bd.com."                  
##  [49] "bd.com."                   "bd.com."                   "belden.com."               "bemis.com."               
##  [53] "bestbuy.com."              "biogen.com."               "biomet.com."               "bloominbrands.com."       
##  [57] "bms.com."                  "boeing.com."               "bonton.com."               "borgwarner.com."          
##  [61] "brinks.com."               "brocade.com."              "brunswick.com."            "c-a-m.com."               
##  [65] "ca.com."                   "cabelas.com."              "cabotog.com."              "caleres.com."             
##  [69] "caleres.com."              "caleres.com."              "calpine.com."              "capitalone.com."          
##  [73] "cardinal.com."             "carlyle.com."              "carlyle.com."              "cartech.com."             
##  [77] "cbre.com."                 "celgene.com."              "centene.com."              "centurylink.com."         
##  [81] "cerner.com."               "cerner.com."               "cfindustries.com."         "ch2m.com."                
##  [85] "chevron.com."              "chipotle.com."             "chiquita.com."             "chk.com."                 
##  [89] "chrobinson.com."           "chs.net."                  "chsinc.com."               "chubb.com."               
##  [93] "ciena.com."                "cigna.com."                "cinemark.com."             "cit.com."                 
##  [97] "cmc.com."                  "cmegroup.com."             "coach.com."                "cognizant.com."           
## [101] "cokecce.com."              "colfaxcorp.com."           "columbia.com."             "commscope.com."           
## [105] "con-way.com."              "conagrafoods.com."         "conocophillips.com."       "coopertire.com."          
## [109] "core-mark.com."            "crbard.com."               "crestwoodlp.com."          "crowncastle.com."         
## [113] "crowncork.com."            "csx.com."                  "danaher.com."              "darden.com."              
## [117] "darlingii.com."            "davita.com."               "davita.com."               "davita.com."              
## [121] "dentsplysirona.com."       "diebold.com."              "diplomat.is."              "dish.com."                
## [125] "disney.com."               "donaldson.com."            "dresser-rand.com."         "dstsystems.com."          
## [129] "dupont.com."               "dupont.com."               "dyn-intl.com."             "dyn-intl.com."            
## [133] "dynegy.com."               "ea.com."                   "ea.com."                   "eastman.com."             
## [137] "ebay.com."                 "echostar.com."             "ecolab.com."               "edmc.edu."                
## [141] "edwards.com."              "elcompanies.com."          "emc.com."                  "emerson.com."             
## [145] "energyfutureholdings.com." "energytransfer.com."       "eogresources.com."         "equinix.com."             
## [149] "essendant.com."            "esterline.com."            "evhc.net."                 "exelisinc.com."           
## [153] "exeloncorp.com."           "express-scripts.com."      "express.com."              "express.com."             
## [157] "exxonmobil.com."           "familydollar.com."         "fanniemae.com."            "fastenal.com."            
## [161] "fbhs.com."                 "ferrellgas.com."           "firstenergycorp.com."      "firstsolar.com."          
## [165] "fiserv.com."               "flowserve.com."            "fmc.com."                  "fmglobal.com."            
## [169] "fnf.com."                  "freddiemac.com."           "ge.com."                   "genpt.com."               
## [173] "genworth.com."             "ggp.com."                  "grace.com."                "grainger.com."            
## [177] "graphicpkg.com."           "graybar.com."              "guess.com."                "hain.com."                
## [181] "halliburton.com."          "hanes.com."                "hanes.com."                "harley-davidson.com."     
## [185] "harman.com."               "harris.com."               "harsco.com."               "hasbro.com."              
## [189] "hcahealthcare.com."        "hcc.com."                  "hei.com."                  "henryschein.com."         
## [193] "hess.com."                 "hhgregg.com."              "hnicorp.com."              "hollyfrontier.com."       
## [197] "hologic.com."              "honeywell.com."            "hospira.com."              "hp.com."                  
## [201] "hpinc.com."                "hrblock.com."              "iac.com."                  "igt.com."                 
## [205] "iheartmedia.com."          "imshealth.com."            "ingrammicro.com."          "intel.com."               
## [209] "interpublic.com."          "intuit.com."               "ironmountain.com."         "jacobs.com."              
## [213] "jarden.com."               "jcpenney.com."             "jll.com."                  "johndeere.com."           
## [217] "johndeere.com."            "joyglobal.com."            "juniper.net."              "karauctionservices.com."  
## [221] "kbhome.com."               "kemper.com."               "keurig.com."               "khov.com."                
## [225] "kindredhealthcare.com."    "kkr.com."                  "kla-tencor.com."           "labcorp.com."             
## [229] "labcorp.com."              "lamresearch.com."          "lamresearch.com."          "landolakesinc.com."       
## [233] "lansingtradegroup.com."    "lear.com."                 "leggmason.com."            "leidos.com."              
## [237] "level3.com."               "libertymutual.com."        "lilly.com."                "lithia.com."              
## [241] "livenation.com."           "lkqcorp.com."              "loews.com."                "magellanhealth.com."      
## [245] "manitowoc.com."            "marathonoil.com."          "marathonpetroleum.com."    "markelcorp.com."          
## [249] "markwest.com."             "marriott.com."             "martinmarietta.com."       "masco.com."               
## [253] "massmutual.com."           "mastec.com."               "mastercard.com."           "mattel.com."              
## [257] "maximintegrated.com."      "mckesson.com."             "mercuryinsurance.com."     "meritor.com."             
## [261] "metlife.com."              "mgmresorts.com."           "micron.com."               "microsoft.com."           
## [265] "mohawkind.com."            "molsoncoors.com."          "monsanto.com."             "mosaicco.com."            
## [269] "motorolasolutions.com."    "mscdirect.com."            "murphyoilcorp.com."        "nasdaqomx.com."           
## [273] "navistar.com."             "nbty.com."                 "ncr.com."                  "netapp.com."              
## [277] "newfield.com."             "newscorp.com."             "nike.com."                 "nov.com."                 
## [281] "nrgenergy.com."            "ntenergy.com."             "nucor.com."                "nustarenergy.com."        
## [285] "o-i.com."                  "oaktreecapital.com."       "ocwen.com."                "omnicare.com."            
## [289] "oneok.com."                "oneok.com."                "onsemi.com."               "outerwall.com."           
## [293] "owens-minor.com."          "owens-minor.com."          "oxy.com."                  "packagingcorp.com."       
## [297] "pall.com."                 "parexel.com."              "paychex.com."              "pcconnection.com."        
## [301] "penskeautomotive.com."     "pepsico.com."              "pfizer.com."               "pg.com."                  
## [305] "polaris.com."              "polyone.com."              "pplweb.com."               "principal.com."           
## [309] "protective.com."           "publix.com."               "qg.com."                   "questdiagnostics.com."    
## [313] "quintiles.com."            "rcscapital.com."           "realogy.com."              "regmovies.com."           
## [317] "rentacenter.com."          "republicservices.com."     "rexnord.com."              "reynoldsamerican.com."    
## [321] "reynoldsamerican.com."     "rgare.com."                "roberthalf.com."           "rpc.net."                 
## [325] "rushenterprises.com."      "safeway.com."              "saic.com."                 "sandisk.com."             
## [329] "scana.com."                "scansource.com."           "seaboardcorp.com."         "selective.com."           
## [333] "selective.com."            "sempra.com."               "servicemaster.com."        "servicemaster.com."       
## [337] "servicemaster.com."        "sjm.com."                  "sm-energy.com."            "spectraenergy.com."       
## [341] "spiritaero.com."           "sprouts.com."              "spx.com."                  "staples.com."             
## [345] "starbucks.com."            "starwoodhotels.com."       "statestreet.com."          "steelcase.com."           
## [349] "steeldynamics.com."        "stericycle.com."           "stifel.com."               "stryker.com."             
## [353] "sunedison.com."            "sungard.com."              "supervalu.com."            "symantec.com."            
## [357] "symantec.com."             "synnex.com."               "synopsys.com."             "taylormorrison.com."      
## [361] "tdsinc.com."               "teamhealth.com."           "techdata.com."             "teledyne.com."            
## [365] "tempursealy.com."          "tenethealth.com."          "teradata.com."             "tetratech.com."           
## [369] "textron.com."              "thermofisher.com."         "tiaa-cref.org."            "tiffany.com."             
## [373] "timewarner.com."           "towerswatson.com."         "treehousefoods.com."       "tribunemedia.com."        
## [377] "trimble.com."              "trinet.com."               "trueblue.com."             "ugicorp.com."             
## [381] "uhsinc.com."               "ulta.com."                 "unifiedgrocers.com."       "unisys.com."              
## [385] "unum.com."                 "usfoods.com."              "varian.com."               "verizon.com."             
## [389] "vfc.com."                  "viacom.com."               "visa.com."                 "vishay.com."              
## [393] "visteon.com."              "wabtec.com."               "walmart.com."              "wecenergygroup.com."      
## [397] "wecenergygroup.com."       "west.com."                 "westarenergy.com."         "westlake.com."            
## [401] "westrock.com."             "weyerhaeuser.com."         "wholefoodsmarket.com."     "williams.com."            
## [405] "wm.com."                   "wnr.com."                  "wpxenergy.com."            "wyndhamworldwide.com."    
## [409] "xerox.com."                "xpo.com."                  "yrcw.com."                 "yum.com."                 
## [413] "zoetis.com."

And, even go so far as to see what are the most popular third-party mail services:

incl <- suffix_extract(sort(unlist(spf_includes(f1k$data))))
incl <- data.frame(table(paste(incl$domain, incl$suffix, sep=".")), stringsAsFactors=FALSE)
incl <- head(arrange(incl, desc(Freq)), 20)
incl <- mutate(incl, Var1=factor(Var1, Var1))
incl <- rename(incl, Service=Var1, Count=Freq)
 
gg <- ggplot(incl, aes(Service, Count))
gg <- gg + geom_bar(stat="identity", width=0.75)
gg <- gg + scale_x_discrete(expand=c(0,0))
gg <- gg + scale_y_continuous(expand=c(0,0), limits=c(0, 250))
gg <- gg + coord_flip()
gg <- gg + labs(x=NULL, y=NULL, 
                title="Most popular services used by the F1000",
                subtitle="As determined by SPF record configuration")
gg <- gg + theme_hrbrmstr(grid="X", axis="y")
gg <- gg + theme(plot.margin=margin(t=10, l=10, b=20, r=10))
gg

Fullscreen_4_11_16__1_10_AM

### Fin

There are more `TXT` records to play with than just SPF ones and many other hidden easter eggs. I need to add a few more functions into `gdns` before shipping it off to CRAN, so if you have any feature requests, now’s the time to file a [github issue](https://github.com/hrbrmstr/gdns/issues).

The [`iptools` package](https://github.com/hrbrmstr/iptools)—a toolkit for manipulating, validating and testing IP addresses and ranges, along with datasets relating to IP addresses—is flying through the internets and hitting a CRAN mirror near you, soon.

### What’s fixed?

[Tim Smith](https://github.com/tdsmith) fixed [a bug](https://github.com/hrbrmstr/iptools/issues/26) in `ip_in_range()` that occurred when the netmask was `/32` (thanks, Tim!).

### What’s new?

The `range_boundaries()` function now returns the three new fields that are pretty obvious once you see it in action:

range_boundaries("172.18.0.0/28")
##   minimum_ip  maximum_ip min_numeric max_numeric         range
## 1 172.18.0.0 172.18.0.15  2886860800  2886860815 172.18.0.0/28

They are tacked on the end, so if you were using positional or named columns previously, you’re still good to go.

We’ve added a new `country_ranges()` function to return all “assigned” CIDR blocks in a country. You just give it character vector of one or more ISO 3166-1 alpha-2 codes and you get back the CIDRs:

country_ranges("TO")
## $TO
## [1] "43.255.148.0/22"  "103.239.160.0/22" "103.242.126.0/23" "103.245.160.0/22" "175.176.144.0/21" "202.43.8.0/21"   
## [7] "202.134.24.0/21"

This data is updated daily and there’s some session caching built-into the function to speed up subsequent calls if you forgot to save the output. You can flush the session cache with `flush_country_cidrs()` and query it with `cached_country_cidrs()`.

### What’s next?

We’re waiting until the R 3.3.0 Windows toolchain is stable to add in MaxMind ASN lookups. If there are any IP-related functions you need added, drop us an [issue](https://github.com/hrbrmstr/iptools/issues). We’re at nearly 1,700 downloads from the RStudio mirror, which (IMO) is kinda cool for such a niche package. Many thanks to all our users and one more thank you to Dirk for the `AsioHeaders` package.

### Fin

If you want some “bad” IP addresses to play around with in `iptools`, check out the [`blocklist`](https://github.com/hrbrmstr/blocklist) package, which provides an interface to a subset of the [blocklist.de](http://www.blocklist.de/en/index.html) API.

`iptools` is a set of tools for working with IP addresses. Not just work, but work _fast_. It’s backed by `Rcpp` and now uses the [AsioHeaders](http://dirk.eddelbuettel.com/blog/2016/01/07/#asioheaders_1.11.0-1) package by Dirk Eddelbuettel, which means it no longer needs to _link_ against the monolithic Boost libraries and *works on Windows*!

What can you do with it? One thing you can do is take a vector of domain names and turn them into IP addresses:

library(iptools)
 
hostname_to_ip(c("rud.is", "dds.ec", "ironholds.org", "google.com"))
 
## [[1]]
## [1] "104.236.112.222"
## 
## [[2]]
## [1] "162.243.111.4"
## 
## [[3]]
## [1] "104.131.2.226"
## 
## [[4]]
##  [1] "2607:f8b0:400b:80a::100e" "74.125.226.101"           "74.125.226.102"          
##  [4] "74.125.226.100"           "74.125.226.96"            "74.125.226.104"          
##  [7] "74.125.226.99"            "74.125.226.103"           "74.125.226.105"          
## [10] "74.125.226.98"            "74.125.226.97"            "74.125.226.110"

That means you can pump a bunch of domain names from logs into `iptools` and get current IP address allocations out for them.

You can also do the reverse:

library(magrittr)
library(purrr)
library(iptools)
 
hostname_to_ip(c("rud.is", "dds.ec", "ironholds.org", "google.com")) %>% 
  flatten_chr() %>% 
  ip_to_hostname() %>% 
  flatten_chr()
 
##  [1] "104.236.112.222"           "dds.ec"                    "104.131.2.226"            
##  [4] "yyz08s13-in-x0e.1e100.net" "yyz08s13-in-f5.1e100.net"  "yyz08s13-in-f6.1e100.net" 
##  [7] "yyz08s13-in-f4.1e100.net"  "yyz08s13-in-f0.1e100.net"  "yyz08s13-in-f8.1e100.net" 
## [10] "yyz08s13-in-f3.1e100.net"  "yyz08s13-in-f7.1e100.net"  "yyz08s13-in-f9.1e100.net" 
## [13] "yyz08s13-in-f2.1e100.net"  "yyz08s13-in-f1.1e100.net"  "yyz08s13-in-f14.1e100.net"

Notice that it handled IPv6 addresses and also cases where no reverse mapping existed for an IP address.

You can convert IPv4 addresses to and from long integer format (the 4 octet version of IPv4 addresses is primarily to make them easier for humans to grok), generate random IP addresses for testing, test IP addresses for validity and type and also reference data sets with registered assignments (so you can see allocated IP groups). Plus, it includes `xff_extract()` which can help identify an actual IP address (helpful when connections come from behind proxies).

We can’t thank Dirk enough for cranking out `AsioHeaders` since it means there will be many more network/”cyber” packages coming for R and available on every platform.

You can find `iptools` version `0.3.0` [on CRAN](https://cran.r-project.org/web/packages/iptools/) now (it may take your mirror a bit to catch up), grab the source [release](https://github.com/hrbrmstr/iptools/releases/tag/v0.3.0) on GitHub or check out the [repo](https://github.com/hrbrmstr/iptools/), poke around, submit issues and/or contribute!

Isn’t it great when an R package can help you with resolutions in the new year?