Skip navigation

Category Archives: Information Security

(A reminder to folks expecting “R”/”data science” content: the feed for that is at https://rud.is/b/category/r/feed/ if you don’t want to see the occasional non-R/datasci posts.)


Over at the $WORK blog we posted some research into the fairly horrible Cisco RV320/RV325 router vulnerability. The work blog is the work blog and this blog is my blog (i.e. opinions are my own, yada, yada, yada) and I felt compelled to post a cautionary take to vendors and organizations in general on how security issues can creep into your environment as a result of acquisitions and supply chains.

Looking purely at the evidence gathered from internet scans — which include SSL certificate info — and following the trail in the historical web archive one can make an informed, speculative claim that the weakness described in CVE-2019-1653 existed well before the final company logo ended up on the product.

It appears that NetKlass was at least producing the boards for this class of SMB VPN router and ultimately ended up supplying them to Linksys. A certain giant organization bought that company (and subsequently sold it off again) and it’s very likely this vulnerability ended up in said behemoth’s lap due to both poor supply chain management — in that Linksys seems to have done no security testing on the sourced parts — and, due to the acquisition, which caused those security issues to end up in a major brand’s product inventory.

This can happen to any organization involved in sourcing hardware/software from a third party and/or involved in acquiring another company. Receiving compliance-driven checkbox forms on the efficacy of the target security programs (or sw/hw) is not sufficient but is all too common a practice. Real due diligence involves kinda-trusting then verifying that the claims are accurate.

Rigorous product testing on the part of the original sourcing organization and follow-up assurance testing at the point of acquisition would have very likely caught this issue before it became a responsibly disclosed vulnerability.

Phishing is [still] the primary way attackers either commit a primary criminal act (i.e. phish a target to, say, install ransomware) or is the initial vehicle used to gain a foothold in an organization so they can perform other criminal operations to achieve some goal. As such, security teams, vendors and active members of the cybersecurity community work diligently to neutralize phishing campaigns as quickly as possible.

One popular community tool/resource in this pursuit is PhishTank which is a collaborative clearing house for data and information about phishing on the Internet. Also, PhishTank provides an open API for developers and researchers to integrate anti-phishing data into their applications at no charge.

While the PhishTank API is useful for real-time anti-phishing operations the data is also useful for security researchers as we work to understand the ebb, flow and evolution of these attacks. One avenue of research is to track the various features associated with phishing campaigns which include (amongst many other elements) network (internet) location of the phishing site, industry being targeted, domain names being used, what type of sites are being cloned/copied and a feature we’ll be looking at in this post: what percentage of new phishing sites use SSL encryption and — of these — which type of SSL certificates are “en vogue”.

Phishing sites are increasingly using and relying on SSL certificates because we in the information security industry spent a decade instructing the general internet surfing population to trust sites with the green lock icon near the location bar. Initially, phishers worked to compromise existing, encryption-enabled web properties to install phishing sites/pages since they could leech off of the “trusted” status of the associated SSL certificates. However, the advent of services like Let’s Encrypt have made it possible for attacker to setup their own phishing domains that look legitimate to current-generation internet browsers and prey upon the decade’s old “trust the lock icon” mantra that most internet users still believe. We’ll table that path of discussion (since it’s fraught with peril if you don’t support the internet-do-gooder-consequences-be-darned cabal’s personal agendas) and just focus on how to work with PhishTank data in R and take a look at the most prevalent SSL certs used in the past week (you can extend the provided example to go back as far as you like provided the phishing sites are still online).

Accessing PhishTank From R

You can use the aquarium package [GL|GH] to gain access to the data provided by PhishTank’s API (you need to sign up for access and put you API key into the PHISHTANK_API_KEY environment variable which is best done via your ~/.Renviron file).

Let’s setup all the packages we’ll need and cache a current copy of the PhishTank data. The package forces you to utilize your own caching strategy since it doesn’t make sense for it to decide that for you. I’d suggest either using the time-stamped approach below or using some type of database system (or, say, Apache Drill) to actually manage the data.

Here are the packages we’ll need:

library(psl) # git[la|hu]b/hrbrmstr/psl
library(curlparse) # git[la|hu]b/hrbrmstr/curlparse
library(aquarium) # git[la|hu]b/hrbrmstr/aquarium
library(gt) # github/rstudio/gt
library(furrr)
library(stringi)
library(openssl)
library(tidyverse)

NOTE: The psl and curlparse packages are optional. Windows users will find it difficult to get them working and it may be easier to review the functions provided by the urlparse package and substitute equivalents for the domain() and apex_domain() functions used below. Now, we get a copy of the current PhishTank dataset & cache it:

if (!file.exists("~/Data/2018-12-23-fishtank.rds")) {
  xdf <- pt_read_db()
  saveRDS(xdf, "~/Data/2018-12-23-fishtank.rds")
} else {
  xdf <- readRDS("~/Data/2018-12-23-fishtank.rds")
}

Let’s take a look:

glimpse(xdf)
## Observations: 16,446
## Variables: 9
## $ phish_id          <chr> "5884184", "5884138", "5884136", "5884135", ...
## $ url               <chr> "http://internetbanking-bancointer.com.br/lo...
## $ phish_detail_url  <chr> "http://www.phishtank.com/phish_detail.php?p...
## $ submission_time   <dttm> 2018-12-22 20:45:09, 2018-12-22 18:40:24, 2...
## $ verified          <chr> "yes", "yes", "yes", "yes", "yes", "yes", "y...
## $ verification_time <dttm> 2018-12-22 20:45:52, 2018-12-22 21:26:49, 2...
## $ online            <chr> "yes", "yes", "yes", "yes", "yes", "yes", "y...
## $ details           <list> [<209.132.252.7, 209.132.252.0/24, 7296 468...
## $ target            <chr> "Other", "Other", "Other", "PayPal", "Other"...

The data is really straightforward. We have unique ids for each site/campaign the URL of the site along with a URL to extra descriptive info PhishTank has on the site/campaign. We also know when the site was submitted/discovered and other details, such as the network/internet space the site is in:

glimpse(xdf$details[1])
## List of 1
##  $ :'data.frame':    1 obs. of  6 variables:
##   ..$ ip_address        : chr "209.132.252.7"
##   ..$ cidr_block        : chr "209.132.252.0/24"
##   ..$ announcing_network: chr "7296 468"
##   ..$ rir               : chr "arin"
##   ..$ country           : chr "US"
##   ..$ detail_time       : chr "2018-12-23T01:46:16+00:00"

We’re going to focus on recent phishing sites (in this case, ones that are less than a week old) and those that use SSL certificates:

filter(xdf, verified == "yes") %>%
  filter(online == "yes") %>%
  mutate(diff = as.numeric(difftime(Sys.Date(), verification_time), "days")) %>%
  filter(diff <= 7) %>%
  { all_ct <<- nrow(.) ; . } %>%
  filter(grepl("^https", url)) %>%
  { ssl_ct <<- nrow(.) ; . } %>%
  mutate(
    domain = domain(url),
    apex = apex_domain(domain)
  ) -> recent

Let’s ee how many are using SSL:

(ssl_ct)
## [1] 383

(pct_ssl <- ssl_ct / all_ct)
## [1] 0.2919207

This percentage is lower than a recent “50% of all phishing sites use encryption” statistic going around of late. There are many reasons for the difference:

  • PhishTank doesn’t have all phishing sites in it
  • We just looked at a week of examples
  • Some sites were offline at the time of access attempt
  • Diverse attacker groups with varying degrees of competence engage in phishing attacks

Despite the 20% deviation, 30% is still a decent percentage, and a green, “everything’s ??” icon is a still a valued prize so we shall pursue our investigation.

Now we need to retrieve all those certs. This can be a slow operation that so we’ll grab them in parallel. It’s also quite possible the “online”status above data frame glimpse is inaccurate (sites can go offline quickly) so we’ll catch certificate request failures with safely() and cache the results:

cert_dl <- purrr::safely(openssl::download_ssl_cert)

plan(multiprocess)

if (!file.exists("~/Data/recent.rds")) {

  recent <- mutate(recent, cert = future_map(domain, cert_dl))
  saveRDS(recent, "~/Data/recent.rds")

} else {
  recent <- readRDS("~/Data/recent.rds")
}

Let see how many request failures we had:

(failed <- sum(map_lgl(recent$cert, ~is.null(.x$result))))
## [1] 25

(failed / nrow(recent))
## [1] 0.06527415

As noted in the introduction to the blog, when attackers want to use SSL for the lock icon ruse they can either try to piggyback off of legitimate domains or rely on Let’s Encrypt to help them commit crimes. Let’s see what the top p”apex” domains](https://help.github.com/articles/about-supported-custom-domains/#apex-domains) were in use in the past week:

count(recent, apex, sort = TRUE)
## # A tibble: 255 x 2
##    apex                              n
##    <chr>                         <int>
##  1 000webhostapp.com                42
##  2 google.com                       17
##  3 umbler.net                        8
##  4 sharepoint.com                    6
##  5 com-fl.cz                         5
##  6 lbcpzonasegurabeta-viabcp.com     4
##  7 windows.net                       4
##  8 ashaaudio.net                     3
##  9 brijprints.com                    3
## 10 portaleisp.com                    3
## # ... with 245 more rows

We can see that a large hosting provider (000webhostapp.com) bore a decent number of these sites, but Google Sites (which is what the full domain represented by the google.com apex domain here is usually pointing to) Microsoft SharePoint (sharepoint.com) and Microsoft forums (windows.net) are in active use as well (which is smart give the pervasive trust associated with those properties). There are 241 distinct apex domains in this 1-week set so what is the SSL cert diversity across these pages/campaigns?

We ultimately used openssl::download_ssl_cert to retrieve the SSL certs of each site that was online, so let’s get the issuer and intermediary certs from them and look at the prevalence of each. We’ll extract the fields from the issuer component returned by openssl::download_ssl_cert then just do some basic maths:

filter(recent, map_lgl(cert, ~!is.null(.x$result))) %>%
  mutate(issuers = map(cert, ~map_chr(.x$result, ~.x$issuer))) %>%
  mutate(
    inter = map_chr(issuers, ~.x[1]), # the order is not guaranteed here but the goal of the exercise is
    root = map_chr(issuers, ~.x[2])   # to get you working with the data vs build a 100% complete solution
  ) %>%
  mutate(
    inter = stri_replace_all_regex(inter, ",([[:alpha:]])+=", ";;;$1=") %>%
      stri_split_fixed(";;;") %>% # there are parswers for the cert info fields but this hack is quick and works
      map(stri_split_fixed, "=", 2, simplify = TRUE) %>%
      map(~setNames(as.list(.x[,2]), .x[,1])) %>%
      map(bind_cols),
    root = stri_replace_all_regex(root, ",([[:alpha:]])+=", ";;;$1=") %>%
      stri_split_fixed(";;;") %>%
      map(stri_split_fixed, "=", 2, simplify = TRUE) %>%
      map(~setNames(as.list(.x[,2]), .x[,1])) %>%
      map(bind_cols)
  ) -> recent

Let’s take a look at roots:

unnest(recent, root) %>%
  distinct(phish_id, apex, CN) %>%
  count(CN, sort = TRUE) %>%
  mutate(pct = n/sum(n)) %>%
  gt::gt() %>%
  gt::fmt_number("n", decimals = 0) %>%
  gt::fmt_percent("pct")

CN n pct
DST Root CA X3 96 26.82%
COMODO RSA Certification Authority 93 25.98%
DigiCert Global Root G2 45 12.57%
Baltimore CyberTrust Root 30 8.38%
GlobalSign 27 7.54%
DigiCert Global Root CA 15 4.19%
Go Daddy Root Certificate Authority – G2 14 3.91%
COMODO ECC Certification Authority 11 3.07%
Actalis Authentication Root CA 9 2.51%
GlobalSign Root CA 4 1.12%
Amazon Root CA 1 3 0.84%
Let’s Encrypt Authority X3 3 0.84%
AddTrust External CA Root 2 0.56%
DigiCert High Assurance EV Root CA 2 0.56%
USERTrust RSA Certification Authority 2 0.56%
GeoTrust Global CA 1 0.28%
SecureTrust CA 1 0.28%

DST Root CA X3 is (wait for it) Let’s Encrypt! Now, Comodo is not far behind and indeed surpasses LE if we combine the extra-special “enhanced” versions they provide and it’s important for you to read the comments near the lines of code making assumptions about order of returned issuer information above. Now, let’s take a look at intermediaries:

unnest(recent, inter) %>%
  distinct(phish_id, apex, CN) %>%
  count(CN, sort = TRUE) %>%
  mutate(pct = n/sum(n)) %>%
  gt::gt() %>%
  gt::fmt_number("n", decimals = 0) %>%
  gt::fmt_percent("pct")

CN n pct
Let’s Encrypt Authority X3 99 27.65%
cPanel\, Inc. Certification Authority 75 20.95%
RapidSSL TLS RSA CA G1 45 12.57%
Google Internet Authority G3 24 6.70%
COMODO RSA Domain Validation Secure Server CA 20 5.59%
CloudFlare Inc ECC CA-2 18 5.03%
Go Daddy Secure Certificate Authority – G2 14 3.91%
COMODO ECC Domain Validation Secure Server CA 2 11 3.07%
Actalis Domain Validation Server CA G1 9 2.51%
RapidSSL RSA CA 2018 9 2.51%
Microsoft IT TLS CA 1 6 1.68%
Microsoft IT TLS CA 5 6 1.68%
DigiCert SHA2 Secure Server CA 5 1.40%
Amazon 3 0.84%
GlobalSign CloudSSL CA – SHA256 – G3 2 0.56%
GTS CA 1O1 2 0.56%
AlphaSSL CA – SHA256 – G2 1 0.28%
DigiCert SHA2 Extended Validation Server CA 1 0.28%
DigiCert SHA2 High Assurance Server CA 1 0.28%
Don Dominio / MrDomain RSA DV CA 1 0.28%
GlobalSign Extended Validation CA – SHA256 – G3 1 0.28%
GlobalSign Organization Validation CA – SHA256 – G2 1 0.28%
RapidSSL SHA256 CA 1 0.28%
TrustAsia TLS RSA CA 1 0.28%
USERTrust RSA Domain Validation Secure Server CA 1 0.28%
NA 1 0.28%

LE is number one again! But, it’s important to note that these issuer CommonNames can roll up into a single issuing organization given just how messed up integrity and encryption capability is when it comes to web site certs, so the raw results could do with a bit of post-processing for a more complete picture (an exercise left to intrepid readers).

FIN

There are tons of avenues to explore with this data, so I hope this post whet your collective appetites sufficiently for you to dig into it, especially if you have some dowm-time coming.

Let me also take this opportunity to resissue guidance I and many others have uttered this holiday season: be super careful about what you click on, which sites you even just visit, and just how much you really trust the site, provider and entity behind the form about to enter your personal information and credit card info into.

I pen this mini-tome on “GDPR Enforcement Day”. The spirit of GDPR is great, but it’s just going to be another Potempkin Village in most organizations much like PCI or SOX. For now, the only thing GDPR has done is made GDPR consulting companies rich, increased the use of javascript on web sites so they can pop-up useless banners we keep telling users not to click on and increase the size of email messages to include mandatory postscripts (that should really be at the beginning of the message, but, hey, faux privacy is faux privacy).

Those are just a few of the “unintended consequences” of GDPR. Just like Let’s Encrypt & “HTTPS Everywhere” turned into “Let’s Enable Criminals and Hurt Real People With Successful Phishing Attacks”, GDPR is going to cause a great deal of downstream issues that either the designers never thought of or decided — in their infinite, superior wisdom — were completely acceptable to make themselves feel better.

Today’s installment of “GDPR Unintended Consequences” is WordPress.

WordPress “powers” a substantial part of the internet. As such, it is a perma-target of attackers.

Since the GDPR Intelligentsia provided a far-too-long lead-time on both the inaugural and mandated enforcement dates for GDPR and also created far more confusion with the regulations than clarity, WordPress owners are flocking to “single button install” solutions to make them magically GDPR compliant (#protip that’s not “a thing”). Here’s a short list of plugins and active installation counts (no links since I’m not going to encourage attack surface expansion):

  • WP GDPR Compliance : 50,000+ active installs
  • GDPR : 10,000+ active installs
  • The GDPR Framework : 6,000+ installs
  • GDPR Cookie Compliance : 10,000+ active installs
  • GDPR Cookie Consent : 200,000+ active installs
  • WP GDPR : 4,000 active installs
  • Cookiebot | GDPR Compliant Cookie Consent and Notice : 10,000+ active installations
  • GDPR Tools : 500+ active installs
  • Surbma — GDPR Proof Cookies : 400+ installs
  • Social Media Share Buttons & Social Sharing Icons (which “enhanced” GDPR compatibility) : 100,000+ active installs
  • iubenda Cookie Solution for GDPR : 10,000+ active installs
  • Cookie Consent : 100,000+ active installs

I’m somewhat confident that a fraction of those publishers follow secure coding guidelines (it may be a small fraction). But, if I was an attacker, I’d be poking pretty hard at a few of those with six-figure installs to see if I could find a usable exploit.

GDPR just gave attackers a huge footprint of homogeneous resources to attempt at-scale exploits. They will very likely succeed (over-and-over-and-over again). This means that GDPR just increased the likelihood of losing your data privacy…the complete opposite of the intent of the regulation.

There are more unintended consequences and I’ll pepper the blog with them as the year and pain progresses.

RIPE 76 is going on this week and — as usual — there are scads of great talks. The selected ones below are just my (slightly) thinner slice at what may have broader appeal outside pure networking circles.

Do not read anything more into the order than the end-number of the “Main URL” since this was auto-generated from a script that processed my Firefox tab URLs.

Artyom Gavrichenkov – Memcache Amplification DDoS: Lessons Learned

Erik Bais – Why Do We Still See Amplification DDOS Traffic

Jordi Palet Martinez – A New Internet Intro to HTTP/2, QUIC, DOH and DNS over QUIC

Sara Dickinson – DNS Privacy BCP

Jordi Palet Martinez – Email Servers on IPv6

Martin Winter – Real-Time BGP Toolkit: A New BGP Monitor Service

Job Snijders – Practical Data Sources For BGP Routing Security

Charles Eckel – Combining Open Source and Open Standards

Kostas Zorbadelos – Towards IPv6 Only: A large scale lw4o6 deployment (rfc7596) for broadband users @AS6799

Louis Poinsignon – Internet Noise (Announcing 1.1.1.0/24)

Filiz Yilmaz – Current Policy Topics – Global Policy Proposals

Geoff Huston – Measuring ATR

Moritz Muller, SIDN – DNSSEC Rollovers

Anand Buddhdev – DNS Status Report

Victoria Risk – A Survey on DNS Privacy

Baptiste Jonglez – High-Performance DNS over TCP

Sara Dickinson – Latest Measurements on DNS Privacy

Willem Toorop – Sunrise DNS-over-TLS! Sunset DNSSEC – Who Needs Reasons, When You’ve Got Heroes

Laurenz Wagner – A Modern Chatbot Approach for Accessing the RIPE Database

I’ve blogged a bit about robots.txt — the rules file that documents a sites “robots exclusion” standard that instructs web crawlers what they can and cannot do (and how frequently they should do things when they are allowed to). This is a well-known and well-defined standard, but it’s not mandatory and often ignored by crawlers and content owners alike.

There’s an emerging IETF draft for a different type of site metadata that content owners should absolutely consider adopting. This one defines “web security policies” for a given site and has much in common with robots exclusion standard, including the name (security.txt) and format (policy directives are defined with simple syntax — see Chapter 5 of the Debian Policy Manual).

One core difference is that this file is intended for humans. If you are are a general user and visit a site and notice something “off” (security-wise) or if you are an honest, honorable security researcher who found a vulnerability or weakness on a site, this security.txt file should make it easier to contact the appropriate folks at the site to help them identify and resolve security issues. The IETF abstract summarizes the intent well:

When security risks in web services are discovered by independent security researchers who understand the severity of the risk, they often lack the channels to properly disclose them. As a result, security issues may be left unreported. Security.txt defines a standard to help organizations define the process for security researchers to securely disclose security vulnerabilities.

A big change from robots.txt is where the security.txt file goes. The IETF standard is still in draft state so the location may change, but the current thinking is to have it go into /.well-known/security.txt vs being placed in the top level root (i.e. it’s not supposed to be in /security.txt). If you aren’t familiar with the .well-known directory, give RFC 5785 a read.

You can visit the general information site to find out more and install a development version of a Chrome extension that will make it easier for pull up this info in your browser if you find an issue.

Here’s the security.txt for my site:

Contact: bob@rud.is
Encryption: https://keybase.io/hrbrmstr/pgp_keys.asc?fingerprint=e5388172b81c210906f5e5605879179645de9399
Disclosure: Full

With that info, you know where to contact me, have the ability to encrypt your message and know that I’ll give you credit and will disclose the bugs openly.

So, Why the [R] tag?

Ah, yes. This post is in the R RSS category feed for a reason. I do at-scale analysis of the web for a living and will be tracking the adoption of security.txt across the internet (initially with the Umbrella Top 1m and a choice list of sites with more categorical data associated with them) over time. My esteemed colleague @jhartftw is handling the crawling part, but I needed a way to speedily read in these files for a broader analysis. So, I made an R package: securitytxt?.

It’s pretty easy to use. Here’s how to install it and use one of the functions to generate a security.txt target URL for a site:

devtools::install_github("hrbrmstr/securitytxt")

library(securitytxt)

(xurl <- sectxt_url("https://rud.is/b"))
## [1] "https://rud.is/.well-known/security.txt"

This is how you read in and parse a security.txt file:

(x <- sectxt(url(xurl)))
## <Web Security Policies Object>
## Contact: bob@rud.is
## Encryption: https://keybase.io/hrbrmstr/pgp_keys.asc?fingerprint=e5388172b81c210906f5e5605879179645de9399
## Disclosure: Full

And, this is how you turn that into a usable data frame:

sectxt_info(x)
##          key                                                                                         value
## 1    contact                                                                                    bob@rud.is
## 2 encryption https://keybase.io/hrbrmstr/pgp_keys.asc?fingerprint=e5388172b81c210906f5e5605879179645de9399
## 3 disclosure                                                                                          Full

There’s also a function to validate that the keys are within the current IETF standard. That will become more useful once the standard moves out of draft status.

FIN

So, definitely adopt the standard and feel invited to kick the tyres on the package. Don’t hesitate to jump on board if you have ideas for how you’d like to extend the package, and drop a note in the comments if you have questions on it or on adopting the standard for your site.

insert(post, "{ 'standard_disclaimer' : 'My opinion, not my employer\'s' }")

This is a post about the fictional company FredCo. If the context or details presented by the post seem familiar, it’s purely coincidental. This is, again, a fictional story.

Let’s say FredCo had a pretty big breach that (fictionally) garnered media, Twitterverse, tech-world and Government-level attention and that we have some spurious details that let us sit back in our armchairs to opine about. What might have helped create the debacle at FredCo?

Despite (fictional) endless mainstream media coverage and a good chunk of ‘on background’ infosec-media clandestine blatherings we know very little about the breach itself (though it’s been fictionally, officially blamed on failure to patch Apache Struts). We know even less (fictionally officially) about the internal reach of the breach (apart from the limited consumer impact official disclosures). We know even less than that (fictionally officially) about how FredCo operates internally (process-wise).

But, I’ve (fictionally) seen:

  • a detailed breakdown of the number of domains, subdomains, and hosts FredCo “manages”.
  • the open port/service configurations of the public components of those domains
  • public information from individuals who are more willing to (fictionally) violate the CFAA than I am to get more than just port configuration information
  • a 2012/3 SAS 1 Type II report about FredCo controls
  • testimonies from FredCo execs regarding efficacy of $SECURITY_TECHOLOGY and 3 videos purporting to be indicative of expert opine on how to use BIIGG DATERZ to achieve cybersecurity success
  • the board & management structure + senior management bonus structures, complete with incentive-based objectives they were graded on

so, I’m going to blather a bit about how this fictional event should finally tear down the Potemkin village that is the combination of the Regulatory+Audit Industrial Complex and the Cybersecurity Industrial Complex.

“Tear down” with respect to the goal being to help individuals understand that a significant portion of organizations you entrust with your data are not incentivized or equipped to protect your data and that these same conditions exist in more critical areas — such as transportation, health care, and critical infrastructure — and you should expect a failure on the scale of FredCo — only with real, harmful impact — if nothing ends up changing soon.

From the top

There is boilerplate mention of “security” in the objectives of the senior executives between 2015 & 2016 14A filings:

  • CEO: “Employing advanced analytics and technology to help drive client growth, security, efficiency and profitability.”
  • CFO: “Continuing to advance and execute global enterprise risk management processes, including directing increased investment in data security, disaster recovery and regulatory compliance capabilities.”
  • CLO: “Continuing to refine and build out the Company’s global security organization.”
  • President, Workforce Solutions: None
  • CHRO: None
  • President – US Information Services: None

You’ll be happy to know that they all received either “Distinguished” or “Exceeds” on their appraisals and received a multiplier of their bonus & compensation targets as a result.

Furthermore, there is no one in the make-up of FredCo’s board of directors who has shown an interest or specialization in cybersecurity.

From the camera-positioned 50-yard line on instant replay, the board and shareholders of FredCo did not think protection of your identity and extremely personal information was important enough to include on three top executive directives and performance measure and was given little more than boilerplate mention for others. Investigators who look into FredCo’s breach should dig deep into the last decade of the detailed measures for these objectives. I have first-hand experience how these types of HR processes are managed in large orgs, which is why I’m encouraging this area for investigation.

“Security” is a terrible term, but it only works when it is an emergent property of the business processes of an organization. That means it must be contextual for every worker. Some colleagues suggest individual workers should not have to care about cybersecurity when making decisions or doing work, but even minimum-wage retail and grocery store clerks are educated about shoplifting risks and are given tools, tips and techniques to prevent loss. When your HR organizations is not incentivized to help create and maintain a cybersecurity-aware culture from the top you’re going to have problems, and when there are no cyberscurity-oriented targets for the CIO or even business process owners, don’t expect your holey screen door to keep out predators.

Awwwdit, Part I

NOTE: I’m not calling out any particular audit organization as I’ve only seen one fictional official report.

The Regulatory+Audit Industrial Complex is a lucrative business cabal. Governments and large business meta-agencies create structures where processes can be measured, verified and given a big green ✅. This validation exercise is generally done in one or more ways:

  • simple questionnaire, very high level questions, no veracity validation
  • more detailed questionnaire, mid-level questions, usually some in-person lightweight checking
  • detailed questionnaire, but with topics that can be sliced-and-diced by the legal+technical professions to mean literally anything, measured in-person by (usually) extremely junior reviewers with little-to-no domain expertise who follow review playbooks, get overwhelmed with log entries and scope-refinement+reduction and who end up being steered towards “important” but non-material findings

Sure, there are good audits and good auditors, but I will posit they are the rare diamonds in a bucket of zirconia.

We need to cover some technical ground before covering this further, though.

Shocking Struts

We’ll take the stated breach cause at face-value: failure to patch an remote-accessible vulnerability with Apache Struts. This was presented as the singular issue enabling attackers to walk (with crutches) away with scads of identify-theft-enabling personal data, administrator passwords, database passwords, and the recipe for the winning entry in the macaroni salad competition at last year’s HR annual picnic. Who knew one Java library had so much power!

We don’t know the architecture of all the web apps at FredCo. However, your security posture should not be a Jenga game tower, easily destroyed by removing one peg. These are all (generally) components of externally-facing applications at the scale of FredCo:

  • routers
  • switches
  • firewalls
  • load balancers
  • operating systems
  • application servers
  • middleware servers
  • database servers
  • customized code

These are mimicked (to varying levels of efficacy) across:

  • development
  • test
  • staging
  • production

environments.

They may coexist (in various layers of the network) with:

  • HR systems
  • Finance systems
  • Intranet servers
  • Active Directory
  • General user workstations
  • Executive workstations
  • Developer workstations
  • Mobile devices
  • Remote access infrastructure (i.e. VPNs)

A properly incentivized organization ensures there are logical and physical separation between/isolation of “stuff that matters” and that varying levels of authentication & authorization are applied to ensure access is restricted.

Keeping all that “secure” requires:

  • managing thousands of devices (servers, network components, laptops, desktops, mobile devices)
  • managing thousands of identities
  • managing thousands of configurations across systems, networks and devices
  • managing hundreds to thousands of connections between internal and external networks
  • managing thousands of rules
  • managing thousands of vulnerabilities (as they become known)
  • managing a secure development life cycle across hundreds or thousands of applications

Remember, though, that FredCo ostensibly managed all of that well and the data loss was solely due to one Java library.

If your executives (all of them) and workers (all of them) are not incentivized with that list in mind, you will have problems, but let’s talk about the security challenges back in the context of the audit role.

Awwwdit, Part II

The post is already long, so we’ll make this quick.

If I dropped you off — yes, you, because you’re likely as capable as the auditors mentioned in the previous section on audit — into that environment once a year, do you think you’d be able to ferret out issues based on convoluted network diagrams, poorly documented firewall rules and source code, non-standard checklists of user access management processes?

Let’s say I dropped you in months before the known Struts vulnerability and re-answer the question.

The burden placed on internal and — especially — external auditors is great and they are pretty much set up for failure from engagement number one.

Couple IT complexity with the fact that many orgs like FredCo aren’t required to do more than ensure financial reporting processes are ?.

But, even if there were more technical, security-oriented audits performed, you’d likely have ten different report findings by as many firms or auditors, especially if they were point-in-time audits. Furthermore, FredCo has has decades of point-in-time audits but hundreds of auditors and dozens of firms. The conditions of the breach were likely not net-new, so how did decades of systemic IT failures go unnoticed by this cabal?

IT audit functions are a multi-billion dollar business. FredCo is partially the result of the built-in cracks in the way verification is performed in orgs. In other words, I posit the Regulatory+Audit Industrial Complex bears some of the responsibility for FredCo’s breach.

Divisive Devices

From the (now removed) testimonials & videos, it was clear there may have been a “blinky light” problem in the mindset of those responsible for cybersecurity at FredCo. Relying solely on the capabilities of one or more devices (they are usually appliances with blinky lights) and thinking that storing petabytes of log data are going to stop “bad guys’ is a great recipe for a breach parfait.

But, the Cybersecurity Industrial Complex continues to dole out LED-laden boxes with the fervor of a U.S. doctor handing out opioids. Sure, they are just giving orgs what they want, but it doesn’t make it responsible behaviour. Just like the opioid problem, the “device” issue is likely causing cyber-sickness in more organizations that you’d like to admit. You may even know someone who works at an org with a box-addition.

I posit the Cybersecurity Industrial Complex bears some of the responsibility for FredCo’s breach, especially when you consider the hundreds of marketing e-mails I’ve seen post-FredCo breach telling me how CyberBox XJ9-11 would have stopped FredCo’s attackers cold.

A Matter of Trust

If removing a Struts peg from FredCo’s IT Jenga board caused the fictional tower to crash:

  • What do you think the B2B infrastructure looks like?
  • How do you think endpoints are managed?
  • What isolation, segmentation and access controls really exist?
  • How effective do you think their security awareness program is?
  • How many apps are architected & managed as poorly as the breached one?
  • How many shadow IT deployments exist in the ☁️ with your data in it?
  • How can you trust FredCo with anything of importance?

Fictional FIN

In this fictional world I’ve created one ending is:

  • all B2B connections to FredCo have been severed
  • lawyers at a thousand firms are working on language for filings to cancel all B2B contracts with FredCo
  • FredCo was de-listed from exchanges
  • FredCo executives are defending against a slew of criminal and civil charges
  • The U.S. Congress and U.K. Parliament have come together to undertake a joint review of regulatory and audit practices spanning both countries (since it impacted both countries and the Reg+Audit cabal spans both countries they decided to save time and money) resulting in sweeping changes
  • The SEC has mandated detailed cybersecurity objectives be placed on all senior management executives at all public companies and have forced results of those objectives assessments to be part of a new filing requirement.
  • The SEC has also mandated that at least one voting board member of public companies must have demonstrated experience with cybersecurity
  • The FTC creates and enforces standards on cybersecurity product advertising practices
  • You have understood that nobody has your back when it comes to managing your sensitive, personal data and that you must become an active participant in helping to ensure your elected representatives hold all organizations accountable when it comes to taking their responsibilities seriously.

but, another is:

  • FredCo’s stock bounces back
  • FredCo loses no business partners
  • FredCo’s current & former execs faced no civil or criminal charges
  • Congress makes a bit of opportunistic, temporary bluster for the sake of 2018 elections but doesn’t do anything more than berate FredCo publicly
  • You’re so tired of all these breaches and data loss that you go back to playing “Clash of Clans” on your mobile phone and do nothing.

I caught a mention of this project by Pete Warden on Four Short Links today. If his name sounds familiar, he’s the creator of the DSTK, an O’Reilly author, and now works at Google. A decidedly clever and decent chap.

The project goal is noble: crowdsource and make a repository of open speech data for researchers to make a better world. Said sourcing is done by asking folks to record themselves saying “Yes”, “No” and other short words.

As I meandered over the blog post I looked in horror on the URL for the application that did the recording: https://open-speech-commands.appspot.com/.

Why would the goal of the project combined with that URL give pause? Read on!

You’ve Got Scams!

Picking up the phone and saying something as simple as ‘Yes’ has been a major scam this year. By recording your voice, attackers can replay it on phone prompts and because it’s your voice it makes it harder to refute the evidence and can foil recognition systems that look for your actual voice.

As the chart above shows, the Better Business Bureau has logged over 5,000 of these scams this year (searching for ‘phishing’ and ‘yes’). You can play with the data (a bit — the package needs work) in R with scamtracker.

Now, these are “analog” attacks (i.e. a human spends time socially engineering a human). Bookmark this as you peruse section 2.

Integrity Challenges in 2017

I “trust” Pete’s intentions, but I sure don’t trust open-speech-commands.appspot.com (and, you shouldn’t either). Why? Go visit https://totally-harmless-app.appspot.com. It’s a Google App Engine app I made for this post. Anyone can make an appspot app and the https is meaningless as far as integrity & authenticity goes since I’m running on google’s infrastructure but I’m not google.

You can’t really trust most SSL/TLS sessions as far as site integrity goes anyway. Let’s Encrypt put the final nail in the coffin with their Certs Gone Wild! initiative. With super-recent browser updates you can almost trust your eyes again when it comes to URLs, but you should be very wary of entering your info — especially uploading voice, prints or eye/face images — into any input box on any site if you aren’t 100% sure it’s a legit site that you trust.

Tracking the Trackers

If you don’t know that you’re being tracked 100% of the time on the internet then you really need to read up on the modern internet.

In many cases your IP address can directly identify you. In most cases your device & browser profile — which most commercial sites log — can directly identify you. So, just visiting a web site means that it’s highly likely that web site can know that you are both not a dog and are in fact you.

Still Waiting for the “So, What?”

Many states and municipalities have engaged in awareness campaigns to warn citizens about the “Say ‘Yes'” scam. Asking someone to record themselves saying ‘Yes’ into a random web site pretty much negates that advice.

Folks like me regularly warn about trust on the internet. I could have cloned the functionality of the original site to open-speech-commmands.appspot.com. Did you even catch the 3rd ‘m’ there? Even without that, it’s an appspot.com domain. Anyone can set one up.

Even if the site doesn’t ask for your name or other info and just asks for your ‘Yes’, it can know who you are. In fact, when you’re enabling the microphone to do the recording, it could even take a picture of you if it wanted to (and you’d likely not know or not object since it’s for SCIENCE!).

So, in the worst case scenario a malicious entity could be asking you for your ‘Yes’, tying it right to you and then executing the post-scam attacks that were being performed in the analog version.

But, go so far as to assume this is a legit site with good intentions. Do you really know what’s being logged when you commit your voice info? If the data was mishandled, it would be just as easy to tie the voice files back to you (assuming a certain level of data logging).

The “so what” is not really a warning to users but a message to researchers: You need to threat model your experiments and research initiatives, especially when innocent end users are potentially being put at risk. Data is the new gold, diamonds and other precious bits that attackers are after. You may think you’re not putting folks at risk and aren’t even a hacker target, but how you design data gathering can reinforce good or bad behaviour on the part of users. It can solidify solid security messages or tear them down. And, you and your data may be more of a target than you really know.

Reach out to interdisciplinary colleagues to help threat model your data collection, storage and dissemination methods to ensure you aren’t putting yourself or others at risk.

FIN

Pete did the right thing:

and, I’m sure the site will be on a “proper” domain soon. When it is, I’ll be one of the first in line to help make a much-needed open data set for research purposes.

Tagging this as #rstats-related since many R coders use Travis-CI to automate package builds (and other things). Security researcher Ivan Vyshnevskyi did some ++gd responsible disclosure to the Travis-CI folks letting them know they were leaking the contents of “secure” environment variables in the build logs.

The TL;DR on “secure” environment variables is that they let you store secrets — such as OAuth keys or API tokens — ostensibly “securely” (they have to be decrypted to be used so someone/something has they keys to do that so it’s not really “secure”). That is, they should not leak them in build logs. Except that they did…for a bit.

As mentioned, this flaw was reported and is now fixed. Regen your “secrets” and keep an eye on Travis security announcements moving forward.