R Archives - Page 15 of 56

Category Archives: R

A Limited-but-Functional Couchbase Free Text Search & Retrieval Un-package; or, “How I Abused Couchbase & R to Perform Bulk IP Whois Full-text Searches” (a Cobbler’s Tale)

Researching “the internet” (i.e. $DAYJOB) means having to deal with a ton of “unique” (I’m being kind) data formats. This is ultimately a tale of how I performed full-text searches across one of them.

It all started off innocently enough. This past week I need to be able to do full-text searches across metadata about who is using which parts of the internet. Normally I don’t need to do that at scale and can just go to RIPE’s excellent resource and manage to find what I need on the first page. However, this time I needed all the resultant info and noticed an interesting foible on that full text search interface. To reproduce it. Enter something like “domino's” (for the record, I’m not researching Domino’s Pizza — nor would I ever consume it — but a Twitter ad happened to fly by for Domino’s and I just typed it for kicks) into the field and page around, keeping an eye on the results. I think they still use Solr for indexing/searching and aren’t passing in all they need to keep session context or something. Anyway, suffice it to say it was fairly useless (I filed a bug report, so I’m not just complaining, and I wish more sites had the same easy error reporting filing capability the RIPE folks do).

If it were just searching for precise data in one field, that’s not really an issue since we have ALL THE WHOIS IP THINGS in Parquet. But:

I really hate giving Amazon money (even if it’s $WORK money) for Athena queries
Full text search across all columns is not one of Parquet’s strengths
This is a third bullet b/c I feel compelled to have a minimum of three points in bullet lists likely thanks to an overbearing middle-school English teacher

Since I have a modest analytics server setup at home, I figured I’d take the opportunity to re-brush-up on either Elasticsearch or Couchbase since both are pretty great at free text searching JSON data. Except…this isn’t JSON data, It’s records formatted like this:

#
# The contents of this file are subject to 
# RIPE Database Terms and Conditions
#
# http://www.ripe.net/db/support/db-terms-conditions.pdf
#

as-block:       AS7 - AS7
descr:          RIPE NCC ASN block
remarks:        These AS Numbers are assigned to network operators in the RIPE NCC service region.
mnt-by:         RIPE-NCC-HM-MNT
created:        2018-11-22T15:27:05Z
last-modified:  2018-11-22T15:27:05Z
source:         RIPE
remarks:        ****************************
remarks:        * THIS OBJECT IS MODIFIED
remarks:        * Please note that all data that is generally regarded as personal
remarks:        * data has been removed from this object.
remarks:        * To view the original object, please query the RIPE Database at:
remarks:        * http://www.ripe.net/whois
remarks:        ****************************

as-block:       AS28 - AS28
descr:          RIPE NCC ASN block
remarks:        These AS Numbers are assigned to network operators in the RIPE NCC service region.
mnt-by:         RIPE-NCC-HM-MNT
created:        2018-11-22T15:27:05Z
last-modified:  2018-11-22T15:27:05Z
source:         RIPE
remarks:        ****************************
remarks:        * THIS OBJECT IS MODIFIED
remarks:        * Please note that all data that is generally regarded as personal
remarks:        * data has been removed from this object.
remarks:        * To view the original object, please query the RIPE Database at:
remarks:        * http://www.ripe.net/whois
remarks:        ****************************

They “keys” (the colon-ified line prefixes) vary and there are other record types (which I don’t need) that have other prefixes in them plus those #-prefixed comments are not necessarily only at the top. But, after judicious use of stringi::stri::stri_enc_toutf8(), stringi::stri_split_regex() and some vectorized record targeting they’re pretty easily converted to lovely ndjson data like this (random selection further in the conversion):

{"descr":"Reseau Teleinformatique de l'Education Nationale Educational and research network for Luxembourg","admin_c":"DUMY-RIPE","as_set":"AS-RESTENA","members":"AS2602, AS42909, AS51966, AS49624","mnt_by":"AS2602-MNT","notify":"noc@restena.lu","tech_c":"DUMY-RIPE"}
{"descr":"CWIX ASes announced to EBONE","admin_c":"DUMY-RIPE","as_set":"AS-TMPEBONECWIX","members":"AS3727, AS4445, AS4610, AS4624, AS4637, AS4654, AS4655, AS4656, AS4659 AS4681, AS4696, AS4714, AS4849, AS5089, AS5090, AS5532, AS5551, AS5559 AS5655, AS6081, AS6255, AS6292, AS6618, AS6639","mnt_by":"EBONE-MNT","notify":"staff@ebone.net","tech_c":"DUMY-RIPE"}
{"descr":"ASs accepted by DFN from the University of Cologne","admin_c":"DUMY-RIPE","as_set":"AS-DFNFROMCOLOGNE","members":"AS5520 AS6733","mnt_by":"DFN-MNT","tech_c":"DUMY-RIPE"}
{"descr":"NetMatters UK","admin_c":"DUMY-RIPE","as_set":"AS-NETMATTERS","members":"AS6765 AS3344","mnt_by":"AS8407-MNT","tech_c":"DUMY-RIPE"}

I went with Couchbase since it handles ndjson import by default and — as you know since you read the comparison in the aforelinked article — it can easily index all fields by default without you having to do virtually anything. Plus, Couchbase has been around long enough that it generally installs without pain and has a fairly decent web admin panel. Here’s a snapshot of the final import:

and here’s the config for the “all” full text index:

{
  "type": "fulltext-index",
  "name": "all",
  "uuid": "481bc7ed642dddfb",
  "sourceType": "couchbase",
  "sourceName": "ripe",
  "sourceUUID": "3ffbbe0c0923f233ffe0fc96c652262d",
  "planParams": {
    "maxPartitionsPerPIndex": 171
  },
  "params": {
    "doc_config": {
      "docid_prefix_delim": "",
      "docid_regexp": "",
      "mode": "type_field",
      "type_field": "type"
    },
    "mapping": {
      "analysis": {},
      "default_analyzer": "standard",
      "default_datetime_parser": "dateTimeOptional",
      "default_field": "_all",
      "default_mapping": {
        "dynamic": true,
        "enabled": true
      },
      "default_type": "_default",
      "docvalues_dynamic": true,
      "index_dynamic": true,
      "store_dynamic": false,
      "type_field": "_type"
    },
    "store": {
      "indexType": "scorch",
      "kvStoreName": ""
    }
  },
  "sourceParams": {}
}

You Said This Is A Post With R Code

Very true! We’ll get to that in a minute.

Going with Couchbase introduced a different problem: there’s almost no R support for Couchbase. Sure, Couchbase has a gnarly, two-year old, raw httr::-prefixed bit of a tutorial post but that’s not really as cool as if there were a library(couchbase). I mean, you can check GitUgh or CRAN or a more general search yourself if you’d like but it’s going to come up bupkis.

If you were expecting a big reveal, right now, that I’ve got a feature-packed, full R Couchbase package ready to roll…you didn’t actually read the title of the post. What I do have is a set of functions that — given server/connection metadata, a bucket, a full text index, and a query — will return all matching documents (I still do not like that term for “record”) for said set of parameters:

# function code is in: https://paste.sr.ht/~hrbrmstr/051f5d5400644952a3ad2cf8664b84e2cbb9ac6b

cb_fts("domino's", "all", "ripe")
## # A tibble: 120 x 9
##    admin_c   country descr                      inetnum                  mnt_by      netname  status    tech_c  notify         
##    <chr>     <chr>   <chr>                      <chr>                    <chr>       <chr>    <chr>     <chr>   <chr>          
##  1 DUMY-RIPE FR      OPEN IP DOMINO'S PIZZA     79.141.8.44 - 79.141.8.… ALPHALINK-… OPEN-IP  ASSIGNED… DUMY-R… NA             
##  2 DUMY-RIPE NL      Domino's Pizza TILBURG     62.21.176.160 - 62.21.1… AS286-MNT   OTS2634… ASSIGNED… DUMY-R… ip-reg@kpn.net 
##  3 DUMY-RIPE NL      Domino's Pizza EINDHOVEN   62.132.252.168 - 62.132… AS286-MNT   OTS2270… ASSIGNED… DUMY-R… ip-reg@kpn.net 
##  4 DUMY-RIPE NL      Domino's Pizza SPYKENISSE  194.123.233.232 - 194.1… AS286-MNT   OTS69259 ASSIGNED… DUMY-R… ip-reg@kpn.net 
##  5 DUMY-RIPE NL      Domino's AMSTERDAM         37.74.38.188 - 37.74.38… AS286-MNT   OTS6103… ASSIGNED… DUMY-R… kpn-ip-office@…
##  6 DUMY-RIPE NL      Domino's Pizza VOORSCHOTEN 92.66.116.136 - 92.66.1… AS286-MNT   OTS1914… ASSIGNED… DUMY-R… ip-reg@kpn.net 
##  7 DUMY-RIPE NL      Domino's Pizza Doetinchem… 212.241.42.136 - 212.24… AS286-MNT   OTS2301… ASSIGNED… DUMY-R… ip-reg@kpn.net 
##  8 DUMY-RIPE NL      Domino's Pizza AMSTERDAM   194.120.45.224 - 194.12… AS286-MNT   OTS82906 ASSIGNED… DUMY-R… ip-reg@kpn.net 
##  9 DUMY-RIPE NL      Domino's Pizza [Woerden] … 62.41.228.80 - 62.41.22… AS286-MNT   OTS2024… ASSIGNED… DUMY-R… ip-reg@kpn.net 
## 10 DUMY-RIPE NL      Domino's Pizza GRONINGEN   188.203.128.0 - 188.203… AS286-MNT   OTS3767… ASSIGNED… DUMY-R… kpn-ip-office@…
## # … with 110 more rows

It’s not fancy.

It’s meets the needs of a narrow use-case.

It’s not in a standalone package (which is triggering my R code OCD something fierce).

But, it’s seriously fast, got me back to “work mode” with a minimum of hassle, and now there’s some google-able Couchbase R code that isn’t just bare httr calls that may help someone else who’s on a quest for how to work with Couchbase in R.

The first primary function – cb_fts() — uses the /api/index/{index-name}/query API endpoint to paginate through results of the full text search and retrieves all matching record doc id keys, then calls the last primary function — cb_get_records_from_keys() — which uses the /query/service API endpoint, issues a SELECT * FROM {bucket} USE KEYS {keys} query with all the found document (record) key ids and returns the result set. Nothing more fancy than that.

FIN

While I do not have these functions in a standalone, Couchbase-focused package I do have them in the package associated with this particular project. If you do know of a Couchbase R package (please don’t link to JDBC/ODBC drivers as I’m not going to buy) please link to them in the comments.

If you have other strategies for how to deal with these “un-packages”, please blog about it and post a link as well! I’m curious how others balance the package/not-a-package/un-package tension, especially when you may need to depend on a series of functions across projects.

Wicked Fast, Accurate Quantiles Using ‘t-Digests’ in R with the {tdigest} Package

@ted_dunning recently updated the t-Digest algorithm he created back in 2013. What is this “t-digest”? Fundamentally, it is a probabilistic data structure for estimating any percentile of distributed/streaming data. Ted explains it quite elegantly in this short video:

Said video has a full transcript as well.

T-digests have been baked into many “big data” analytics ecosystems for a while but I hadn’t seen any R packages for them (ref any in a comment if you do know of some) so I wrapped one of the low-level implementation libraries by ajwerner into a diminutive R package boringly, but appropriately named tdigest:

SourceHut
GitLab
CINC (The CINC repo has R 3.5 package binaries for Windows and macOS.)
GitUgh

There are wrappers for the low-level accumulators and quantile/value extractors along with vectorised functions for creating t-digest objects and retrieving quantiles from them (including a tdigest S3 method for stats::quantile()).

This:

install.packages("tdigest", repos="https://cinc.rud.is/")

will install from source or binaries onto your system(s).

Basic Ops

The low-level interface is more useful in “streaming” operations (i.e. accumulating input over time):

set.seed(2019-04-03)

td <- td_create()

for (i in 1:100000) {
  td_add(td, sample(100, 1), 1)
}

quantile(td)
## [1]   1.00000  25.62222  53.09883  74.75522 100.00000

More R-like Ops

Vectorisation is the name of the game in R and we can use tdigest() to work in a vectorised manner:

set.seed(2019-04-03)

x <- sample(100, 1000000, replace=TRUE)

td <- tdigest(x)

quantile(td)
## [1]   1.00000  25.91914  50.79468  74.76439 100.00000

Need for Speed

The t-digest algorithm was designed for both streaming operations and speed. It’s pretty, darned fast:

microbenchmark::microbenchmark(
  tdigest = tquantile(td, c(0, 0.01, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 0.99, 1)),
  r_quantile = quantile(x, c(0, 0.01, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 0.99, 1))
)
## Unit: microseconds
##        expr      min         lq        mean    median       uq       max neval
##     tdigest    22.81    26.6525    48.70123    53.355    63.31    151.29   100
##  r_quantile 57675.34 59118.4070 62992.56817 60488.932 64731.23 160130.50   100

Note that “accurate” is not the same thing as “precise”, so regular quantile ops in R will be close to what t-digest computes, but not always exactly the same.

FIN

This was a quick (but, complete) wrapper and could use some tyre kicking. I’ve a mind to add serialization to the C implementation so I can then enable [de]serialization on the R-side since that would (IMO) make t-digest ops more useful in an R-context, especially since you can merge two different t-digests.

As always, code/PR where you want to and file issues with any desired functionality/enhancements.

Also, whomever started the braces notation for package names (e.g. {ggplot2}): brilliant!

Rome Was Not Built In A Day But widgetcard Was!

I saw a second post on turning htmlwidgets into interactive Twitter Player cards and felt somewhat compelled to make creating said entities a bit easier so posited the following:

Wld this be useful packaged up, #rstats?https://t.co/sfqlWnEeJV https://t.co/troKzmzTNv

(TLDR/V: Single function to turn an HTML widget into a deployable interactive Twitter card) pic.twitter.com/uahB52YfE2

— boB Rudis (@hrbrmstr) March 26, 2019

I figured 40+ 💙 could not be wrong, so thus begat widgetcard:

To make this post as short as possible, the TLDR is that you just pass in an htmlwidget and some required parameters and you get back a deployable interactive Twitter Player card as an archive file and local directory. The example code is almost as short since we’re cheating and using the immensely helpful plotly package to turn a ggplot2 vis into something interactive.

First, make the vis:

library(ssh)
library(plotly)
library(ggplot2)
library(widgetcard)

ggplot(mtcars, aes(wt, mpg)) +
  geom_point() -> gg

Now, we create a local preview image for the plot we just made since we need one for the card:

preview <- gg_preview(gg)

NOTE that you can use any image you want. This function streamlines the process for plotly plots created from ggplot2 plots. There are links to image sizing guidelines in the package help files.

Now, we convert our ggplot2 object to a plotly object and create the Twitter Player card. Note that Twitter really doesn’t like standalone widgets being used as Twitter Player card links due to their heavyweight size. Therefore, card_widget() creates a non-standalone widget but bundles everything up into a single directory and deployable archive.

ggplotly(gg) %>% 
  card_widget(
    output_dir = "~/widgets/tc",
    name_prefix = "tc",
    preview_img = preview,
    html_title = "A way better title",
    card_twitter_handle = "@hrbrmstr",
    card_title = "Basic ggplot2 example",
    card_description = "This is a sample caRd demonstrating card_widget()",
    card_image_url_prefix = "https://rud.is/vis/tc/",
    card_player_url_prefix = "https://rud.is/vis/tc/",
    card_player_width = 480,
    card_player_height = 480
  ) -> arch_fil

Here’s what the resulting directory structure looks like:

tc
├── tc.html
├── tc.png
└── tc_files
    ├── crosstalk-1.0.0
    │   ├── css
    │   │   └── crosstalk.css
    │   └── js
    │       ├── crosstalk.js
    │       ├── crosstalk.js.map
    │       ├── crosstalk.min.js
    │       └── crosstalk.min.js.map
    ├── htmlwidgets-1.3
    │   └── htmlwidgets.js
    ├── jquery-1.11.3
    │   ├── jquery-AUTHORS.txt
    │   ├── jquery.js
    │   ├── jquery.min.js
    │   └── jquery.min.map
    ├── plotly-binding-4.8.0
    │   └── plotly.js
    ├── plotly-htmlwidgets-css-1.39.2
    │   └── plotly-htmlwidgets.css
    ├── plotly-main-1.39.2
    │   └── plotly-latest.min.js
    ├── pymjs-1.3.2
    │   ├── pym.v1.js
    │   └── pym.v1.min.js
    └── typedarray-0.1
        └── typedarray.min.js

(There’s also a tc.tgz at the same level as the tc directory.)

The widget is iframe’d using widgetframe and then saved out using htmlwidgets::saveWidget().

Now, for deploying this to a web server, one could use a method like this to scp the deployable archive:

sess <- ssh_connect(Sys.getenv("SSH_HOST"))

invisible(scp_upload(
  sess, files = arch_fil, Sys.getenv("REMOTE_VIS_DIR"), verbose = FALSE
))

ssh_exec_wait(
  sess,
  command = c(
    sprintf("cd %s", Sys.getenv("REMOTE_VIS_DIR")),
    sprintf("tar -xzf %s", basename(arch_fil))
  )
)

Alternatively, you can use other workflows to transfer and expand the archive or copy output to your static blog host.

Make sure to test anything you build with Twitter’s validator before tweeting it out.

FIN

This works but is super nascent and could use some serious IRL tyre kicking and brutal feedback. Pick the least offensive social coding site you prefer and file issues & PR’s at-will.

Assumptions Matter More Than Dependencies

There’s been alot of talk about “dependencies” in the R universe of late. This is not really a post about that but more of a “really, don’t do this” if you decide you want to poke the dependency bear by trying to build a deeply flawed model off of CRAN package metadata.

CRAN packages undergo checks. Here’s one for akima (I :heart: me some gridded interpolation functions, plus this package is not in any hot-button R tribe right now):

Flavor	Version	T_install	T_check	T_total	Status
r-devel-linux-x86_64-debian-clang	0.6-2	9.83	32.67	42.50	OK
r-devel-linux-x86_64-debian-gcc	0.6-2	8.53	26.56	35.09	OK
r-devel-linux-x86_64-fedora-clang	0.6-2			53.33	NOTE
r-devel-linux-x86_64-fedora-gcc	0.6-2			51.03	NOTE
r-devel-windows-ix86+x86_64	0.6-2	53.00	76.00	129.00	OK
r-patched-linux-x86_64	0.6-2	9.26	28.32	37.58	OK
r-patched-solaris-x86	0.6-2			66.20	OK
r-release-linux-x86_64	0.6-2	8.59	28.25	36.84	OK
r-release-windows-ix86+x86_64	0.6-2	39.00	69.00	108.00	OK
r-release-osx-x86_64	0.6-2				OK
r-oldrel-windows-ix86+x86_64	0.6-2	28.00	71.00	99.00	OK
r-oldrel-osx-x86_64	0.6-2				OK

Check Details
Version: 0.6-2
  Check: compiled code
 Result: NOTE

    File ‘akima/libs/akima.so’:
      Found no call to: ‘R_useDynamicSymbols’

    It is good practice to register native routines and to disable symbol search.

    See ‘Writing portable packages’ in the ‘Writing R Extensions’ manual.

The Status field can be “OK”, “NOTE”, “WARN[ING]”, “ERROR”, or “FAIL”.

You’ll also note that there are checks for a future, even cooler R (“devel”), spiffy R (“release”/”patched”) and :yawn: R (“oldrel”). Remember those, they are important.

Now, let’s say you wanted to perform an honest appraisal of whether packages with more dependencies are more likely to have one or more “bad” CRAN check conditions. You’d likely lump “NOTE” with “OK” and not mark that particular check against the package. That leaves “WARN[ING]” (the reason for the [ING] is that different check RDS files include/forego the [ING]…yay consistency?) “ERROR” and “FAIL”. Obviously we can just use those without further concern, right?

WARN[ING] Will Robinson!

You can get a copy of the check details at https://cran.r-project.org/web/checks/check_details.rds. I happen to have a local copy (used “Mar 18 02:47” version) now that my CRAN mirror is re-humming along nicely again. Let’s make sure it’s being read in OK:

library(tidyverse)

det <- as_tibble(readRDS("check_details.rds")) # it's got tons of classes & I like readable data frame prints

nrow(distinct(det, Package))
## [1] 13094

OK, that number tracks with the count as of the last rsync. So, what do these WARN[ING]s look like?

filter(det, Status == "WARNING") %>% 
  select(Output)
## # A tibble: 2,299 x 1
##    Output                                                             
##    <chr>                                                              
##  1 "Found the following significant warnings:\n  Warning: unable to r…
##  2 "Found the following significant warnings:\n  Warning: unable to r…
##  3 "Found the following significant warnings:\n  Warning: unable to r…
##  4 "Warning in parse(file = files, n = -1L) :\n  invalid input found …
##  5 "Warning in parse(file = files, n = -1L) :\n  invalid input found …
##  6 "Found the following significant warnings:\n  Warning: unable to r…
##  7 "Found the following significant warnings:\n  Warning: unable to r…
##  8 "Found the following significant warnings:\n  Warning: unable to r…
##  9 "Found the following significant warnings:\n  track_methods.cpp:22…
## 10 "Found the following significant warnings:\n  Warning: unable to r…
## # … with 2,289 more rows

EEK! Well, actually not really. The checks are automated and we can use some substring machinations to try to get better groups:

filter(det, Status == "WARNING") %>% 
  mutate(bits = substring(trimws(Output), 1, 30)) %>% 
  count(bits, sort=TRUE) %>% 
  mutate(pct = n/sum(n))
## # A tibble: 29 x 3
##    bits                                  n     pct
##    <chr>                             <int>   <dbl>
##  1 Found the following significan     1316 0.572  
##  2 Error in re-building vignettes      702 0.305  
##  3 Error(s) in re-building vignet       90 0.0391 
##  4 Output from running autoreconf       56 0.0244 
##  5 Warning in parse(file = files,       37 0.0161 
##  6 Missing link or links in docum       15 0.00652
##  7 "network.dyadcount:\n  function("    10 0.00435
##  8 Found the following executable        9 0.00391
##  9 dyld: Library not loaded: /Bui        8 0.00348
## 10 Errors in running code in vign        8 0.00348
## # … with 19 more rows

OK, so some “eeking” is warranted for those “significant” ones but thirty percent of these findings are about vignettes. Sure, vignettes are important and ideally they build fine but there are tons of reasons they don’t on CRAN’s ever-changing infrastructure. I say they need to be excluded. Drop a note in the comments with a different opinion since this is an analyst’s opinion. But I happen to know CRAN really well and would seriously suggest that in the context of the question regarding high dependency package efficacy that this should be ignored unless further investigated individually.

So, where are these “significant” WARN[ING]s?

filter(det, Status == "WARNING") %>% 
  filter(grepl("significant", Output, ignore.case = TRUE)) %>%
  mutate(flavor_flav = case_when(
    grepl("devel", Flavor) ~ "devel",
    grepl("oldrel", Flavor) ~ "oldrel",
    TRUE ~ "current"
  )) %>% 
  count(flavor_flav, sort=TRUE) %>% 
  mutate(pct = n/sum(n))
## # A tibble: 3 x 3
##   flavor_flav     n   pct
##   <chr>       <int> <dbl>
## 1 devel         904 0.686
## 2 current       280 0.213
## 3 oldrel        133 0.101

I posit that if the goal really is to create a model to help decide whether you should take on the risk of packages with multiple++ dependencies you cannot include “devel”. Nobody sane runs “devel” in production and that’s the real goal: a safe production environment. So you literally have to throw out 68% of these, too (some folks are stuck on :yawn: “oldrel” R in orgs with draconian IT practices or fragile workflow systems). We’re not at “0” yet so what are some of these issues?

filter(det, Status == "WARNING") %>% 
  filter(grepl("significant", Output, ignore.case = TRUE)) %>%
  mutate(flavor_flav = case_when(
    grepl("devel", Flavor) ~ "devel",
    grepl("oldrel", Flavor) ~ "oldrel",
    TRUE ~ "current"
  )) %>% 
  filter(flavor_flav != "devel") %>% 
  mutate(Output = gsub("Found the following significant warnings:\n  ", "", trimws(Output))) %>% 
  mutate(bits = substring(trimws(Output), 1, 50)) %>% 
  count(bits, sort=TRUE) %>% 
  mutate(pct = n/sum(n))
## # A tibble: 110 x 3
##    bits                                                      n     pct
##    <chr>                                                 <int>   <dbl>
##  1 "Warning: S3 methods '[.fun_list', '[.grouped_df', "    158 0.383  
##  2 Warning: 'rgl_init' failed, running with rgl.useNU       60 0.145  
##  3 "Found the following significant warnings:\n\n  Warn…     9 0.0218 
##  4 Warning: package ‘dplyr’ was built under R version        9 0.0218 
##  5 "bgc_hmm.c:241:31: warning: ‘, ’ directive writing "      4 0.00969
##  6 driver.c:381:26: warning: cast from pointer to int        4 0.00969
##  7 hash.c:144:5: warning: ‘strncpy’ specified bound 2        4 0.00969
##  8 RngStream.c:347:4: warning: ‘strncpy’ specified bo        4 0.00969
##  9 Warning: ‘__var_1_mmb.offset’ is used uninitialize        4 0.00969
## 10 /home/hornik/tmp/R.check/r-patched-gcc/Work/build/        3 0.00726
## # … with 100 more rows

Fun fact: a re-run of this with a 2019-03-19 RDS pulled from CRAN shows 79 vs 158 (and those 79 packages were’t magically re-submitted). This is usually a CRAN check “hiccup” on Windows:

filter(det, Status == "WARNING") %>% 
  filter(grepl("significant", Output, ignore.case = TRUE)) %>%
  mutate(flavor_flav = case_when(
    grepl("devel", Flavor) ~ "devel",
    grepl("oldrel", Flavor) ~ "oldrel",
    TRUE ~ "current"
  )) %>% 
  filter(flavor_flav != "devel") %>% 
  mutate(Output = gsub("Found the following significant warnings:\n  ", "", trimws(Output))) %>% 
  mutate(bits = substring(trimws(Output), 1, 50)) %>% 
  filter(grepl("S3 methods", bits)) %>% 
  count(bits, Flavor, sort=TRUE) %>% 
  mutate(pct = n/sum(n))
## # A tibble: 6 x 4
##   bits                               Flavor                  n     pct
##   <chr>                              <chr>               <int>   <dbl>
## 1 "Warning: S3 methods '[.fun_list'… r-oldrel-windows-i…    79 0.485  
## 2 "Warning: S3 methods '[.fun_list'… r-release-windows-…    79 0.485  
## 3 Warning: S3 methods 'as_mapper.ch… r-oldrel-windows-i…     2 0.0123 
## 4 Warning: S3 methods '.DollarNames… r-oldrel-windows-i…     1 0.00613
## 5 Warning: S3 methods 'as.promise.F… r-release-windows-…     1 0.00613
## 6 Warning: S3 methods 'format.stati… r-oldrel-windows-i…     1 0.00613

Yep! So, we really have to ignore some portion of these but not many (remember, these are test counts, not package counts).

Perhaps we’ll have better luck ~~ginning up the analysis~~ focusing on “ERROR”!

To ERROR Is Definitely Human When Assumptions Are Flawed

Let’s see about these ERRORs:

filter(det, Status == "ERROR") %>%
  mutate(flavor_flav = case_when(
    grepl("devel", Flavor) ~ "devel",
    grepl("oldrel", Flavor) ~ "oldrel",
    TRUE ~ "current"
  )) %>% 
  filter(flavor_flav != "devel") %>% 
  mutate(bits = substring(trimws(Output), 1, 20)) %>% 
  count(bits, sort=TRUE) %>% 
  print(n=66)
## # A tibble: 66 x 2
##    bits                        n
##    <chr>                   <int>
##  1 Installation failed.      437
##  2 "Running examples in "    424
##  3 Package required but      213
##  4 Running ‘testthat.R’      150
##  5 Packages required bu      102
##  6 Running 'testthat.R'       58
##  7 Package required and       31
##  8 Running ‘test-all.R’       17
##  9 Packages required an       12
## 10 Errors in running co        7
## 11 Running ‘spelling.R’        6
## 12 Running ‘test-that.R        5
## 13 Running 'activate_te        4
## 14 Running 'test-all.R'        4
## 15 Running ‘activate_te        3
## 16 Running 'Bernstein-E        2
## 17 Running 'Class+Meth.        2
## 18 Running 'dist_matrix        2
## 19 Running 'Frechet-tes        2
## 20 Running 'spelling.R'        2
## 21 Running 'test-as-dgC        2
## 22 Running 'valued_fit.        2
## 23 Running ‘allier.R’ [        2
## 24 Running ‘aunitizer.R        2
## 25 Running ‘autoprint.R        2
## 26 "Running ‘bdstest.R’ "      2
## 27 Running ‘build-tools        2
## 28 Running ‘exporting-m        2
## 29 Running ‘restfulr_un        2
## 30 Running ‘run_test.R’        2
## 31 Running ‘test_change        2
## 32 Running ‘test-as-dgC        2
## 33 Running ‘testGBHProc        2
## 34 Running ‘tests-nlgam        2
## 35 Running ‘testthat-pr        2
## 36 Running ‘TimeIn_Data        2
## 37 Running '000.session        1
## 38 Running '001.setupEx        1
## 39 Running 'bug1.R' [1s        1
## 40 "Running 'failure.R' "      1
## 41 Running 'fold.R' [5s        1
## 42 Running 'Rgui.R' [3s        1
## 43 Running 'Rgui.R' [4s        1
## 44 Running 'SP500-ex.R'        1
## 45 Running ‘aggregate.R        1
## 46 Running ‘as.edgelist        1
## 47 "Running ‘bdstest.R’\n"     1
## 48 Running ‘Class+Meth.        1
## 49 "Running ‘config.R’\n "     1
## 50 Running ‘cX-ui-funct        1
## 51 Running ‘DevEvalFile        1
## 52 "Running ‘emTests.R’\n"     1
## 53 "Running ‘group01.R’ "      1
## 54 Running ‘loop_genera        1
## 55 Running ‘LTS-special        1
## 56 Running ‘rsolr_unit_        1
## 57 "Running ‘run-all.R’\n"     1
## 58 Running ‘runTests.R’        1
## 59 "Running ‘sleep.R’\n  "     1
## 60 Running ‘SP500-ex.R’        1
## 61 Running ‘test_bccaq.        1
## 62 Running ‘test_scs.R’        1
## 63 Running ‘test-cluste        1
## 64 "Running ‘test.R’\nRun"     1
## 65 Running ‘tests.R’ [4        1
## 66 Running ‘testthat.r’        1

We need to investigate more so let’s make some groups:

filter(det, Status == "ERROR") %>%
  mutate(flavor_flav = case_when(
    grepl("devel", Flavor) ~ "devel",
    grepl("oldrel", Flavor) ~ "oldrel",
    TRUE ~ "current"
  )) %>% 
  filter(flavor_flav != "devel") %>% 
  mutate(Output = trimws(Output)) %>% 
  mutate(output_grp = case_when(
    grepl("^Running ", Output) ~ "Test/example run",
    grepl("^Installation fail", Output) ~ "Install failed",
    grepl("Package[s]* requir", Output) ~ "Missing pacakge(s)",
    grepl("Errors in running code in vig", Output) ~ "Vignette issue",
    TRUE ~ Output
  )) %>% 
  count(output_grp, sort=TRUE)
## # A tibble: 4 x 2
##   output_grp             n
##   <chr>              <int>
## 1 Test/example run     743
## 2 Install failed       437
## 3 Missing pacakge(s)   358
## 4 Vignette issue         7

Much better. Now, let see where those are:

filter(det, Status == "ERROR") %>%
  mutate(flavor_flav = case_when(
    grepl("devel", Flavor) ~ "devel",
    grepl("oldrel", Flavor) ~ "oldrel",
    TRUE ~ "current"
  )) %>% 
  filter(flavor_flav != "devel") %>% 
  mutate(Output = trimws(Output)) %>% 
  mutate(output_grp = case_when(
    grepl("^Running ", Output) ~ "Test/example run",
    grepl("^Installation fail", Output) ~ "Install failed",
    grepl("Package[s]* requir", Output) ~ "Missing package(s)",
    grepl("Errors in running code in vig", Output) ~ "Vignette issue",
    TRUE ~ Output
  )) %>% 
  filter(!grepl("solaris", Flavor)) %>%  # I love ya Solaris, but you're not relevant anymore
  count(output_grp, Flavor, sort=TRUE) %>% 
  mutate(pct = n/sum(n))
## # A tibble: 16 x 4
##    output_grp         Flavor                            n     pct
##    <chr>              <chr>                         <int>   <dbl>
##  1 Install failed     r-oldrel-osx-x86_64             256 0.197  
##  2 Test/example run   r-oldrel-windows-ix86+x86_64    218 0.168  
##  3 Missing package(s) r-oldrel-osx-x86_64             156 0.120  
##  4 Missing package(s) r-release-osx-x86_64            133 0.102  
##  5 Test/example run   r-oldrel-osx-x86_64             110 0.0847 
##  6 Test/example run   r-release-osx-x86_64             80 0.0616 
##  7 Install failed     r-release-windows-ix86+x86_64    73 0.0562 
##  8 Test/example run   r-patched-linux-x86_64           57 0.0439 
##  9 Test/example run   r-release-linux-x86_64           57 0.0439 
## 10 Install failed     r-oldrel-windows-ix86+x86_64     46 0.0354 
## 11 Test/example run   r-release-windows-ix86+x86_64    46 0.0354 
## 12 Install failed     r-release-osx-x86_64             36 0.0277 
## 13 Missing package(s) r-oldrel-windows-ix86+x86_64     19 0.0146 
## 14 Missing package(s) r-release-windows-ix86+x86_64     5 0.00385
## 15 Vignette issue     r-release-osx-x86_64              4 0.00308
## 16 Vignette issue     r-oldrel-osx-x86_64               3 0.00231

Let’s poke a bit more, but let’s also be aware of the fact that some (many) ERRORs on “oldrel” are due to conditions like this where the package specifies that it can only be used in release++ versions of R. So we kinda have to go all Columbo on every ERROR or exclude “oldrel” (we’ll do the latter since this post is already long) and we should also ignore the missing packages ones since that’s more than likely a CRAN issue.

filter(det, Status == "ERROR") %>%
  mutate(flavor_flav = case_when(
    grepl("devel", Flavor) ~ "devel",
    grepl("oldrel", Flavor) ~ "oldrel",
    TRUE ~ "current"
  )) %>% 
  filter(!(flavor_flav %in% c("oldrel", "devel"))) %>% 
  filter(!grepl("solaris", Flavor)) %>%  
  mutate(Output = trimws(Output)) %>% 
  mutate(output_grp = case_when(
    grepl("^Running ", Output) ~ "Test/example run",
    grepl("^Installation fail", Output) ~ "Install failed",
    grepl("Package[s]* requir", Output) ~ "Missing package(s)",
    grepl("Errors in running code in vig", Output) ~ "Vignette issue",
    TRUE ~ Output
  )) %>% 
  filter(output_grp != "Missing package(s)") %>% 
  distinct(Package)
## # A tibble: 254 x 1
##    Package          
##    <chr>            
##  1 AER              
##  2 archdata         
##  3 atlantistools    
##  4 BAMBI            
##  5 biglmm           
##  6 biglm            
##  7 BIOMASS          
##  8 blockingChallenge
##  9 broom            
## 10 clusternomics    
## # … with 244 more rows

Now we have a target package list (+ ~26 in “FAIL”) that can very likely legitimately have issues. We’ll let more practical data scientists than I am figure out the dependency tree member count for them and then determine proper features and model selection to come up with a far more legitimate “risk metric”.

Just One More Thing

Did you read the bit about the 03-18 RDS having some serious differences from 03-19 one? Yeah, so perhaps any model needs to be run a few times or the data collected over the course of time to ensure we’re working with as clean a dataset as possible. Y’know, ask practical data science questions like:

What data is available to me?
Will it help me solve the problem?
Is it enough?
Is the data quality good enough?

FIN

Never contrive an analysis just to fit your preferred message.

Assumptions matter. Analysis setup matters. Domain expertise matters. Dataset knowledge matters.

I can sum that up as mindfulness matters, and if you approach a project that way then go forth and LIBRARY ALL THE THINGS you need to accomplish your goals.

Handling & Sharing PCAPs Like a Boss with PacketTotal

The fine folks over at @PacketTotal bequeathed an API token on me so I cranked out an R package for it to enable more dynamic investigations work (RStudio makes for an amazing incident responder investigations console given that you can script in multiple languages, code in C[++], and write documentation all at the same time using R ‘projects’ with full source code control).

Since I used the DT package my usual “just copy and paste the markdown into WordPress” wasn’t going to work and I wasn’t going to do two saveWidget()s and force two iframes on y’all just for an introductory post, so the inline-iframe for the R markdown output is below and can be frame-busted as well.

You can also find the source for the R code used in the R markdown document here.

</p> </div> </div> <div id="post-12083" class="post-12083 post type-post status-publish format-standard hentry category-cybersecurity category-r"> <div class="entry-meta"> <h2 class="entry-title"><a href="https://rud.is/b/2019/03/14/collecting-content-security-policy-violation-reports-in-s3-effortlessly-freely/" rel="bookmark">Collecting Content Security Policy Violation Reports in S3 (‘Effortlessly’/’Freely’)</a></h2> <ul> <li class="entry-date"><a href="https://rud.is/b/2019/03/14/collecting-content-security-policy-violation-reports-in-s3-effortlessly-freely/">2019-03-14 – 10:02</a></li> <li class="entry-category">Posted in <a href="https://rud.is/b/category/cybersecurity/" rel="category tag">Cybersecurity</a>, <a href="https://rud.is/b/category/r/" rel="category tag">R</a></li> <li class="entry-commentlink"><a href="https://rud.is/b/2019/03/14/collecting-content-security-policy-violation-reports-in-s3-effortlessly-freely/#comments">Comments (1)</a></li> </ul> </div> <div class="entry-content"> <p>In the <a href="https://rud.is/b/2019/03/10/wrangling-content-security-policies-in-r/">previous post</a> I tried to explain what Content Security Policies (CSPs) are and how to work with them in R. In case you didn’t RTFPost the TLDR is that CSPs give <em>you</em> control over what can be loaded along with your web content and can optionally be configured to generate a violation report for any attempt to violate the policy you create. While you don’t <em>need</em> to specify a report URI you really should since at the very least you’ll know if you errantly missed a given host, wildcard, or path. You’ll also know when there’s been malicious or just plain skeezy activity going on with third-parties and your content (which is part of the whole point of CSPs).</p> <p>There’s an “R” category tag on this post (so it’s hitting R-bloggers, et al) since it’s part of an unnumbered series on working with CSPs in R and the <em>next</em> posts will show how to analyze the JSON-formatted reports that are generated. But, to analyze such reports you <em>kinda need a way to get them</em> first. So, we’re going to setup a “serverless” workflow in Amazon AWS to shove CSP reports into a well-organized structure in S3 from which we’ll be able to access, ingest, and analyze them.</p> <p>Sure, there are services out there who will (legit for free) let you forward violation reports to them but if you can do this for “free” on your own and not give data out to a third-party to make money or ostensibly do-gooder reputation from I can’t fathom an argument for just giving up control.</p> <p>Note that all you <em>need</em> is an an internet-accessible HTTPS endpoint that can take an HTTP POST request with a JSON payload and then store that somewhere, so if you want to, say, use the <code>plumber</code> package to handle these requests without resorting to AWS, then by all means do so! (And, blog about it!)</p> <h3>AWS “Serverless” CSP Report Workflow Prerequisites</h3> <p>You’re obviously going to need an Amazon AWS account and will also need the <a href="https://aws.amazon.com/cli/">AWS Command Line Interface</a> tools installed plus an IAM user that has permissions to use <a href="https://aws.amazon.com/cloudformation/">CloudFormation</a>. AWS has been around <em>a while</em> now so yet-another-howto on signing up for AWS, installing the CLI tools and <a href="https://docs.aws.amazon.com/IAM/latest/UserGuide/id_users_create.html">generating an IAM user</a> would be, at-best, redundant. <a href="https://aws.amazon.com/getting-started/">Amazon has decent intro resources</a> and, honestly, it’s 2019 and having some familiarity with how to work with at least one cloud provider is pretty much a necessary skillset at this point depending on what part of “tech” you’re in. If you’re new to AWS then follow the links in this paragraph, run through some basics and jump back to enter the <em>four</em> commands you’ll need to run to bootstrap your CSP collection setup.</p> <h3>Bootstrapping an S3 CSP Collector in AWS</h3> <p>We’re going to use <a href="https://github.com/michaelbanfield/serverless-csp-report-to">this CloudFormation workflow</a> to bootstrap the CSP collection process and you should skim the <a href="https://github.com/michaelbanfield/serverless-csp-report-to/blob/master/template.yaml">yaml file</a> to see what’s going on. Said yaml is “infrastructure as code”, meaning it’s a series of configuration directives to generate AWS services for you (i.e. no pointing-and-clicking) and, perhaps more importantly, destroy them for you if you no longer want to keep this active.</p> <p>The <a href="https://github.com/michaelbanfield/serverless-csp-report-to/blob/master/template.yaml#L4-L14">CF Output directive</a> will be the URI you’re going to use in the <code>report-uri</code>/<code>report-to</code> CSP directives and is something we’ll be querying for at the end of the setup process.</p> <p>The first <a href="https://github.com/michaelbanfield/serverless-csp-report-to/blob/master/template.yaml#L4-L14">set of resources</a> are <a href="https://aws.amazon.com/glue/">AWS Glue</a> templates which would enable wiring up the CSP report results into <a href="https://aws.amazon.com/athena/">AWS Athena</a>. Glue is a nice ETL framework but it’s kinda expensive if set in active mode (Amazon calls it ‘crawler’ mode) so this CloudFormation recipe only created the Glue template but does not activate it. This section can (as the repo author notes) be deleted but it does no harm and costs nothing extra so leaving it in is fine as well.</p> <p>The <a href="https://github.com/michaelbanfield/serverless-csp-report-to/blob/master/template.yaml#L64-L127">next bit</a> sets up an <a href="https://aws.amazon.com/kinesis/data-firehose/">AWS Firehose</a> configuration which is a silly sounding name for setting up a workflow for where to store “streaming” data. This “firehose” config is just going to setup a path for an S3 bucket and then setup the necessary permissions associated with said bucket. <strong>This is where we’re going to pull data from in the next post.</strong></p> <p>The aforementioned “firehose” can take streaming data from all kinds of input sources and our data source is going to be a POSTed JSON HTTP interaction from a browser so we need to have something that listens for these POST requests and wire that up to the “firehose”. For that we need an <a href="https://aws.amazon.com/api-gateway/">API gateway</a> and that’s what the <a href="https://github.com/michaelbanfield/serverless-csp-report-to/blob/master/template.yaml#L128-L198">penultimate section</a> sets up for us. It instructs AWS to setup an API endpoint to listen for POST requests, tells it the data type (JSON) it will be handling and then tells it what <a href="https://aws.amazon.com/lambda/">AWS Lambda</a> to call, which is in <a href="https://github.com/michaelbanfield/serverless-csp-report-to/blob/master/template.yaml#L199-L213">the last section</a>.</p> <p>Said lambda code is in the repo’s <a href="https://github.com/michaelbanfield/serverless-csp-report-to/blob/master/index.js">index.js</a> and is a short Node.js script to post-process the <a href="https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Content-Security-Policy-Report-Only#Sample_violation_report">CSP report JSON</a> into something slightly more usable in a data analysis context (the folks who made the violation report clearly did not have data science folks in mind when creating the structure given the liberal use if <code>-</code> in field names).</p> <p><strong>If the above sounds super-complex just go get CSP reports, you’re not-wrong.</strong> We trade off the cost and tedium of self-hosting and securing a standalone-yet-simple JSON POST handling server for a moderately complex workflow that involves multiple types of moving parts in AWS. The downside is having to gain a more than casual familiarity with AWS components. The plus side is that this is pretty much free unless your site is wildly popular and either constantly under XSS attack or your CSP policy is woefully misconfigured.</p> <p><em>“‘Free’, you say?!”</em> Yep. Free. (OK, “mostly” free)</p> <ul> <li><strong>AWS API Gateway</strong>: 1,000,000 HTTP REST API calls (our POST reqs that call the lambda code) per month are free</li> <li><strong>AWS Lambda</strong> (the <code>index.js</code> runner which sends data to the “firehose”): 1,000,000 free requests per month and 400,000 seconds of compute time per month (the <code>index.js</code> takes ~1s to run)</li> <li><strong>AWS Firehose</strong> (the bit that shoves data into S3): first 500 TB/month is $0.029 USD</li> <li><strong>AWS S3</strong>: First 50 TB / month is $0.023 per GB (the CSP JSON POSTs gzip’d are usually <1K each) + some super-fractional (of a penny) costs for PUTting data into S3 and copying data from S3.</li> </ul> <p>A well-crafted CSP and a typical site should end up costing you way less than $1.00 USD/month and you can monitor it all via the <a href="https://console.aws.amazon.com/billing/home?region=us-east-1#/freetier">console</a> or <a href="https://docs.aws.amazon.com/awsaccountbilling/latest/aboutv2/tracking-free-tier-usage.html">with alerts</a> (change your region, if needed). Plus, you can destroy it at any time with one command (we haven’t built it yet so we’ll see this in a bit).</p> <h3>Launching the Bootstrap</h3> <p>As the repo says, do:</p> <pre><code class="language-bash">$ git clone git@github.com:michaelbanfield/serverless-csp-report-to.git # get the repo $ cd serverless-csp-report-to # go to the dir $ aws s3 mb s3://some-unique-and-decent-bucket-name-to-hold-the-lambda-code/ # pick a good name that you'll recognize $ aws cloudformation package \ # generate the build template --template-file template.yaml \ --s3-bucket <bucket-you-just-created> \ --output-template-file packaged-template.yaml $ aws cloudformation deploy \ # launch the build --template-file /path/to/packaged-template.yaml \ --stack-name CSPReporter \ --capabilities CAPABILITY_IAM </code></pre> <p>It’ll take a minute or two and when it is done just do:</p> <pre><code class="language-plain">$ aws cloudformation describe-stacks \ --query "Stacks[0].Outputs[0].OutputValue" \ --output text \ --stack-name CSPReporter </code></pre> <p>To get the URL you’ll use in the reporting directives.</p> <p>To get rid of all these created resources you can <a href="https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/cfn-console-delete-stack.html">go into the console and do it</a> <em>or</em> just do</p> <pre><code class="language-plain">$ aws cloudformation --delete-stack --stack-name CSPReporter </code></pre> <p>To see the bucket that was created for the CSP reports just do:</p> <pre><code class="language-plain">$ aws s3 ls | grep firehose </code></pre> <h3>FIN</h3> <p>If you’re experienced with AWS that was likely not a big deal. If you’re new or inexperienced with AWS this is not a bad way to get some experience with a “serverless” API setup since it’s cheap, easy to delete and touches on a number of key components within AWS.</p> <p>You can browse through the AWS console to see all of what was created and eventually tweak the CF yaml to bend it to your own will.</p> <p>Next time we’ll dive in to CSP violation report analysis with R.</p> <p><strong>REMINDER</strong> to — regardless of the source (whether it’s me, RStudio, spiffy R package authors, or big names like AWS/Microsoft/etc.) — <em>always</em> at least spot check the code you’re about to install or execute. Everyone needs to start developing and honing a zero-trust mindset when it comes to even installing apps from app stores on your phones/tablets let alone allowing random R, C[++], Python, Go, Rust, Haskel, … code to execute on your laptops and servers. This is one reason I went through the sections in the YAML and deliberately linked to the <code>index.js</code>. Not knowing what the code does can lead to unfortunate situations down the line.</p> <p>NOTE: If you have an alternative <a href="https://www.terraform.io/">Terraform</a> configuration for this drop a note in the comments since TF is a bit more “modern” and less AWS-centric “infrastructure as code” framework. Also, if you’ve done this with Azure or other providers, also drop a note in the comments since it may be of use to folks who aren’t interested in using AWS. Finally, if you do make a <code>plumber</code> server for this, also drop a note to a post with how you did it and perhaps discuss the costs & headaches involved.</p> </div> </div> <div id="post-12060" class="post-12060 post type-post status-publish format-standard hentry category-r"> <div class="entry-meta"> <h2 class="entry-title"><a href="https://rud.is/b/2019/03/05/heads-up-roll-your-own-http-headers-investigations-with-the-hdrs-package/" rel="bookmark">Head’s Up! Roll Your Own HTTP Headers Investigations with the ‘hdrs’ Package</a></h2> <ul> <li class="entry-date"><a href="https://rud.is/b/2019/03/05/heads-up-roll-your-own-http-headers-investigations-with-the-hdrs-package/">2019-03-05 – 15:50</a></li> <li class="entry-category">Posted in <a href="https://rud.is/b/category/r/" rel="category tag">R</a></li> <li class="entry-commentlink"><a href="https://rud.is/b/2019/03/05/heads-up-roll-your-own-http-headers-investigations-with-the-hdrs-package/#comments">Comments (3)</a></li> </ul> </div> <div class="entry-content"> <p>I blathered <em>alot</em> about HTTP headers in <a href="https://rud.is/b/2019/03/03/cran-mirror-security/">the last post</a>.</p> <p>In the event you wanted to dig deeper I <a href="https://git.rud.is/hrbrmstr/hdrs">threw together</a> a small package that will let you grab HTTP headers from a given URL and take a look at them. The README has examples for most things but we’ll go through a bit of them here as well.</p> <p>For those that just want to play, you can do:</p> <pre><code class="language-r">install.packages("hdrs", repos = "https://cinc.rud.is/") hdrs::explore_app() </code></pre> <p><a href="https://i0.wp.com/rud.is/b/wp-content/uploads/2019/03/shiny-headers.png?ssl=1"><img data-recalc-dims="1" decoding="async" data-attachment-id="12061" data-permalink="https://rud.is/b/2019/03/05/heads-up-roll-your-own-http-headers-investigations-with-the-hdrs-package/shiny-headers/" data-orig-file="https://i0.wp.com/rud.is/b/wp-content/uploads/2019/03/shiny-headers.png?fit=2808%2C2260&ssl=1" data-orig-size="2808,2260" data-comments-opened="1" data-image-meta="{"aperture":"0","credit":"","camera":"","caption":"","created_timestamp":"0","copyright":"","focal_length":"0","iso":"0","shutter_speed":"0","title":"","orientation":"0"}" data-image-title="shiny-headers" data-image-description="" data-image-caption="" data-medium-file="https://i0.wp.com/rud.is/b/wp-content/uploads/2019/03/shiny-headers.png?fit=300%2C241&ssl=1" data-large-file="https://i0.wp.com/rud.is/b/wp-content/uploads/2019/03/shiny-headers.png?fit=510%2C410&ssl=1" src="https://i0.wp.com/rud.is/b/wp-content/uploads/2019/03/shiny-headers.png?resize=510%2C410&ssl=1" alt="" width="510" height="410" class="aligncenter size-full wp-image-12061" /></a></p> <p>and use the diminutive Shiny app to explore a site’s security headers or look at all the headers they return. (<em>Oh, yeah…if you read the previous post then looked at the above screenshot you’ll notice how completely useless IP blocking is to determined <strike>attackers</strike> individuals.</em>)</p> <p>NOTE: There are binaries for macOS and Windows at my CINC repo for <code>hdrs</code> so you’ll be getting those if you use the above method. Use <code>type='source'</code> on that call or use various <code>remotes</code> package functions to install the source package (after reading it b/c you really shouldn’t trust any package, ever) from:</p> <ul> <li><a href="https://git.sr.ht/~hrbrmstr/hdrs">SourceHut</a></li> <li><a href="https://gitlab.com/hrbrmstr/hdrs">GitLab</a></li> <li>if you must <a href="https://github.com/hrbrmstr/hdrs">GitHub</a></li> </ul> <h3>Moving Ahead</h3> <p>Let’s use the command-line to poke at my newfound most favorite site to use in security-related examples:</p> <pre><code class="language-r">library(hdrs) assess_security_headers("https://cran.r-project.org/") %>% dplyr::select(-url) ## # A tibble: 13 x 4 ## header value status_code message ## * <chr> <chr> <chr> <chr> ## 1 access-control-allow-origin NA WARN Header not set ## 2 content-security-policy NA WARN Header not set ## 3 expect-ct NA WARN Header not set ## 4 feature-policy NA WARN Header not set ## 5 public-key-pins NA WARN Header not set ## 6 referrer-policy NA WARN Header not set ## 7 server Apache/2.4.10 (Debian) NOTE Server header found ## 8 strict-transport-security NA WARN Header not set ## 9 x-content-type-options NA WARN Header not set ## 10 x-frame-options NA WARN Header not set ## 11 x-permitted-cross-domain-policies NA WARN Header not set ## 12 x-powered-by NA WARN Header not set ## 13 x-xss-protection NA WARN Header not set </code></pre> <p>Ouch. Not exactly a great result (so, perhaps it matters little how poorly maintained the downstream mirrors are after all, or maybe it’s <em>perfectly fine</em> to run a <a href="http://mail-archives.apache.org/mod_mbox/httpd-announce/201407.mbox/%3C650BABAF-9B03-4EEB-94EC-D6DD833C248F@apache.org%3E">five year old web server</a> with some fun <a href="https://httpd.apache.org/security/vulnerabilities_24.html">vulns</a>).</p> <p>Anyway…</p> <p>The <code>assess_security_headers()</code> function looks at 13 modern “security-oriented” HTTP headers, performs a very light efficacy assessment and returns the results.</p> <ul> <li><code>access-control-allow-origin</code></li> <li><code>content-security-policy</code></li> <li><code>expect-ct</code></li> <li><code>feature-policy</code></li> <li><code>server</code></li> <li><code>public-key-pins</code></li> <li><code>referrer-policy</code></li> <li><code>strict-transport-security</code></li> <li><code>x-content-type-options</code></li> <li><code>x-frame-options</code></li> <li><code>x-permitted-cross-domain-policies</code></li> <li><code>x-powered-by</code></li> <li><code>x-xss-protection</code></li> </ul> <p>Since you likely do not have every HTTP header’s name, potential values, suggested values, and overall purpose memorized, you can also pass in <code>include_ref = TRUE</code> to the function to get more information with decent textual descriptions like you saw in the screenshot (the Shiny app omits many fields).</p> <p>The full reference is available in a data element:</p> <pre><code class="language-r">data("http_headers") dplyr::glimpse(http_headers) ## Observations: 184 ## Variables: 14 ## $ header_field_name <chr> "A-IM", "Accept", "Accept-Additions", "Accept-Charset", "Accept-Datetime", "Accept-Encoding… ## $ type_1 <chr> "Permanent", "Permanent", "Permanent", "Permanent", "Permanent", "Permanent", "Permanent", … ## $ protocol <chr> "http", "http", "http", "http", "http", "http", "http", "http", "http", "http", "http", "ht… ## $ status <chr> "", "standard", "", "standard", "informational", "standard", "", "standard", "", "standard"… ## $ reference <chr> "https://tools.ietf.org/html/rfc3229#section-10.5.3", "https://tools.ietf.org/html/rfc7231#… ## $ type_2 <chr> "Request", "Request", "Request", "Request", "Request", "Request", "Request", "Request", "Re… ## $ enable <lgl> FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, TRUE, TRUE, FALSE, TRUE, FALSE, FAL… ## $ required <lgl> NA, NA, NA, NA, NA, NA, NA, NA, TRUE, TRUE, NA, TRUE, NA, NA, NA, TRUE, NA, NA, NA, NA, NA,… ## $ https <lgl> NA, NA, NA, NA, NA, NA, NA, NA, TRUE, TRUE, NA, TRUE, NA, NA, NA, TRUE, NA, NA, NA, NA, NA,… ## $ security_description <chr> "", "", "", "", "", "", "", "", "Sometimes an HTTP intermediary might try to detect viruses… ## $ security_reference <chr> "", "", "", "", "", "", "", "", "https://tools.ietf.org/html/rfc5789#section-5", "https://t… ## $ recommendations <chr> "", "", "", "", "", "", "", "", "Antivirus software scans for viruses or worms.", "Servers … ## $ cwe <chr> "", "", "", "", "", "", "", "", "CWE-509: Replicating Malicious Code (Virus or Worm)", "CWE… ## $ cwe_url <chr> "\r", "\r", "\r", "\r", "\r", "\r", "\r", "\r", "https://cwe.mitre.org/data/definitions/509… </code></pre> <p>There will eventually be a lovely vignette with well-formatted sections that include the above information so you can reference it at your leisure (it’s <em>great</em> bedtime reading).</p> <p>The <code>http_headers</code> object is fully documented but here’s what those fields mean:</p> <ul> <li><code>header_field_name</code>: header field</li> <li><code>type_1</code>: <code>Permanent</code> (in a standard); <code>Provisional</code> (experimental); <code>Personal</code> (unofficial)</li> <li><code>protocol</code>: should always be <code>http</code> for now but may be different (e.g. <code>quic</code>)</li> <li><code>status</code>: blank == unknown; otherwise the value describes the status well</li> <li><code>reference</code>: where to look for more info</li> <li><code>type_2</code>: <code>Request</code> (should only be found in requests); <code>Response</code> (should only be found in responses); <code>Request/Response</code> found in either; <code>Reserved</code> (not in use yet)</li> <li><code>enable</code>: should you have this enabled</li> <li><code>required</code>: Is this header required</li> <li><code>https</code>: HTTPS-specific header?</li> <li><code>security_description</code>: Information on the header</li> <li><code>security_reference</code>: Extra external reference information on the header</li> <li><code>recommendations</code>: Recommended setting(s)</li> <li><code>cwe</code>: Associated Common Weakness Enumeration (CWE) identifier</li> <li><code>cwe_url</code>: Associated CWE URL</li> </ul> <h3>Even Moar Headers</h3> <p>HTTP servers can spit out tons of headers and we can catch’em all with <code>hdrs::explain_headers()</code>. That function grabs the headers, merges in the full metadata from <code>http_headers</code> and returns a big ol’ data frame. We’ll only pull out the security reference URL for this last example:</p> <pre><code class="language-r">explain_headers("https://community.rstudio.com/") %>% dplyr::select(header, value, security_reference) ## # A tibble: 18 x 3 ## header value security_reference ## <chr> <chr> <chr> ## 1 cache-control no-cache, no-store https://tools.ietf.org/html/rfc7234#… ## 2 connection keep-alive "" ## 3 content-encoding gzip https://en.wikipedia.org/wiki/BREACH… ## 4 content-security-po… base-uri 'none'; object-src 'none'; script-src 'unsafe-eval'… https://www.owasp.org/index.php/List… ## 5 content-type text/html; charset=utf-8 https://tools.ietf.org/html/rfc7231#… ## 6 date Tue, 05 Mar 2019 20:40:31 GMT "" ## 7 referrer-policy strict-origin-when-cross-origin NA ## 8 server nginx https://tools.ietf.org/html/rfc7231#… ## 9 strict-transport-se… max-age=31536000 https://tools.ietf.org/html/rfc6797 ## 10 vary Accept-Encoding "" ## 11 x-content-type-opti… nosniff https://www.owasp.org/index.php/List… ## 12 x-discourse-route list/latest NA ## 13 x-download-options noopen NA ## 14 x-frame-options SAMEORIGIN https://tools.ietf.org/html/rfc7034 ## 15 x-permitted-cross-d… none NA ## 16 x-request-id 12322c6e-b47e-4960-b384-32138097886c NA ## 17 x-runtime 0.106664 NA ## 18 x-xss-protection 1; mode=block https://www.owasp.org/index.php/List… </code></pre> <h3>FIN</h3> <p>Have some fun and poke at some headers. Perhaps even use this to do a survey of key web sites in your field of work/study and see how well they rate. As usual, post PRs & issues at your fav social coding site.</p> </div> </div> <div id="post-12016" class="post-12016 post type-post status-publish format-standard hentry category-cybersecurity category-data-driven-security category-information-security category-r"> <div class="entry-meta"> <h2 class="entry-title"><a href="https://rud.is/b/2019/03/03/cran-mirror-security/" rel="bookmark">CRAN Mirror “Security”</a></h2> <ul> <li class="entry-date"><a href="https://rud.is/b/2019/03/03/cran-mirror-security/">2019-03-03 – 15:22</a></li> <li class="entry-category">Posted in <a href="https://rud.is/b/category/cybersecurity/" rel="category tag">Cybersecurity</a>, <a href="https://rud.is/b/category/data-driven-security/" rel="category tag">data driven security</a>, <a href="https://rud.is/b/category/information-security/" rel="category tag">Information Security</a>, <a href="https://rud.is/b/category/r/" rel="category tag">R</a></li> <li class="entry-commentlink"><a href="https://rud.is/b/2019/03/03/cran-mirror-security/#comments">Comments (5)</a></li> </ul> </div> <div class="entry-content"> <p>In the “Changes on CRAN” section of the latest version of the The R Journal (Vol. 10/2, December 2018) had this short blurb entitled “CRAN mirror security”:</p> <blockquote><p> Currently, there are 100 official CRAN mirrors, 68 of which provide both secure downloads via ‘https’ and use secure mirroring from the CRAN master (via <code>rsync</code> through <code>ssh</code> tunnels). Since the R 3.4.0 release, <code>chooseCRANmirror()</code> offers these mirrors in preference to the others which are not fully secured (yet). </p></blockquote> <p>I would have linked to the R Journal section quoted above but <em>I can’t</em> because I’m blocked from accessing all resources at the IP address serving <code>cran.r-project.org</code> from my business-class internet connection likely due to me having a personal CRAN mirror (that was following the rules, which I also cannot link to since I can’t get to the site).</p> <p>That word — “security” — is one of <em>the</em> most misunderstood and misused terms in modern times in many contexts. The context for the use here is cybersecurity and since CRAN (and others in the R community) seem to equate transport-layer uber-obfuscation with actual security/safety I thought it would be useful for R users in general to get a more complete picture of these so-called “secure” hosts. I also did this since I had to figure out another way to continue to have a CRAN mirror and needed to validate which nodes both supported + allowed mirroring and were at least somewhat trustworthy.</p> <p>Unless there is something truly egregious in a given section I’m just going to present data with some commentary (I’m unamused abt being blocked so some commentary has an unusually sharp edge) and refrain from stating “X is 👍|👎” since the goal is really to help <strong>you</strong> make the best decision of which mirror to use on your own.</p> <p>The full Rproj supporting the snippets in this post (and including the data gathered by the post) can be found in my new <a href="https://git.rud.is/r-blog-projects/cran-mirror-security">R blog projects</a>.</p> <p>We’re going to need a few supporting packages so let’s get those out of the way:</p> <pre><code class="language-r">library(xml2) library(httr) library(curl) library(stringi) library(urltools) library(ipinfo) # install.packages("ipinfo", repos = "https://cinc.rud.is/") library(openssl) library(furrr) library(vershist) # install.packages("vershist", repos = "https://cinc.rud.is/") library(ggalt) library(ggbeeswarm) library(hrbrthemes) library(tidyverse) </code></pre> <h3>What Is “Secure”?</h3> <p>As noted, CRAN folks seem to think encryption == security since the criteria for making that claim in the R Journal was transport-layer encryption for <code>rsync</code> (via <code>ssh</code>) mirroring from CRAN to a downstream mirror and a downstream mirror providing an <code>https</code> transport for shuffling package binaries and sources from said mirror to your local system(s). I find that equally as adorable as I do the rhetoric from the Let’s Encrypt cabal as this <code>https</code> gets you:</p> <ul> <li><em>in theory</em> protection from person-in-the-middle attacks that could otherwise fiddle with the package bits in transport</li> <li>protection from your organization or ISP knowing what specific package you were grabbing; note that <strong>unless you’ve got a setup where your DNS requests are also encrypted</strong> the entity that controls your transport layer <em>does indeed know exactly where you’re going</em>.</li> </ul> <p>and…that’s about it.</p> <p>The soon-to-be-gone-and-formerly-green-in-most-browsers lock icon alone tells you <em>nothing</em> about the configuration of any site you’re connecting to and using <code>rsync</code> over <code>ssh</code> provides no assurance as to what else is on the CRAN mirror server(s), what else is using the mirror server(s), how many admins/users have shell access to those system(s) nor anything else about the cyber hygiene of those systems.</p> <p>So, we’re going to look at (not necessarily in this order & non-exhaustively since this isn’t a penetration test and only lightweight introspection has been performed):</p> <ul> <li>how many servers are involved in a given mirror URL</li> <li>SSL certificate information including issuer, strength, and just how many other domains can use the cert</li> <li>the actual server SSL transport configuration to see just how many CRAN mirrors have HIGH or CRITICAL SSL configuration issues</li> <li>use (or lack thereof) HTTP “security” headers (I mean, the server is supposed to be “secure”, right?)</li> <li>how much other “junk” is running on a given CRAN mirror (the more running services the greater the attack surface)</li> </ul> <p>We’ll use R for <em>most</em> of this, too (I’m likely never going to rewrite longstanding SSL testers in/for R).</p> <p>Let’s dig in.</p> <h3>Acquiring Most of the Metadata</h3> <p>It can take a little while to run some of the data gathering steps so the project repo includes the already-gathered data. But, we’ll show the work on the first bit of reconnaissance which involves:</p> <ul> <li>Slurping the SSL certificate from the first server in each CRAN mirror entry (again, I can’t link to the mirror page because I literally can’t see CRAN or the main R site anymore)</li> <li>Performing an <code>HTTP</code> <code>HEAD</code> request (to minimize server bandwidth & CPU usage) of the full CRAN mirror URL (we have to since load balancers or proxies could re-route us to a completely different server otherwise)</li> <li>Getting an IP address for each CRAN mirror</li> <li>Getting metadata about that IP address</li> </ul> <p>This all done below:</p> <pre><code class="language-r">if (!file.exists(here::here("data/mir-dat.rds"))) { mdoc <- xml2::read_xml(here::here("data/mirrors.html"), as_html = TRUE) xml_find_all(mdoc, ".//td/a[contains(@href, 'https')]") %>% xml_attr("href") %>% unique() -> ssl_mirrors plan(multiprocess) # safety first dl_cert <- possibly(openssl::download_ssl_cert, NULL) HEAD_ <- possibly(httr::HEAD, NULL) dig <- possibly(curl::nslookup, NULL) query_ip_ <- possibly(ipinfo::query_ip, NULL) ssl_mirrors %>% future_map(~{ host <- domain(.x) ip <- dig(host, TRUE) ip_info <- if (length(ip)) query_ip_(ip) else NULL list( host = host, cert = dl_cert(host), head = HEAD_(.x), ip = ip, ip_info = ip_info ) }) -> mir_dat saveRDS(mir_dat, here::here("data/mir-dat.rds")) } else { mir_dat <- readRDS(here::here("data/mir-dat.rds")) } # take a look str(mir_dat[1], 3) ## List of 1 ## $ :List of 5 ## ..$ host : chr "cloud.r-project.org" ## ..$ cert :List of 4 ## .. ..$ :List of 8 ## .. ..$ :List of 8 ## .. ..$ :List of 8 ## .. ..$ :List of 8 ## ..$ head :List of 10 ## .. ..$ url : chr "https://cloud.r-project.org/" ## .. ..$ status_code: int 200 ## .. ..$ headers :List of 13 ## .. .. ..- attr(*, "class")= chr [1:2] "insensitive" "list" ## .. ..$ all_headers:List of 1 ## .. ..$ cookies :'data.frame': 0 obs. of 7 variables: ## .. ..$ content : raw(0) ## .. ..$ date : POSIXct[1:1], format: "2018-11-29 09:41:27" ## .. ..$ times : Named num [1:6] 0 0.0507 0.0512 0.0666 0.0796 ... ## .. .. ..- attr(*, "names")= chr [1:6] "redirect" "namelookup" "connect" "pretransfer" ... ## .. ..$ request :List of 7 ## .. .. ..- attr(*, "class")= chr "request" ## .. ..$ handle :Class 'curl_handle' <externalptr> ## .. ..- attr(*, "class")= chr "response" ## ..$ ip : chr "52.85.89.62" ## ..$ ip_info:List of 8 ## .. ..$ ip : chr "52.85.89.62" ## .. ..$ hostname: chr "server-52-85-89-62.jfk6.r.cloudfront.net" ## .. ..$ city : chr "Seattle" ## .. ..$ region : chr "Washington" ## .. ..$ country : chr "US" ## .. ..$ loc : chr "47.6348,-122.3450" ## .. ..$ postal : chr "98109" ## .. ..$ org : chr "AS16509 Amazon.com, Inc." </code></pre> <p>Note that two sites failed to respond so they were excluded from all analyses.</p> <h3>A Gratuitous Map of “Secure” CRAN Servers</h3> <p>Since <a href="https://ipinfo.io/">ipinfo.io</a>‘s API returns lat/lng geolocation information why not start with a map (since that’s going to be the kindest section of this post):</p> <pre><code class="language-r">maps::map("world", ".", exact = FALSE, plot = FALSE, fill = TRUE) %>% fortify() %>% filter(region != "Antarctica") -> world map_chr(mir_dat, ~.x$ip_info$loc) %>% stri_split_fixed(pattern = ",", n = 2, simplify = TRUE) %>% as.data.frame(stringsAsFactors = FALSE) %>% as_tibble() %>% mutate_all(list(as.numeric)) -> wheres_cran ggplot() + ggalt::geom_cartogram( data = world, map = world, aes(long, lat, map_id=region), color = ft_cols$gray, size = 0.125 ) + geom_point( data = wheres_cran, aes(V2, V1), size = 2, color = ft_cols$slate, fill = alpha(ft_cols$yellow, 3/4), shape = 21 ) + ggalt::coord_proj("+proj=wintri") + labs( x = NULL, y = NULL, title = "Geolocation of HTTPS-'enabled' CRAN Mirrors" ) + theme_ft_rc(grid="") + theme(axis.text = element_blank()) </code></pre> <p><a href="https://i0.wp.com/rud.is/b/wp-content/uploads/2019/03/cran-map-1.png?ssl=1"><img data-recalc-dims="1" loading="lazy" decoding="async" data-attachment-id="12017" data-permalink="https://rud.is/b/2019/03/03/cran-mirror-security/cran-map-1/" data-orig-file="https://i0.wp.com/rud.is/b/wp-content/uploads/2019/03/cran-map-1.png?fit=1920%2C1152&ssl=1" data-orig-size="1920,1152" data-comments-opened="1" data-image-meta="{"aperture":"0","credit":"","camera":"","caption":"","created_timestamp":"0","copyright":"","focal_length":"0","iso":"0","shutter_speed":"0","title":"","orientation":"0"}" data-image-title="cran-map-1" data-image-description="" data-image-caption="" data-medium-file="https://i0.wp.com/rud.is/b/wp-content/uploads/2019/03/cran-map-1.png?fit=300%2C180&ssl=1" data-large-file="https://i0.wp.com/rud.is/b/wp-content/uploads/2019/03/cran-map-1.png?fit=510%2C306&ssl=1" src="https://i0.wp.com/rud.is/b/wp-content/uploads/2019/03/cran-map-1.png?resize=510%2C306&ssl=1" alt="" width="510" height="306" class="aligncenter size-full wp-image-12017" /></a></p> <h3>Shakesperian Security</h3> <p><em>What’s in a [Subject Alternative] name? That which we call a site secure. By using dozens of other names would smell as not really secure at all?</em> —Hackmeyo & Pwndmeyet (II, ii, 1-2)</p> <p>The average internet user likely has no idea that one SSL certificate can front a gazillion sites. I’m not just talking a wildcard cert (e.g. using <code>*.rud.is</code> for all <code>rud.is</code> subdomains which I try not to do for many reasons), I’m talking dozens of <a href="https://tools.ietf.org/html/rfc5280#section-4.2.1.6">subject alternative names</a>. Let’s examine some data since an example is better than blathering:</p> <pre><code class="language-r"># extract some of the gathered metadata into a data frame map_df(mir_dat, ~{ tibble( host = .x$host, s_issuer = .x$cert[[1]]$issuer %||% NA_character_, i_issuer = .x$cert[[2]]$issuer %||% NA_character_, algo = .x$cert[[1]]$algorithm %||% NA_character_, names = .x$cert[[1]]$alt_names %||% NA_character_, nm_ct = length(.x$cert[[1]]$alt_names), key_size = .x$cert[[1]]$pubkey$size %||% NA_integer_ ) }) -> certs certs <- filter(certs, complete.cases(certs)) count(certs, host, sort=TRUE) %>% ggplot() + geom_quasirandom( aes("", n), size = 2, color = ft_cols$slate, fill = alpha(ft_cols$yellow, 3/4), shape = 21 ) + scale_y_comma() + labs( x = NULL, y = "# Servers", title = "Distribution of the number of alt-names in CRAN mirror certificates" ) + theme_ft_rc(grid="Y") </code></pre> <p><a href="https://i0.wp.com/rud.is/b/wp-content/uploads/2019/03/alt-names-ct-1.png?ssl=1"><img data-recalc-dims="1" loading="lazy" decoding="async" data-attachment-id="12021" data-permalink="https://rud.is/b/2019/03/03/cran-mirror-security/alt-names-ct-1/" data-orig-file="https://i0.wp.com/rud.is/b/wp-content/uploads/2019/03/alt-names-ct-1.png?fit=1536%2C1152&ssl=1" data-orig-size="1536,1152" data-comments-opened="1" data-image-meta="{"aperture":"0","credit":"","camera":"","caption":"","created_timestamp":"0","copyright":"","focal_length":"0","iso":"0","shutter_speed":"0","title":"","orientation":"0"}" data-image-title="alt-names-ct-1" data-image-description="" data-image-caption="" data-medium-file="https://i0.wp.com/rud.is/b/wp-content/uploads/2019/03/alt-names-ct-1.png?fit=300%2C225&ssl=1" data-large-file="https://i0.wp.com/rud.is/b/wp-content/uploads/2019/03/alt-names-ct-1.png?fit=510%2C383&ssl=1" src="https://i0.wp.com/rud.is/b/wp-content/uploads/2019/03/alt-names-ct-1.png?resize=510%2C383&ssl=1" alt="" width="510" height="383" class="aligncenter size-full wp-image-12021" /></a></p> <p><em>Most</em> only front a couple but there are some with a <em>crazy</em> amount of domains. We can look at a slice of <code>cran.cnr.berkeley.edu</code>:</p> <pre><code class="language-r">filter(certs, host == "cran.cnr.berkeley.edu") %>% select(names) %>% head(20) </code></pre> <table> <thead> <tr> <th align="left">names</th> </tr> </thead> <tbody> <tr> <td align="left">nature.berkeley.edu</td> </tr> <tr> <td align="left">ag-labor.cnr.berkeley.edu</td> </tr> <tr> <td align="left">agro-laboral.cnr.berkeley.edu</td> </tr> <tr> <td align="left">agroecology.berkeley.edu</td> </tr> <tr> <td align="left">anthoff.erg.berkeley.edu</td> </tr> <tr> <td align="left">are-dev.cnr.berkeley.edu</td> </tr> <tr> <td align="left">are-prod.cnr.berkeley.edu</td> </tr> <tr> <td align="left">are-qa.cnr.berkeley.edu</td> </tr> <tr> <td align="left">are.berkeley.edu</td> </tr> <tr> <td align="left">arebeta.berkeley.edu</td> </tr> <tr> <td align="left">areweb.berkeley.edu</td> </tr> <tr> <td align="left">atkins-dev.cnr.berkeley.edu</td> </tr> <tr> <td align="left">atkins-prod.cnr.berkeley.edu</td> </tr> <tr> <td align="left">atkins-qa.cnr.berkeley.edu</td> </tr> <tr> <td align="left">atkins.berkeley.edu</td> </tr> <tr> <td align="left">bakerlab-dev.cnr.berkeley.edu</td> </tr> <tr> <td align="left">bakerlab-prod.cnr.berkeley.edu</td> </tr> <tr> <td align="left">bakerlab-qa.cnr.berkeley.edu</td> </tr> <tr> <td align="left">bamg.cnr.berkeley.edu</td> </tr> <tr> <td align="left">beahrselp-dev.cnr.berkeley.edu</td> </tr> </tbody> </table> <p>The project repo has some more examples and you can examine as many as you like.</p> <p>For some CRAN mirrors the certificate is used all over the place at the hosting organization. That alone isn’t bad, but organizations are generally terrible at protecting the secrets associated with certificate generation (just look at how many Google/Apple app store apps are found monthly to be using absconded-with enterprise certs) and since each server with these uber-certs has copies of public & private bits users had better hope that mal-intentioned ne’er-do-wells do not get copies of them (making it easier to impersonate any one of those, especially if an attacker controls DNS).</p> <p>This Berkeley uber-cert is also kinda cute since it mixes alt-names for dev, prod & qa systems across may different apps/projects (dev systems are notoriously maintained improperly in virtually every organization).</p> <p>There <em>are</em> legitimate reasons and circumstances for wildcard certs and taking advantage of SANs. You can examine what other CRAN mirrors do and judge for yourself which ones are Doing It Kinda OK.</p> <h3>Size (and Algorithm) Matters</h3> <p>In some crazy twist of pleasant surprises most of the mirrors seem to do OK when it comes to the algorithm and key size used for the certificate(s):</p> <pre><code class="language-r">distinct(certs, host, algo, key_size) %>% count(algo, key_size, sort=TRUE) </code></pre> <table> <thead> <tr> <th>algo</th> <th align="right">key_size</th> <th align="right">n</th> </tr> </thead> <tbody> <tr> <td>sha256WithRSAEncryption</td> <td align="right">2048</td> <td align="right">59</td> </tr> <tr> <td>sha256WithRSAEncryption</td> <td align="right">4096</td> <td align="right">13</td> </tr> <tr> <td>ecdsa-with-SHA256</td> <td align="right">256</td> <td align="right">2</td> </tr> <tr> <td>sha256WithRSAEncryption</td> <td align="right">256</td> <td align="right">1</td> </tr> <tr> <td>sha256WithRSAEncryption</td> <td align="right">384</td> <td align="right">1</td> </tr> <tr> <td>sha512WithRSAEncryption</td> <td align="right">2048</td> <td align="right">1</td> </tr> <tr> <td>sha512WithRSAEncryption</td> <td align="right">4096</td> <td align="right">1</td> </tr> </tbody> </table> <p>You can go to the mirror list and hit up <a href="https://www.ssllabs.com/ssltest/">SSL Labs Interactive Server Test</a> (which has links to many ‘splainers) or use the <a href="https://git.rud.is/hrbrmstr/ssllabs"><code>ssllabs</code>🔗</a> R package to get the grade of each site. I dig into the state of config and transport issues below but will suggest that you stick with sites with ecdsa certs or sha256 and higher numbers if you want a general, quick bit of guidance.</p> <h3>Where Do They Get All These <strike>Wonderful</strike> Certs?</h3> <p>Certs come from somewhere. You can self-generate play ones, setup your own internal/legit certificate authority and augment trust chains, or go to a bona-fide certificate authority to get a certificate.</p> <p>Your browsers and operating systems have a built-in set of certificate authorities they trust and you can use <a href="https://git.rud.is/hrbrmstr/ssllabs/src/branch/master/R/root.r"><code>ssllabs::get_root_certs()</code>🔗</a> to see an up-to-date list of ones for Mozilla, Apple, Android, Java & Windows. In the age of Let’s Encrypt, certificates have almost no monetary value and virtually no integrity value so where they come from isn’t <em>as</em> important as it used to be, but it’s kinda fun to poke at it anyway:</p> <pre><code class="language-r">distinct(certs, host, i_issuer) %>% count(i_issuer, sort = TRUE) %>% head(28) </code></pre> <table> <thead> <tr> <th align="left">i_issuer</th> <th align="right">n</th> </tr> </thead> <tbody> <tr> <td align="left">CN=DST Root CA X3,O=Digital Signature Trust Co.</td> <td align="right">20</td> </tr> <tr> <td align="left">CN=COMODO RSA Certification Authority,O=COMODO CA Limited,L=Salford,ST=Greater Manchester,C=GB</td> <td align="right">7</td> </tr> <tr> <td align="left">CN=DigiCert Assured ID Root CA,OU=www.digicert.com,O=DigiCert Inc,C=US</td> <td align="right">7</td> </tr> <tr> <td align="left">CN=DigiCert Global Root CA,OU=www.digicert.com,O=DigiCert Inc,C=US</td> <td align="right">6</td> </tr> <tr> <td align="left">CN=DigiCert High Assurance EV Root CA,OU=www.digicert.com,O=DigiCert Inc,C=US</td> <td align="right">6</td> </tr> <tr> <td align="left">CN=QuoVadis Root CA 2 G3,O=QuoVadis Limited,C=BM</td> <td align="right">5</td> </tr> <tr> <td align="left">CN=USERTrust RSA Certification Authority,O=The USERTRUST Network,L=Jersey City,ST=New Jersey,C=US</td> <td align="right">5</td> </tr> <tr> <td align="left">CN=GlobalSign Root CA,OU=Root CA,O=GlobalSign nv-sa,C=BE</td> <td align="right">4</td> </tr> <tr> <td align="left">CN=Trusted Root CA SHA256 G2,O=GlobalSign nv-sa,OU=Trusted Root,C=BE</td> <td align="right">3</td> </tr> <tr> <td align="left">CN=COMODO ECC Certification Authority,O=COMODO CA Limited,L=Salford,ST=Greater Manchester,C=GB</td> <td align="right">2</td> </tr> <tr> <td align="left">CN=DFN-Verein PCA Global – G01,OU=DFN-PKI,O=DFN-Verein,C=DE</td> <td align="right">2</td> </tr> <tr> <td align="left">OU=Security Communication RootCA2,O=SECOM Trust Systems CO.\,LTD.,C=JP</td> <td align="right">2</td> </tr> <tr> <td align="left">CN=AddTrust External CA Root,OU=AddTrust External TTP Network,O=AddTrust AB,C=SE</td> <td align="right">1</td> </tr> <tr> <td align="left">CN=Amazon Root CA 1,O=Amazon,C=US</td> <td align="right">1</td> </tr> <tr> <td align="left">CN=Baltimore CyberTrust Root,OU=CyberTrust,O=Baltimore,C=IE</td> <td align="right">1</td> </tr> <tr> <td align="left">CN=Certum Trusted Network CA,OU=Certum Certification Authority,O=Unizeto Technologies S.A.,C=PL</td> <td align="right">1</td> </tr> <tr> <td align="left">CN=DFN-Verein Certification Authority 2,OU=DFN-PKI,O=Verein zur Foerderung eines Deutschen Forschungsnetzes e. V.,C=DE</td> <td align="right">1</td> </tr> <tr> <td align="left">CN=Go Daddy Root Certificate Authority – G2,O=GoDaddy.com\, Inc.,L=Scottsdale,ST=Arizona,C=US</td> <td align="right">1</td> </tr> <tr> <td align="left">CN=InCommon RSA Server CA,OU=InCommon,O=Internet2,L=Ann Arbor,ST=MI,C=US</td> <td align="right">1</td> </tr> <tr> <td align="left">CN=QuoVadis Root CA 2,O=QuoVadis Limited,C=BM</td> <td align="right">1</td> </tr> <tr> <td align="left">CN=QuoVadis Root Certification Authority,OU=Root Certification Authority,O=QuoVadis Limited,C=BM</td> <td align="right">1</td> </tr> </tbody> </table> <p>That first one is Let’s Encrypt, which is not unexpected since they’re free and super easy to setup/maintain (especially for phishing campaigns).</p> <p>A “fun” exercise might be to Google/DDG around for historical compromises tied to these CAs (look in the subject ones too if you’re playing with the data at home) and see what, eh, <em>issues</em> they’ve had.</p> <p>You might want to keep more of an eye on this whole “boring” CA bit, too, since some trust stores are <a href="https://www.zdnet.com/article/surveillance-firm-asks-mozilla-to-be-included-in-firefoxs-certificate-whitelist/">noodling on the idea of trusting surveillance firms</a> and you never know what Microsoft or Google is going to do to placate authoritarian regimes and allow into their trust stores.</p> <p>At this point in the exercise you’ve got</p> <ul> <li>how many domains a certificate fronts</li> <li>certificate strength </li> <li>certificate birthplace</li> </ul> <p>to use when formulating your own decision on what CRAN mirror to use.</p> <p>But, as noted, certificate breeding is not enough. Let’s dive into the next areas.</p> <h3>It’s In The Way That You Use It</h3> <p>You can’t just look at a cert to evaluate site security. Sure, you can spend 4 days and use the aforementioned <code>ssllabs</code> package to get the rating for each cert (well, if they’ve been cached then an API call won’t be an assessment so you can prime the cache with 4 other ppl in one day and then everyone else can use the cached values and not burn the rate limit) or go one-by-one in the SSL Labs test site, but we can also use a tool like <a href="https://github.com/drwetter/testssl.sh"><code>testssl.sh</code>🔗</a> to gather technical data via interactive protocol examination.</p> <p>I’m being a bit harsh in this post, so fair’s fair and here are <a href="https://rud.is/dl/rud.is-testssl.sh.txt">the plaintext results from my own run of <code>testssl.sh</code> for <code>rud.is</code></a> along with <a href="https://www.ssllabs.com/ssltest/analyze.html?d=rud.is&latest">ones from Qualys</a>:</p> <p><a href="https://i0.wp.com/rud.is/b/wp-content/uploads/2019/03/rud.is-ssllabs.png?ssl=1"><img data-recalc-dims="1" loading="lazy" decoding="async" data-attachment-id="12032" data-permalink="https://rud.is/b/2019/03/03/cran-mirror-security/rud-is-ssllabs/" data-orig-file="https://i0.wp.com/rud.is/b/wp-content/uploads/2019/03/rud.is-ssllabs.png?fit=2124%2C722&ssl=1" data-orig-size="2124,722" data-comments-opened="1" data-image-meta="{"aperture":"0","credit":"","camera":"","caption":"","created_timestamp":"0","copyright":"","focal_length":"0","iso":"0","shutter_speed":"0","title":"","orientation":"0"}" data-image-title="rud.is-ssllabs" data-image-description="" data-image-caption="" data-medium-file="https://i0.wp.com/rud.is/b/wp-content/uploads/2019/03/rud.is-ssllabs.png?fit=300%2C102&ssl=1" data-large-file="https://i0.wp.com/rud.is/b/wp-content/uploads/2019/03/rud.is-ssllabs.png?fit=510%2C173&ssl=1" src="https://i0.wp.com/rud.is/b/wp-content/uploads/2019/03/rud.is-ssllabs.png?resize=510%2C173&ssl=1" alt="" width="510" height="173" class="aligncenter size-full wp-image-12032" /></a></p> <p>As you can see in the detail pages, I am having an issue with the provider of my <code>.is</code> domain (severe limitation on DNS record counts and types) so I fail CAA checks because I literally can’t add an entry for it nor can I use a different nameserver. Feel encouraged to pick nits about that tho as that should provide sufficient impetus to take two weeks of IRL time and some USD to actually get it transferred (yay. international. domain. providers.)</p> <p>The project repo has all the results from a weekend run on the CRAN mirrors. No special options were chosen for the runs.</p> <pre><code class="language-r">list.files(here::here("data/ssl"), "json$", full.names = TRUE) %>% map_df(jsonlite::fromJSON) %>% as_tibble() -> ssl_tests # filter only fields we want to show and get them in order sev <- c("OK", "LOW", "MEDIUM", "HIGH", "WARN", "CRITICAL") group_by(ip) %>% count(severity) %>% ungroup() %>% complete(ip = unique(ip), severity = sev) %>% mutate(severity = factor(severity, levels = sev)) %>% # order left->right by severity arrange(ip) %>% mutate(ip = factor(ip, levels = rev(unique(ip)))) %>% # order alpha by mirror name so it's easier to ref ggplot(aes(severity, ip, fill=n)) + geom_tile(color = "#b2b2b2", size = 0.125) + scale_x_discrete(name = NULL, expand = c(0,0.1), position = "top") + scale_y_discrete(name = NULL, expand = c(0,0)) + viridis::scale_fill_viridis( name = "# Tests", option = "cividis", na.value = ft_cols$gray ) + labs( title = "CRAN Mirror SSL Test Summary Findings by Severity" ) + theme_ft_rc(grid="") + theme(axis.text.y = element_text(size = 8, family = "mono")) -> gg # We're going to move the title vs have too wide of a plot gb <- ggplot2::ggplotGrob(gg) gb$layout$l[gb$layout$name %in% "title"] <- 2 grid::grid.newpage() grid::grid.draw(gb) </code></pre> <p><a href="https://i0.wp.com/rud.is/b/wp-content/uploads/2019/03/testssl-1.png?ssl=1"><img data-recalc-dims="1" loading="lazy" decoding="async" data-attachment-id="12034" data-permalink="https://rud.is/b/2019/03/03/cran-mirror-security/testssl-1/" data-orig-file="https://i0.wp.com/rud.is/b/wp-content/uploads/2019/03/testssl-1.png?fit=1536%2C2304&ssl=1" data-orig-size="1536,2304" data-comments-opened="1" data-image-meta="{"aperture":"0","credit":"","camera":"","caption":"","created_timestamp":"0","copyright":"","focal_length":"0","iso":"0","shutter_speed":"0","title":"","orientation":"0"}" data-image-title="testssl-1" data-image-description="" data-image-caption="" data-medium-file="https://i0.wp.com/rud.is/b/wp-content/uploads/2019/03/testssl-1.png?fit=200%2C300&ssl=1" data-large-file="https://i0.wp.com/rud.is/b/wp-content/uploads/2019/03/testssl-1.png?fit=510%2C765&ssl=1" src="https://i0.wp.com/rud.is/b/wp-content/uploads/2019/03/testssl-1.png?resize=510%2C765&ssl=1" alt="" width="510" height="765" class="aligncenter size-full wp-image-12034" /></a></p> <p>Thankfully <em>most</em> SSL checks come back OK. Unfortunately, many do not:</p> <pre><code class="language-r">filter(ssl_tests,severity == "HIGH") %>% count(id, sort = TRUE) </code></pre> <table> <thead> <tr> <th align="left">id</th> <th align="right">n</th> </tr> </thead> <tbody> <tr> <td align="left">BREACH</td> <td align="right">42</td> </tr> <tr> <td align="left">cipherlist_3DES_IDEA</td> <td align="right">37</td> </tr> <tr> <td align="left">cipher_order</td> <td align="right">34</td> </tr> <tr> <td align="left">RC4</td> <td align="right">16</td> </tr> <tr> <td align="left">cipher_negotiated</td> <td align="right">10</td> </tr> <tr> <td align="left">LOGJAM-common_primes</td> <td align="right">9</td> </tr> <tr> <td align="left">POODLE_SSL</td> <td align="right">6</td> </tr> <tr> <td align="left">SSLv3</td> <td align="right">6</td> </tr> <tr> <td align="left">cert_expiration_status</td> <td align="right">1</td> </tr> <tr> <td align="left">cert_notAfter</td> <td align="right">1</td> </tr> <tr> <td align="left">fallback_SCSV</td> <td align="right">1</td> </tr> <tr> <td align="left">LOGJAM</td> <td align="right">1</td> </tr> <tr> <td align="left">secure_client_renego</td> <td align="right">1</td> </tr> </tbody> </table> <pre><code class="language-r">filter(ssl_tests,severity == "CRITICAL") %>% count(id, sort = TRUE) </code></pre> <table> <thead> <tr> <th align="left">id</th> <th align="right">n</th> </tr> </thead> <tbody> <tr> <td align="left">cipherlist_LOW</td> <td align="right">16</td> </tr> <tr> <td align="left">TLS1_1</td> <td align="right">5</td> </tr> <tr> <td align="left">CCS</td> <td align="right">2</td> </tr> <tr> <td align="left">cert_chain_of_trust</td> <td align="right">1</td> </tr> <tr> <td align="left">cipherlist_aNULL</td> <td align="right">1</td> </tr> <tr> <td align="left">cipherlist_EXPORT</td> <td align="right">1</td> </tr> <tr> <td align="left">DROWN</td> <td align="right">1</td> </tr> <tr> <td align="left">FREAK</td> <td align="right">1</td> </tr> <tr> <td align="left">ROBOT</td> <td align="right">1</td> </tr> <tr> <td align="left">SSLv2</td> <td align="right">1</td> </tr> </tbody> </table> <p>Some CRAN mirror site admins aren’t keeping up with secure SSL configurations. If you’re not familiar with some of the acronyms here are a few (fairly layman-friendly) links:</p> <ul> <li><a href="https://en.wikipedia.org/wiki/BREACH">BREACH</a></li> <li><a href="https://en.wikipedia.org/wiki/DROWN_attack">DROWN</a></li> <li><a href="https://en.wikipedia.org/wiki/FREAK">FREAK</a></li> <li><a href="https://en.wikipedia.org/wiki/Logjam_(computer_security)">LOGJAM</a></li> <li><a href="https://en.wikipedia.org/wiki/POODLE">POODLE</a></li> <li><a href="https://robotattack.org/">ROBOT</a></li> </ul> <p>You’d be hard-pressed to have me say that the presence of these is the end of the world (I mean, you’re trusting random servers to provide packages for you which may run in secure enclaves on production code, so how important can this really be?) but I also wouldn’t attach the word “secure” to any CRAN mirror with HIGH or CRITICAL SSL configuration weaknesses.</p> <h3>Getting Ahead[er] Of Myself</h3> <p>We did the <code>httr::HEAD()</code> request primarily to capture HTTP headers. And, we <em>definitely</em> got some!</p> <pre><code class="language-r">map_df(mir_dat, ~{ if (length(.x$head$headers) == 0) return(NULL) host <- .x$host flatten_df(.x$head$headers) %>% gather(name, value) %>% mutate(host = host) }) -> hdrs count(hdrs, name, sort=TRUE) %>% head(nrow(.)) </code></pre> <table> <thead> <tr> <th align="left">name</th> <th align="right">n</th> </tr> </thead> <tbody> <tr> <td align="left">content-type</td> <td align="right">79</td> </tr> <tr> <td align="left">date</td> <td align="right">79</td> </tr> <tr> <td align="left">server</td> <td align="right">79</td> </tr> <tr> <td align="left">last-modified</td> <td align="right">72</td> </tr> <tr> <td align="left">content-length</td> <td align="right">67</td> </tr> <tr> <td align="left">accept-ranges</td> <td align="right">65</td> </tr> <tr> <td align="left">etag</td> <td align="right">65</td> </tr> <tr> <td align="left">content-encoding</td> <td align="right">38</td> </tr> <tr> <td align="left">connection</td> <td align="right">28</td> </tr> <tr> <td align="left">vary</td> <td align="right">28</td> </tr> <tr> <td align="left">strict-transport-security</td> <td align="right">13</td> </tr> <tr> <td align="left">x-frame-options</td> <td align="right">8</td> </tr> <tr> <td align="left">x-content-type-options</td> <td align="right">7</td> </tr> <tr> <td align="left">cache-control</td> <td align="right">4</td> </tr> <tr> <td align="left">expires</td> <td align="right">3</td> </tr> <tr> <td align="left">x-xss-protection</td> <td align="right">3</td> </tr> <tr> <td align="left">cf-ray</td> <td align="right">2</td> </tr> <tr> <td align="left">expect-ct</td> <td align="right">2</td> </tr> <tr> <td align="left">set-cookie</td> <td align="right">2</td> </tr> <tr> <td align="left">via</td> <td align="right">2</td> </tr> <tr> <td align="left">ms-author-via</td> <td align="right">1</td> </tr> <tr> <td align="left">pragma</td> <td align="right">1</td> </tr> <tr> <td align="left">referrer-policy</td> <td align="right">1</td> </tr> <tr> <td align="left">upgrade</td> <td align="right">1</td> </tr> <tr> <td align="left">x-amz-cf-id</td> <td align="right">1</td> </tr> <tr> <td align="left">x-cache</td> <td align="right">1</td> </tr> <tr> <td align="left">x-permitted-cross-domain</td> <td align="right">1</td> </tr> <tr> <td align="left">x-powered-by</td> <td align="right">1</td> </tr> <tr> <td align="left">x-robots-tag</td> <td align="right">1</td> </tr> <tr> <td align="left">x-tuna-mirror-id</td> <td align="right">1</td> </tr> <tr> <td align="left">x-ua-compatible</td> <td align="right">1</td> </tr> </tbody> </table> <p>There are a handful of <a href="https://securityheaders.com/">“security” headers</a> that kinda matter so we’ll see how many “secure” CRAN mirrors use “security” headers:</p> <pre><code class="language-r">c( "content-security-policy", "x-frame-options", "x-xss-protection", "x-content-type-options", "strict-transport-security", "referrer-policy" ) -> secure_headers count(hdrs, name, sort=TRUE) %>% filter(name %in% secure_headers) </code></pre> <table> <thead> <tr> <th align="left">name</th> <th align="right">n</th> </tr> </thead> <tbody> <tr> <td align="left">strict-transport-security</td> <td align="right">13</td> </tr> <tr> <td align="left">x-frame-options</td> <td align="right">8</td> </tr> <tr> <td align="left">x-content-type-options</td> <td align="right">7</td> </tr> <tr> <td align="left">x-xss-protection</td> <td align="right">3</td> </tr> <tr> <td align="left">referrer-policy</td> <td align="right">1</td> </tr> </tbody> </table> <p>I’m honestly shocked any were in use but only a handful or two are using even one “security” header. <code>cran.csiro.au</code> uses all five of the above so good on ya Commonwealth Scientific and Industrial Research Organisation!</p> <p>I keep putting the word “security” in quotes as R does nothing with these headers when you do an <code>install.packages()</code>. As a whole they’re important but mostly when it comes to your safety when browsing those CRAN mirrors.</p> <p>I would have liked to have seen at <em>least</em> one with some <code>Content-Security-Policy</code> header, but a girl can at least dream.</p> <h3>Version Aversion</h3> <p>There’s another HTTP response header we can look at, the <code>Server</code> one which is generally there to help attackers figure out whether they should target you further for HTTP server and application attacks. No, I mean it! Back in the day when geeks rules the internets — and it wasn’t just a platform for cat pictures and pwnd IP cameras — things like the <code>Server</code> header were cool because it might help us create server-specific interactions and build cool stuff. Yes, modern day REST APIs are likely better in the long run but the naiveté of the silver age of the internet was definitely something special (and also led to the chaos we have now). But, I digress.</p> <p>In theory, no HTTP server in it’s rightly configured digital mind would tell you what it’s running down to the version level, but most do. (Again, feel free to pick nits that I let the world know I run <code>nginx</code>…or <em>do</em> I). Assuming the CRAN mirrors haven’t been configured to deceive attackers and report what folks told them to report we can survey what they run behind the browser window:</p> <pre><code class="language-r">filter(hdrs, name == "server") %>% separate( value, c("kind", "version"), sep="/", fill="right", extra="merge" ) -> svr count(svr, kind, sort=TRUE) </code></pre> <table> <thead> <tr> <th align="left">kind</th> <th align="right">n</th> </tr> </thead> <tbody> <tr> <td align="left">Apache</td> <td align="right">57</td> </tr> <tr> <td align="left">nginx</td> <td align="right">15</td> </tr> <tr> <td align="left">cloudflare</td> <td align="right">2</td> </tr> <tr> <td align="left">CSIRO</td> <td align="right">1</td> </tr> <tr> <td align="left">Hiawatha v10.8.4</td> <td align="right">1</td> </tr> <tr> <td align="left">High Performance 8bit Web Server</td> <td align="right">1</td> </tr> <tr> <td align="left">none</td> <td align="right">1</td> </tr> <tr> <td align="left">openresty</td> <td align="right">1</td> </tr> </tbody> </table> <p>I really hope Cloudflare is donating bandwidth vs charging these mirror sites. They’ve likely benefitted greatly from the diverse FOSS projects many of these sites serve. (I hadn’t said anything bad about Cloudflare yet so I had to get one in before the end).</p> <p>Lots run Apache (makes sense since CRAN-proper does too, not that I can validate that from home since I’m IP blocked…<em>bitter much, hrbrmstr</em>?) Many run nginx. CSIRO likely names their server that on purpose and hasn’t actually written their own web server. Hiawatha is, indeed, a valid web server. While there are also “high performance 8bit web servers” out there I’m willing to bet that’s a joke header value along with “none”. Finally, “openresty” is also a valid web server (it’s nginx++).</p> <p>We’ll pick on Apache and nginx and see how current patch levels are. Not all return a version number but a good chunk do:</p> <pre><code class="language-r">apache_httpd_version_history() %>% arrange(rls_date) %>% mutate( vers = factor(as.character(vers), levels = as.character(vers)) ) -> apa_all filter(svr, kind == "Apache") %>% filter(!is.na(version)) %>% mutate(version = stri_replace_all_regex(version, " .*$", "")) %>% count(version) %>% separate(version, c("maj", "min", "pat"), sep="\\.", convert = TRUE, fill = "right") %>% mutate(pat = ifelse(is.na(pat), 1, pat)) %>% mutate(v = sprintf("%s.%s.%s", maj, min, pat)) %>% mutate(v = factor(v, levels = apa_all$vers)) %>% arrange(v) -> apa_vers filter(apa_all, vers %in% apa_vers$v) %>% arrange(rls_date) %>% group_by(rls_year) %>% slice(1) %>% ungroup() %>% arrange(rls_date) -> apa_yrs ggplot() + geom_blank( data = apa_vers, aes(v, n) ) + geom_segment( data = apa_yrs, aes(vers, 0, xend=vers, yend=Inf), linetype = "dotted", size = 0.25, color = "white" ) + geom_segment( data = apa_vers, aes(v, n, xend=v, yend=0), color = ft_cols$gray, size = 8 ) + geom_label( data = apa_yrs, aes(vers, Inf, label = rls_year), family = font_rc, color = "white", fill = "#262a31", size = 4, vjust = 1, hjust = 0, nudge_x = 0.01, label.size = 0 ) + scale_y_comma(limits = c(0, 15)) + labs( x = "Apache Version #", y = "# Servers", title = "CRAN Mirrors Apache Version History" ) + theme_ft_rc(grid="Y") + theme(axis.text.x = element_text(family = "mono", size = 8, color = "white")) </code></pre> <p><a href="https://i0.wp.com/rud.is/b/wp-content/uploads/2019/03/apache-history-1.png?ssl=1"><img data-recalc-dims="1" loading="lazy" decoding="async" data-attachment-id="12037" data-permalink="https://rud.is/b/2019/03/03/cran-mirror-security/apache-history-1/" data-orig-file="https://i0.wp.com/rud.is/b/wp-content/uploads/2019/03/apache-history-1.png?fit=2400%2C960&ssl=1" data-orig-size="2400,960" data-comments-opened="1" data-image-meta="{"aperture":"0","credit":"","camera":"","caption":"","created_timestamp":"0","copyright":"","focal_length":"0","iso":"0","shutter_speed":"0","title":"","orientation":"0"}" data-image-title="apache-history-1" data-image-description="" data-image-caption="" data-medium-file="https://i0.wp.com/rud.is/b/wp-content/uploads/2019/03/apache-history-1.png?fit=300%2C120&ssl=1" data-large-file="https://i0.wp.com/rud.is/b/wp-content/uploads/2019/03/apache-history-1.png?fit=510%2C204&ssl=1" src="https://i0.wp.com/rud.is/b/wp-content/uploads/2019/03/apache-history-1.png?resize=510%2C204&ssl=1" alt="" width="510" height="204" class="aligncenter size-full wp-image-12037" /></a></p> <p>O_O</p> <p>I’ll let you decide if a six-year-old version of Apache indicates how well a mirror site is run or not. Sure, mitigations could be in place but I see no statement of efficacy on any site so we’ll go with #lazyadmin.</p> <p>But, it’s gotta be better with nginx, right? It’s all cool & modern!</p> <pre><code class="language-r">nginx_version_history() %>% arrange(rls_date) %>% mutate( vers = factor(as.character(vers), levels = as.character(vers)) ) -> ngx_all filter(svr, kind == "nginx") %>% filter(!is.na(version)) %>% mutate(version = stri_replace_all_regex(version, " .*$", "")) %>% count(version) %>% separate(version, c("maj", "min", "pat"), sep="\\.", convert = TRUE, fill = "right") %>% mutate(v = sprintf("%s.%s.%s", maj, min, pat)) %>% mutate(v = factor(v, levels = ngx_all$vers)) %>% arrange(v) -> ngx_vers filter(ngx_all, vers %in% ngx_vers$v) %>% arrange(rls_date) %>% group_by(rls_year) %>% slice(1) %>% ungroup() %>% arrange(rls_date) -> ngx_yrs ggplot() + geom_blank( data = ngx_vers, aes(v, n) ) + geom_segment( data = ngx_yrs, aes(vers, 0, xend=vers, yend=Inf), linetype = "dotted", size = 0.25, color = "white" ) + geom_segment( data = ngx_vers, aes(v, n, xend=v, yend=0), color = ft_cols$gray, size = 8 ) + geom_label( data = ngx_yrs, aes(vers, Inf, label = rls_year), family = font_rc, color = "white", fill = "#262a31", size = 4, vjust = 1, hjust = 0, nudge_x = 0.01, label.size = 0 ) + scale_y_comma(limits = c(0, 15)) + labs( x = "nginx Version #", y = "# Servers", title = "CRAN Mirrors nginx Version History" ) + theme_ft_rc(grid="Y") + theme(axis.text.x = element_text(family = "mono", color = "white")) </code></pre> <p><a href="https://i0.wp.com/rud.is/b/wp-content/uploads/2019/03/nginx-history-1.png?ssl=1"><img data-recalc-dims="1" loading="lazy" decoding="async" data-attachment-id="12039" data-permalink="https://rud.is/b/2019/03/03/cran-mirror-security/nginx-history-1/" data-orig-file="https://i0.wp.com/rud.is/b/wp-content/uploads/2019/03/nginx-history-1.png?fit=1536%2C960&ssl=1" data-orig-size="1536,960" data-comments-opened="1" data-image-meta="{"aperture":"0","credit":"","camera":"","caption":"","created_timestamp":"0","copyright":"","focal_length":"0","iso":"0","shutter_speed":"0","title":"","orientation":"0"}" data-image-title="nginx-history-1" data-image-description="" data-image-caption="" data-medium-file="https://i0.wp.com/rud.is/b/wp-content/uploads/2019/03/nginx-history-1.png?fit=300%2C188&ssl=1" data-large-file="https://i0.wp.com/rud.is/b/wp-content/uploads/2019/03/nginx-history-1.png?fit=510%2C319&ssl=1" src="https://i0.wp.com/rud.is/b/wp-content/uploads/2019/03/nginx-history-1.png?resize=510%2C319&ssl=1" alt="" width="510" height="319" class="aligncenter size-full wp-image-12039" /></a></p> <p>🤨</p> <p>I will at close out this penultimate section with a “thank you!” to the admins at Georg-August-Universität Göttingen and Yamagata University for keeping up with web server patches.</p> <h3>You Made It This Far</h3> <p>If I had known you’d read to the nigh bitter end I would have made cookies. You’ll have to just accept the ones the blog gives your browser (those ones taste taste pretty bland tho).</p> <p>The last lightweight element we’ll look at is “what else do these ‘secure’ CRAN mirrors run”?</p> <p>To do this, we’ll turn to <a href="https://opendata.rapid7.com/">Rapid7 OpenData</a> and look at what else is running on the IP addresses used by these CRAN mirrors. We already know some certs are promiscuous, so what about the servers themselves?</p> <pre><code class="language-r">cran_mirror_other_things <- readRDS(here::here("data/cran-mirror-other-things.rds")) # "top" 20 distinct(cran_mirror_other_things, ip, port) %>% count(ip, sort = TRUE) %>% head(20) </code></pre> <table> <thead> <tr> <th align="left">ip</th> <th align="right">n</th> </tr> </thead> <tbody> <tr> <td align="left">104.25.94.23</td> <td align="right">8</td> </tr> <tr> <td align="left">143.107.10.17</td> <td align="right">7</td> </tr> <tr> <td align="left">104.27.133.206</td> <td align="right">5</td> </tr> <tr> <td align="left">137.208.57.37</td> <td align="right">5</td> </tr> <tr> <td align="left">192.75.96.254</td> <td align="right">5</td> </tr> <tr> <td align="left">208.81.1.244</td> <td align="right">5</td> </tr> <tr> <td align="left">119.40.117.175</td> <td align="right">4</td> </tr> <tr> <td align="left">130.225.254.116</td> <td align="right">4</td> </tr> <tr> <td align="left">133.24.248.17</td> <td align="right">4</td> </tr> <tr> <td align="left">14.49.99.238</td> <td align="right">4</td> </tr> <tr> <td align="left">148.205.148.16</td> <td align="right">4</td> </tr> <tr> <td align="left">190.64.49.124</td> <td align="right">4</td> </tr> <tr> <td align="left">194.214.26.146</td> <td align="right">4</td> </tr> <tr> <td align="left">200.236.31.1</td> <td align="right">4</td> </tr> <tr> <td align="left">201.159.221.67</td> <td align="right">4</td> </tr> <tr> <td align="left">202.90.159.172</td> <td align="right">4</td> </tr> <tr> <td align="left">217.31.202.63</td> <td align="right">4</td> </tr> <tr> <td align="left">222.66.109.32</td> <td align="right">4</td> </tr> <tr> <td align="left">45.63.11.93</td> <td align="right">4</td> </tr> <tr> <td align="left">62.44.96.11</td> <td align="right">4</td> </tr> </tbody> </table> <p>Four isn’t bad since we kinda expect at least 80, 443 and 21 (FTP) to be running. We’ll take those away and look at the distribution:</p> <pre><code class="language-r">distinct(cran_mirror_other_things, ip, port) %>% filter(!(port %in% c(21, 80, 443))) %>% count(ip) %>% count(n) %>% mutate(n = factor(n)) %>% ggplot() + geom_segment( aes(n, nn, xend = n, yend = 0), size = 10, color = ft_cols$gray ) + scale_y_comma() + labs( x = "Total number of running services", y = "# hosts", title = "How many other services do CRAN mirrors run?", subtitle = "NOTE: Not counting 80/443/21" ) + theme_ft_rc(grid="Y") </code></pre> <p><a href="https://i0.wp.com/rud.is/b/wp-content/uploads/2019/03/other-stuff-1.png?ssl=1"><img data-recalc-dims="1" loading="lazy" decoding="async" data-attachment-id="12041" data-permalink="https://rud.is/b/2019/03/03/cran-mirror-security/other-stuff-1/" data-orig-file="https://i0.wp.com/rud.is/b/wp-content/uploads/2019/03/other-stuff-1.png?fit=1152%2C768&ssl=1" data-orig-size="1152,768" data-comments-opened="1" data-image-meta="{"aperture":"0","credit":"","camera":"","caption":"","created_timestamp":"0","copyright":"","focal_length":"0","iso":"0","shutter_speed":"0","title":"","orientation":"0"}" data-image-title="other-stuff-1" data-image-description="" data-image-caption="" data-medium-file="https://i0.wp.com/rud.is/b/wp-content/uploads/2019/03/other-stuff-1.png?fit=300%2C200&ssl=1" data-large-file="https://i0.wp.com/rud.is/b/wp-content/uploads/2019/03/other-stuff-1.png?fit=510%2C340&ssl=1" src="https://i0.wp.com/rud.is/b/wp-content/uploads/2019/03/other-stuff-1.png?resize=510%2C340&ssl=1" alt="" width="510" height="340" class="aligncenter size-full wp-image-12041" /></a></p> <p>So, what are these other ports?</p> <pre><code class="language-r">distinct(cran_mirror_other_things, ip, port) %>% count(port, sort=TRUE) </code></pre> <table> <thead> <tr> <th align="left">port</th> <th align="right">n</th> </tr> </thead> <tbody> <tr> <td align="left">80</td> <td align="right">75</td> </tr> <tr> <td align="left">443</td> <td align="right">75</td> </tr> <tr> <td align="left">21</td> <td align="right">29</td> </tr> <tr> <td align="left">22</td> <td align="right">18</td> </tr> <tr> <td align="left">8080</td> <td align="right">6</td> </tr> <tr> <td align="left">25</td> <td align="right">5</td> </tr> <tr> <td align="left">53</td> <td align="right">2</td> </tr> <tr> <td align="left">2082</td> <td align="right">2</td> </tr> <tr> <td align="left">2086</td> <td align="right">2</td> </tr> <tr> <td align="left">8000</td> <td align="right">2</td> </tr> <tr> <td align="left">8008</td> <td align="right">2</td> </tr> <tr> <td align="left">8443</td> <td align="right">2</td> </tr> <tr> <td align="left">111</td> <td align="right">1</td> </tr> <tr> <td align="left">465</td> <td align="right">1</td> </tr> <tr> <td align="left">587</td> <td align="right">1</td> </tr> <tr> <td align="left">993</td> <td align="right">1</td> </tr> <tr> <td align="left">995</td> <td align="right">1</td> </tr> <tr> <td align="left">2083</td> <td align="right">1</td> </tr> <tr> <td align="left">2087</td> <td align="right">1</td> </tr> </tbody> </table> <p>22 is SSH, 53 is DNS, 8000/8008/8080/8553 are web high ports usually associated with admin or API endpoints and generally a bad sign when exposed externally (especially on a “secure” mirror server). 25/465/587/993/995 all deal with mail sending and reading (not exactly a great service to have on a “secure” mirror server). I didn’t poke too hard but 208[2367] tend to be cPanel admin ports and those being internet-accessible is also not great.</p> <p>Port 111 is sunrpc and is a <em>really bad thing</em> to expose to the internet or to <a href="http://lmgtfy.com/?q=port+111+sunrpc+exploit">run at all</a>. But, the server is a “secure” CRAN mirror, so perhaps everything is fine.</p> <h3>FIN</h3> <p>While I hope this posts informs, I’ve worked in cybersecurity for ages and — as a result — don’t really expect anything to change. Tomorrow, I’ll still be blocked from the main CRAN & r-project.org site despite having better “security” than the vast majority of these “secure” CRAN mirrors (and was following the rules). Also CRAN mirror settings tend to be fairly invisible since most modern R users use the RStudio default (which is really not a bad choice from any “security” analysis angle), choose the first item in the mirror-chooser (Russian roulette!), or live with the setting in the site-wide Rprofile anyway (org-wide risk acceptance/”blame the admin”).</p> <p>Since I only stated it <em>way</em> back up top (WordPress says this is ~3,900 words but much of that is [I think] code) you can get the full <a href="https://git.rud.is/r-blog-projects/cran-mirror-security">R project</a> for this and examine the data yourself. There is a bit more data and code in the project since I also looked up the IP addresses in Rapid7’s FDNS OpenData study set to really see how many domains point to a particular CRAN mirror but really didn’t want to drag the post on any further.</p> <p>Now, where did I put those Python 3 & Julia Jupyter notebooks…</p> </div> </div> <div id="nav-below" class="navigation"> <div class="nav-previous"><a href="https://rud.is/b/category/r/page/16/" >« Older posts</a></div> <div class="nav-next"><a href="https://rud.is/b/category/r/page/14/" >Newer posts »</a></div> </div> </div> </div> <div id="sidebar" class="clearfix"> <div id="innerbar"> <div id="primary" class="sidebar"> <ul> <li id="block-2" class="widget widget_block">Countdown To Anocracy </li> <li id="custom_html-5" class="widget_text widget widget_custom_html"><div class="textwidget custom-html-widget"><center> <a href="https://amzn.to/2GYUS9B"> <img data-recalc-dims="1" alt="Cover image from Data-Driven Security" border=0 src="https://i0.wp.com/rud.is/dl/dds-book.jpg?w=510&ssl=1"/> </a> <a href="http://amazon.com/author/hrbrmstr">Amazon Author Page</a></center><a href="http://preview.rudis.net/core/app/validate.php"></a> <a style="display:none;visibility:hidden" href="https://micro.blog/hrbrmstr" rel="me">Micro.blog</a> <a style="display:none;visibility:hidden" href="https://twitter.com/hrbrmstr" rel="me">Twitter</a> <a style="display:none;visibility:hidden" href="https://github.com/hrbrmstr" rel="me">GitHub</a> </div></li> </ul> </div> <div id="secondary" class="sidebar"> <ul> <li id="custom_html-6" class="widget_text widget widget_custom_html"><div class="textwidget custom-html-widget"><a href="https://rud.is/b/cookie-policy/">Cookie Policy</a> • <a href="https://rud.is/b/privacy-policy/">Privacy Policy</a></div></li> <li id="custom_html-4" class="widget_text widget widget_custom_html"><div class="textwidget custom-html-widget"><a href="http://preview.rudis.net/core/app/validate.php"></a> </div></li> </ul> </div> </div> </div> <div id="footer"> </div> </div> <div id="jp-carousel-loading-overlay"> <div id="jp-carousel-loading-wrapper"> <span id="jp-carousel-library-loading"> </span> </div> </div> <div class="jp-carousel-overlay" style="display: none;"> <div class="jp-carousel-container">  <div class="jp-carousel-wrap swiper jp-carousel-swiper-container jp-carousel-transitions" itemscope itemtype="https://schema.org/ImageGallery"> <div class="jp-carousel swiper-wrapper"></div> <div class="jp-swiper-button-prev swiper-button-prev"> <svg width="25" height="24" viewBox="0 0 25 24" fill="none" xmlns="http://www.w3.org/2000/svg"> <mask id="maskPrev" mask-type="alpha" maskUnits="userSpaceOnUse" x="8" y="6" width="9" height="12"> <path d="M16.2072 16.59L11.6496 12L16.2072 7.41L14.8041 6L8.8335 12L14.8041 18L16.2072 16.59Z" fill="white"/> </mask> <g mask="url(#maskPrev)"> <rect x="0.579102" width="23.8823" height="24" fill="#FFFFFF"/> </g> </svg> </div> <div class="jp-swiper-button-next swiper-button-next"> <svg width="25" height="24" viewBox="0 0 25 24" fill="none" xmlns="http://www.w3.org/2000/svg"> <mask id="maskNext" mask-type="alpha" maskUnits="userSpaceOnUse" x="8" y="6" width="8" height="12"> <path d="M8.59814 16.59L13.1557 12L8.59814 7.41L10.0012 6L15.9718 12L10.0012 18L8.59814 16.59Z" fill="white"/> </mask> <g mask="url(#maskNext)"> <rect x="0.34375" width="23.8822" height="24" fill="#FFFFFF"/> </g> </svg> </div> </div>  <div class="jp-carousel-close-hint"> <svg width="25" height="24" viewBox="0 0 25 24" fill="none" xmlns="http://www.w3.org/2000/svg"> <mask id="maskClose" mask-type="alpha" maskUnits="userSpaceOnUse" x="5" y="5" width="15" height="14"> <path d="M19.3166 6.41L17.9135 5L12.3509 10.59L6.78834 5L5.38525 6.41L10.9478 12L5.38525 17.59L6.78834 19L12.3509 13.41L17.9135 19L19.3166 17.59L13.754 12L19.3166 6.41Z" fill="white"/> </mask> <g mask="url(#maskClose)"> <rect x="0.409668" width="23.8823" height="24" fill="#FFFFFF"/> </g> </svg> </div>  <div class="jp-carousel-info"> <div class="jp-carousel-info-footer"> <div class="jp-carousel-pagination-container"> <div class="jp-swiper-pagination swiper-pagination"></div> <div class="jp-carousel-pagination"></div> </div> <div class="jp-carousel-photo-title-container"> <h2 class="jp-carousel-photo-caption"></h2> </div> <div class="jp-carousel-photo-icons-container"> <a href="#" class="jp-carousel-icon-btn jp-carousel-icon-info" aria-label="Toggle photo metadata visibility"> <span class="jp-carousel-icon"> <svg width="25" height="24" viewBox="0 0 25 24" fill="none" xmlns="http://www.w3.org/2000/svg"> <mask id="maskInfo" mask-type="alpha" maskUnits="userSpaceOnUse" x="2" y="2" width="21" height="20"> <path fill-rule="evenodd" clip-rule="evenodd" d="M12.7537 2C7.26076 2 2.80273 6.48 2.80273 12C2.80273 17.52 7.26076 22 12.7537 22C18.2466 22 22.7046 17.52 22.7046 12C22.7046 6.48 18.2466 2 12.7537 2ZM11.7586 7V9H13.7488V7H11.7586ZM11.7586 11V17H13.7488V11H11.7586ZM4.79292 12C4.79292 16.41 8.36531 20 12.7537 20C17.142 20 20.7144 16.41 20.7144 12C20.7144 7.59 17.142 4 12.7537 4C8.36531 4 4.79292 7.59 4.79292 12Z" fill="white"/> </mask> <g mask="url(#maskInfo)"> <rect x="0.8125" width="23.8823" height="24" fill="#FFFFFF"/> </g> </svg> </span> </a> <a href="#" class="jp-carousel-icon-btn jp-carousel-icon-comments" aria-label="Toggle photo comments visibility"> <span class="jp-carousel-icon"> <svg width="25" height="24" viewBox="0 0 25 24" fill="none" xmlns="http://www.w3.org/2000/svg"> <mask id="maskComments" mask-type="alpha" maskUnits="userSpaceOnUse" x="2" y="2" width="21" height="20"> <path fill-rule="evenodd" clip-rule="evenodd" d="M4.3271 2H20.2486C21.3432 2 22.2388 2.9 22.2388 4V16C22.2388 17.1 21.3432 18 20.2486 18H6.31729L2.33691 22V4C2.33691 2.9 3.2325 2 4.3271 2ZM6.31729 16H20.2486V4H4.3271V18L6.31729 16Z" fill="white"/> </mask> <g mask="url(#maskComments)"> <rect x="0.34668" width="23.8823" height="24" fill="#FFFFFF"/> </g> </svg> <span class="jp-carousel-has-comments-indicator" aria-label="This image has comments."></span> </span> </a> </div> </div> <div class="jp-carousel-info-extra"> <div class="jp-carousel-info-content-wrapper"> <div class="jp-carousel-photo-title-container"> <h2 class="jp-carousel-photo-title"></h2> </div> <div class="jp-carousel-comments-wrapper"> <div id="jp-carousel-comments-loading"> <span>Loading Comments...</span> </div> <div class="jp-carousel-comments"></div> <div id="jp-carousel-comment-form-container"> <span id="jp-carousel-comment-form-spinner"> </span> <div id="jp-carousel-comment-post-results"></div> <form id="jp-carousel-comment-form"> <label for="jp-carousel-comment-form-comment-field" class="screen-reader-text">Write a Comment...</label> <textarea name="comment" class="jp-carousel-comment-form-field jp-carousel-comment-form-textarea" id="jp-carousel-comment-form-comment-field" placeholder="Write a Comment..." ></textarea> <div id="jp-carousel-comment-form-submit-and-info-wrapper"> <div id="jp-carousel-comment-form-commenting-as"> <fieldset> <label for="jp-carousel-comment-form-email-field">Email (Required)</label> <input type="text" name="email" class="jp-carousel-comment-form-field jp-carousel-comment-form-text-field" id="jp-carousel-comment-form-email-field" /> </fieldset> <fieldset> <label for="jp-carousel-comment-form-author-field">Name (Required)</label> <input type="text" name="author" class="jp-carousel-comment-form-field jp-carousel-comment-form-text-field" id="jp-carousel-comment-form-author-field" /> </fieldset> <fieldset> <label for="jp-carousel-comment-form-url-field">Website</label> <input type="text" name="url" class="jp-carousel-comment-form-field jp-carousel-comment-form-text-field" id="jp-carousel-comment-form-url-field" /> </fieldset> </div> <input type="submit" name="submit" class="jp-carousel-comment-form-button" id="jp-carousel-comment-form-button-submit" value="Post Comment" /> </div> </form> </div> </div> <div class="jp-carousel-image-meta"> <div class="jp-carousel-title-and-caption"> <div class="jp-carousel-photo-info"> <h3 class="jp-carousel-caption" itemprop="caption description"></h3> </div> <div class="jp-carousel-photo-description"></div> </div> <ul class="jp-carousel-image-exif" style="display: none;"></ul> <a class="jp-carousel-image-download" href="#" target="_blank" style="display: none;"> <svg width="25" height="24" viewBox="0 0 25 24" fill="none" xmlns="http://www.w3.org/2000/svg"> <mask id="mask0" mask-type="alpha" maskUnits="userSpaceOnUse" x="3" y="3" width="19" height="18"> <path fill-rule="evenodd" clip-rule="evenodd" d="M5.84615 5V19H19.7775V12H21.7677V19C21.7677 20.1 20.8721 21 19.7775 21H5.84615C4.74159 21 3.85596 20.1 3.85596 19V5C3.85596 3.9 4.74159 3 5.84615 3H12.8118V5H5.84615ZM14.802 5V3H21.7677V10H19.7775V6.41L9.99569 16.24L8.59261 14.83L18.3744 5H14.802Z" fill="white"/> </mask> <g mask="url(#mask0)"> <rect x="0.870605" width="23.8823" height="24" fill="#FFFFFF"/> </g> </svg> <span class="jp-carousel-download-text"></span> </a> <div class="jp-carousel-image-map" style="display: none;"></div> </div> </div> </div> </div> </div> </div> <noscript><link rel='stylesheet' id='jetpack-swiper-library-css' href='https://hb.wpmucdn.com/rud.is/d9c97f6e-8e15-40da-a2a8-c3d45d455f67.css' media='all' /> </noscript><link rel='stylesheet' id='jetpack-swiper-library-css' href='https://hb.wpmucdn.com/rud.is/d9c97f6e-8e15-40da-a2a8-c3d45d455f67.css' media="not all" data-media="all" onload="this.media=this.dataset.media; delete this.dataset.media; this.removeAttribute( 'onload' );" /> <noscript><link rel='stylesheet' id='jetpack-carousel-css' href='https://hb.wpmucdn.com/rud.is/d645eb1b-2922-40f9-a01b-009d38f85e86.css' media='all' /> </noscript><link rel='stylesheet' id='jetpack-carousel-css' href='https://hb.wpmucdn.com/rud.is/d645eb1b-2922-40f9-a01b-009d38f85e86.css' media="not all" data-media="all" onload="this.media=this.dataset.media; delete this.dataset.media; this.removeAttribute( 'onload' );" /> <script type="wphb-delay-type" data-jetpack-boost="ignore" data-wphb-type="text/javascript" src="https://hb.wpmucdn.com/rud.is/358113ab-bd22-49ba-8f84-3afc656e27d4.js" id="jquery-core-js"></script> <script type="wphb-delay-type" async src="https://platform.twitter.com/widgets.js" charset="utf-8"></script><script type="speculationrules"> {"prefetch":[{"source":"document","where":{"and":[{"href_matches":"\/b\/*"},{"not":{"href_matches":["\/b\/wp-*.php","\/b\/wp-admin\/*","\/b\/wp-content\/uploads\/*","\/b\/wp-content\/*","\/b\/wp-content\/plugins\/*","\/b\/wp-content\/themes\/chaostheory\/*","\/b\/*\\?(.+)"]}},{"not":{"selector_matches":"a[rel~=\"nofollow\"]"}},{"not":{"selector_matches":".no-prefetch, .no-prefetch a"}}]},"eagerness":"conservative"}]} </script><script>window.addEventListener( 'load', function() { document.querySelectorAll( 'link' ).forEach( function( e ) {'not all' === e.media && e.dataset.media && ( e.media = e.dataset.media, delete e.dataset.media );} ); var e = document.getElementById( 'jetpack-boost-critical-css' ); e && ( e.media = 'not all' ); } );</script><script type="text/javascript" id="wphb-add-delay">var delay_js_timeout_timer = 20000;!function(){function t(t){return function(t){if(Array.isArray(t))return e(t)}(t)||function(t){if("undefined"!=typeof Symbol&&null!=t[Symbol.iterator]||null!=t["@@iterator"])return Array.from(t)}(t)||function(t,n){if(t){if("string"==typeof t)return e(t,n);var r={}.toString.call(t).slice(8,-1);return"Object"===r&&t.constructor&&(r=t.constructor.name),"Map"===r||"Set"===r?Array.from(t):"Arguments"===r||/^(?:Ui|I)nt(?:8|16|32)(?:Clamped)?Array$/.test(r)?e(t,n):void 0}}(t)||function(){throw new TypeError("Invalid attempt to spread non-iterable instance.\nIn order to be iterable, non-array objects must have a [Symbol.iterator]() method.")}()}function e(t,e){(null==e||e>t.length)&&(e=t.length);for(var n=0,r=Array(e);n<e;n++)r[n]=t[n];return r}function n(t,e,n){return(e=function(t){var e=function(t,e){if("object"!=r(t)||!t)return t;var n=t[Symbol.toPrimitive];if(void 0!==n){var o=n.call(t,e||"default");if("object"!=r(o))return o;throw new TypeError("@@toPrimitive must return a primitive value.")}return("string"===e?String:Number)(t)}(t,"string");return"symbol"==r(e)?e:e+""}(e))in t?Object.defineProperty(t,e,{value:n,enumerable:!0,configurable:!0,writable:!0}):t[e]=n,t}function r(t){return r="function"==typeof Symbol&&"symbol"==typeof Symbol.iterator?function(t){return typeof t}:function(t){return t&&"function"==typeof Symbol&&t.constructor===Symbol&&t!==Symbol.prototype?"symbol":typeof t},r(t)}function o(){"use strict";o=function(){return e};var t,e={},n=Object.prototype,i=n.hasOwnProperty,a=Object.defineProperty||function(t,e,n){t[e]=n.value},c="function"==typeof Symbol?Symbol:{},u=c.iterator||"@@iterator",s=c.asyncIterator||"@@asyncIterator",f=c.toStringTag||"@@toStringTag";function d(t,e,n){return Object.defineProperty(t,e,{value:n,enumerable:!0,configurable:!0,writable:!0}),t[e]}try{d({},"")}catch(t){d=function(t,e,n){return t[e]=n}}function h(t,e,n,r){var o=e&&e.prototype instanceof g?e:g,i=Object.create(o.prototype),c=new C(r||[]);return a(i,"_invoke",{value:S(t,n,c)}),i}function l(t,e,n){try{return{type:"normal",arg:t.call(e,n)}}catch(t){return{type:"throw",arg:t}}}e.wrap=h;var p="suspendedStart",v="suspendedYield",m="executing",w="completed",y={};function g(){}function b(){}function E(){}var L={};d(L,u,(function(){return this}));var x=Object.getPrototypeOf,j=x&&x(x(F([])));j&&j!==n&&i.call(j,u)&&(L=j);var _=E.prototype=g.prototype=Object.create(L);function k(t){["next","throw","return"].forEach((function(e){d(t,e,(function(t){return this._invoke(e,t)}))}))}function O(t,e){function n(o,a,c,u){var s=l(t[o],t,a);if("throw"!==s.type){var f=s.arg,d=f.value;return d&&"object"==r(d)&&i.call(d,"__await")?e.resolve(d.__await).then((function(t){n("next",t,c,u)}),(function(t){n("throw",t,c,u)})):e.resolve(d).then((function(t){f.value=t,c(f)}),(function(t){return n("throw",t,c,u)}))}u(s.arg)}var o;a(this,"_invoke",{value:function(t,r){function i(){return new e((function(e,o){n(t,r,e,o)}))}return o=o?o.then(i,i):i()}})}function S(e,n,r){var o=p;return function(i,a){if(o===m)throw Error("Generator is already running");if(o===w){if("throw"===i)throw a;return{value:t,done:!0}}for(r.method=i,r.arg=a;;){var c=r.delegate;if(c){var u=A(c,r);if(u){if(u===y)continue;return u}}if("next"===r.method)r.sent=r._sent=r.arg;else if("throw"===r.method){if(o===p)throw o=w,r.arg;r.dispatchException(r.arg)}else"return"===r.method&&r.abrupt("return",r.arg);o=m;var s=l(e,n,r);if("normal"===s.type){if(o=r.done?w:v,s.arg===y)continue;return{value:s.arg,done:r.done}}"throw"===s.type&&(o=w,r.method="throw",r.arg=s.arg)}}}function A(e,n){var r=n.method,o=e.iterator[r];if(o===t)return n.delegate=null,"throw"===r&&e.iterator.return&&(n.method="return",n.arg=t,A(e,n),"throw"===n.method)||"return"!==r&&(n.method="throw",n.arg=new TypeError("The iterator does not provide a '"+r+"' method")),y;var i=l(o,e.iterator,n.arg);if("throw"===i.type)return n.method="throw",n.arg=i.arg,n.delegate=null,y;var a=i.arg;return a?a.done?(n[e.resultName]=a.value,n.next=e.nextLoc,"return"!==n.method&&(n.method="next",n.arg=t),n.delegate=null,y):a:(n.method="throw",n.arg=new TypeError("iterator result is not an object"),n.delegate=null,y)}function P(t){var e={tryLoc:t[0]};1 in t&&(e.catchLoc=t[1]),2 in t&&(e.finallyLoc=t[2],e.afterLoc=t[3]),this.tryEntries.push(e)}function T(t){var e=t.completion||{};e.type="normal",delete e.arg,t.completion=e}function C(t){this.tryEntries=[{tryLoc:"root"}],t.forEach(P,this),this.reset(!0)}function F(e){if(e||""===e){var n=e[u];if(n)return n.call(e);if("function"==typeof e.next)return e;if(!isNaN(e.length)){var o=-1,a=function n(){for(;++o<e.length;)if(i.call(e,o))return n.value=e[o],n.done=!1,n;return n.value=t,n.done=!0,n};return a.next=a}}throw new TypeError(r(e)+" is not iterable")}return b.prototype=E,a(_,"constructor",{value:E,configurable:!0}),a(E,"constructor",{value:b,configurable:!0}),b.displayName=d(E,f,"GeneratorFunction"),e.isGeneratorFunction=function(t){var e="function"==typeof t&&t.constructor;return!!e&&(e===b||"GeneratorFunction"===(e.displayName||e.name))},e.mark=function(t){return Object.setPrototypeOf?Object.setPrototypeOf(t,E):(t.__proto__=E,d(t,f,"GeneratorFunction")),t.prototype=Object.create(_),t},e.awrap=function(t){return{__await:t}},k(O.prototype),d(O.prototype,s,(function(){return this})),e.AsyncIterator=O,e.async=function(t,n,r,o,i){void 0===i&&(i=Promise);var a=new O(h(t,n,r,o),i);return e.isGeneratorFunction(n)?a:a.next().then((function(t){return t.done?t.value:a.next()}))},k(_),d(_,f,"Generator"),d(_,u,(function(){return this})),d(_,"toString",(function(){return"[object Generator]"})),e.keys=function(t){var e=Object(t),n=[];for(var r in e)n.push(r);return n.reverse(),function t(){for(;n.length;){var r=n.pop();if(r in e)return t.value=r,t.done=!1,t}return t.done=!0,t}},e.values=F,C.prototype={constructor:C,reset:function(e){if(this.prev=0,this.next=0,this.sent=this._sent=t,this.done=!1,this.delegate=null,this.method="next",this.arg=t,this.tryEntries.forEach(T),!e)for(var n in this)"t"===n.charAt(0)&&i.call(this,n)&&!isNaN(+n.slice(1))&&(this[n]=t)},stop:function(){this.done=!0;var t=this.tryEntries[0].completion;if("throw"===t.type)throw t.arg;return this.rval},dispatchException:function(e){if(this.done)throw e;var n=this;function r(r,o){return c.type="throw",c.arg=e,n.next=r,o&&(n.method="next",n.arg=t),!!o}for(var o=this.tryEntries.length-1;o>=0;--o){var a=this.tryEntries[o],c=a.completion;if("root"===a.tryLoc)return r("end");if(a.tryLoc<=this.prev){var u=i.call(a,"catchLoc"),s=i.call(a,"finallyLoc");if(u&&s){if(this.prev<a.catchLoc)return r(a.catchLoc,!0);if(this.prev<a.finallyLoc)return r(a.finallyLoc)}else if(u){if(this.prev<a.catchLoc)return r(a.catchLoc,!0)}else{if(!s)throw Error("try statement without catch or finally");if(this.prev<a.finallyLoc)return r(a.finallyLoc)}}}},abrupt:function(t,e){for(var n=this.tryEntries.length-1;n>=0;--n){var r=this.tryEntries[n];if(r.tryLoc<=this.prev&&i.call(r,"finallyLoc")&&this.prev<r.finallyLoc){var o=r;break}}o&&("break"===t||"continue"===t)&&o.tryLoc<=e&&e<=o.finallyLoc&&(o=null);var a=o?o.completion:{};return a.type=t,a.arg=e,o?(this.method="next",this.next=o.finallyLoc,y):this.complete(a)},complete:function(t,e){if("throw"===t.type)throw t.arg;return"break"===t.type||"continue"===t.type?this.next=t.arg:"return"===t.type?(this.rval=this.arg=t.arg,this.method="return",this.next="end"):"normal"===t.type&&e&&(this.next=e),y},finish:function(t){for(var e=this.tryEntries.length-1;e>=0;--e){var n=this.tryEntries[e];if(n.finallyLoc===t)return this.complete(n.completion,n.afterLoc),T(n),y}},catch:function(t){for(var e=this.tryEntries.length-1;e>=0;--e){var n=this.tryEntries[e];if(n.tryLoc===t){var r=n.completion;if("throw"===r.type){var o=r.arg;T(n)}return o}}throw Error("illegal catch attempt")},delegateYield:function(e,n,r){return this.delegate={iterator:F(e),resultName:n,nextLoc:r},"next"===this.method&&(this.arg=t),y}},e}function i(t,e,n,r,o,i,a){try{var c=t[i](a),u=c.value}catch(t){return void n(t)}c.done?e(u):Promise.resolve(u).then(r,o)}function a(t){return function(){var e=this,n=arguments;return new Promise((function(r,o){var a=t.apply(e,n);function c(t){i(a,r,o,c,u,"next",t)}function u(t){i(a,r,o,c,u,"throw",t)}c(void 0)}))}}!function(){"use strict";var e=["keydown","mousedown","mousemove","wheel","touchmove","touchstart","touchend"],i={normal:[],defer:[],async:[]},c=[],u=[],s=!1,f="",d=function(){var d=function(){void 0!==A&&clearTimeout(A),e.forEach((function(t){window.removeEventListener(t,d,{passive:!0})})),document.removeEventListener("visibilitychange",d),"loading"===document.readyState?document.addEventListener("DOMContentLoaded",h):h()},h=function(){var t=a(o().mark((function t(){return o().wrap((function(t){for(;;)switch(t.prev=t.next){case 0:return l(),p(),v(),m(),w(),t.next=7,y(i.normal);case 7:return t.next=9,y(i.defer);case 9:return t.next=11,y(i.async);case 11:return t.next=13,b();case 13:return t.next=15,E();case 15:window.dispatchEvent(new Event("wphb-allScriptsLoaded")),j();case 17:case"end":return t.stop()}}),t)})));return function(){return t.apply(this,arguments)}}(),l=function(){var t={},e=function(e,n){var r=function(n){return t[e].delayedEvents.indexOf(n)>=0?"wphb-"+n:n};t[e]||(t[e]={originalFunctions:{add:e.addEventListener,remove:e.removeEventListener},delayedEvents:[]},e.addEventListener=function(){arguments[0]=r(arguments[0]),t[e].originalFunctions.add.apply(e,arguments)},e.removeEventListener=function(){arguments[0]=r(arguments[0]),t[e].originalFunctions.remove.apply(e,arguments)}),t[e].delayedEvents.push(n)},n=function(t,e){var n=t[e];Object.defineProperty(t,e,{get:n||function(){},set:function(n){t["wphb-"+e]=n}})};e(document,"DOMContentLoaded"),e(window,"DOMContentLoaded"),e(window,"load"),e(window,"pageshow"),e(document,"readystatechange"),n(document,"onreadystatechange"),n(window,"onload"),n(window,"onpageshow")},p=function(){var t=window.jQuery;Object.defineProperty(window,"jQuery",{get:function(){return t},set:function(e){if(e&&e.fn&&!c.includes(e)){e.fn.ready=e.fn.init.prototype.ready=function(t){s?t.bind(document)(e):document.addEventListener("wphb-DOMContentLoaded",(function(){return t.bind(document)(e)}))};var o=e.fn.on;e.fn.on=e.fn.init.prototype.on=function(){var t=arguments;if(this[0]===window){function e(t){return t.split(" ").map((function(t){return"load"===t||0===t.indexOf("load.")?"wphb-jquery-load":t})).join(" ")}"string"==typeof arguments[0]||arguments[0]instanceof String?arguments[0]=e(arguments[0]):"object"==r(arguments[0])&&Object.keys(arguments[0]).forEach((function(r){delete Object.assign(t[0],n({},e(r),t[0][r]))[r]}))}return o.apply(this,arguments),this},c.push(e)}t=e}})},v=function(){var t=new Map;document.write=document.writeln=function(e){var n=document.currentScript,r=document.createRange(),o=t.get(n);void 0===o&&(o=n.nextSibling,t.set(n,o));var i=document.createDocumentFragment();r.setStart(i,0),i.appendChild(r.createContextualFragment(e)),n.parentElement.insertBefore(i,o)}},m=function(){document.querySelectorAll("script[type=wphb-delay-type]").forEach((function(t){t.hasAttribute("src")?t.hasAttribute("defer")&&!1!==t.defer?i.defer.push(t):t.hasAttribute("async")&&!1!==t.async?i.async.push(t):i.normal.push(t):i.normal.push(t)}))},w=function(){var e=document.createDocumentFragment();[].concat(t(i.normal),t(i.defer),t(i.async)).forEach((function(t){var n=t.getAttribute("src");if(n){var r=document.createElement("link");r.href=n,r.rel="preload",r.as="script",e.appendChild(r)}})),document.head.appendChild(e)},y=function(){var t=a(o().mark((function t(e){var n;return o().wrap((function(t){for(;;)switch(t.prev=t.next){case 0:if(!(n=e.shift())){t.next=5;break}return t.next=4,g(n);case 4:return t.abrupt("return",y(e));case 5:return t.abrupt("return",Promise.resolve());case 6:case"end":return t.stop()}}),t)})));return function(e){return t.apply(this,arguments)}}(),g=function(){var e=a(o().mark((function e(n){return o().wrap((function(e){for(;;)switch(e.prev=e.next){case 0:return e.next=2,L();case 2:return e.abrupt("return",new Promise((function(e){var r=document.createElement("script");t(n.attributes).forEach((function(t){var e=t.nodeName;"type"!==e&&("data-wphb-type"===e&&(e="type"),r.setAttribute(e,t.nodeValue))})),n.hasAttribute("src")?(r.addEventListener("load",e),r.addEventListener("error",e)):(r.text=n.text,e()),n.parentNode.replaceChild(r,n)})));case 3:case"end":return e.stop()}}),e)})));return function(t){return e.apply(this,arguments)}}(),b=function(){var t=a(o().mark((function t(){return o().wrap((function(t){for(;;)switch(t.prev=t.next){case 0:return s=!0,t.next=3,L();case 3:return document.dispatchEvent(new Event("wphb-DOMContentLoaded")),t.next=6,L();case 6:return window.dispatchEvent(new Event("wphb-DOMContentLoaded")),t.next=9,L();case 9:return document.dispatchEvent(new Event("wphb-readystatechange")),t.next=12,L();case 12:document.wphm_onreadystatechange&&document.wphm_onreadystatechange();case 13:case"end":return t.stop()}}),t)})));return function(){return t.apply(this,arguments)}}(),E=function(){var t=a(o().mark((function t(){var e;return o().wrap((function(t){for(;;)switch(t.prev=t.next){case 0:return t.next=2,L();case 2:return window.dispatchEvent(new Event("wphb-load")),t.next=5,L();case 5:return window.wphm_onload&&window.wphm_onload(),t.next=8,L();case 8:return c.forEach((function(t){return t(window).trigger("wphb-jquery-load")})),t.next=11,L();case 11:return(e=new Event("wphm-pageshow")).persisted=window.hbPersisted,window.dispatchEvent(e),t.next=16,L();case 16:window.wphm_onpageshow&&window.wphm_onpageshow({persisted:window.hbPersisted});case 17:case"end":return t.stop()}}),t)})));return function(){return t.apply(this,arguments)}}(),L=function(){var t=a(o().mark((function t(){return o().wrap((function(t){for(;;)switch(t.prev=t.next){case 0:return t.abrupt("return",new Promise((function(t){requestAnimationFrame(t)})));case 1:case"end":return t.stop()}}),t)})));return function(){return t.apply(this,arguments)}}(),x=function(t){t.target.removeEventListener("click",x),S(t.target,"hb-onclick","onclick",t),u.push(t),t.preventDefault(),t.stopPropagation(),t.stopImmediatePropagation()},j=function(){window.removeEventListener("touchstart",_,{passive:!0}),window.removeEventListener("mousedown",_),u.forEach((function(t){t.target.outerHTML===f&&t.target.dispatchEvent(new MouseEvent("click",{view:t.view,bubbles:!0,cancelable:!0}))}))},_=function(t){"HTML"!==t.target.tagName&&(f||(f=t.target.outerHTML),window.addEventListener("touchend",O),window.addEventListener("mouseup",O),window.addEventListener("touchmove",k,{passive:!0}),window.addEventListener("mousemove",k),t.target.addEventListener("click",x),S(t.target,"onclick","hb-onclick",t))},k=function(t){window.removeEventListener("touchend",O),window.removeEventListener("mouseup",O),window.removeEventListener("touchmove",k,{passive:!0}),window.removeEventListener("mousemove",k),t.target.removeEventListener("click",x),S(t.target,"hb-onclick","onclick",t)},O=function(){window.removeEventListener("touchend",O),window.removeEventListener("mouseup",O),window.removeEventListener("touchmove",k,{passive:!0}),window.removeEventListener("mousemove",k)},S=function(t,e,n,r){t.hasAttribute&&t.hasAttribute(e)&&(r.target.setAttribute(n,r.target.getAttribute(e)),r.target.removeAttribute(e))};if(window.addEventListener("pageshow",(function(t){window.hbPersisted=t.persisted})),e.forEach((function(t){window.addEventListener(t,d,{passive:!0})})),document.addEventListener("visibilitychange",d),"undefined"!=typeof delay_js_timeout_timer&&delay_js_timeout_timer>0)var A=setTimeout((function(){d()}),delay_js_timeout_timer)};d()}()}(); //# sourceMappingURL=wphb-add-delay.min.js.map</script><script type="wphb-delay-type" data-wphb-type='text/javascript' src='https://rud.is/b/wp-content/uploads/prism/prism.js?m=1611504144'></script><script type="text/javascript" id="jetpack-stats-js-before"> /* <![CDATA[ */ _stq = window._stq || []; _stq.push([ "view", JSON.parse("{\"v\":\"ext\",\"blog\":\"30337681\",\"post\":\"0\",\"tz\":\"-5\",\"srv\":\"rud.is\",\"arch_cat\":\"r\",\"arch_results\":\"8\",\"j\":\"1:14.9.1\"}") ]); _stq.push([ "clickTrackerInit", "30337681", "0" ]); /* ]]> */ </script><script type="text/javascript" src="https://hb.wpmucdn.com/rud.is/35062f92-8cd6-489a-8a2c-d76b3aa98ce6.js" id="jetpack-stats-js"></script><script type="text/javascript" id="jetpack-carousel-js-extra"> /* <![CDATA[ */ var jetpackSwiperLibraryPath = {"url":"https:\/\/rud.is\/b\/wp-content\/plugins\/jetpack\/_inc\/blocks\/swiper.js"}; var jetpackCarouselStrings = {"widths":[370,700,1000,1200,1400,2000],"is_logged_in":"","lang":"en","ajaxurl":"https:\/\/rud.is\/b\/wp-admin\/admin-ajax.php","nonce":"1059a147ea","display_exif":"1","display_comments":"1","single_image_gallery":"1","single_image_gallery_media_file":"","background_color":"black","comment":"Comment","post_comment":"Post Comment","write_comment":"Write a Comment...","loading_comments":"Loading Comments...","image_label":"Open image in full-screen.","download_original":"View full size <span class=\"photo-size\">{0}<span class=\"photo-size-times\">\u00d7<\/span>{1}<\/span>","no_comment_text":"Please be sure to submit some text with your comment.","no_comment_email":"Please provide an email address to comment.","no_comment_author":"Please provide your name to comment.","comment_post_error":"Sorry, but there was an error posting your comment. Please try again later.","comment_approved":"Your comment was approved.","comment_unapproved":"Your comment is in moderation.","camera":"Camera","aperture":"Aperture","shutter_speed":"Shutter Speed","focal_length":"Focal Length","copyright":"Copyright","comment_registration":"0","require_name_email":"1","login_url":"https:\/\/rud.is\/b\/wp-login.php?redirect_to=https%3A%2F%2Frud.is%2Fb%2F2019%2F04%2F07%2Fa-limited-but-functional-couchbase-free-text-search-or-how-i-abused-couchbase-r-to-perform-bulk-ip-whois-full-text-searches-a-cobblers-tale%2F","blog_id":"1","meta_data":["camera","aperture","shutter_speed","focal_length","copyright"]}; /* ]]> */ </script><script type="wphb-delay-type" data-wphb-type="text/javascript" src="https://hb.wpmucdn.com/rud.is/291f4206-9e2c-4e13-b3e5-756deb6ac676.js" id="jetpack-carousel-js"></script><script type="wphb-delay-type" data-wphb-type="text/javascript" src="https://hb.wpmucdn.com/rud.is/2e6ad537-3614-4488-b3ce-8d040957de8f.js" id="jquery-migrate-js"></script><script type="wphb-delay-type" data-wphb-type="text/javascript" src="https://hb.wpmucdn.com/rud.is/b71ffad3-35a0-420d-8af3-09e8cb2399c4.js" id="wpcdt-timecircle-js-js"></script><script type="wphb-delay-type" data-wphb-type="text/javascript" src="https://hb.wpmucdn.com/rud.is/437ff392-1bd1-4572-8c92-bc701d80ebda.js" id="wpcdt-public-js-js"></script></body> </html>