(General reminder abt “R⁶” posts in that they are heavy on code-examples, minimal on expository. I try to design them with 2-3 “nuggets” embedded for those who take the time to walk through the code examples on their systems. I’ll always provide further expository if requested in a comment, so don’t hesitate to ask if something is confusing.)
I had to check something on the macOS systems across the abode today and — on a lark — decided to do all the “shell” scripting in R vs bash
for a change. After performing the tasks, it occurred to me that not all R users on macOS realize there are hidden gems of information spread across the “boring” parts of the filesystem in SQLite databases. So, I put together a small example that:
- identifies all the top-level apps in
/Applications
- extracts code signing information from them into a tibble
- grabs the Gatekeeper whitelist database
- uses an internal SQLite transformation function inside a
dplyr
chain - merges the info and gives you a basis to explore your apps
- shows off the wicked cool
processx
package by @gaborcsardi that you should be using in place ofsystem()
/system2()
A quick note about this Gatekeeper database: If an app is not already recognized by macOS and you allow it to run your local system is updated to “whitelist” this app so you don’t get the notification in the future. Apple maintains a large signature list that gets updated throughout the year on your system and your own list is merged with it.
A second quick note (I guess I’m doing more expository than promised :-) about QUOTE()
: In certain dplyr
/ dbplyr
contexts, what looks like local function calls are actually passed over to the SQL side vs used locally. Even though quote()
is an internal R function, the use of the lowercase version of it would still go over to the SQL side vs be eval’d locally. However, to avoid confusing others, I try to uppercase conflicts like this when they occur.
library(processx)
library(stringi)
library(docxtractr) # install_github("hrbrmstr/docxtractr")
library(tidyverse)
list.files("/Applications", pattern = "app$", full.names = TRUE) %>%
map_df(~{
message(.x)
res <- run("codesign", args = c("-dvvv", .x), error_on_status = FALSE)
if (any(grepl("not signed at all", res$stderr))) {
list(Executable = .x)
} else {
stri_split_lines(res$stderr)[[1]] %>%
keep(~grepl("=", .)) %>%
stri_split_fixed("=", 2, simplify = TRUE) -> res
as.list(set_names(res[,2], res[,1]))
}
}) %>%
mcga() %>%
mutate(short_name = stri_replace_last_fixed(basename(executable), ".app", "")) -> my_apps
glimpse(my_apps)
## Observations: 102
## Variables: 20
## $ executable <chr> "/Applications/1Password 6.app/Con...
## $ identifier <chr> "com.agilebits.onepassword4", "com...
## $ format <chr> "app bundle with Mach-O thin (x86_...
## $ codedirectory_v <chr> "20200 size=29594 flags=0x0(none) ...
## $ hash_type <chr> "sha256 size=32", "sha256 size=32"...
## $ candidatecdhash_sha1 <chr> "e21ac2a66473feec6276b448fc518678d...
## $ candidatecdhash_sha256 <chr> "8cf4cb4bdbea3b4d4f9e293e1aee1edb7...
## $ hash_choices <chr> "sha1,sha256", "sha1,sha256", "sha...
## $ cdhash <chr> "8cf4cb4bdbea3b4d4f9e293e1aee1edb7...
## $ signature_size <chr> "8915", "8936", "4610", "8528", "4...
## $ authority <chr> "Apple Root CA", "Apple Root CA", ...
## $ timestamp <chr> "Jul 17, 2017, 10:18:00 AM", "Jul ...
## $ info_plist_entries <chr> "32", "31", "30", "17", "30", "18"...
## $ teamidentifier <chr> "2BUA8C4S2C", "XZZXE9SED4", "94KV3...
## $ sealed_resources_version <chr> "2 rules=12 files=2440", "2 rules=...
## $ internal_requirements_count <chr> "1 size=220", "1 size=192", "1 siz...
## $ signed_time <chr> NA, NA, "Jul 13, 2017, 6:40:13 PM"...
## $ library_validation_warning <chr> NA, NA, NA, "OS X SDK version befo...
## $ platform_identifier <chr> NA, NA, NA, NA, "2", NA, NA, "2", ...
## $ short_name <chr> "1Password 6", "Alfred 3", "Amazon...
# this is macOS atekeeper whitelist db
db <- src_sqlite("/private/var/db/gkopaque.bundle/Contents/Resources/gkopaque.db")
db
## src: sqlite 3.19.3 [/private/var/db/gkopaque.bundle/Contents/Resources/gkopaque.db]
## tbls: conditions, merged, whitelist
whitelist <- tbl(db, "whitelist")
# they are binary blobs, but we need them as text, so use SQLite's "QUOTE()" function
mutate(whitelist, current = QUOTE(current)) %>%
select(current) %>%
collect() %>%
mutate(
cdhash = stri_replace_first_fixed(current, "X'", "") %>%
stri_replace_last_fixed("'", "") %>%
stri_trans_tolower()
) -> gatekeeper_whitelist_hashes
my_apps <- left_join(my_apps, gatekeeper_whitelist_hashes)
# Unsigned apps
filter(my_apps, is.na(cdhash)) %>%
select(short_name)
# you can see your own output here ;-)
# App organization
select(my_apps, identifier) %>%
mutate(
identifier = stri_split_fixed(identifier, ".", simplify = TRUE)[,2],
identifier = ifelse(identifier == "", "UNSPECIFIED", identifier)
) %>%
count(identifier, sort=TRUE)
## # A tibble: 47 x 2
## identifier n
## <chr> <int>
## 1 apple 37
## 2 UNSPECIFIED 10
## 3 microsoft 6
## 4 eclecticlight 3
## 5 google 3
## 6 amazon 2
## 7 agilebits 1
## 8 antonycourtney 1
## 9 appscape 1
## 10 audacityteam 1
## # ... with 37 more rows
# Which apps have code-signing validation warnings
filter(my_apps, !is.na(library_validation_warning)) %>%
select(short_name, library_validation_warning)
## # A tibble: 7 x 2
## short_name
## <chr>
## 1 Amazon Music
## 2 Audacity
## 3 firefox
## 4 RStudio
## 5 VLC
## 6 Webcam Settings
## 7 WordService
## # ... with 1 more variables: library_validation_warning <chr>
NOTE: One could easily extend this example to look for apps across the filesystem.