R lacks some of the more “utilitarian” features found in other scripting languages that were/are more geared—at least initially—towards systems administration. One of the most frustrating missing pieces for security data scientists is the lack of ability to perform basic IP address manipulations, including reverse DNS resolution (even though it has nsl() which is just glue to gethostbyname()!).
If you need to perform reverse resolution, the only two viable options available are to (a) pre-resolve a list of IP addresses or (b) whip up something in R that takes advantage of the ability to perform system calls. Yes, one could write a C/C++ API library that accesses native resolver routines, but that becomes a pain to maintain across platforms. System calls also create some cross-platform issues, but they are usually easier for the typical R user to overcome.
Assuming the dig command is available on your linux, BSD or Mac OS system, it’s pretty trivial to pass in a list of IP addresses to a simple sapply() one-liner:
resolved = sapply(ips, function(x) system(sprintf("dig -x %s +short",x), intern=TRUE))
That works for fairly small lists of addresses, but doesn’t scale well to hundreds or thousands of addresses. (Also, @jayjacobs kinda hates my one-liners #true.)
A better way is to generate a batch query to dig, but the results will be synchronous, which could take A Very Long Time depending on the size of the list and types of results.
The best way (IMO) to tackle this problem is to perform an asynchronous batch query and post-process the results, which we can do with a little help from adns (which homebrew users can install with a quick “brew install adns“).
Once adns is installed, it’s just a matter of writing out a query list, performing the asynchronous batch lookup, parsing the results and re-integrating with the original IP list (which is necessary since errant or unresponsive reverse queries will not be returned by the adns system call).
#pretend this is A Very Long List of IPs ip.list = c("1.1.1.1", "2.3.4.99", "1.1.1.2", "2.3.4.100", "70.196.7.32", "146.160.21.171", "146.160.21.172", "146.160.21.186", "2.3.4.101", "216.167.176.93", "1.1.1.3", "2.3.4.5", "2.3.4.88", "2.3.4.9", "98.208.205.1", "24.34.218.80", "159.127.124.209", "70.196.198.151", "70.192.72.48", "173.192.34.24", "65.214.243.208", "173.45.242.179", "184.106.97.102", "198.61.171.18", "71.184.118.37", "70.215.200.159", "184.107.87.105", "174.121.93.90", "172.17.96.139", "108.59.250.112", "24.63.14.4") # "ips" is a list of IP addresses ip.to.host <- function(ips) { # save out a list of IP addresses in adnshost reverse query format # if you're going to be using this in "production", you *might* # want to consider using tempfile() #justsayin writeLines(laply(ips, function(x) sprintf("-i%s",x)),"/tmp/ips.in") # call adnshost with the file # requires adnshost :: http://www.chiark.greenend.org.uk/~ian/adns/ system.output <- system("cat /tmp/ips.in | adnshost -f",intern=TRUE) # keep our file system tidy unlink("/tmp/ips.in") # clean up the result cleaned.result <- gsub("\\.in-addr\\.arpa","",system.output) # split the reply split.result <- strsplit(cleaned.result," PTR ") # make a data frame of the reply result.df <- data.frame(do.call(rbind, lapply(split.result, rbind))) colnames(result.df) <- c("IP","hostname") # reverse the octets in the IP address list result.df$IP <- sapply(as.character(result.df$IP), function(x) { y <- unlist(strsplit(x,"\\.")) sprintf("%s.%s.%s.%s",y[4],y[3],y[2],y[1]) }) # fill errant lookups with "NA" final.result <- merge(ips,result.df,by.x="x",by.y="IP",all.x=TRUE) colnames(final.result) = c("IP","hostname") return(final.result) } resolved.df <- ip.to.host(ip.list) head(resolved.df,n=10) IP hostname 1 1.1.1.1 <NA> 2 1.1.1.2 <NA> 3 1.1.1.3 <NA> 4 108.59.250.112 vps-1068142-5314.manage.myhosting.com 5 146.160.21.171 <NA> 6 146.160.21.172 <NA> 7 146.160.21.186 <NA> 8 159.127.124.209 <NA> 9 172.17.96.139 <NA> 10 173.192.34.24 173.192.34.24-static.reverse.softlayer.com
If you wish to suppress adns error messages and any resultant R warnings, you can add an “ignore.stderr=TRUE” to the system() call and an “options(warn=-1)” to the function itself (remember to get/reset the current value). I kinda like leaving them in, though, as it shows progress is being made.
Whether you end up using a one-liner or the asynchronous function, it would be a spiffy idea to setup a local caching server, such as Unbound, to speed up subsequent queries (because you will undoubtedly have subsequent queries unless your R scripts are perfect on the first go-round).
If you’ve solved the “efficient reverse DNS query problem” a different way in R, drop a note in the comments! I know quite a few folks who’d love to buy you tasty beverage!
You can find similar, handy IP address and other security-oriented R code in our (me & @jayjacobs’) upcoming book on security data analysis and visualization.

AI Proofing Your It/cyber Career: The Human Only Capabilities That Matter
In the past ~4 weeks I have personally observed some irrefutable things in “AI” that are very likely going to cause massive shocks to employment models in IT, software development, systems administration, and cybersecurity. I know some have already seen minor shocks. They are nothing compared to what’s highly probably ahead.
Nobody likely wants to hear this, but you absolutely need to make or take time this year to identify what you can do that AI cannot do and create some of those items if your list is short or empty.
The weavers in the 1800s used violence to get a 20-year pseudo-reprieve before they were pushed into obsolescence. We’ve got ~maybe 18 months. I’m as pushback-on-this-“AI”-thing as makes sense. I’d like for the bubble to burst. Even if it does, the rulers of our clicktatorship will just fuel a quick rebuild.
Four human-only capabilities in security
In my (broad) field, I think there are some things that make humans 110% necessary. Here’s my list — and it’d be great if folks in very subdomain-specific parts of cyber would provide similar ones. I try to stay in my lane.
1. Judgment under uncertainty with real consequences
These new “AI” systems can use tools to analyze a gazillion sessions and cluster payloads, but they do not (or absolutely should not) bear responsibility for the “we’re pulling the plug on production” decision at 3am. This “weight of consequence” shapes human expertise in ways that inform intuition, risk tolerance, and the ability to act decisively with incomplete information.
Organizations will continue needing people who can own outcomes, not just produce analysis.
2. Adversarial creativity and novel problem framing
The more recent “AI” systems are actually darn good at pattern matching against known patterns and recombining existing approaches. They absolutely suck at the “genuinely novel” — the attack vector nobody has documented, the defensive technique that requires understanding how a specific organization actually operates versus how it should operate.
The best security practitioners think like attackers in ways that go beyond “here are common TTPs.”
3. Institutional knowledge and relationship capital
A yuge one.
Understanding that the finance team always ignores security warnings — especially Dave — during quarter-close. That the legacy SCADA system can’t be patched because the vendor went bankrupt in 2019. That the CISO and CTO have a long-running disagreement about cloud migration.
This context shapes what recommendations are actually actionable. Many technically correct analyses are organizationally useless.
4. The ability to build and maintain trust
The biggest one.
When a breach happens, executives don’t want a report from an “AI”. They want someone who can look them in the eye, explain what happened, and take ownership of the path forward. The human element of security leadership is absolutely not going away.
How to develop these capabilities
Develop depth in areas that require your presence or legal accountability. Disciplines such as incident response, compliance attestation, or security architecture for air-gapped or classified environments. These have regulatory and practical barriers to full automation.
Build expertise in the seams between systems. Understanding how a given combination of legacy mainframe, cloud services, and OT environment actually interconnects requires the kind of institutional archaeology (or the powers of a sexton) that doesn’t exist in training data.
Get comfortable being the human in the loop. I know this will get me tapping mute or block a lot, but you’re going to need to get comfortable being the human in the loop for “AI”-augmented workflows. The analyst who can effectively direct tools, validate outputs (b/c these things will always make stuff up), and translate findings for different audiences has a different job than before but still a necessary one.
Learn to ask better questions. Bring your hypotheses, domain expertise, and knowing which threads are worth pulling to the table. That editorial judgment about what matters is undervalued, and is going to take a while to infuse into “AI” systems.
We’re all John Henry now
A year ago, even with long covid brain fog, I could out-“John Henry” all of the commercial AI models at programming, cyber, and writing tasks. Both in speed and quality.
Now, with the fog gone, I’m likely ~3 months away from being slower than “AI” on a substantial number of core tasks that it can absolutely do. I’ve seen it. I’ve validated the outputs. It sucks. It really really sucks. And it’s not because I’m feeble or have some other undisclosed brain condition (unlike 47). These systems are being curated to do exactly that: erase all of us John Henrys.
The folks who thrive will be those who can figure out what “AI” capabilities aren’t complete garbage and wield them with uniquely human judgment rather than competing on tasks where “AI” has clear advantages.
The pipeline problem
The very uncomfortable truth: there will be fewer entry-level positions that consist primarily of “look at alerts and escalate.” That pipeline into the field is narrowing at a frightening pace.
What concerns me most isn’t the senior practitioners. We’ll adapt and likely become that much more effective. It’s the junior folks who won’t get the years of pattern exposure that built our intuition in the first place.
That’s a pipeline problem the industry hasn’t seriously grappled with yet — and isn’t likely to b/c of the hot, thin air in the offices and boardrooms of myopic and greedy senior executives.