Chapter 8 What domain was the user connected to in packet 27300?

8.1 Objects in memory/packages loaded from preceding chapters

Objects available:

  • read_zeek_log() — helper function to read Zeek log files (Chapter 3)
  • packets — data frame of PCAP packet data (Chapter 4)
  • conn — Zeek conn.log data (Chapter 5)

Packages:

  • {tidyverse}

8.2 Question Setup

True to the name of this challenge we have to make a few twists and turns to figure out what domain the user connected to in packet 27300? This involves selecting the packet and grabbing the destination IP, then looking that up in other metadata we can generate. This will help build or refresh the use of a common idiom in cybersecurity analyses: using multiple data sources to arrive at an an answer.

8.3 Solving the quest with R and packets

We finally have an opportunity to use the hosts.txt file we generated in Chapter 2! And, while we could do a few tshark standalone command line machinations to solve this quest, it doesn’t make much sense to since we have to deal with multiple calls, already have the data we need, and would have to use other command line tools to truly “solve” it well with “just” tshark.

# read in our hosts/ip file
read_tsv(
  file = "maze/hosts.txt",
  col_names = c("ip", "hostname"), 
  col_types = "cc",
  skip = 3
) -> hosts

packets %>% 
  filter(packet_num == 27300) %>% 
  select(ip = dst) %>% 
  left_join(hosts)
## Joining, by = "ip"
## # A tibble: 1 x 2
##   ip             hostname    
##   <chr>          <chr>       
## 1 172.67.162.206 dfir.science

Old-school R follows the same idiom:

packets |>
  subset(
    packet_num == 27300,
    select = dst
  ) |>
  merge(
    hosts, 
    by.x = "dst", 
    by.y = "ip"
  )
##              dst     hostname
## 1 172.67.162.206 dfir.science