Chapter 15 What was the camera model name used to take picture 20210429_152157.jpg?

15.1 Objects in memory/packages loaded from preceding chapters

Objects available:

  • read_zeek_log() — helper function to read Zeek log files (Chapter 3)
  • packets — data frame of PCAP packet data (Chapter 4)
  • conn — Zeek conn.log data (Chapter 5)
  • hoststshark host/IP file (Chapter 8)
  • ftp — Zeek ftp.log data (Chapter 10)

Packages:

  • {jsonlite}
  • {stringi}
  • {glue}
  • {tidyverse}
  • {sf} (for a really fast version of converting hex strings to raw vectors)
  • {MACtools} (optional but not used in this chapter anyway)
  • {exif} (optional)

15.2 Question Setup

Our last quest is a doozy. Not only do we have to work with network data from a PCAP, but we need to also extract an image (20210429_152157.jpg) from it and then poke at the image metadata to identity the camera model name (it’s like metadata inception).

According to the FTP RFC files are transferred via STOR[E] or RETR[IEVE], so we’ll need to figure out which one has that file then do some fun data extraction. We’ll use Zeek data, a custom tshark filter/extraction command and R to finally get out of this maze.

15.3 Solving the quest with Zeek ftp.log, tshark filters, and R

Let’s first see if we’re looking for STOR or RETR:

library(exif)

ftp %>% 
  filter(
    command %in% c("STOR", "RETR"),
    stri_detect_fixed(arg, "20210429_152157.jpg")
  ) %>% 
  select(command, arg, mime_type)
## # A tibble: 1 x 3
##   command arg                                                        mime_type 
##   <chr>   <chr>                                                      <chr>     
## 1 STOR    ftp://192.168.1.20/home/kali/Documents/20210429_152157.jpg image/jpeg

It looks like we’re going to want STOR!

We’ll eventually use tshark to grab the JPG file, but to do that we need the TCP stream (so tshark can follow it and save the data).

(unique(
  system("tshark -Tfields -e tcp.stream  -r maze/maze.pcapng 'ftp-data.command == \"STOR 20210429_152157.jpg\"'", intern=TRUE)
) -> stream_id)
## [1] "17"

Now we ask tshark to save that to a temporary file.

tf <- tempfile(fileext = ".jpg")

# -l tells tshark to flush the buffer after each line is printed (just to be safe)
# -q tells tshark to be quiet
# -z is a special feature to output components of packets or statistics on packets.
#    here we are asking tshark to follow the entire stream and then we redirect 
#    that to a file

system(
  glue(
    "tshark -lr maze/maze.pcapng -qz 'follow,tcp,raw,{stream_id}' > {tf}"
  )
)

jpg <- read_lines(tf)

substr(jpg[1:10], 1, 80)
##  [1] ""                                                                                
##  [2] "==================================================================="             
##  [3] "Follow: tcp,raw"                                                                 
##  [4] "Filter: tcp.stream eq 17"                                                        
##  [5] "Node 0: 192.168.1.26:47052"                                                      
##  [6] "Node 1: 192.168.1.20:34391"                                                      
##  [7] "ffd8ffe19c9d4578696600004d4d002a00000008000a011000020000000900000086011200030000"
##  [8] "0000000000000000000088888888888806e3400065140006f440004dd400053e4000726400047040"
##  [9] "00000000000000000000000000000000000000000000000000000000000000000000000000000000"
## [10] "00000000000000000000000000000000000000000000000000000000000000000000000000000000"

That file is definitely not a JPG yet. There’s some metadata up front, a useless bit of ======… at the end, and the binary data has been hex-encoded. O_o We’ll need to clean this up a bit before it’s a proper image, something R is ridiculously good at.

jpg[7:(length(jpg)-1)] %>% 
  stri_trim_both() %>% # just in case
  paste0(collapse = "") %>%  # mush it all together
  sf:::CPL_hex_to_raw() -> img # there are other ways but this is super fast way to go from hex to raw

Now we can even view the image:

magick::image_read(img[[1]])

And, check out the metadata to get the camera model:

writeBin(img[[1]], tf)

exif::read_exif(tf)
##             make    model software bits_per_sample image_width image_height
## 1 LG Electronics LM-Q725K                        0        4160         3120
##   description orientation copyright           timestamp    origin_timestamp
## 1                       6           2021:04:29 15:21:57 2021:04:29 15:21:57
##   digitised_timestamp subsecond_timestamp exposure_time f_stop iso_speed
## 1 2021:04:29 15:21:57              673205           207    2.2        50
##   subject_distance exposure_bias used_flash metering focal_length
## 1                              0       TRUE        2        3.701
##   focal_length_35mm latitude longitude altitude lens_min_focal_length
## 1                 0        0         0        0                     0
##   lens_max_focal_length lens_min_f_stop lens_max_f_stop lens_make lens_model
## 1                     0               0               0
unlink(tf)

We escaped the packet maze!