In-brief: Using Bro connection logs with Apache Drill

If you’ve got a directory full of Bro NSM logs, it’s easy to work with them in Apache Drill since they’re just tab-separated values (TSV) files by default. The most tedious part is mapping the columns to proper types and hopefully this saves at least one person from typing it out manually:

SELECT 
  TO_TIMESTAMP(CAST(columns[0] AS DOUBLE)) AS ts,
                    columns[1]             AS uid,
                    columns[2]             AS id_orig_h,
                    columns[3]             AS id_orig_p,
                    columns[4]             AS id_resp_h,
                    columns[5]             AS id_resp_p,
                    columns[6]             AS proto,
                    columns[7]             AS service,
              CAST( columns[8] AS DOUBLE)  AS duration,
              CAST( columns[9] AS INTEGER) AS orig_bytes,
              CAST(columns[10] AS INTEGER) AS resp_bytes,
                   columns[11]             AS conn_state,
                   columns[12]             AS local_orig,
                   columns[13]             AS local_resp,
              CAST(columns[14] AS INTEGER) AS missed_bytes,
                   columns[15]             AS history,
              CAST(columns[16] AS INTEGER) AS orig_packets,
              CAST(columns[17] AS INTEGER) AS orig_ip_bytes,
              CAST(columns[18] AS INTEGER) AS resp_pkts,
              CAST(columns[19] AS INTEGER) AS resp_ip_bytes,
                   columns[20]             AS tunnel_parents
FROM dfs.brologs.`/201808/*`

You can either store them all under a single workspace with a default input type or soft-link/rename them to end in .tsv (it’s unlikely you want to change all .log files to be read as TSV everywhere).

While you could just use the logs this way, consider using CTAS to move them to Parquet. The above will created typed columns and the queries will generally be much faster.

One Trackback/Pingback

By In-brief: Using Bro connection logs with Apache Drill | Atlantic Tagmata AI Security Feed on 17 Aug 2018 at 1:36 pm

[…] *** This is a Security Bloggers Network syndicated blog from rud.is authored by hrbrmstr. Read the original post at: https://rud.is/b/2018/08/17/in-brief-using-bro-connection-logs-with-apache-drill/ […]

rud.is