In-brief: Using Bro connection logs with Apache Drill

If you’ve got a directory full of Bro NSM logs, it’s easy to work with them in Apache Drill since they’re just tab-separated values (TSV) files by default. The most tedious part is mapping the columns to proper types and hopefully this saves at least one person from typing it out manually:

                    columns[1]             AS uid,
                    columns[2]             AS id_orig_h,
                    columns[3]             AS id_orig_p,
                    columns[4]             AS id_resp_h,
                    columns[5]             AS id_resp_p,
                    columns[6]             AS proto,
                    columns[7]             AS service,
              CAST( columns[8] AS DOUBLE)  AS duration,
              CAST( columns[9] AS INTEGER) AS orig_bytes,
              CAST(columns[10] AS INTEGER) AS resp_bytes,
                   columns[11]             AS conn_state,
                   columns[12]             AS local_orig,
                   columns[13]             AS local_resp,
              CAST(columns[14] AS INTEGER) AS missed_bytes,
                   columns[15]             AS history,
              CAST(columns[16] AS INTEGER) AS orig_packets,
              CAST(columns[17] AS INTEGER) AS orig_ip_bytes,
              CAST(columns[18] AS INTEGER) AS resp_pkts,
              CAST(columns[19] AS INTEGER) AS resp_ip_bytes,
                   columns[20]             AS tunnel_parents
FROM dfs.brologs.`/201808/*`

You can either store them all under a single workspace with a default input type or soft-link/rename them to end in .tsv (it’s unlikely you want to change all .log files to be read as TSV everywhere).

While you could just use the logs this way, consider using CTAS to move them to Parquet. The above will created typed columns and the queries will generally be much faster.

Cover image from Data-Driven Security
Amazon Author Page

1 Comment In-brief: Using Bro connection logs with Apache Drill

  1. Pingback: In-brief: Using Bro connection logs with Apache Drill | Atlantic Tagmata AI Security Feed

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.