

{"id":22211,"date":"2024-08-26T07:37:28","date_gmt":"2024-08-26T12:37:28","guid":{"rendered":"https:\/\/rud.is\/b\/?p=22211"},"modified":"2024-08-30T03:03:25","modified_gmt":"2024-08-30T08:03:25","slug":"reading-pcap-files-directly-with-duckdb","status":"publish","type":"post","link":"https:\/\/rud.is\/b\/2024\/08\/26\/reading-pcap-files-directly-with-duckdb\/","title":{"rendered":"Reading PCAP Files (Directly) With DuckDB"},"content":{"rendered":"<div id=\"attachment_22214\" style=\"width: 1310px\" class=\"wp-caption aligncenter\"><a href=\"https:\/\/i0.wp.com\/rud.is\/b\/wp-content\/uploads\/2024\/08\/vincent-van-zalinge-MM2bY4W7G2Y-unsplash.jpg?ssl=1\"><img data-recalc-dims=\"1\" loading=\"lazy\" decoding=\"async\" aria-describedby=\"caption-attachment-22214\" data-attachment-id=\"22214\" data-permalink=\"https:\/\/rud.is\/b\/2024\/08\/26\/reading-pcap-files-directly-with-duckdb\/vincent-van-zalinge-mm2by4w7g2y-unsplash\/\" data-orig-file=\"https:\/\/i0.wp.com\/rud.is\/b\/wp-content\/uploads\/2024\/08\/vincent-van-zalinge-MM2bY4W7G2Y-unsplash.jpg?fit=1300%2C753&amp;ssl=1\" data-orig-size=\"1300,753\" data-comments-opened=\"1\" data-image-meta=\"{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;1&quot;}\" data-image-title=\"vincent-van-zalinge-MM2bY4W7G2Y-unsplash\" data-image-description=\"\" data-image-caption=\"&lt;p&gt;Photo by &lt;a href=&quot;https:\/\/unsplash.com\/@vincentvanzalinge?utm_content=creditCopyText&amp;#038;utm_medium=referral&amp;#038;utm_source=unsplash&quot;&gt;Vincent van Zalinge&lt;\/a&gt; on &lt;a href=&quot;https:\/\/unsplash.com\/photos\/shallow-focus-photo-of-flying-goose-MM2bY4W7G2Y?utm_content=creditCopyText&amp;#038;utm_medium=referral&amp;#038;utm_source=unsplash&quot;&gt;Unsplash&lt;\/a&gt;&lt;\/p&gt;\n\" data-large-file=\"https:\/\/i0.wp.com\/rud.is\/b\/wp-content\/uploads\/2024\/08\/vincent-van-zalinge-MM2bY4W7G2Y-unsplash.jpg?fit=510%2C295&amp;ssl=1\" src=\"https:\/\/i0.wp.com\/rud.is\/b\/wp-content\/uploads\/2024\/08\/vincent-van-zalinge-MM2bY4W7G2Y-unsplash.jpg?resize=510%2C295&#038;ssl=1\" alt=\"\" width=\"510\" height=\"295\" class=\"size-full wp-image-22214\" srcset=\"https:\/\/i0.wp.com\/rud.is\/b\/wp-content\/uploads\/2024\/08\/vincent-van-zalinge-MM2bY4W7G2Y-unsplash.jpg?w=1300&amp;ssl=1 1300w, https:\/\/i0.wp.com\/rud.is\/b\/wp-content\/uploads\/2024\/08\/vincent-van-zalinge-MM2bY4W7G2Y-unsplash.jpg?resize=300%2C174&amp;ssl=1 300w, https:\/\/i0.wp.com\/rud.is\/b\/wp-content\/uploads\/2024\/08\/vincent-van-zalinge-MM2bY4W7G2Y-unsplash.jpg?resize=530%2C307&amp;ssl=1 530w, https:\/\/i0.wp.com\/rud.is\/b\/wp-content\/uploads\/2024\/08\/vincent-van-zalinge-MM2bY4W7G2Y-unsplash.jpg?resize=150%2C87&amp;ssl=1 150w, https:\/\/i0.wp.com\/rud.is\/b\/wp-content\/uploads\/2024\/08\/vincent-van-zalinge-MM2bY4W7G2Y-unsplash.jpg?resize=768%2C445&amp;ssl=1 768w, https:\/\/i0.wp.com\/rud.is\/b\/wp-content\/uploads\/2024\/08\/vincent-van-zalinge-MM2bY4W7G2Y-unsplash.jpg?w=1020&amp;ssl=1 1020w\" sizes=\"auto, (max-width: 510px) 100vw, 510px\" \/><\/a><p id=\"caption-attachment-22214\" class=\"wp-caption-text\">Photo by <a href=\"https:\/\/unsplash.com\/@vincentvanzalinge?utm_content=creditCopyText&#038;utm_medium=referral&#038;utm_source=unsplash\">Vincent van Zalinge<\/a> on <a href=\"https:\/\/unsplash.com\/photos\/shallow-focus-photo-of-flying-goose-MM2bY4W7G2Y?utm_content=creditCopyText&#038;utm_medium=referral&#038;utm_source=unsplash\">Unsplash<\/a><\/p><\/div>\n<p>2024-08-30 UPDATE:<br \/>\nBinary versions of this extension are available for amd64 Linux (<code>linux_amd64<\/code> &amp; <code>linux_amd64_gcc4<\/code>) and Apple Silicon. (<code>osx_arm64<\/code>).<\/p>\n<pre><code class=\"language-bash\">$ duckdb -unsigned\nv1.0.0 1f98600c2c\nEnter \".help\" for usage hints.\nConnected to a transient in-memory database.\nUse \".open FILENAME\" to reopen on a persistent database.\nD SET custom_extension_repository='https:\/\/w3c2.c20.e2-5.dev\/ppcap\/latest';\nD INSTALL ppcap;\nD LOAD ppcap;\n<\/code><\/pre>\n<p>2024-08-29 UPDATE: The Apple Silicon macOS and Linux AMD64 versions of the plugin now work with PCAP files that are &#8220;Raw IP&#8221; vs. just &#8220;Ethernet<\/p>\n<p>We generate <em>a ton<\/em> of PCAP files at <code>$DAYJOB<\/code>. Since I do not always have to work directly with them, I regularly mix up or forget the various <code>tshark<\/code>, <code>tcpdump<\/code>, etc., filters and CLI parameters. While this is less of an issue in the age of LLM\/GPTs (just ask local ollama to gen the CLI incantation, and it usually does a good job), each failed command makes me miss Apache Drill just a tad, since it had\/has a decent, albeit basic, PCAP reading capability.<\/p>\n<p>For the past few months, I&#8217;ve had an &#8220;I should build a DuckDB extension to read PCAP files&#8221; idea floating in the back of my mind. Thanks to lingering issues from long covid, I&#8217;m back in the &#8220;let&#8217;s wake him up at 0-dark-30 and not let him get back to sleep&#8221; routine, so I decided to try to scratch this itch (I was actually hoping super focused work would engender slumber, but that, too, was a big fail).<\/p>\n<p>The DuckDB folks have a spiffy <a href=\"https:\/\/github.com\/duckdb\/extension-template\">extension template<\/a> that you can use\/fork to get started. It&#8217;s been a minute since I&#8217;ve had to work in C++ land, and I&#8217;m also used to working with system-level, or vendored libraries when doing said work. So, first I had to figure out <a href=\"https:\/\/vcpkg.io\/en\/\">vcpkg<\/a> \u2014 a C\/C++ dependency manager from (ugh) Microsoft \u2014 as the DuckDB folks strongly encourage using it (and they use it). You likely do not have to get in the weeds, since there are three lines in the <a href=\"https:\/\/github.com\/duckdb\/extension-template\">extension template<\/a> that are (pretty much) all you really need to know\/do.<\/p>\n<p>Once that was done, I added <code>libpcap<\/code> to the <a href=\"https:\/\/github.com\/hrbrmstr\/duckdb-pcap\/blob\/main\/vcpkg.json\">DuckDB vcpkg deps<\/a>. Then, a review of the structure of the example extension and the JSON, CSV, and Parquet reader extensions was in order to get a feel for how to add new functions, and return rectangular data from an entirely new file type.<\/p>\n<p>To get started, I focused on some easy fields: source\/destination IPs, timestamp, and payload length and had some oddly great success. So, of course, I had to <a href=\"https:\/\/mastodon.social\/@hrbrmstr\/113022175871245120\">start a Mastodon thread<\/a>.<\/p>\n<p>The brilliant minds at DuckDB truly made it pretty straightforward to work with list\/array columns, and write new utility functions, so I just kept adding fields and functionality until time ran out (adulting is hard).<\/p>\n<p>At present, the extension exposes the following fields from a PCAP file:<\/p>\n<ul>\n<li><code>timestamp<\/code><\/li>\n<li><code>source_ip<\/code><\/li>\n<li><code>dest_ip<\/code><\/li>\n<li><code>source_port<\/code><\/li>\n<li><code>dest_port<\/code><\/li>\n<li><code>length<\/code><\/li>\n<li><code>tcp_session<\/code><\/li>\n<li><code>source_mac<\/code><\/li>\n<li><code>dest_mac<\/code><\/li>\n<li><code>protocols<\/code><\/li>\n<li><code>payload<\/code><\/li>\n<li><code>tcp_flags<\/code><\/li>\n<li><code>tcp_seq_num<\/code><\/li>\n<\/ul>\n<p>It also has a <code>read_pcap<\/code> function that supports wildcards or an array of filenames. And, there are three utility functions, one that does a naive test for whether a payload is an HTTP request or response, another that extracts HTTP request headers (if present), and one more that extracts some info from ICMP packets.<\/p>\n<h3>Stop Telling Me And Show Me<\/h3>\n<p><em>Fine.<\/em><\/p>\n<p>Here&#8217;s an incantation that naively converts all HTTP request and response packets to Parquet, since it will always be faster to use Parquet than it will be to use PCAPs:<\/p>\n<pre><code class=\"language-bash\">duckdb -unsigned &lt;&lt;EOF\nLOAD ppcap;\n\nCOPY (\n  FROM \n    read_pcap('scans.pcap')\n  SELECT\n    *,\n    is_http(payload) AS is_http,\n    extract_http_request_headers(payload) AS req\n) TO 'scans.parquet' (FORMAT PARQUET);\nEOF\n\nduckdb -json -s \"FROM read_parquet('scans.parquet') WHERE is_http LIMIT 2\" | jq\n[\n  {\n    \"timestamp\": \"2024-07-23 16:31:06\",\n    \"source_ip\": \"94.156.71.207\",\n    \"dest_ip\": \"203.161.44.208\",\n    \"source_port\": 49678,\n    \"dest_port\": 80,\n    \"length\": 154,\n    \"tcp_session\": \"94.156.71.207:49678-203.161.44.208:80\",\n    \"source_mac\": \"64:64:9b:4f:37:00\",\n    \"dest_mac\": \"00:16:3c:cb:72:42\",\n    \"protocols\": \"[Ethernet, IP, TCP]\",\n    \"payload\": \"GET \/_profiler\/phpinfo HTTP\/1.1\\\\x0D\\\\x0AHost: 203.161.44.208\\\\x0D\\\\x0AUser-Agent: Web Downloader\/6.9\\\\x0D\\\\x0AAccept-Charset: utf-8\\\\x0D\\\\x0AAccept-Encoding: gzip\\\\x0D\\\\x0AConnection: close\\\\x0D\\\\x0A\\\\x0D\\\\x0A\",\n    \"tcp_flags\": \"[ACK, PSH]\",\n    \"tcp_seq_num\": \"2072884123\",\n    \"is_http\": true,\n    \"req\": \"[{'key': Host, 'value': 203.161.44.208}, {'key': User-Agent, 'value': Web Downloader\/6.9}, {'key': Accept-Charset, 'value': utf-8}, {'key': Accept-Encoding, 'value': gzip}, {'key': Connection, 'value': close}]\"\n  },\n  {\n    \"timestamp\": \"2024-07-23 16:31:06\",\n    \"source_ip\": \"203.161.44.208\",\n    \"dest_ip\": \"94.156.71.207\",\n    \"source_port\": 80,\n    \"dest_port\": 49678,\n    \"length\": 456,\n    \"tcp_session\": \"203.161.44.208:80-94.156.71.207:49678\",\n    \"source_mac\": \"00:16:3c:cb:72:42\",\n    \"dest_mac\": \"64:64:9b:4f:37:00\",\n    \"protocols\": \"[Ethernet, IP, TCP]\",\n    \"payload\": \"HTTP\/1.1 404 Not Found\\\\x0D\\\\x0ADate: Tue, 23 Jul 2024 16:31:06 GMT\\\\x0D\\\\x0AServer: Apache\/2.4.52 (Ubuntu)\\\\x0D\\\\x0AContent-Length: 276\\\\x0D\\\\x0AConnection: close\\\\x0D\\\\x0AContent-Type: text\/html; charset=iso-8859-1\\\\x0D\\\\x0A\\\\x0D\\\\x0A&lt;!DOCTYPE HTML PUBLIC \\\\x22-\/\/IETF\/\/DTD HTML 2.0\/\/EN\\\\x22&gt;\\\\x0A&lt;html&gt;&lt;head&gt;\\\\x0A&lt;title&gt;404 Not Found&lt;\/title&gt;\\\\x0A&lt;\/head&gt;&lt;body&gt;\\\\x0A&lt;h1&gt;Not Found&lt;\/h1&gt;\\\\x0A&lt;p&gt;The requested URL was not found on this server.&lt;\/p&gt;\\\\x0A&lt;hr&gt;\\\\x0A&lt;address&gt;Apache\/2.4.52 (Ubuntu) Server at 203.161.44.208 Port 80&lt;\/address&gt;\\\\x0A&lt;\/body&gt;&lt;\/html&gt;\\\\x0A\",\n    \"tcp_flags\": \"[ACK, PSH]\",\n    \"tcp_seq_num\": \"2821588265\",\n    \"is_http\": true,\n    \"req\": null\n  }\n]\n<\/code><\/pre>\n<p>The reason for <code>ppcap<\/code> is that I was too lazy to deal with some symbol name collisions (between the extension and <code>libpcap<\/code>) in a more fancy manner. I&#8217;ll eventually figure out how to make it just <code>pcap<\/code>. PRs welcome.<\/p>\n<h3>How Do I Get This?<\/h3>\n<p>Well, for now, it&#8217;s a bit more complex than an <code>INSTALL ppcap<\/code>. My extension is not ready for prime time, so it won&#8217;t be in the DuckDB community extensions for a while. Which means, you&#8217;ll need to install them manually, and also get used to using the <code>-unsigned<\/code> CLI flag (I&#8217;ve aliased that to <code>duckdbu<\/code>).<\/p>\n<p>NOTE: you need to be running v1.0.0+ of DuckDB for this extension to work.<\/p>\n<p>Here&#8217;s how to install it on macOS + Apple Silicon and test to see if it worked:<\/p>\n<pre><code class=\"language-bash\"># where extensions live on macOS + Apple Silicon\nmkdir -p ~\/.duckdb\/extensions\/v1.0.0\/osx_arm64\n\n# grab and \"install\" the extension\ncurl --output ~\/.duckdb\/extensions\/v1.0.0\/osx_arm64\/ppcap.duckdb_extension https:\/\/rud.is\/dl\/pcap\/darwin-arm64\/ppcap.duckdb_extension\n\n# this should not output anyting if it worked\nduckdb -unsigned -s \"load ppcap\"\n<\/code><\/pre>\n<p>Linux folks can sub out <code>osx_arm64<\/code> and <code>darwin-arm64<\/code> with <code>linux_amd64<\/code> or <code>linux_amd64_gcc4<\/code>, depending on your system architecture, which you can find via <code>duckdb -s \"PRAGMA platform\"<\/code>. <code>linux_amd64_gcc4<\/code> is the architecture of the Linux amd64\/x86_64 binary offered for <a href=\"https:\/\/duckdb.org\/docs\/installation\/index?version=stable&#038;environment=cli&#038;platform=linux&#038;download_method=direct&#038;architecture=x86_64\">download from DuckDB-proper<\/a>.<\/p>\n<p>Source is, sadly, on GitHub: <a href=\"https:\/\/github.com\/hrbrmstr\/duckdb-pcap\">https:\/\/github.com\/hrbrmstr\/duckdb-pcap<\/a>.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>2024-08-30 UPDATE: Binary versions of this extension are available for amd64 Linux (linux_amd64 &amp; linux_amd64_gcc4) and Apple Silicon. (osx_arm64). $ duckdb -unsigned v1.0.0 1f98600c2c Enter &#8220;.help&#8221; for usage hints. Connected to a transient in-memory database. Use &#8220;.open FILENAME&#8221; to reopen on a persistent database. D SET custom_extension_repository=&#8217;https:\/\/w3c2.c20.e2-5.dev\/ppcap\/latest&#8217;; D INSTALL ppcap; D LOAD ppcap; 2024-08-29 UPDATE: [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"jetpack_post_was_ever_published":false,"_jetpack_newsletter_access":"","_jetpack_dont_email_post_to_subs":false,"_jetpack_newsletter_tier_id":0,"_jetpack_memberships_contains_paywalled_content":false,"_jetpack_memberships_contains_paid_content":false,"activitypub_content_warning":"","activitypub_content_visibility":"","activitypub_max_image_attachments":3,"activitypub_interaction_policy_quote":"anyone","activitypub_status":"federated","footnotes":""},"categories":[890,798],"tags":[],"class_list":["post-22211","post","type-post","status-publish","format-standard","hentry","category-duckdb","category-pcap"],"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v27.3 - https:\/\/yoast.com\/product\/yoast-seo-wordpress\/ -->\n<title>Reading PCAP Files (Directly) With DuckDB - rud.is<\/title>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/rud.is\/b\/2024\/08\/26\/reading-pcap-files-directly-with-duckdb\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"Reading PCAP Files (Directly) With DuckDB - rud.is\" \/>\n<meta property=\"og:description\" content=\"2024-08-30 UPDATE: Binary versions of this extension are available for amd64 Linux (linux_amd64 &amp; linux_amd64_gcc4) and Apple Silicon. (osx_arm64). $ duckdb -unsigned v1.0.0 1f98600c2c Enter &quot;.help&quot; for usage hints. Connected to a transient in-memory database. Use &quot;.open FILENAME&quot; to reopen on a persistent database. D SET custom_extension_repository=&#039;https:\/\/w3c2.c20.e2-5.dev\/ppcap\/latest&#039;; D INSTALL ppcap; D LOAD ppcap; 2024-08-29 UPDATE: [&hellip;]\" \/>\n<meta property=\"og:url\" content=\"https:\/\/rud.is\/b\/2024\/08\/26\/reading-pcap-files-directly-with-duckdb\/\" \/>\n<meta property=\"og:site_name\" content=\"rud.is\" \/>\n<meta property=\"article:published_time\" content=\"2024-08-26T12:37:28+00:00\" \/>\n<meta property=\"article:modified_time\" content=\"2024-08-30T08:03:25+00:00\" \/>\n<meta property=\"og:image\" content=\"https:\/\/rud.is\/b\/wp-content\/uploads\/2024\/08\/vincent-van-zalinge-MM2bY4W7G2Y-unsplash.jpg\" \/>\n<meta name=\"author\" content=\"hrbrmstr\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:label1\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data1\" content=\"hrbrmstr\" \/>\n\t<meta name=\"twitter:label2\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data2\" content=\"4 minutes\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\\\/\\\/schema.org\",\"@graph\":[{\"@type\":\"Article\",\"@id\":\"https:\\\/\\\/rud.is\\\/b\\\/2024\\\/08\\\/26\\\/reading-pcap-files-directly-with-duckdb\\\/#article\",\"isPartOf\":{\"@id\":\"https:\\\/\\\/rud.is\\\/b\\\/2024\\\/08\\\/26\\\/reading-pcap-files-directly-with-duckdb\\\/\"},\"author\":{\"name\":\"hrbrmstr\",\"@id\":\"https:\\\/\\\/rud.is\\\/b\\\/#\\\/schema\\\/person\\\/d7cb7487ab0527447f7fda5c423ff886\"},\"headline\":\"Reading PCAP Files (Directly) With DuckDB\",\"datePublished\":\"2024-08-26T12:37:28+00:00\",\"dateModified\":\"2024-08-30T08:03:25+00:00\",\"mainEntityOfPage\":{\"@id\":\"https:\\\/\\\/rud.is\\\/b\\\/2024\\\/08\\\/26\\\/reading-pcap-files-directly-with-duckdb\\\/\"},\"wordCount\":724,\"commentCount\":3,\"publisher\":{\"@id\":\"https:\\\/\\\/rud.is\\\/b\\\/#\\\/schema\\\/person\\\/d7cb7487ab0527447f7fda5c423ff886\"},\"image\":{\"@id\":\"https:\\\/\\\/rud.is\\\/b\\\/2024\\\/08\\\/26\\\/reading-pcap-files-directly-with-duckdb\\\/#primaryimage\"},\"thumbnailUrl\":\"https:\\\/\\\/rud.is\\\/b\\\/wp-content\\\/uploads\\\/2024\\\/08\\\/vincent-van-zalinge-MM2bY4W7G2Y-unsplash.jpg\",\"articleSection\":[\"duckdb\",\"pcap\"],\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"CommentAction\",\"name\":\"Comment\",\"target\":[\"https:\\\/\\\/rud.is\\\/b\\\/2024\\\/08\\\/26\\\/reading-pcap-files-directly-with-duckdb\\\/#respond\"]}]},{\"@type\":\"WebPage\",\"@id\":\"https:\\\/\\\/rud.is\\\/b\\\/2024\\\/08\\\/26\\\/reading-pcap-files-directly-with-duckdb\\\/\",\"url\":\"https:\\\/\\\/rud.is\\\/b\\\/2024\\\/08\\\/26\\\/reading-pcap-files-directly-with-duckdb\\\/\",\"name\":\"Reading PCAP Files (Directly) With DuckDB - rud.is\",\"isPartOf\":{\"@id\":\"https:\\\/\\\/rud.is\\\/b\\\/#website\"},\"primaryImageOfPage\":{\"@id\":\"https:\\\/\\\/rud.is\\\/b\\\/2024\\\/08\\\/26\\\/reading-pcap-files-directly-with-duckdb\\\/#primaryimage\"},\"image\":{\"@id\":\"https:\\\/\\\/rud.is\\\/b\\\/2024\\\/08\\\/26\\\/reading-pcap-files-directly-with-duckdb\\\/#primaryimage\"},\"thumbnailUrl\":\"https:\\\/\\\/rud.is\\\/b\\\/wp-content\\\/uploads\\\/2024\\\/08\\\/vincent-van-zalinge-MM2bY4W7G2Y-unsplash.jpg\",\"datePublished\":\"2024-08-26T12:37:28+00:00\",\"dateModified\":\"2024-08-30T08:03:25+00:00\",\"breadcrumb\":{\"@id\":\"https:\\\/\\\/rud.is\\\/b\\\/2024\\\/08\\\/26\\\/reading-pcap-files-directly-with-duckdb\\\/#breadcrumb\"},\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\\\/\\\/rud.is\\\/b\\\/2024\\\/08\\\/26\\\/reading-pcap-files-directly-with-duckdb\\\/\"]}]},{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\\\/\\\/rud.is\\\/b\\\/2024\\\/08\\\/26\\\/reading-pcap-files-directly-with-duckdb\\\/#primaryimage\",\"url\":\"https:\\\/\\\/i0.wp.com\\\/rud.is\\\/b\\\/wp-content\\\/uploads\\\/2024\\\/08\\\/vincent-van-zalinge-MM2bY4W7G2Y-unsplash.jpg?fit=1300%2C753&ssl=1\",\"contentUrl\":\"https:\\\/\\\/i0.wp.com\\\/rud.is\\\/b\\\/wp-content\\\/uploads\\\/2024\\\/08\\\/vincent-van-zalinge-MM2bY4W7G2Y-unsplash.jpg?fit=1300%2C753&ssl=1\",\"width\":1300,\"height\":753,\"caption\":\"Photo by Vincent van Zalinge on Unsplash\"},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\\\/\\\/rud.is\\\/b\\\/2024\\\/08\\\/26\\\/reading-pcap-files-directly-with-duckdb\\\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Home\",\"item\":\"https:\\\/\\\/rud.is\\\/b\\\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"Reading PCAP Files (Directly) With DuckDB\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\\\/\\\/rud.is\\\/b\\\/#website\",\"url\":\"https:\\\/\\\/rud.is\\\/b\\\/\",\"name\":\"rud.is\",\"description\":\"&quot;In God we trust. All others must bring data&quot;\",\"publisher\":{\"@id\":\"https:\\\/\\\/rud.is\\\/b\\\/#\\\/schema\\\/person\\\/d7cb7487ab0527447f7fda5c423ff886\"},\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\\\/\\\/rud.is\\\/b\\\/?s={search_term_string}\"},\"query-input\":{\"@type\":\"PropertyValueSpecification\",\"valueRequired\":true,\"valueName\":\"search_term_string\"}}],\"inLanguage\":\"en-US\"},{\"@type\":[\"Person\",\"Organization\"],\"@id\":\"https:\\\/\\\/rud.is\\\/b\\\/#\\\/schema\\\/person\\\/d7cb7487ab0527447f7fda5c423ff886\",\"name\":\"hrbrmstr\",\"image\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\\\/\\\/i0.wp.com\\\/rud.is\\\/b\\\/wp-content\\\/uploads\\\/2023\\\/10\\\/ukr-shield.png?fit=460%2C460&ssl=1\",\"url\":\"https:\\\/\\\/i0.wp.com\\\/rud.is\\\/b\\\/wp-content\\\/uploads\\\/2023\\\/10\\\/ukr-shield.png?fit=460%2C460&ssl=1\",\"contentUrl\":\"https:\\\/\\\/i0.wp.com\\\/rud.is\\\/b\\\/wp-content\\\/uploads\\\/2023\\\/10\\\/ukr-shield.png?fit=460%2C460&ssl=1\",\"width\":460,\"height\":460,\"caption\":\"hrbrmstr\"},\"logo\":{\"@id\":\"https:\\\/\\\/i0.wp.com\\\/rud.is\\\/b\\\/wp-content\\\/uploads\\\/2023\\\/10\\\/ukr-shield.png?fit=460%2C460&ssl=1\"},\"description\":\"Don't look at me\u2026I do what he does \u2014 just slower. #rstats avuncular \u2022 ?Resistance Fighter \u2022 Cook \u2022 Christian \u2022 [Master] Chef des Donn\u00e9es de S\u00e9curit\u00e9 @ @rapid7\",\"sameAs\":[\"http:\\\/\\\/rud.is\"],\"url\":\"https:\\\/\\\/rud.is\\\/b\\\/author\\\/hrbrmstr\\\/\"}]}<\/script>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"Reading PCAP Files (Directly) With DuckDB - rud.is","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/rud.is\/b\/2024\/08\/26\/reading-pcap-files-directly-with-duckdb\/","og_locale":"en_US","og_type":"article","og_title":"Reading PCAP Files (Directly) With DuckDB - rud.is","og_description":"2024-08-30 UPDATE: Binary versions of this extension are available for amd64 Linux (linux_amd64 &amp; linux_amd64_gcc4) and Apple Silicon. (osx_arm64). $ duckdb -unsigned v1.0.0 1f98600c2c Enter \".help\" for usage hints. Connected to a transient in-memory database. Use \".open FILENAME\" to reopen on a persistent database. D SET custom_extension_repository='https:\/\/w3c2.c20.e2-5.dev\/ppcap\/latest'; D INSTALL ppcap; D LOAD ppcap; 2024-08-29 UPDATE: [&hellip;]","og_url":"https:\/\/rud.is\/b\/2024\/08\/26\/reading-pcap-files-directly-with-duckdb\/","og_site_name":"rud.is","article_published_time":"2024-08-26T12:37:28+00:00","article_modified_time":"2024-08-30T08:03:25+00:00","og_image":[{"url":"https:\/\/rud.is\/b\/wp-content\/uploads\/2024\/08\/vincent-van-zalinge-MM2bY4W7G2Y-unsplash.jpg","type":"","width":"","height":""}],"author":"hrbrmstr","twitter_card":"summary_large_image","twitter_misc":{"Written by":"hrbrmstr","Est. reading time":"4 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"Article","@id":"https:\/\/rud.is\/b\/2024\/08\/26\/reading-pcap-files-directly-with-duckdb\/#article","isPartOf":{"@id":"https:\/\/rud.is\/b\/2024\/08\/26\/reading-pcap-files-directly-with-duckdb\/"},"author":{"name":"hrbrmstr","@id":"https:\/\/rud.is\/b\/#\/schema\/person\/d7cb7487ab0527447f7fda5c423ff886"},"headline":"Reading PCAP Files (Directly) With DuckDB","datePublished":"2024-08-26T12:37:28+00:00","dateModified":"2024-08-30T08:03:25+00:00","mainEntityOfPage":{"@id":"https:\/\/rud.is\/b\/2024\/08\/26\/reading-pcap-files-directly-with-duckdb\/"},"wordCount":724,"commentCount":3,"publisher":{"@id":"https:\/\/rud.is\/b\/#\/schema\/person\/d7cb7487ab0527447f7fda5c423ff886"},"image":{"@id":"https:\/\/rud.is\/b\/2024\/08\/26\/reading-pcap-files-directly-with-duckdb\/#primaryimage"},"thumbnailUrl":"https:\/\/rud.is\/b\/wp-content\/uploads\/2024\/08\/vincent-van-zalinge-MM2bY4W7G2Y-unsplash.jpg","articleSection":["duckdb","pcap"],"inLanguage":"en-US","potentialAction":[{"@type":"CommentAction","name":"Comment","target":["https:\/\/rud.is\/b\/2024\/08\/26\/reading-pcap-files-directly-with-duckdb\/#respond"]}]},{"@type":"WebPage","@id":"https:\/\/rud.is\/b\/2024\/08\/26\/reading-pcap-files-directly-with-duckdb\/","url":"https:\/\/rud.is\/b\/2024\/08\/26\/reading-pcap-files-directly-with-duckdb\/","name":"Reading PCAP Files (Directly) With DuckDB - rud.is","isPartOf":{"@id":"https:\/\/rud.is\/b\/#website"},"primaryImageOfPage":{"@id":"https:\/\/rud.is\/b\/2024\/08\/26\/reading-pcap-files-directly-with-duckdb\/#primaryimage"},"image":{"@id":"https:\/\/rud.is\/b\/2024\/08\/26\/reading-pcap-files-directly-with-duckdb\/#primaryimage"},"thumbnailUrl":"https:\/\/rud.is\/b\/wp-content\/uploads\/2024\/08\/vincent-van-zalinge-MM2bY4W7G2Y-unsplash.jpg","datePublished":"2024-08-26T12:37:28+00:00","dateModified":"2024-08-30T08:03:25+00:00","breadcrumb":{"@id":"https:\/\/rud.is\/b\/2024\/08\/26\/reading-pcap-files-directly-with-duckdb\/#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/rud.is\/b\/2024\/08\/26\/reading-pcap-files-directly-with-duckdb\/"]}]},{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/rud.is\/b\/2024\/08\/26\/reading-pcap-files-directly-with-duckdb\/#primaryimage","url":"https:\/\/i0.wp.com\/rud.is\/b\/wp-content\/uploads\/2024\/08\/vincent-van-zalinge-MM2bY4W7G2Y-unsplash.jpg?fit=1300%2C753&ssl=1","contentUrl":"https:\/\/i0.wp.com\/rud.is\/b\/wp-content\/uploads\/2024\/08\/vincent-van-zalinge-MM2bY4W7G2Y-unsplash.jpg?fit=1300%2C753&ssl=1","width":1300,"height":753,"caption":"Photo by Vincent van Zalinge on Unsplash"},{"@type":"BreadcrumbList","@id":"https:\/\/rud.is\/b\/2024\/08\/26\/reading-pcap-files-directly-with-duckdb\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/rud.is\/b\/"},{"@type":"ListItem","position":2,"name":"Reading PCAP Files (Directly) With DuckDB"}]},{"@type":"WebSite","@id":"https:\/\/rud.is\/b\/#website","url":"https:\/\/rud.is\/b\/","name":"rud.is","description":"&quot;In God we trust. All others must bring data&quot;","publisher":{"@id":"https:\/\/rud.is\/b\/#\/schema\/person\/d7cb7487ab0527447f7fda5c423ff886"},"potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/rud.is\/b\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"en-US"},{"@type":["Person","Organization"],"@id":"https:\/\/rud.is\/b\/#\/schema\/person\/d7cb7487ab0527447f7fda5c423ff886","name":"hrbrmstr","image":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/i0.wp.com\/rud.is\/b\/wp-content\/uploads\/2023\/10\/ukr-shield.png?fit=460%2C460&ssl=1","url":"https:\/\/i0.wp.com\/rud.is\/b\/wp-content\/uploads\/2023\/10\/ukr-shield.png?fit=460%2C460&ssl=1","contentUrl":"https:\/\/i0.wp.com\/rud.is\/b\/wp-content\/uploads\/2023\/10\/ukr-shield.png?fit=460%2C460&ssl=1","width":460,"height":460,"caption":"hrbrmstr"},"logo":{"@id":"https:\/\/i0.wp.com\/rud.is\/b\/wp-content\/uploads\/2023\/10\/ukr-shield.png?fit=460%2C460&ssl=1"},"description":"Don't look at me\u2026I do what he does \u2014 just slower. #rstats avuncular \u2022 ?Resistance Fighter \u2022 Cook \u2022 Christian \u2022 [Master] Chef des Donn\u00e9es de S\u00e9curit\u00e9 @ @rapid7","sameAs":["http:\/\/rud.is"],"url":"https:\/\/rud.is\/b\/author\/hrbrmstr\/"}]}},"jetpack_featured_media_url":"","jetpack_shortlink":"https:\/\/wp.me\/p23idr-5Mf","jetpack_likes_enabled":true,"jetpack-related-posts":[{"id":6127,"url":"https:\/\/rud.is\/b\/2017\/07\/27\/reading-pcap-files-with-apache-drill-and-the-sergeant-r-package\/","url_meta":{"origin":22211,"position":0},"title":"Reading PCAP Files with Apache Drill and the sergeant R Package","author":"hrbrmstr","date":"2017-07-27","format":false,"excerpt":"It's no secret that I'm a fan of Apache Drill. One big strength of the platform is that it normalizes the access to diverse data sources down to ANSI SQL calls, which means that I can pull data from parquet, Hie, HBase, Kudu, CSV, JSON, MongoDB and MariaDB with the\u2026","rel":"","context":"In &quot;Apache Drill&quot;","block_context":{"text":"Apache Drill","link":"https:\/\/rud.is\/b\/category\/apache-drill\/"},"img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":13120,"url":"https:\/\/rud.is\/b\/2021\/07\/20\/packet-maze-solving-a-cyberdefenders-pcap-puzzle-with-r-zeek-and-tshark\/","url_meta":{"origin":22211,"position":1},"title":"Packet Maze: Solving a CyberDefenders PCAP Puzzle with R, Zeek, and tshark","author":"hrbrmstr","date":"2021-07-20","format":false,"excerpt":"It was a rainy weekend in southern Maine and I really didn't feel like doing chores, so I was skimming through RSS feeds and noticed a link to a PacketMaze challenge in the latest This Week In 4n6. Since it's also been a while since I've done any serious content\u2026","rel":"","context":"In &quot;Cybersecurity&quot;","block_context":{"text":"Cybersecurity","link":"https:\/\/rud.is\/b\/category\/cybersecurity\/"},"img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":13137,"url":"https:\/\/rud.is\/b\/2021\/07\/25\/acoustic-solving-a-cyberdefenders-pcap-sip-rtp-challenge-with-r-zeek-tshark-friends\/","url_meta":{"origin":22211,"position":2},"title":"Acoustic: Solving a CyberDefenders PCAP SIP\/RTP Challenge with R, Zeek, tshark (&#038; friends)","author":"hrbrmstr","date":"2021-07-25","format":false,"excerpt":"Hot on the heels of the previous CyberDefenders Challenge Solution comes this noisy installment which solves their Acoustic challenge. You can find the source Rmd on GitHub, but I'm also testing the limits of WP's markdown rendering and putting it in-stream as well. No longer book expository this time since\u2026","rel":"","context":"In &quot;Cybersecurity&quot;","block_context":{"text":"Cybersecurity","link":"https:\/\/rud.is\/b\/category\/cybersecurity\/"},"img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":2046,"url":"https:\/\/rud.is\/b\/2013\/02\/08\/extended-simple-example-asn-graph-visualization-example-r-to-d3\/","url_meta":{"origin":22211,"position":3},"title":"Extended (Simple) ASN Graph Visualization Example [R to D3]","author":"hrbrmstr","date":"2013-02-08","format":false,"excerpt":"The small igraph visualization in the previous post shows the basics of what you can do with the BulkOrigin & BulkPeer functions, and I thought a larger example with some basic D3 tossed in might be even more useful. Assuming you have the previous functions in your environment, the following\u2026","rel":"","context":"In &quot;d3&quot;","block_context":{"text":"d3","link":"https:\/\/rud.is\/b\/category\/d3\/"},"img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":11712,"url":"https:\/\/rud.is\/b\/2019\/01\/02\/apache-drill-1-15-0-sergeant-0-8-0-pcapng-support-proper-column-types-mounds-of-new-metadata\/","url_meta":{"origin":22211,"position":4},"title":"Apache Drill 1.15.0 + sergeant 0.8.0 = pcapng Support, Proper Column Types &#038; Mounds of New Metadata","author":"hrbrmstr","date":"2019-01-02","format":false,"excerpt":"Apache Drill is an innovative distributed SQL engine designed to enable data exploration and analytics on non-relational datastores [...] without having to create and manage schemas. [...] It has a schema-free JSON document model similar to MongoDB and Elasticsearch; [a plethora of APIs, including] ANSI SQL, ODBC\/JDBC, and HTTP[S] REST;\u2026","rel":"","context":"In &quot;Apache Drill&quot;","block_context":{"text":"Apache Drill","link":"https:\/\/rud.is\/b\/category\/apache-drill\/"},"img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":12977,"url":"https:\/\/rud.is\/b\/2021\/03\/01\/brimming-with-possibilities-query-zqd-mine-logs-with-zq-from-r\/","url_meta":{"origin":22211,"position":5},"title":"Brimming With Possibilities: Query zqd &#038; Mine Logs with zq from R","author":"hrbrmstr","date":"2021-03-01","format":false,"excerpt":"Brim Security maintains a free, Electron-based desktop GUI for exploration of PCAPs and select cybersecurity logs: along with a broad ecosystem of tools which can be used independently of the GUI. The standalone or embedded zqd server, as well as the zq command line utility let analysts run ZQL (a\u2026","rel":"","context":"In &quot;Cybersecurity&quot;","block_context":{"text":"Cybersecurity","link":"https:\/\/rud.is\/b\/category\/cybersecurity\/"},"img":{"alt_text":"","src":"https:\/\/i0.wp.com\/rud.is\/b\/wp-content\/uploads\/2021\/03\/brimr-graph.png?fit=1200%2C667&ssl=1&resize=350%2C200","width":350,"height":200,"srcset":"https:\/\/i0.wp.com\/rud.is\/b\/wp-content\/uploads\/2021\/03\/brimr-graph.png?fit=1200%2C667&ssl=1&resize=350%2C200 1x, https:\/\/i0.wp.com\/rud.is\/b\/wp-content\/uploads\/2021\/03\/brimr-graph.png?fit=1200%2C667&ssl=1&resize=525%2C300 1.5x, https:\/\/i0.wp.com\/rud.is\/b\/wp-content\/uploads\/2021\/03\/brimr-graph.png?fit=1200%2C667&ssl=1&resize=700%2C400 2x, https:\/\/i0.wp.com\/rud.is\/b\/wp-content\/uploads\/2021\/03\/brimr-graph.png?fit=1200%2C667&ssl=1&resize=1050%2C600 3x"},"classes":[]}],"jetpack_sharing_enabled":true,"_links":{"self":[{"href":"https:\/\/rud.is\/b\/wp-json\/wp\/v2\/posts\/22211","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/rud.is\/b\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/rud.is\/b\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/rud.is\/b\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/rud.is\/b\/wp-json\/wp\/v2\/comments?post=22211"}],"version-history":[{"count":0,"href":"https:\/\/rud.is\/b\/wp-json\/wp\/v2\/posts\/22211\/revisions"}],"wp:attachment":[{"href":"https:\/\/rud.is\/b\/wp-json\/wp\/v2\/media?parent=22211"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/rud.is\/b\/wp-json\/wp\/v2\/categories?post=22211"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/rud.is\/b\/wp-json\/wp\/v2\/tags?post=22211"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}