

{"id":4527,"date":"2016-07-12T16:46:06","date_gmt":"2016-07-12T21:46:06","guid":{"rendered":"http:\/\/rud.is\/b\/?p=4527"},"modified":"2018-03-07T16:42:22","modified_gmt":"2018-03-07T21:42:22","slug":"slaying-cidr-orcs-with-triebeard-a-k-a-fast-trie-based-ipv4-in-cidr-lookups-in-r","status":"publish","type":"post","link":"https:\/\/rud.is\/b\/2016\/07\/12\/slaying-cidr-orcs-with-triebeard-a-k-a-fast-trie-based-ipv4-in-cidr-lookups-in-r\/","title":{"rendered":"Slaying CIDR Orcs with Triebeard (a.k.a. fast trie-based &#8216;IPv4-in-CIDR&#8217; lookups in R)"},"content":{"rendered":"<p>The insanely productive elf-lord, @quominus put together a small package ([`triebeard`](https:\/\/github.com\/ironholds\/triebeard)) that exposes an API for [radix\/prefix tries](https:\/\/en.wikipedia.org\/wiki\/Trie) at both the R and Rcpp levels. I know he had some personal needs for this and we both kinda need these to augment some functions in our `iptools` package. Despite `triebeard` having both a vignette and function-level examples, I thought it might be good to show a real-world use of the package (at least in the cyber real world): fast determination of which [autonomous system](https:\/\/en.wikipedia.org\/wiki\/Autonomous_system_(Internet)) an IPv4 address is in (if it&#8217;s in one at all).<\/p>\n<p>I&#8217;m not going to delve to deep into routing (you can find a good primer [here](http:\/\/www.kixtart.org\/forums\/ubbthreads.php?ubb=showflat&#038;Number=81619&#038;site_id=1#import) and one that puts routing in the context of radix tries [here](http:\/\/www.juniper.net\/documentation\/en_US\/junos14.1\/topics\/usage-guidelines\/policy-configuring-route-lists-for-use-in-routing-policy-match-conditions.html)) but there exists, essentially, abbreviated tables of which IP addresses belong to a particular network. These tables are in routers on your local networks and across the internet. Groups of these networks (on the internet) are composed into those autonomous systems I mentioned earlier and these tables are used to get the packets that make up the cat videos you watch routed to you as efficiently as possible.<\/p>\n<p>When dealing with cybersecurity data science, it&#8217;s often useful to know which autonomous system an IP address belongs in. The world is indeed full of peril and in it there are many dark places. It&#8217;s a dangerous business, going out on the internet and we sometimes find it possible to identify unusually malicious autonomous systems by looking up suspicious IP addresses en masse. These mappings look something like this:<\/p>\n<pre id=\"plain-text\"><code class=\"language-text\">CIDR            ASN\r\n1.0.0.0\/24      47872\r\n1.0.4.0\/24      56203\r\n1.0.5.0\/24      56203\r\n1.0.6.0\/24      56203\r\n1.0.7.0\/24      38803\r\n1.0.48.0\/20     49597\r\n1.0.64.0\/18     18144<\/code><\/pre>\n<p>Each CIDR has a start and end IP address which can ultimately be converted to integers. Now, one _could_ just sequentially compare start and end ranges to see which CIDR an IP address belongs in, but there are (as of the day of this post) `647,563` CIDRs to compare against, which\u2014in the worst case\u2014would mean having to traverse through the entire list to find the match (or discover there is no match). There are some trivial ways to slightly optimize this, but the search times could still be fairly long, especially when you&#8217;re trying to match a billion IPv4 addresses to ASNs.<\/p>\n<p>By storing the CIDR mask (the number of bits of the leading IP address specified after the `\/`) in binary form (strings of 1&#8217;s and 0&#8217;s) as keys for the trie, we get much faster lookups (only a few comparisons at worst-case vs 647,563).<\/p>\n<p>I made an initial, na\u00efve, mostly straight R, implementation as a precursor to a more low-level implementation in Rcpp in our `iptools` package and to illustrate this use of the `triebeard` package.<\/p>\n<p>One thing we&#8217;ll need is a function to convert an IPv4 address (in long integer form) into a binary character string. We _could_ do this with base R, but it&#8217;ll be super-slow and it doesn&#8217;t take much effort to create it with an Rcpp inline function:<\/p>\n<pre id=\"rcpp-01\"><code class=\"language-r\">library(Rcpp)\r\nlibrary(inline)\r\n\r\nip_to_binary_string &lt;- rcpp(signature(x=&quot;integer&quot;), &quot;\r\n  NumericVector xx(x);\r\n\r\n  std::vector&lt;double&gt; X(xx.begin(),xx.end());\r\n  std::vector&lt;std::string&gt; output(X.size());\r\n\r\n  for (unsigned int i=0; i&lt;X.size(); i++){\r\n\r\n    if ((i % 10000) == 0) Rcpp::checkUserInterrupt();\r\n\r\n    output[i] = std::bitset&lt;32&gt;(X[i]).to_string();\r\n\r\n  }\r\n\r\n  return(Rcpp::wrap(output));\r\n&quot;)\r\n\r\nip_to_binary_string(ip_to_numeric(&quot;192.168.1.1&quot;))\r\n## [1] &quot;11000000101010000000000100000001&quot;<\/code><\/pre>\n<p>We take a vector from R and use some C++ standard library functions to convert them to bits. I vectorized this in C++ for speed (which is just a fancy way to say I used a `for` loop). In this case, our short cut will not make for a long delay.<\/p>\n<p>Now, we&#8217;ll need a CIDR file. There are [historical ones](http:\/\/data.4tu.nl\/repository\/uuid:d4d23b8e-2077-4592-8b47-cb476ad16e12) avaialble, and I use one that I generated the day of this post (and, referenced in the code block below). You can use [`pyasn`](https:\/\/github.com\/hadiasghari\/pyasn) to make new ones daily (relegating mindless, automated, menial data retrieval tasks to the python goblins, like one should).<\/p>\n<pre id=\"trie-01\"><code class=\"language-r\">library(iptools)\r\nlibrary(stringi)\r\nlibrary(dplyr)\r\nlibrary(purrr)\r\nlibrary(readr)\r\nlibrary(tidyr)\r\n\r\nasn_dat_url &lt;- &quot;http:\/\/rud.is\/dl\/asn-20160712.1600.dat.gz&quot;\r\nasn_dat_fil &lt;- basename(asn_dat_url)\r\nif (!file.exists(asn_dat_fil)) download.file(asn_dat_url, asn_dat_fil)\r\n\r\nrip &lt;- read_tsv(asn_dat_fil, comment=&quot;;&quot;, col_names=c(&quot;cidr&quot;, &quot;asn&quot;))\r\nrip %&gt;%\r\n  separate(cidr, c(&quot;ip&quot;, &quot;mask&quot;), &quot;\/&quot;) %&gt;%\r\n  mutate(prefix=stri_sub(ip_to_binary_string(ip_to_numeric(ip)), 1, mask)) -&gt; rip_df\r\n\r\nrip_df\r\n## # A tibble: 647,557 x 4\r\n##           ip  mask   asn                   prefix\r\n##        &lt;chr&gt; &lt;chr&gt; &lt;int&gt;                    &lt;chr&gt;\r\n## 1    1.0.0.0    24 47872 000000010000000000000000\r\n## 2    1.0.4.0    24 56203 000000010000000000000100\r\n## 3    1.0.5.0    24 56203 000000010000000000000101\r\n## 4    1.0.6.0    24 56203 000000010000000000000110\r\n## 5    1.0.7.0    24 38803 000000010000000000000111\r\n## 6   1.0.48.0    20 49597     00000001000000000011\r\n## 7   1.0.64.0    18 18144       000000010000000001\r\n## 8  1.0.128.0    17  9737        00000001000000001\r\n## 9  1.0.128.0    18  9737       000000010000000010\r\n## 10 1.0.128.0    19  9737      0000000100000000100\r\n## # ... with 647,547 more rows<\/code><\/pre>\n<p>You can save off that `data_frame` to an R data file to pull in later (but it&#8217;s pretty fast to regenerate).<\/p>\n<p>Now, we create the trie, using the prefix we calculated and a value we&#8217;ll piece together for this example:<\/p>\n<pre id=\"trie-02\"><code class=\"language-r\">library(triebeard)\r\n\r\nrip_trie &lt;- trie(rip_df$prefix, sprintf(&quot;%s\/%s|%s&quot;, rip_df$ip, rip_df$mask, rip_df$asn))<\/code><\/pre>\n<p>Yep, that&#8217;s it. If you ran this yourself, it should have taken less than 2s on most modern systems to create the nigh 700,000 element trie.<\/p>\n<p>Now, we&#8217;ll generate a million random IP addresses and look them up:<\/p>\n<pre id=\"trie-03\"><code class=\"language-r\">set.seed(1492)\r\ndata_frame(lkp=ip_random(1000000),\r\n           lkp_bin=ip_to_binary_string(ip_to_numeric(lkp)),\r\n           long=longest_match(rip_trie, lkp_bin)) -&gt; lkp_df\r\n\r\nlkp_df\r\n## # A tibble: 1,000,000 x 3\r\n##               lkp                          lkp_bin                long\r\n##             &lt;chr&gt;                            &lt;chr&gt;               &lt;chr&gt;\r\n## 1   35.251.195.57 00100011111110111100001100111001  35.248.0.0\/13|4323\r\n## 2     28.57.78.42 00011100001110010100111000101010                &lt;NA&gt;\r\n## 3   24.60.146.202 00011000001111001001001011001010   24.60.0.0\/14|7922\r\n## 4    14.236.36.53 00001110111011000010010000110101                &lt;NA&gt;\r\n## 5   7.146.253.182 00000111100100101111110110110110                &lt;NA&gt;\r\n## 6     2.9.228.172 00000010000010011110010010101100     2.9.0.0\/16|3215\r\n## 7  108.111.124.79 01101100011011110111110001001111 108.111.0.0\/16|3651\r\n## 8    65.78.24.214 01000001010011100001100011010110   65.78.0.0\/19|6079\r\n## 9   50.48.151.239 00110010001100001001011111101111   50.48.0.0\/13|5650\r\n## 10  97.231.13.131 01100001111001110000110110000011   97.128.0.0\/9|6167\r\n## # ... with 999,990 more rows<\/code><\/pre>\n<p>On most modern systems, that should have taken less than 3s.<\/p>\n<p>The `NA` values are not busted lookups. Many IP networks are assigned but not accessible (see [this](https:\/\/en.wikipedia.org\/wiki\/List_of_assigned_\/8_IPv4_address_blocks) for more info). You can validate this with `cymruservices::bulk_origin()` on your own, too).<\/p>\n<p>The trie structure for these CIDRs takes up approximately 9MB of RAM, a small price to pay for speedy lookups (and, memory really is not what the heart desires, anyway). Hopefully the `triebeard` package will help you speed up your own lookups and stay-tuned for a new version of `iptools` with some new and enhanced functions.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>The insanely productive elf-lord, @quominus put together a small package ([`triebeard`](https:\/\/github.com\/ironholds\/triebeard)) that exposes an API for [radix\/prefix tries](https:\/\/en.wikipedia.org\/wiki\/Trie) at both the R and Rcpp levels. I know he had some personal needs for this and we both kinda need these to augment some functions in our `iptools` package. Despite `triebeard` having both a vignette and [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"jetpack_post_was_ever_published":false,"_jetpack_newsletter_access":"","_jetpack_dont_email_post_to_subs":false,"_jetpack_newsletter_tier_id":0,"_jetpack_memberships_contains_paywalled_content":false,"_jetpack_memberships_contains_paid_content":false,"activitypub_content_warning":"","activitypub_content_visibility":"","activitypub_max_image_attachments":3,"activitypub_interaction_policy_quote":"anyone","activitypub_status":"","footnotes":""},"categories":[681,677,754,764,91],"tags":[810],"class_list":["post-4527","post","type-post","status-publish","format-standard","hentry","category-cybersecurity","category-data-analysis-2","category-data-science","category-data-wrangling","category-r","tag-post"],"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v27.3 - https:\/\/yoast.com\/product\/yoast-seo-wordpress\/ -->\n<title>Slaying CIDR Orcs with Triebeard (a.k.a. fast trie-based &#039;IPv4-in-CIDR&#039; lookups in R) - rud.is<\/title>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/rud.is\/b\/2016\/07\/12\/slaying-cidr-orcs-with-triebeard-a-k-a-fast-trie-based-ipv4-in-cidr-lookups-in-r\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"Slaying CIDR Orcs with Triebeard (a.k.a. fast trie-based &#039;IPv4-in-CIDR&#039; lookups in R) - rud.is\" \/>\n<meta property=\"og:description\" content=\"The insanely productive elf-lord, @quominus put together a small package ([`triebeard`](https:\/\/github.com\/ironholds\/triebeard)) that exposes an API for [radix\/prefix tries](https:\/\/en.wikipedia.org\/wiki\/Trie) at both the R and Rcpp levels. I know he had some personal needs for this and we both kinda need these to augment some functions in our `iptools` package. Despite `triebeard` having both a vignette and [&hellip;]\" \/>\n<meta property=\"og:url\" content=\"https:\/\/rud.is\/b\/2016\/07\/12\/slaying-cidr-orcs-with-triebeard-a-k-a-fast-trie-based-ipv4-in-cidr-lookups-in-r\/\" \/>\n<meta property=\"og:site_name\" content=\"rud.is\" \/>\n<meta property=\"article:published_time\" content=\"2016-07-12T21:46:06+00:00\" \/>\n<meta property=\"article:modified_time\" content=\"2018-03-07T21:42:22+00:00\" \/>\n<meta name=\"author\" content=\"hrbrmstr\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:label1\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data1\" content=\"hrbrmstr\" \/>\n\t<meta name=\"twitter:label2\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data2\" content=\"4 minutes\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\\\/\\\/schema.org\",\"@graph\":[{\"@type\":\"Article\",\"@id\":\"https:\\\/\\\/rud.is\\\/b\\\/2016\\\/07\\\/12\\\/slaying-cidr-orcs-with-triebeard-a-k-a-fast-trie-based-ipv4-in-cidr-lookups-in-r\\\/#article\",\"isPartOf\":{\"@id\":\"https:\\\/\\\/rud.is\\\/b\\\/2016\\\/07\\\/12\\\/slaying-cidr-orcs-with-triebeard-a-k-a-fast-trie-based-ipv4-in-cidr-lookups-in-r\\\/\"},\"author\":{\"name\":\"hrbrmstr\",\"@id\":\"https:\\\/\\\/rud.is\\\/b\\\/#\\\/schema\\\/person\\\/d7cb7487ab0527447f7fda5c423ff886\"},\"headline\":\"Slaying CIDR Orcs with Triebeard (a.k.a. fast trie-based &#8216;IPv4-in-CIDR&#8217; lookups in R)\",\"datePublished\":\"2016-07-12T21:46:06+00:00\",\"dateModified\":\"2018-03-07T21:42:22+00:00\",\"mainEntityOfPage\":{\"@id\":\"https:\\\/\\\/rud.is\\\/b\\\/2016\\\/07\\\/12\\\/slaying-cidr-orcs-with-triebeard-a-k-a-fast-trie-based-ipv4-in-cidr-lookups-in-r\\\/\"},\"wordCount\":895,\"commentCount\":4,\"publisher\":{\"@id\":\"https:\\\/\\\/rud.is\\\/b\\\/#\\\/schema\\\/person\\\/d7cb7487ab0527447f7fda5c423ff886\"},\"keywords\":[\"post\"],\"articleSection\":[\"Cybersecurity\",\"Data Analysis\",\"data science\",\"data wrangling\",\"R\"],\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"CommentAction\",\"name\":\"Comment\",\"target\":[\"https:\\\/\\\/rud.is\\\/b\\\/2016\\\/07\\\/12\\\/slaying-cidr-orcs-with-triebeard-a-k-a-fast-trie-based-ipv4-in-cidr-lookups-in-r\\\/#respond\"]}]},{\"@type\":\"WebPage\",\"@id\":\"https:\\\/\\\/rud.is\\\/b\\\/2016\\\/07\\\/12\\\/slaying-cidr-orcs-with-triebeard-a-k-a-fast-trie-based-ipv4-in-cidr-lookups-in-r\\\/\",\"url\":\"https:\\\/\\\/rud.is\\\/b\\\/2016\\\/07\\\/12\\\/slaying-cidr-orcs-with-triebeard-a-k-a-fast-trie-based-ipv4-in-cidr-lookups-in-r\\\/\",\"name\":\"Slaying CIDR Orcs with Triebeard (a.k.a. fast trie-based 'IPv4-in-CIDR' lookups in R) - rud.is\",\"isPartOf\":{\"@id\":\"https:\\\/\\\/rud.is\\\/b\\\/#website\"},\"datePublished\":\"2016-07-12T21:46:06+00:00\",\"dateModified\":\"2018-03-07T21:42:22+00:00\",\"breadcrumb\":{\"@id\":\"https:\\\/\\\/rud.is\\\/b\\\/2016\\\/07\\\/12\\\/slaying-cidr-orcs-with-triebeard-a-k-a-fast-trie-based-ipv4-in-cidr-lookups-in-r\\\/#breadcrumb\"},\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\\\/\\\/rud.is\\\/b\\\/2016\\\/07\\\/12\\\/slaying-cidr-orcs-with-triebeard-a-k-a-fast-trie-based-ipv4-in-cidr-lookups-in-r\\\/\"]}]},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\\\/\\\/rud.is\\\/b\\\/2016\\\/07\\\/12\\\/slaying-cidr-orcs-with-triebeard-a-k-a-fast-trie-based-ipv4-in-cidr-lookups-in-r\\\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Home\",\"item\":\"https:\\\/\\\/rud.is\\\/b\\\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"Slaying CIDR Orcs with Triebeard (a.k.a. fast trie-based &#8216;IPv4-in-CIDR&#8217; lookups in R)\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\\\/\\\/rud.is\\\/b\\\/#website\",\"url\":\"https:\\\/\\\/rud.is\\\/b\\\/\",\"name\":\"rud.is\",\"description\":\"&quot;In God we trust. All others must bring data&quot;\",\"publisher\":{\"@id\":\"https:\\\/\\\/rud.is\\\/b\\\/#\\\/schema\\\/person\\\/d7cb7487ab0527447f7fda5c423ff886\"},\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\\\/\\\/rud.is\\\/b\\\/?s={search_term_string}\"},\"query-input\":{\"@type\":\"PropertyValueSpecification\",\"valueRequired\":true,\"valueName\":\"search_term_string\"}}],\"inLanguage\":\"en-US\"},{\"@type\":[\"Person\",\"Organization\"],\"@id\":\"https:\\\/\\\/rud.is\\\/b\\\/#\\\/schema\\\/person\\\/d7cb7487ab0527447f7fda5c423ff886\",\"name\":\"hrbrmstr\",\"image\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\\\/\\\/i0.wp.com\\\/rud.is\\\/b\\\/wp-content\\\/uploads\\\/2023\\\/10\\\/ukr-shield.png?fit=460%2C460&ssl=1\",\"url\":\"https:\\\/\\\/i0.wp.com\\\/rud.is\\\/b\\\/wp-content\\\/uploads\\\/2023\\\/10\\\/ukr-shield.png?fit=460%2C460&ssl=1\",\"contentUrl\":\"https:\\\/\\\/i0.wp.com\\\/rud.is\\\/b\\\/wp-content\\\/uploads\\\/2023\\\/10\\\/ukr-shield.png?fit=460%2C460&ssl=1\",\"width\":460,\"height\":460,\"caption\":\"hrbrmstr\"},\"logo\":{\"@id\":\"https:\\\/\\\/i0.wp.com\\\/rud.is\\\/b\\\/wp-content\\\/uploads\\\/2023\\\/10\\\/ukr-shield.png?fit=460%2C460&ssl=1\"},\"description\":\"Don't look at me\u2026I do what he does \u2014 just slower. #rstats avuncular \u2022 ?Resistance Fighter \u2022 Cook \u2022 Christian \u2022 [Master] Chef des Donn\u00e9es de S\u00e9curit\u00e9 @ @rapid7\",\"sameAs\":[\"http:\\\/\\\/rud.is\"],\"url\":\"https:\\\/\\\/rud.is\\\/b\\\/author\\\/hrbrmstr\\\/\"}]}<\/script>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"Slaying CIDR Orcs with Triebeard (a.k.a. fast trie-based 'IPv4-in-CIDR' lookups in R) - rud.is","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/rud.is\/b\/2016\/07\/12\/slaying-cidr-orcs-with-triebeard-a-k-a-fast-trie-based-ipv4-in-cidr-lookups-in-r\/","og_locale":"en_US","og_type":"article","og_title":"Slaying CIDR Orcs with Triebeard (a.k.a. fast trie-based 'IPv4-in-CIDR' lookups in R) - rud.is","og_description":"The insanely productive elf-lord, @quominus put together a small package ([`triebeard`](https:\/\/github.com\/ironholds\/triebeard)) that exposes an API for [radix\/prefix tries](https:\/\/en.wikipedia.org\/wiki\/Trie) at both the R and Rcpp levels. I know he had some personal needs for this and we both kinda need these to augment some functions in our `iptools` package. Despite `triebeard` having both a vignette and [&hellip;]","og_url":"https:\/\/rud.is\/b\/2016\/07\/12\/slaying-cidr-orcs-with-triebeard-a-k-a-fast-trie-based-ipv4-in-cidr-lookups-in-r\/","og_site_name":"rud.is","article_published_time":"2016-07-12T21:46:06+00:00","article_modified_time":"2018-03-07T21:42:22+00:00","author":"hrbrmstr","twitter_card":"summary_large_image","twitter_misc":{"Written by":"hrbrmstr","Est. reading time":"4 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"Article","@id":"https:\/\/rud.is\/b\/2016\/07\/12\/slaying-cidr-orcs-with-triebeard-a-k-a-fast-trie-based-ipv4-in-cidr-lookups-in-r\/#article","isPartOf":{"@id":"https:\/\/rud.is\/b\/2016\/07\/12\/slaying-cidr-orcs-with-triebeard-a-k-a-fast-trie-based-ipv4-in-cidr-lookups-in-r\/"},"author":{"name":"hrbrmstr","@id":"https:\/\/rud.is\/b\/#\/schema\/person\/d7cb7487ab0527447f7fda5c423ff886"},"headline":"Slaying CIDR Orcs with Triebeard (a.k.a. fast trie-based &#8216;IPv4-in-CIDR&#8217; lookups in R)","datePublished":"2016-07-12T21:46:06+00:00","dateModified":"2018-03-07T21:42:22+00:00","mainEntityOfPage":{"@id":"https:\/\/rud.is\/b\/2016\/07\/12\/slaying-cidr-orcs-with-triebeard-a-k-a-fast-trie-based-ipv4-in-cidr-lookups-in-r\/"},"wordCount":895,"commentCount":4,"publisher":{"@id":"https:\/\/rud.is\/b\/#\/schema\/person\/d7cb7487ab0527447f7fda5c423ff886"},"keywords":["post"],"articleSection":["Cybersecurity","Data Analysis","data science","data wrangling","R"],"inLanguage":"en-US","potentialAction":[{"@type":"CommentAction","name":"Comment","target":["https:\/\/rud.is\/b\/2016\/07\/12\/slaying-cidr-orcs-with-triebeard-a-k-a-fast-trie-based-ipv4-in-cidr-lookups-in-r\/#respond"]}]},{"@type":"WebPage","@id":"https:\/\/rud.is\/b\/2016\/07\/12\/slaying-cidr-orcs-with-triebeard-a-k-a-fast-trie-based-ipv4-in-cidr-lookups-in-r\/","url":"https:\/\/rud.is\/b\/2016\/07\/12\/slaying-cidr-orcs-with-triebeard-a-k-a-fast-trie-based-ipv4-in-cidr-lookups-in-r\/","name":"Slaying CIDR Orcs with Triebeard (a.k.a. fast trie-based 'IPv4-in-CIDR' lookups in R) - rud.is","isPartOf":{"@id":"https:\/\/rud.is\/b\/#website"},"datePublished":"2016-07-12T21:46:06+00:00","dateModified":"2018-03-07T21:42:22+00:00","breadcrumb":{"@id":"https:\/\/rud.is\/b\/2016\/07\/12\/slaying-cidr-orcs-with-triebeard-a-k-a-fast-trie-based-ipv4-in-cidr-lookups-in-r\/#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/rud.is\/b\/2016\/07\/12\/slaying-cidr-orcs-with-triebeard-a-k-a-fast-trie-based-ipv4-in-cidr-lookups-in-r\/"]}]},{"@type":"BreadcrumbList","@id":"https:\/\/rud.is\/b\/2016\/07\/12\/slaying-cidr-orcs-with-triebeard-a-k-a-fast-trie-based-ipv4-in-cidr-lookups-in-r\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/rud.is\/b\/"},{"@type":"ListItem","position":2,"name":"Slaying CIDR Orcs with Triebeard (a.k.a. fast trie-based &#8216;IPv4-in-CIDR&#8217; lookups in R)"}]},{"@type":"WebSite","@id":"https:\/\/rud.is\/b\/#website","url":"https:\/\/rud.is\/b\/","name":"rud.is","description":"&quot;In God we trust. All others must bring data&quot;","publisher":{"@id":"https:\/\/rud.is\/b\/#\/schema\/person\/d7cb7487ab0527447f7fda5c423ff886"},"potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/rud.is\/b\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"en-US"},{"@type":["Person","Organization"],"@id":"https:\/\/rud.is\/b\/#\/schema\/person\/d7cb7487ab0527447f7fda5c423ff886","name":"hrbrmstr","image":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/i0.wp.com\/rud.is\/b\/wp-content\/uploads\/2023\/10\/ukr-shield.png?fit=460%2C460&ssl=1","url":"https:\/\/i0.wp.com\/rud.is\/b\/wp-content\/uploads\/2023\/10\/ukr-shield.png?fit=460%2C460&ssl=1","contentUrl":"https:\/\/i0.wp.com\/rud.is\/b\/wp-content\/uploads\/2023\/10\/ukr-shield.png?fit=460%2C460&ssl=1","width":460,"height":460,"caption":"hrbrmstr"},"logo":{"@id":"https:\/\/i0.wp.com\/rud.is\/b\/wp-content\/uploads\/2023\/10\/ukr-shield.png?fit=460%2C460&ssl=1"},"description":"Don't look at me\u2026I do what he does \u2014 just slower. #rstats avuncular \u2022 ?Resistance Fighter \u2022 Cook \u2022 Christian \u2022 [Master] Chef des Donn\u00e9es de S\u00e9curit\u00e9 @ @rapid7","sameAs":["http:\/\/rud.is"],"url":"https:\/\/rud.is\/b\/author\/hrbrmstr\/"}]}},"jetpack_featured_media_url":"","jetpack_shortlink":"https:\/\/wp.me\/p23idr-1b1","jetpack_likes_enabled":true,"jetpack-related-posts":[{"id":4547,"url":"https:\/\/rud.is\/b\/2016\/07\/24\/mid-year-r-packages-update-summary\/","url_meta":{"origin":4527,"position":0},"title":"Mid-year R Packages Update Summary","author":"hrbrmstr","date":"2016-07-24","format":false,"excerpt":"I been updating some existing packages and github-releasing new ones (before a CRAN push). Most are \"cyber\"-related, but there are some general purpose ones. Here's a quick overview: docxtractr (CRAN, now, v0.2.0) was initially designed to make it easy to get data tables out of MS Word (docx) documents. The\u2026","rel":"","context":"In &quot;R&quot;","block_context":{"text":"R","link":"https:\/\/rud.is\/b\/category\/r\/"},"img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":3298,"url":"https:\/\/rud.is\/b\/2015\/03\/09\/new-r-package-ipapi-ipdomain-geolocation\/","url_meta":{"origin":4527,"position":1},"title":"New R Package &#8211; ipapi (IP\/Domain Geolocation)","author":"hrbrmstr","date":"2015-03-09","format":false,"excerpt":"I noticed that the @rOpenSci folks had an interface to [ip-api.com](http:\/\/ip-api.com\/) on their [ToDo](https:\/\/github.com\/ropensci\/webservices\/wiki\/ToDo) list so I whipped up a small R package to fill said gap. Their IP Geolocation API will take an IPv4, IPv6 or FQDN and kick back a ASN, lat\/lon, address and more. The [ipapi package](https:\/\/github.com\/hrbrmstr\/ipapi)\u2026","rel":"","context":"In &quot;cartography&quot;","block_context":{"text":"cartography","link":"https:\/\/rud.is\/b\/category\/cartography\/"},"img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":4417,"url":"https:\/\/rud.is\/b\/2016\/06\/07\/new-viridis-colorbrewer-palettes-for-ipv4-heatmap\/","url_meta":{"origin":4527,"position":2},"title":"New viridis &#038; colorbrewer palettes for ipv4-heatmap","author":"hrbrmstr","date":"2016-06-07","format":false,"excerpt":"It's no seekrit that I :heart: Hilbert curve heatmaps of IPv4 space. Real-world IPv4 maps (i.e. the ones that drop dots on the Earth) have little utility, but with Hilbert curves maps of IPv4 space many different topologies can be superimposed (from ASNs to\u2014if need be\u2014geographic locations). Plus, there's more\u2026","rel":"","context":"In &quot;data driven security&quot;","block_context":{"text":"data driven security","link":"https:\/\/rud.is\/b\/category\/data-driven-security\/"},"img":{"alt_text":"","src":"https:\/\/i0.wp.com\/rud.is\/b\/wp-content\/uploads\/2016\/06\/rdbu-inverted.png?fit=512%2C512&ssl=1&resize=350%2C200","width":350,"height":200},"classes":[]},{"id":4236,"url":"https:\/\/rud.is\/b\/2016\/04\/04\/iptools-0-4-0-released-into-the-wild-i-e-is-hitting-the-cran-mirrors-today\/","url_meta":{"origin":4527,"position":3},"title":"iptools 0.4.0 released into the wild (i.e. is hitting the CRAN mirrors today)","author":"hrbrmstr","date":"2016-04-04","format":false,"excerpt":"The [`iptools` package](https:\/\/github.com\/hrbrmstr\/iptools)\u2014a toolkit for manipulating, validating and testing IP addresses and ranges, along with datasets relating to IP addresses\u2014is flying through the internets and hitting a CRAN mirror near you, soon. ### What's fixed? [Tim Smith](https:\/\/github.com\/tdsmith) fixed [a bug](https:\/\/github.com\/hrbrmstr\/iptools\/issues\/26) in `ip_in_range()` that occurred when the netmask was `\/32` (thanks,\u2026","rel":"","context":"In &quot;Cybersecurity&quot;","block_context":{"text":"Cybersecurity","link":"https:\/\/rud.is\/b\/category\/cybersecurity\/"},"img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":3898,"url":"https:\/\/rud.is\/b\/2016\/01\/13\/cobble-xpath-interactively-with-the-xmlview-package\/","url_meta":{"origin":4527,"position":4},"title":"Cobble XPath Interactively with the xmlview Package","author":"hrbrmstr","date":"2016-01-13","format":false,"excerpt":"(If you don't know what XML is, you should probably [read a primer](https:\/\/en.wikipedia.org\/wiki\/XML) before reading this post,) When working with data, one inevitably comes across things encoded in XML. I'm in the \"anti-XML\" camp, but deal with my fair share of XML in \"cyber\" and help out enough people who\u2026","rel":"","context":"In &quot;data wrangling&quot;","block_context":{"text":"data wrangling","link":"https:\/\/rud.is\/b\/category\/data-wrangling\/"},"img":{"alt_text":"","src":"https:\/\/i0.wp.com\/rud.is\/b\/wp-content\/uploads\/2016\/01\/RStudioScreenSnapz003.png?fit=865%2C523&ssl=1&resize=350%2C200","width":350,"height":200,"srcset":"https:\/\/i0.wp.com\/rud.is\/b\/wp-content\/uploads\/2016\/01\/RStudioScreenSnapz003.png?fit=865%2C523&ssl=1&resize=350%2C200 1x, https:\/\/i0.wp.com\/rud.is\/b\/wp-content\/uploads\/2016\/01\/RStudioScreenSnapz003.png?fit=865%2C523&ssl=1&resize=525%2C300 1.5x, https:\/\/i0.wp.com\/rud.is\/b\/wp-content\/uploads\/2016\/01\/RStudioScreenSnapz003.png?fit=865%2C523&ssl=1&resize=700%2C400 2x"},"classes":[]},{"id":10287,"url":"https:\/\/rud.is\/b\/2018\/05\/18\/lmx-ot-nosj-interchanging-classic-data-formats-with-single-blackmagic-incantations\/","url_meta":{"origin":4527,"position":5},"title":"&#8220;LMX ot NOSJ!&#8221; Interchanging Classic Data Formats With Single blackmagic Incantations","author":"hrbrmstr","date":"2018-05-18","format":false,"excerpt":"The D.C. Universe magic hero Zatanna used spells (i.e. incantations) to battle foes and said spells were just sentences said backwards, hence the mixed up jumble in the title. But, now I'm regretting not naming the package zatanna and reversing the function names to help ensure they're only used deliberately\u2026","rel":"","context":"In &quot;R&quot;","block_context":{"text":"R","link":"https:\/\/rud.is\/b\/category\/r\/"},"img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]}],"jetpack_sharing_enabled":true,"_links":{"self":[{"href":"https:\/\/rud.is\/b\/wp-json\/wp\/v2\/posts\/4527","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/rud.is\/b\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/rud.is\/b\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/rud.is\/b\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/rud.is\/b\/wp-json\/wp\/v2\/comments?post=4527"}],"version-history":[{"count":0,"href":"https:\/\/rud.is\/b\/wp-json\/wp\/v2\/posts\/4527\/revisions"}],"wp:attachment":[{"href":"https:\/\/rud.is\/b\/wp-json\/wp\/v2\/media?parent=4527"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/rud.is\/b\/wp-json\/wp\/v2\/categories?post=4527"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/rud.is\/b\/wp-json\/wp\/v2\/tags?post=4527"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}