Skip navigation

Category Archives: d3

For those inclined to click, I was interviewed by Fahmida Rashid (@fahmiwrite) over at Sourceforge’s HTML5 center a few weeks ago (right after the elections) due to my tweets on the use of HTML5 tech over Flash. Here’s one of them:

https://twitter.com/hrbrmstr/status/266006111256207361

While a tad inaccurate (one site did use Flash with an HTML fallback and some international sites are still stuck in the 1990s), it is still a good sign of how the modern web is progressing.

I can honestly say I’ve never seen my last name used so many times in one article :-)

In the spirit of the previous example this one shows you how to do a quick, country-based choropleth in D3/jQuery with some help from the command-line since not everyone is equipped to kick out some R and most folks I know are very handy at a terminal prompt.

I took the ZeroAccessGeoIPs.csv file and ran it through a quick *nix one-liner to get a JSON-ish associative array of country abbreviations to botnet counts in that country:

cat ZeroAccessGeoIPs.csv | cut -f1,1 -d\,| sort | uniq -c | sort -n | tr "[:upper:]" "[:lower:]" | while read a b ; do echo "{ \"$b\" : \"$a\" }," ; done > botcounts.js

I found a suitable SVG world map on Wikipedia that had id="ABBREV" country groupings. This is not the most #spiffy map (and, if you dig a bit more than I did, you’ll probably find a better one) but it works for this example and shows you how quickly you can find the bits you need to get a visualization together.

With that data and the SVG file pasted into the same HTML document, it’s a simple matter of generating a gradient with some d3 magic:

color = d3.scale.log().domain([1,47880]).range(["#FFEB38","#F54747"]);

and, then, looping over the associative array while using said color range to fill in the country shapes:

 $.each(botcounts, function(key, value) {
    $('#' + key).css('fill',color(value))
  });
}) ;

we get:

You can view the full, larger example on this separate page where you can do a view-source to see the entire code. I really encourage you to do this as you’ll see that there are just a handful of lines of your own code necessary to make a compelling visualization. Sure, you’ll want to add a legend and some other styling, but the basics can be done in – literally – minutes, leaving customized details to your imagination & creativity.

The entire map could have been done in D3, but I only spent about 5 minutes on the entire exercise (including the one-liner) and am still more comfortable in jQuery than I am in D3. I did this also to show that it’s perfectly fine (as Mainers are wont to say) to do pre-processing and hard-coding when cranking out visualizations. The goal is to communicate something to your audience and there are no hard-and-fast rules governing this process. As with any coding, if you think you’ll be doing this again it is a wise idea to make the solution more generic, but there’s nothing wrong with taking valid shortcuts to get results out as quickly as possible.

Definitely feel invited to share your creations in the comments, especially if you find a better map!

In @jayjacobs’ latest post on SSH honeypot passsword analysis he shows some spiffy visualizations from crunching the data with Tableau. While I’ve joked with him and called them “robocharts”, the reality is that Tableau does let you work on visualizing the answers to questions quickly without having to go into “code mode” (and that doesn’t make it wrong).

I’ve been using Jay’s honeypot data for both attack analysis as well as an excuse to compare data crunching and visualization tools (so far I’ve poked at it with R and python) in an effort to see what tools are good for exploring various types of questions.

A question that came to mind recently was “Hmmm…I wonder if there is a patten to the timings of probes/attacks?” and I posited that a time-series view across the days would help illustrate that. To that end, I came up with the idea of breaking the attacks into one hour chuncks and build a day-stacked heatmap which could be filtered by country. Something like this:

I’ve been wanting to play with D3 and exploring this concept with it seemed to be a good fit.

Given that working with the real data would entail loading a ~4MB file every time someone viewed this blog post, I put the working example in a separate page where you can do a “view source” to see the code. Without the added complexity of a popup selector and loading spinner, the core code is about 50 lines, much of which could be condensed even further since it’s just chaining calls in javascript. I cheated a bit and used jQuery, too, plus made some of it dependent on WebKit (the legend may look weird in Firefox) due to time constraints.

The library is wicked simple to grok and makes it easy to come up with new ways to look at data (as you can see from the examples gallery on the D3 site).

Unfortunately, no real patterns emerged, but I’m going to take a stab at taking the timestamps (which is the timestamp at the destination of the attack) and align it to the origin to see if that makes a difference in the view. If that turns up anything interesting, I’ll make another quick post on it.

Given that much of data (“big” or otherwise) analysis is domain knowledgable folk asking interesting questions, are there any folks out there who have questions that they’d like to see explored with this data set?