Visualizing Risky Words — Part 4 (D3 Word Trees)

This is a fourth post in my [Visualizing Risky Words](http://rud.is/b/2013/03/06/visualizing-risky-words/) series. You’ll need to read starting from that link for context if you’re just jumping in now.

I was going to create a rudimentary version of an interactive word tree for this, but the extremely talented @jasondavies (I marvel especially at his cartographic work) just posted what is probably the best online [word tree generator](https://www.jasondavies.com/wordtree/) ever made…and in D3 no less.

A word tree is a “visual interactive concordance” and was created back in 2007 by Martin Wattenberg and Fernanda Viégas. You can [read more about](http://hint.fm/projects/wordtree/) this technique on your own, but a good summary (from their site) is:

A word tree is a visual search tool for unstructured text, such as a book, article, speech or poem. It lets you pick a word or phrase and shows you all the different contexts in which it appears. The contexts are arranged in a tree-like branching structure to reveal recurrent themes and phrases.

I pasted the VZ RISK INTSUM texts into Jason’s tool so you could investigate the corpus to your heart’s content. I would suggest exploring “patch”, “vulnerability”, “adobe”, “breach” & “malware” (for starters).

Jason’s implementation is nothing short of beautiful. He uses SVG text tspans to make the individual text elements not just selectable but easily scaleable with browser window resize events.

The actual [word tree D3 javascript code](http://www.jasondavies.com/wordtree/wordtree.js?20130312.1) shows just how powerful the combination of the language and @mbostock’s library is. He has, in essence, built a completely cross-platform tokenizer and interactive visualization tool in ~340 lines of javascript. Working your way through that code through to understanding will really help improve your D3 skills.

rud.is