Skip navigation

Author Archives: hrbrmstr

Don't look at me…I do what he does — just slower. #rstats avuncular • ?Resistance Fighter • Cook • Christian • [Master] Chef des Données de Sécurité @ @rapid7

Despite being on holiday, I had a spare hour to refactor the code (@mrshrbrmstr was joining the 1% in the hotel spa). It’s up on github and now sports a spiffy JSON-format config file. You now must execute the slopegraph.py script with a “--config FILENAME” argument. The configuration file lets you specify the “theme” as well as the input file and output format (you can only use PDF for the moment).

Here’s a sample config file included in the github push (there’s another one there too):

  1. {
  2.  
  3. "font_family" : "Palatino",
  4. "font_size" : "20",
  5.  
  6. "x_margin" : "20",
  7. "y_margin" : "30",
  8.  
  9. "line_width" : "0.5",
  10.  
  11. "background_color" : "DEC299",
  12. "label_color" : "687D64",
  13. "value_color" : "949258",
  14. "slope_color" : "61514C",
  15.  
  16. "value_format_string" : "%2d",
  17.  
  18. "input" : "television.csv",
  19. "output" : "television",
  20. "format" : "pdf"
  21.  
  22. }

Included in the refactor is the ability to use a sprintf-like format string for the label value output to make the slopegraphs a tad prettier. Also included with the refactor is a new limitation of the CSV file requiring a

"LABEL, VALUE, VALUE"

format in preparation for support for multiple columns. As @jayjacobs said to me, it’s easy to reformat data into the CSV file format, and, he’s right (as usual).

Plans for the next revision include:

  • Specifying a transparent background
  • Specifying PDF|PS|SVG|PNG format output
  • Allowing for an arbitrary number of columns for the slopegraph
  • Optional column labels as well as slopepgraph title (with themeing)
  • Line color change by slope up/same/down value (will most likely be pushed out, tho)

Here’s the whole source:

  1. import csv
  2. import cairo
  3. import argparse
  4. import json
  5.  
  6. def split(input, size):
  7. 	return [input[start:start+size] for start in range(0, len(input), size)]
  8.  
  9. class Slopegraph:
  10.  
  11. 	SLOPEGRAPH_CANVAS_SIZE = 300
  12.  
  13. 	starts = {} # starting "points"
  14. 	ends = {} # ending "points"
  15. 	pairs = [] # base pair array for the final plotting
  16.  
  17. 	def readCSV(self, filename):
  18.  
  19. 		slopeReader = csv.reader(open(filename, 'rb'), delimiter=',', quotechar='"')
  20.  
  21. 		for row in slopeReader:
  22.  
  23. 			# add chosen values (need start/end for each CSV row) to the final plotting array.
  24.  
  25. 			lab = row[0] # label
  26. 			beg = float(row[1]) # left vals
  27. 			end = float(row[2]) # right vals
  28.  
  29. 			self.pairs.append( (float(beg), float(end)) )
  30.  
  31. 			# combine labels of common values into one string
  32.  
  33. 			if beg in self.starts:
  34. 				self.starts[beg] = self.starts[beg] + "; " + lab
  35. 			else:
  36. 				self.starts[beg] = lab
  37.  
  38.  
  39. 			if end in self.ends:
  40. 				self.ends[end] = self.ends[end] + "; " + lab
  41. 			else:
  42. 				self.ends[end] = lab
  43.  
  44.  
  45. 	def sortKeys(self):
  46.  
  47. 		# sort all the values (in the event the CSV wasn't) so
  48. 		# we can determine the smallest increment we need to use
  49. 		# when stacking the labels and plotting points
  50.  
  51. 		self.startSorted = [(k, self.starts[k]) for k in sorted(self.starts)]
  52. 		self.endSorted = [(k, self.ends[k]) for k in sorted(self.ends)]
  53.  
  54. 		self.startKeys = sorted(self.starts.keys())
  55. 		self.delta = max(self.startSorted)
  56. 		for i in range(len(self.startKeys)):
  57. 			if (i+1 <= len(self.startKeys)-1):
  58. 				currDelta = float(self.startKeys[i+1]) - float(self.startKeys[i])
  59. 				if (currDelta < self.delta):
  60. 					self.delta = currDelta
  61.  
  62. 		self.endKeys = sorted(self.ends.keys())
  63. 		for i in range(len(self.endKeys)):
  64. 			if (i+1 <= len(self.endKeys)-1):
  65. 				currDelta = float(self.endKeys[i+1]) - float(self.endKeys[i])
  66. 				if (currDelta < self.delta):
  67. 					self.delta = currDelta
  68.  
  69.  
  70. 	def findExtremes(self):
  71.  
  72. 		# we also need to find the absolute min & max values
  73. 		# so we know how to scale the plots
  74.  
  75. 		self.lowest = min(self.startKeys)
  76. 		if (min(self.endKeys) < self.lowest) : self.lowest = min(self.endKeys)
  77.  
  78. 		self.highest = max(self.startKeys)
  79. 		if (max(self.endKeys) > self.highest) : self.highest = max(self.endKeys)
  80.  
  81. 		self.delta = float(self.delta)
  82. 		self.lowest = float(self.lowest)
  83. 		self.highest = float(self.highest)
  84.  
  85.  
  86. 	def calculateExtents(self, filename, format, valueFormatString):
  87.  
  88. 		surface = cairo.PDFSurface (filename, 8.5*72, 11*72)
  89. 		cr = cairo.Context (surface)
  90. 		cr.save()
  91. 		cr.select_font_face(self.FONT_FAMILY, cairo.FONT_SLANT_NORMAL, cairo.FONT_WEIGHT_NORMAL)
  92. 		cr.set_font_size(self.FONT_SIZE)
  93. 		cr.set_line_width(self.LINE_WIDTH)
  94.  
  95. 		# find the *real* maximum label width (not just based on number of chars)
  96.  
  97. 		maxLabelWidth = 0
  98. 		maxNumWidth = 0
  99.  
  100. 		for k in sorted(self.startKeys):
  101. 			s1 = self.starts[k]
  102. 			xbearing, ybearing, self.sWidth, self.sHeight, xadvance, yadvance = (cr.text_extents(s1))
  103. 			if (self.sWidth > maxLabelWidth) : maxLabelWidth = self.sWidth
  104. 			xbearing, ybearing, self.startMaxLabelWidth, startMaxLabelHeight, xadvance, yadvance = (cr.text_extents(valueFormatString % (k)))
  105. 			if (self.startMaxLabelWidth > maxNumWidth) : maxNumWidth = self.startMaxLabelWidth
  106.  
  107. 		self.sWidth = maxLabelWidth
  108. 		self.startMaxLabelWidth = maxNumWidth
  109.  
  110. 		maxLabelWidth = 0
  111. 		maxNumWidth = 0
  112.  
  113. 		for k in sorted(self.endKeys):
  114. 			e1 = self.ends[k]
  115. 			xbearing, ybearing, self.eWidth, eHeight, xadvance, yadvance = (cr.text_extents(e1))
  116. 			if (self.eWidth > maxLabelWidth) : maxLabelWidth = self.eWidth
  117. 			xbearing, ybearing, self.endMaxLabelWidth, endMaxLabelHeight, xadvance, yadvance = (cr.text_extents(valueFormatString % (k)))
  118. 			if (self.endMaxLabelWidth > maxNumWidth) : maxNumWidth = self.endMaxLabelWidth
  119.  
  120. 		self.eWidth = maxLabelWidth
  121. 		self.endMaxLabelWidth = maxNumWidth	
  122.  
  123. 		cr.restore()
  124. 		cr.show_page()
  125. 		surface.finish()
  126.  
  127. 		self.width = self.X_MARGIN + self.sWidth + self.SPACE_WIDTH + self.startMaxLabelWidth + self.SPACE_WIDTH + self.SLOPEGRAPH_CANVAS_SIZE + self.SPACE_WIDTH + self.endMaxLabelWidth + self.SPACE_WIDTH + self.eWidth + self.X_MARGIN ;
  128. 		self.height = (self.Y_MARGIN * 2) + (((self.highest - self.lowest) / self.delta) * self.LINE_HEIGHT)
  129.  
  130.  
  131. 	def makeSlopegraph(self, filename, config):
  132.  
  133. 		(lab_r,lab_g,lab_b) = split(config["label_color"],2)
  134. 		(val_r,val_g,val_b) = split(config["value_color"],2)
  135. 		(line_r,line_g,line_b) = split(config["slope_color"],2)
  136. 		(bg_r,bg_g,bg_b) = split(config["background_color"],2)
  137.  
  138. 		LAB_R = (int(lab_r, 16)/255.0)
  139. 		LAB_G = (int(lab_g, 16)/255.0)
  140. 		LAB_B = (int(lab_b, 16)/255.0)
  141.  
  142. 		VAL_R = (int(val_r, 16)/255.0)
  143. 		VAL_G = (int(val_g, 16)/255.0)
  144. 		VAL_B = (int(val_b, 16)/255.0)
  145.  
  146. 		LINE_R = (int(line_r, 16)/255.0)
  147. 		LINE_G = (int(line_g, 16)/255.0)
  148. 		LINE_B = (int(line_b, 16)/255.0)
  149.  
  150. 		BG_R = (int(bg_r, 16)/255.0)
  151. 		BG_G = (int(bg_g, 16)/255.0)
  152. 		BG_B = (int(bg_b, 16)/255.0)
  153.  
  154. 		surface = cairo.PDFSurface (filename, self.width, self.height)
  155. 		cr = cairo.Context(surface)
  156.  
  157. 		cr.save()
  158.  
  159. 		cr.select_font_face(self.FONT_FAMILY, cairo.FONT_SLANT_NORMAL, cairo.FONT_WEIGHT_NORMAL)
  160. 		cr.set_font_size(self.FONT_SIZE)
  161.  
  162. 		cr.set_line_width(self.LINE_WIDTH)
  163.  
  164. 		cr.set_source_rgb(BG_R,BG_G,BG_B)
  165. 		cr.rectangle(0,0,self.width,self.height)
  166. 		cr.fill()
  167.  
  168. 		# draw start labels at the correct positions
  169.  
  170. 		valueFormatString = config["value_format_string"]
  171.  
  172. 		for k in sorted(self.startKeys):
  173.  
  174. 			val = float(k)
  175. 			label = self.starts[k]
  176. 			xbearing, ybearing, lWidth, lHeight, xadvance, yadvance = (cr.text_extents(label))
  177. 			xbearing, ybearing, kWidth, kHeight, xadvance, yadvance = (cr.text_extents(valueFormatString % (val)))
  178.  
  179. 			cr.set_source_rgb(LAB_R,LAB_G,LAB_B)
  180. 			cr.move_to(self.X_MARGIN + (self.sWidth - lWidth), self.Y_MARGIN + (self.highest - val) * self.LINE_HEIGHT * (1/self.delta))
  181. 			cr.show_text(label)
  182.  
  183. 			cr.set_source_rgb(VAL_R,VAL_G,VAL_B)
  184. 			cr.move_to(self.X_MARGIN + self.sWidth + self.SPACE_WIDTH + (self.startMaxLabelWidth - kWidth), self.Y_MARGIN + (self.highest - val) * self.LINE_HEIGHT * (1/self.delta))
  185. 			cr.show_text(valueFormatString % (val))
  186.  
  187. 			cr.stroke()
  188.  
  189. 		# draw end labels at the correct positions
  190.  
  191. 		for k in sorted(self.endKeys):
  192.  
  193. 			val = float(k)
  194. 			label = self.ends[k]
  195. 			xbearing, ybearing, lWidth, lHeight, xadvance, yadvance = (cr.text_extents(label))
  196.  
  197. 			cr.set_source_rgb(VAL_R,VAL_G,VAL_B)
  198. 			cr.move_to(self.width - self.X_MARGIN - self.SPACE_WIDTH - self.eWidth - self.SPACE_WIDTH - self.endMaxLabelWidth, self.Y_MARGIN + (self.highest - val) * self.LINE_HEIGHT * (1/self.delta))
  199. 			cr.show_text(valueFormatString % (val))
  200.  
  201. 			cr.set_source_rgb(LAB_R,LAB_G,LAB_B)
  202. 			cr.move_to(self.width - self.X_MARGIN - self.SPACE_WIDTH - self.eWidth, self.Y_MARGIN + (self.highest - val) * self.LINE_HEIGHT * (1/self.delta))
  203. 			cr.show_text(label)
  204.  
  205. 			cr.stroke()
  206.  
  207. 		# do the actual plotting
  208.  
  209. 		cr.set_line_width(self.LINE_WIDTH)
  210. 		cr.set_source_rgb(LINE_R, LINE_G, LINE_B)
  211.  
  212. 		for s1,e1 in self.pairs:
  213. 			cr.move_to(self.X_MARGIN + self.sWidth + self.SPACE_WIDTH + self.startMaxLabelWidth + self.LINE_START_DELTA, self.Y_MARGIN + (self.highest - s1) * self.LINE_HEIGHT * (1/self.delta) - self.LINE_HEIGHT/4)
  214. 			cr.line_to(self.width - self.X_MARGIN - self.eWidth - self.SPACE_WIDTH - self.endMaxLabelWidth - self.LINE_START_DELTA, self.Y_MARGIN + (self.highest - e1) * self.LINE_HEIGHT * (1/self.delta) - self.LINE_HEIGHT/4)
  215. 			cr.stroke()
  216.  
  217. 		cr.restore()
  218. 		cr.show_page()
  219. 		surface.finish()	
  220.  
  221.  
  222. 	def __init__(self, config):
  223.  
  224. 		# a couple methods need these so make them local to the class
  225.  
  226. 		self.FONT_FAMILY = config["font_family"]
  227. 		self.LINE_WIDTH = float(config["line_width"])
  228. 		self.X_MARGIN = float(config["x_margin"])
  229. 		self.Y_MARGIN = float(config["y_margin"])
  230. 		self.FONT_SIZE = float(config["font_size"])
  231. 		self.SPACE_WIDTH = self.FONT_SIZE / 2.0
  232. 		self.LINE_HEIGHT = self.FONT_SIZE + (self.FONT_SIZE / 2.0)
  233. 		self.LINE_START_DELTA = 1.5*self.SPACE_WIDTH
  234.  
  235. 		OUTPUT_FILE = config["output"] + "." + config["format"]
  236.  
  237. 		# process the values & make the slopegraph
  238.  
  239. 		self.readCSV(config["input"])
  240. 		self.sortKeys()
  241. 		self.findExtremes()
  242. 		self.calculateExtents(OUTPUT_FILE, config["format"], config["value_format_string"])
  243. 		self.makeSlopegraph(OUTPUT_FILE, config)
  244.  
  245.  
  246. def main():
  247.  
  248. 	parser = argparse.ArgumentParser(description="Creates a slopegraph from a CSV source")
  249. 	parser.add_argument("--config",required=True,
  250. 					help="config file name to use for  slopegraph creation",)
  251. 	args = parser.parse_args()
  252.  
  253. 	if args.config:
  254.  
  255. 		json_data = open(args.config)
  256. 		config = json.load(json_data)
  257. 		json_data.close()
  258.  
  259. 		Slopegraph(config)
  260.  
  261. 	return(0)
  262.  
  263. if __name__ == "__main__":
  264. 	main()

In the previous installment, a foundation was laid for “parameterizing” fonts, colors and overall slopegraph size. However, a big failing in all this code (up until now) was the reliance on character string length to determine label width. When working with fonts, the font metrics are more important since a lowercase ‘l’ will have a smaller font width than an uppercase ‘D’. So, while ‘llll’ is “longer” than ‘DDD’, it may not be wider (especially in a sans-serif font, but likely in any font):

To solve this problem, we need to use a temporary Cairo surface to compute font metrics for each label & value so we know what the maximum width of both for the starting and ending points. It’s a simple concept & calculation, but very important to ensure everything lines up well.

  1. # find the *real* maximum label width (not just based on number of chars)
  2.  
  3. maxLabelWidth = 0
  4. maxNumWidth = 0
  5.  
  6. for k in sorted(startKeys):
  7. 	s1 = starts[k]
  8. 	xbearing, ybearing, sWidth, sHeight, xadvance, yadvance = (cr.text_extents(s1))
  9. 	if (sWidth > maxLabelWidth) : maxLabelWidth = sWidth
  10. 	xbearing, ybearing, startMaxLabelWidth, startMaxLabelHeight, xadvance, yadvance = (cr.text_extents(str(k)))
  11. 	if (startMaxLabelWidth > maxNumWidth) : maxNumWidth = startMaxLabelWidth
  12.  
  13. sWidth = maxLabelWidth
  14. startMaxLabelWidth = maxNumWidth
  15.  
  16. maxWidth = 0
  17. maxNumWidth = 0
  18.  
  19. for k in sorted(endKeys):
  20. 	e1 = ends[k]
  21. 	xbearing, ybearing, eWidth, eHeight, xadvance, yadvance = (cr.text_extents(e1))
  22. 	if (eWidth > maxLabelWidth) : maxLabelWidth = eWidth
  23. 	xbearing, ybearing, endMaxLabelWidth, endMaxLabelHeight, xadvance, yadvance = (cr.text_extents(str(k)))
  24. 	if (endMaxLabelWidth > maxNumWidth) : maxNumWidth = endMaxLabelWidth
  25.  
  26. eWidth = maxLabelWidth
  27. endMaxLabelWidth = maxNumWidth

I tossed some “anomalies” into the sample data set to show both how adaptable the vertical scale is as well as demonstrate the label alignments:

Updates are in github.

On the heels of last evening’s release of Slopegraphs in Python post comes some minor tweaks:

  • Complete alignment control of labels & and values
  • Colors (for background, lines, labels & values) — I picked a random pattern from Adobe’s Kuler
  • A font change (to prove width calculations work)

…and a new example slopegraph:

As promised, the latest revisions are in github.

Some notes for aspiring Python/Cairo hackers:

  • RGB colors are 0-1 in Cairo, so divide your 0-255 values by 255 to get the corresponding Cairo value and make sure you’re doing float arithmetic in Python
  • It turns out simple Cairo font rendering (ie. non-Pango) does not interpret well in Illustrator from a Cairo PDF surface, so if you do plan on post-processing the slopegraphs, use a postscript/EPS surface:
    1. surface = cairo.PSSurface ("slopegraph.ps", width, height)
    2. surface.set_eps(True)

    (I’ll be incorporating an output option as this gets closer to 1.0)

  • If you do plan on using this at all, grab the Bitstream Vera Serif font (link goes to FontSquirrel, but you can find it almost everywhere as it’s free)

(Sing to the tune of “Fame – Remember My name” …
Here’s some YouTube background music)

They’ve been lookin’ at me, but they never did see—
no, no trace of me did they detect;
Gave me time to collect all the data at rest.
I’ve got so much in me: LUA, zlib & sqlite3–
I can infect the USB in your hand. Don’t you know who I am?
Remember my name [FLAME]

I’ve been around forever. Capturing packets on the fly. [HIGH]
My botnet is comin’ together. When researchers see me they’ll cry. [FLAME]
I even infected Lebanon. Lit up the Middle East with my FLAME. [FLAME]
I’ve been around forever. They will remember my name.

[REMEMBER, REMEMBER, REMEMBER, REMEMBER, REMEMBER, ]

I’m not packed up too tight (I take up 20 megabytes).
With no kill date, I’ll never stop.
Give me your mic and I’ll take all you’ve got to give.
Finding me will be tough. Too much (you’ll say ‘enough’!)
I can ride your net but not break (it). Yeah, I got what it takes.

FLAME!

I’ve been around forever. Capturing packets on the fly. [HIGH]
My botnet is comin’ together. When researchers see me they’ll cry. [FLAME]
I even infected Lebanon. Lit up the Middle East with my FLAME. [FLAME]
I’ve been around forever. They will remember my name.

[REMEMBER, REMEMBER, REMEMBER, REMEMBER, REMEMBER, REMEMBER]

FLAME!…

(NOTE: You can keep up with progress best at github, but can always search on “slopegraph” here or just hit the tag page: “slopegraph” regularly)

I’ve been a bit obsessed with slopegraphs (a.k.a “Tufte table-chart”) of late and very dissatisfied with the lack of tools to make this particular visualization tool more prevalent. While my ultimate goal is to have a user-friendly modern web app or platform app that’s as easy as a “drag & drop” of a CSV file, this first foray will require a bit (not much, really!) of elbow grease to be used.

For those who want to get right to the code, head on over to github and have a look (I’ll post all updates there). Setup, sample & source are also below.

First, you’ll need a modern Python install. I did all the development on Mac OS Mountain Lion (beta) with the stock Python 2.7 build. You’ll also need the Cairo 2D graphics library which built and installed perfectly from source, even on ML, so it should work fine for you. If you want something besides PDF rendering, you may need additional libraries, but PDF is decent for hi-res embedding, converting to jpg/png (see below) and tweaking in programs like Illustrator.

If you search for “Gender Comparisons” in the comments on this post at Tufte’s blog, you’ll see what I was trying to reproduce in this bit of skeleton code (below). By modifying the CSV file you’re using [line 21] and then which fields are relevant [lines 45-47] you should be able to make your own basic slopegraphs without much trouble.

If you catch any glitches, add some tweak or have a slopegraph “wish list”, let me know here, twitter (@hrbrmstr) or over at github.

  1. # slopegraph.py
  2. #
  3. # Author: Bob Rudis (@hrbrmstr)
  4. #
  5. # Basic Python skeleton to do simple two value slopegraphs
  6. # with output to PDF (most useful form for me...Cairo has tons of options)
  7. #
  8. # Find out more about & download Cairo here:
  9. # http://cairographics.org/
  10. #
  11. # 2012-05-28 - 0.5 - Initial github release. Still needs some polish
  12. #
  13.  
  14. import csv
  15. import cairo
  16.  
  17. # original data source: http://www.calvin.edu/~stob/data/television.csv
  18.  
  19. # get a CSV file to work with 
  20.  
  21. slopeReader = csv.reader(open('television.csv', 'rb'), delimiter=',', quotechar='"')
  22.  
  23. starts = {} # starting "points"/
  24. ends = {} # ending "points"
  25.  
  26. # Need to refactor label max width into font calculations
  27. # as there's no guarantee the longest (character-wise)
  28. # label is the widest one
  29.  
  30. startLabelMaxLen = 0
  31. endLabelMaxLen = 0
  32.  
  33. # build a base pair array for the final plotting
  34. # wastes memory, but simplifies plotting
  35.  
  36. pairs = []
  37.  
  38. for row in slopeReader:
  39.  
  40. 	# add chosen values (need start/end for each CSV row)
  41. 	# to the final plotting array. Try this sample with 
  42. 	# row[1] (average life span) instead of row[5] to see some
  43. 	# of the scaling in action
  44.  
  45. 	lab = row[0] # label
  46. 	beg = row[5] # male life span
  47. 	end = row[4] # female life span
  48.  
  49. 	pairs.append( (float(beg), float(end)) )
  50.  
  51. 	# combine labels of common values into one string
  52. 	# also (as noted previously, inappropriately) find the
  53. 	# longest one
  54.  
  55. 	if beg in starts:
  56. 		starts[beg] = starts[beg] + "; " + lab
  57. 	else:
  58. 		starts[beg] = lab
  59.  
  60. 	if ((len(starts[beg]) + len(beg)) > startLabelMaxLen):
  61. 		startLabelMaxLen = len(starts[beg]) + len(beg)
  62. 		s1 = starts[beg]
  63.  
  64.  
  65. 	if end in ends:
  66. 		ends[end] = ends[end] + "; " + lab
  67. 	else:
  68. 		ends[end] = lab
  69.  
  70. 	if ((len(ends[end]) + len(end)) > endLabelMaxLen):
  71. 		endLabelMaxLen = len(ends[end]) + len(end)
  72. 		e1 = ends[end]
  73.  
  74. # sort all the values (in the event the CSV wasn't) so
  75. # we can determine the smallest increment we need to use
  76. # when stacking the labels and plotting points
  77.  
  78. startSorted = [(k, starts[k]) for k in sorted(starts)]
  79. endSorted = [(k, ends[k]) for k in sorted(ends)]
  80.  
  81. startKeys = sorted(starts.keys())
  82. delta = max(startSorted)
  83. for i in range(len(startKeys)):
  84. 	if (i+1 <= len(startKeys)-1):
  85. 		currDelta = float(startKeys[i+1]) - float(startKeys[i])
  86. 		if (currDelta < delta):
  87. 			delta = currDelta
  88.  
  89. endKeys = sorted(ends.keys())
  90. for i in range(len(endKeys)):
  91. 	if (i+1 <= len(endKeys)-1):
  92. 		currDelta = float(endKeys[i+1]) - float(endKeys[i])
  93. 		if (currDelta < delta):
  94. 			delta = currDelta
  95.  
  96. # we also need to find the absolute min & max values
  97. # so we know how to scale the plots
  98.  
  99. lowest = min(startKeys)
  100. if (min(endKeys) < lowest) : lowest = min(endKeys)
  101.  
  102. highest = max(startKeys)
  103. if (max(endKeys) > highest) : highest = max(endKeys)
  104.  
  105. # just making sure everything's a number
  106. # probably should move some of this to the csv reader section
  107.  
  108. delta = float(delta)
  109. lowest = float(lowest)
  110. highest = float(highest)
  111. startLabelMaxLen = float(startLabelMaxLen)
  112. endLabelMaxLen = float(endLabelMaxLen)
  113.  
  114. # setup line width and font-size for the Cairo
  115. # you can change these and the constants should
  116. # scale the plots accordingly
  117.  
  118. FONT_SIZE = 9
  119. LINE_WIDTH = 0.5
  120.  
  121. # there has to be a better way to get a base "surface"
  122. # to do font calculations besides this. we're just making
  123. # this Cairo surface to we know the max pixel width 
  124. # (font extents) of the labels in order to scale the graph
  125. # accurately (since width/height are based, in part, on it)
  126.  
  127. filename = 'slopegraph.pdf'
  128. surface = cairo.PDFSurface (filename, 8.5*72, 11*72)
  129. cr = cairo.Context (surface)
  130. cr.save()
  131. cr.select_font_face("Sans", cairo.FONT_SLANT_NORMAL, cairo.FONT_WEIGHT_NORMAL)
  132. cr.set_font_size(FONT_SIZE)
  133. cr.set_line_width(LINE_WIDTH)
  134. xbearing, ybearing, sWidth, sHeight, xadvance, yadvance = (cr.text_extents(s1))
  135. xbearing, ybearing, eWidth, eHeight, xadvance, yadvance = (cr.text_extents(e1))
  136. xbearing, ybearing, spaceWidth, spaceHeight, xadvance, yadvance = (cr.text_extents(" "))
  137. cr.restore()
  138. cr.show_page()
  139. surface.finish()
  140.  
  141. # setup some more constants for plotting
  142. # all of these are malleable and should cascade nicely
  143.  
  144. X_MARGIN = 10
  145. Y_MARGIN = 10
  146. SLOPEGRAPH_CANVAS_SIZE = 200
  147. spaceWidth = 5
  148. LINE_HEIGHT = 15
  149. PLOT_LINE_WIDTH = 0.5
  150.  
  151. width = (X_MARGIN * 2) + sWidth + spaceWidth + SLOPEGRAPH_CANVAS_SIZE + spaceWidth + eWidth
  152. height = (Y_MARGIN * 2) + (((highest - lowest + 1) / delta) * LINE_HEIGHT)
  153.  
  154. # create the real Cairo surface/canvas
  155.  
  156. filename = 'slopegraph.pdf'
  157. surface = cairo.PDFSurface (filename, width, height)
  158. cr = cairo.Context (surface)
  159.  
  160. cr.save()
  161.  
  162. cr.select_font_face("Sans", cairo.FONT_SLANT_NORMAL, cairo.FONT_WEIGHT_NORMAL)
  163. cr.set_font_size(FONT_SIZE)
  164.  
  165. cr.set_line_width(LINE_WIDTH)
  166. cr.set_source_rgba (0, 0, 0) # need to make this a constant
  167.  
  168. # draw start labels at the correct positions
  169. # cheating a bit here as the code doesn't (yet) line up 
  170. # the actual data values
  171.  
  172. for k in sorted(startKeys):
  173.  
  174. 	label = starts[k]
  175. 	xbearing, ybearing, lWidth, lHeight, xadvance, yadvance = (cr.text_extents(label))
  176.  
  177. 	val = float(k)
  178.  
  179. 	cr.move_to(X_MARGIN + (sWidth - lWidth), Y_MARGIN + (highest - val) * LINE_HEIGHT * (1/delta) + LINE_HEIGHT/2)
  180. 	cr.show_text(label + " " + k)
  181. 	cr.stroke()
  182.  
  183. # draw end labels at the correct positions
  184. # cheating a bit here as the code doesn't (yet) line up 
  185. # the actual data values
  186.  
  187. for k in sorted(endKeys):
  188.  
  189. 	label = ends[k]
  190. 	xbearing, ybearing, lWidth, lHeight, xadvance, yadvance = (cr.text_extents(label))
  191.  
  192. 	val = float(k)
  193.  
  194. 	cr.move_to(width - X_MARGIN - eWidth - (4*spaceWidth), Y_MARGIN + (highest - val) * LINE_HEIGHT * (1/delta) + LINE_HEIGHT/2)
  195. 	cr.show_text(k + " " + label)
  196. 	cr.stroke()
  197.  
  198. # do the actual plotting
  199.  
  200. cr.set_line_width(PLOT_LINE_WIDTH)
  201. cr.set_source_rgba (0.75, 0.75, 0.75) # need to make this a constant
  202.  
  203. for s1,e1 in pairs:
  204. 	cr.move_to(X_MARGIN + sWidth + spaceWidth + 20, Y_MARGIN + (highest - s1) * LINE_HEIGHT * (1/delta) + LINE_HEIGHT/2)
  205. 	cr.line_to(width - X_MARGIN - eWidth - spaceWidth - 20, Y_MARGIN + (highest - e1) * LINE_HEIGHT * (1/delta) + LINE_HEIGHT/2)
  206. 	cr.stroke()
  207.  
  208. cr.restore()
  209. cr.show_page()
  210. surface.finish()

I posted a link to Twitter earlier on a recent discovery of the ability to clone RSA SecurID soft tokens:

https://twitter.com/hrbrmstr/status/204908233645764609

It (rightfully so) received some critical responses by @wh1t3rabbit & @wikidsystems since, apart from what the hypesters may say, this is a low-risk weakness.

Think about it. Just looking at the two most likely threat actors & actions: an insider trying to siphon off soft tokens and an external attacker using crafted malware to grab soft tokens. The former (most likely) knows your organization is using soft tokens (and probably has one herself). The latter is unlikely to just try to blanket siphon off soft tokens so they’ll have to do some research to target an organization (which costs time/money).

Once a victim (or set of victims) is identified, the cloning steps would have to be perfectly executed (and, I’m not convinced that’s a given). Let’s say that this is a given, though. Now both the insider and external agent have access to the bits to clone a token. It is easier for the insider to get that data, but the external attacker has to exfiltrate successfully it somehow (more complexity/time/cost).

To be useful, the attacker needs the user id, PIN and – in most implementations – a password. An insider would (most likely) know the user id (since she probably has one herself) but that data would require more time/effort/cost to the external attacker (think opportunistic keylogger/screenscraper with successful exfiltration). For both attackers, getting the password requires either social engineering or the use of a keylogger. Even then, there’s a time-limit of 90 days or less (since, if you’re using soft tokens, you probably have a 90 day password policy). That shrinks the amount of time the attack can be successful.

Now, both attackers need to know where this soft token can be used and have direct access to those systems. Again, probably easier for an insider and fairly costly for an external attacker.

Looking at this, there’s definitely a greater risk associated with an insider from this weakness than there is from an external party (as pointed out by the aforementioned twitter commentators). As @wikidsystems further pointed out, this also shows the inherent positives of multi-factor authentication :: you need far more component parts to execute a successful attack, making the whole thing very costly to obtain. Security economics FTW!

My comment has been that if using the TPM store for Windows-based SecurID soft token implementations negates this weakness, then why not do it? Does the added deployment & management complexity really cost that much?

In the end, I would categorize this weakness as a low risk to most organizations using soft tokens with a non-TPM storage configuration. Unless you know you’re a nation-state target (my opine for the origin of the attacker) – and, even then, you’re probably using hard tokens – far too many celestial bodies need to align for this weakness to be exploited successfully.

NOTE: This post was not meant to be a comprehensive risk assessment of the weakness and does not cover all attack scenarios. I left out many, including Windows desktop administrators and privileged script access. I was merely trying to do my part to counter whatever hype ensues from this weakness. Comments on those vectors or the analysis in general are most welcome.

I recently finished watching “Japan: Memoirs of a Secret Empire” [ Netflix | PBS | Amazon ]. It’s a three-part documentary that centers around the 16th & 17th centuries (where a vast majority of my favorite samurai movies are set).

In the third part there is a segment on the four-tier class system. It goes into some detail on how the decline of the samurai (due to a time of significant peace) and the increase in trade facilitated the erosion of societal taboos that previously prevented classes from mingling. Apart from the creation of a new merchant-inspired class and an increase in the frequenting of courtesans there was also the mention of “haiku clubs” where (quoting from the documentary) “members chose pen names to obscure their social rank. That way, the classes could mingle freely“.

In a similar way, Twitter (and many forums before it) is the great equalizer with the added similarity of brevity (and often a gravity well for some modern day haiku fanatics). I don’t have any statistics, but I do wonder if Twitter is having a more equalizing impact on societies where class and overt discrimination are still wildly prevalent (think India or Saudi Arabia).

as avatars tweet
modern words become equal
history repeats

I’ve been wanting to post this entry for a while, but I didn’t have the opportunity to compel an extra pair of hands to assist with some necessary, salient portions of it until tonight.

For those who were hoping Mountain Lion’s AirPlay would be a revolutionary step in the “your content, wherever you want it” battle, I fear you may be in for a bit of a disappointment. Quite by accident, I stumbled upon some eerie signs that location-aware video DRM will be alive and well as an integrated part of Apple’s forthcoming release of Mountain Lion.

Since I have a shiny, new 1080p Apple TV and a device capable of running Mountain Lion, I’ve been experimenting with the awesomeness that is AirPlay. Despite claims to the contrary, most of the time routing MacBook Pro Desktop video & system audio to the tiny black box of happiness works flawlessly (even more so since the recent Apple TV update). It’s been a treat to be able to play owned-backup copies of my favorite samurai videos (I should buy stock in Criterion) with subtitles via VLC.

However, on a lark I tried to play one of my Avengers : Earth’s Mightiest Heroes episodes (we subscribe via iTunes) using QuickTime Player and, much to my chagrin, discovered that there lies within the heart of the tame Mountain Lion, a DRM beast.

The easiest way to show this is with what happens when I try to take a fullscreen snapshot with Skitch as I’m playing the DRM-laden episode:

I managed to get #3 to help me record a video (albeit crappy) of what happens when I try to route the Desktop video to the living room TV via AirPlay: (youtube link)

In case it’s not obvious, the video plays fine on the Desktop (in QuickTime) prior to the AirPlay route, but goes equally as blank when the AirPlay device is chosen, yet reverts to playing when AirPlay is disabled.

This means the API hooks are there to prevent DRM-laden content from being used with AirPlay (or snapped via screen capture) and that, in turn, means your hopes of AirPlaying Hulu, Netflix and Amazon Video content may be dashed despite all of those services working now (in the betas).

Video is the last content area to understand the need to be open. Amazon & Apple sell untainted music and even Tor is going DRM free (joining #spiffy folks like O’Reilly Media).

I own that episode of The Avengers yet am not able to do with it as I please. Yes, I could have streamed it over the Internet from iCloud to the Apple TV or even routed it via the local iTunes to the Apple TV, but I wanted to use QuickTime (though, just for a test). What’s to stop Apple or other companies from requiring a special streaming license if you want the ability to use AirPlay or just disabling it altogether in favor of forcing you to use something as horrific as Google TV (full disclosure, I own a Logitech Google TV box, too)?

Combine these restrictions with the inevitable “you will only be able to use Apple-authorized apps in Mac OS” in a post-Mountain Lion release and your hopes of using VLC (or any other player that will not conform to draconian rules) to bypass this silliness will be equally as dashed as your naive AirPlay ones. If there’s no guarantee you’ll be able to get your content to the screen of your choice, why would you choose to remain legal (moral arguments notwithstanding)?

I hope, in the long run, Apple manages to figure out an sane, amenable solution to this silliness. In the meantime, I’m going to pop in a DVD and crank through some Godzilla flicks. At least I can be fairly certain that should work for quite a while longer.

Street sign photo via jbonnain