Skip navigation

Category Archives: Development

(NOTE: You can keep up with progress best at github, but can always search on “slopegraph” here or just hit the tag page: “slopegraph” regularly)

I’ve been a bit obsessed with slopegraphs (a.k.a “Tufte table-chart”) of late and very dissatisfied with the lack of tools to make this particular visualization tool more prevalent. While my ultimate goal is to have a user-friendly modern web app or platform app that’s as easy as a “drag & drop” of a CSV file, this first foray will require a bit (not much, really!) of elbow grease to be used.

For those who want to get right to the code, head on over to github and have a look (I’ll post all updates there). Setup, sample & source are also below.

First, you’ll need a modern Python install. I did all the development on Mac OS Mountain Lion (beta) with the stock Python 2.7 build. You’ll also need the Cairo 2D graphics library which built and installed perfectly from source, even on ML, so it should work fine for you. If you want something besides PDF rendering, you may need additional libraries, but PDF is decent for hi-res embedding, converting to jpg/png (see below) and tweaking in programs like Illustrator.

If you search for “Gender Comparisons” in the comments on this post at Tufte’s blog, you’ll see what I was trying to reproduce in this bit of skeleton code (below). By modifying the CSV file you’re using [line 21] and then which fields are relevant [lines 45-47] you should be able to make your own basic slopegraphs without much trouble.

If you catch any glitches, add some tweak or have a slopegraph “wish list”, let me know here, twitter (@hrbrmstr) or over at github.

  1. # slopegraph.py
  2. #
  3. # Author: Bob Rudis (@hrbrmstr)
  4. #
  5. # Basic Python skeleton to do simple two value slopegraphs
  6. # with output to PDF (most useful form for me...Cairo has tons of options)
  7. #
  8. # Find out more about & download Cairo here:
  9. # http://cairographics.org/
  10. #
  11. # 2012-05-28 - 0.5 - Initial github release. Still needs some polish
  12. #
  13.  
  14. import csv
  15. import cairo
  16.  
  17. # original data source: http://www.calvin.edu/~stob/data/television.csv
  18.  
  19. # get a CSV file to work with 
  20.  
  21. slopeReader = csv.reader(open('television.csv', 'rb'), delimiter=',', quotechar='"')
  22.  
  23. starts = {} # starting "points"/
  24. ends = {} # ending "points"
  25.  
  26. # Need to refactor label max width into font calculations
  27. # as there's no guarantee the longest (character-wise)
  28. # label is the widest one
  29.  
  30. startLabelMaxLen = 0
  31. endLabelMaxLen = 0
  32.  
  33. # build a base pair array for the final plotting
  34. # wastes memory, but simplifies plotting
  35.  
  36. pairs = []
  37.  
  38. for row in slopeReader:
  39.  
  40. 	# add chosen values (need start/end for each CSV row)
  41. 	# to the final plotting array. Try this sample with 
  42. 	# row[1] (average life span) instead of row[5] to see some
  43. 	# of the scaling in action
  44.  
  45. 	lab = row[0] # label
  46. 	beg = row[5] # male life span
  47. 	end = row[4] # female life span
  48.  
  49. 	pairs.append( (float(beg), float(end)) )
  50.  
  51. 	# combine labels of common values into one string
  52. 	# also (as noted previously, inappropriately) find the
  53. 	# longest one
  54.  
  55. 	if beg in starts:
  56. 		starts[beg] = starts[beg] + "; " + lab
  57. 	else:
  58. 		starts[beg] = lab
  59.  
  60. 	if ((len(starts[beg]) + len(beg)) > startLabelMaxLen):
  61. 		startLabelMaxLen = len(starts[beg]) + len(beg)
  62. 		s1 = starts[beg]
  63.  
  64.  
  65. 	if end in ends:
  66. 		ends[end] = ends[end] + "; " + lab
  67. 	else:
  68. 		ends[end] = lab
  69.  
  70. 	if ((len(ends[end]) + len(end)) > endLabelMaxLen):
  71. 		endLabelMaxLen = len(ends[end]) + len(end)
  72. 		e1 = ends[end]
  73.  
  74. # sort all the values (in the event the CSV wasn't) so
  75. # we can determine the smallest increment we need to use
  76. # when stacking the labels and plotting points
  77.  
  78. startSorted = [(k, starts[k]) for k in sorted(starts)]
  79. endSorted = [(k, ends[k]) for k in sorted(ends)]
  80.  
  81. startKeys = sorted(starts.keys())
  82. delta = max(startSorted)
  83. for i in range(len(startKeys)):
  84. 	if (i+1 <= len(startKeys)-1):
  85. 		currDelta = float(startKeys[i+1]) - float(startKeys[i])
  86. 		if (currDelta < delta):
  87. 			delta = currDelta
  88.  
  89. endKeys = sorted(ends.keys())
  90. for i in range(len(endKeys)):
  91. 	if (i+1 <= len(endKeys)-1):
  92. 		currDelta = float(endKeys[i+1]) - float(endKeys[i])
  93. 		if (currDelta < delta):
  94. 			delta = currDelta
  95.  
  96. # we also need to find the absolute min & max values
  97. # so we know how to scale the plots
  98.  
  99. lowest = min(startKeys)
  100. if (min(endKeys) < lowest) : lowest = min(endKeys)
  101.  
  102. highest = max(startKeys)
  103. if (max(endKeys) > highest) : highest = max(endKeys)
  104.  
  105. # just making sure everything's a number
  106. # probably should move some of this to the csv reader section
  107.  
  108. delta = float(delta)
  109. lowest = float(lowest)
  110. highest = float(highest)
  111. startLabelMaxLen = float(startLabelMaxLen)
  112. endLabelMaxLen = float(endLabelMaxLen)
  113.  
  114. # setup line width and font-size for the Cairo
  115. # you can change these and the constants should
  116. # scale the plots accordingly
  117.  
  118. FONT_SIZE = 9
  119. LINE_WIDTH = 0.5
  120.  
  121. # there has to be a better way to get a base "surface"
  122. # to do font calculations besides this. we're just making
  123. # this Cairo surface to we know the max pixel width 
  124. # (font extents) of the labels in order to scale the graph
  125. # accurately (since width/height are based, in part, on it)
  126.  
  127. filename = 'slopegraph.pdf'
  128. surface = cairo.PDFSurface (filename, 8.5*72, 11*72)
  129. cr = cairo.Context (surface)
  130. cr.save()
  131. cr.select_font_face("Sans", cairo.FONT_SLANT_NORMAL, cairo.FONT_WEIGHT_NORMAL)
  132. cr.set_font_size(FONT_SIZE)
  133. cr.set_line_width(LINE_WIDTH)
  134. xbearing, ybearing, sWidth, sHeight, xadvance, yadvance = (cr.text_extents(s1))
  135. xbearing, ybearing, eWidth, eHeight, xadvance, yadvance = (cr.text_extents(e1))
  136. xbearing, ybearing, spaceWidth, spaceHeight, xadvance, yadvance = (cr.text_extents(" "))
  137. cr.restore()
  138. cr.show_page()
  139. surface.finish()
  140.  
  141. # setup some more constants for plotting
  142. # all of these are malleable and should cascade nicely
  143.  
  144. X_MARGIN = 10
  145. Y_MARGIN = 10
  146. SLOPEGRAPH_CANVAS_SIZE = 200
  147. spaceWidth = 5
  148. LINE_HEIGHT = 15
  149. PLOT_LINE_WIDTH = 0.5
  150.  
  151. width = (X_MARGIN * 2) + sWidth + spaceWidth + SLOPEGRAPH_CANVAS_SIZE + spaceWidth + eWidth
  152. height = (Y_MARGIN * 2) + (((highest - lowest + 1) / delta) * LINE_HEIGHT)
  153.  
  154. # create the real Cairo surface/canvas
  155.  
  156. filename = 'slopegraph.pdf'
  157. surface = cairo.PDFSurface (filename, width, height)
  158. cr = cairo.Context (surface)
  159.  
  160. cr.save()
  161.  
  162. cr.select_font_face("Sans", cairo.FONT_SLANT_NORMAL, cairo.FONT_WEIGHT_NORMAL)
  163. cr.set_font_size(FONT_SIZE)
  164.  
  165. cr.set_line_width(LINE_WIDTH)
  166. cr.set_source_rgba (0, 0, 0) # need to make this a constant
  167.  
  168. # draw start labels at the correct positions
  169. # cheating a bit here as the code doesn't (yet) line up 
  170. # the actual data values
  171.  
  172. for k in sorted(startKeys):
  173.  
  174. 	label = starts[k]
  175. 	xbearing, ybearing, lWidth, lHeight, xadvance, yadvance = (cr.text_extents(label))
  176.  
  177. 	val = float(k)
  178.  
  179. 	cr.move_to(X_MARGIN + (sWidth - lWidth), Y_MARGIN + (highest - val) * LINE_HEIGHT * (1/delta) + LINE_HEIGHT/2)
  180. 	cr.show_text(label + " " + k)
  181. 	cr.stroke()
  182.  
  183. # draw end labels at the correct positions
  184. # cheating a bit here as the code doesn't (yet) line up 
  185. # the actual data values
  186.  
  187. for k in sorted(endKeys):
  188.  
  189. 	label = ends[k]
  190. 	xbearing, ybearing, lWidth, lHeight, xadvance, yadvance = (cr.text_extents(label))
  191.  
  192. 	val = float(k)
  193.  
  194. 	cr.move_to(width - X_MARGIN - eWidth - (4*spaceWidth), Y_MARGIN + (highest - val) * LINE_HEIGHT * (1/delta) + LINE_HEIGHT/2)
  195. 	cr.show_text(k + " " + label)
  196. 	cr.stroke()
  197.  
  198. # do the actual plotting
  199.  
  200. cr.set_line_width(PLOT_LINE_WIDTH)
  201. cr.set_source_rgba (0.75, 0.75, 0.75) # need to make this a constant
  202.  
  203. for s1,e1 in pairs:
  204. 	cr.move_to(X_MARGIN + sWidth + spaceWidth + 20, Y_MARGIN + (highest - s1) * LINE_HEIGHT * (1/delta) + LINE_HEIGHT/2)
  205. 	cr.line_to(width - X_MARGIN - eWidth - spaceWidth - 20, Y_MARGIN + (highest - e1) * LINE_HEIGHT * (1/delta) + LINE_HEIGHT/2)
  206. 	cr.stroke()
  207.  
  208. cr.restore()
  209. cr.show_page()
  210. surface.finish()

I’ve been an unapologetic Alfred user since @hatlessec recommended it and have recently been cobbling together quick shell scripts that make life a bit easier.

The following ones – lip & rip – copy your local & remote IP addresses (respectively) to your clipboard and also display a Growl message (if you’re a Growl user).

Nothing really special about them. They are each one-liners and are easily customizable once you install them.

Download: liprip.zip

I’m on a “three things” motif for 2012, as it’s really difficult for most folks to focus on more than three core elements well. This is especially true for web developers as they have so much to contend with on a daily basis, whether it be new features, bug reports, user help requests or just ensuring proper caffeine levels are maintained.

In 2011, web sites took more hits then they ever have and—sadly—most attacks could have been prevented. I fear that the pastings will continue in 2012, but there are some steps you can take to help make your site less of a target.

Bookmark & Use OWASP’s Web Site Regularly

I’d feel a little sorry for hacked web sites if it weren’t for resources like OWASP, tools like IronBee and principles like Rugged being in abundance, with many smart folks associated with them being more than willing to offer counsel and advice.

If you run a web site or develop web applications and have not inhaled all the information OWASP has to provide, then you are engaging in the Internet equivalent of driving a Ford Pinto (the exploding kind) without seat belts, airbags, doors and a working dashboard console. There is so much good information and advice out there with solid examples that prove some truly effective security measures can really be implemented in a single line of code.

Make it a point to read, re-read and keep-up-to-date on new articles and resources that OWASP provides. I know you also need to beat the competition to new features and crank out “x” lines of code per day, but you also need to do what it takes to avoid joining the ranks of those in DataLossDB.

Patch & Properly Configure Your Bootstrap Components

Your web app uses frameworks, runs in some type of web container and sits on top of an operating system. Unfortunately, vulnerabilities pop up in each of those components from time to time and you need to keep on top of those and determine which ones you will patch and when. Sites like Secunia and US-CERT aggregate patch information pretty well for operating systems and popular server software components, but it’s best to also subscribe to release and security mailing lists for your frameworks and other bootstrap components.

Configuring your bootstrap environment securely is also important and you can use handy guides over at the Center for Internet Security and the National Vulnerability Database (which is also good for vulnerability reports). The good news is that you probably only need to double-check this a couple times a year and can also integreate secure configuration baselines into tools like Chef & Puppet.

Secure Data Appropriately

I won’t belabor this point (especially if you promise to read the OWASP guidance on this thoroughly) but you need to look at the data being stored and how it is accessed and determine the most appropriate way to secure it. Don’t store more than you absolutely need to. Encrypt password fields (and other sensitive data) with more than a plain MD5 hash. Don’t store any credit card numbers (really, just don’t) or tokenize them if you do (but you really don’t). Keep data off the front-end environment and watch the database and application logs with a service like Loggly (to see if there’s anything fishy going on).

I’m going to cheat and close with a fourth resolution for you: Create (and test) a data breach response plan. If any security professional is being honest, it’s virtually impossible to prevent a breach if a hacker is determined enough and the best thing you can do for your user base is to respond well when it happens. The only way to do that is have a plan and to test it (so you know what you are doing when the breach occurs). And, you should run your communications plan by other folks to make sure it’s adequate (ping @securitytwits for suggestions for good resources).

You want to be able to walk away from a breach with your reputation as intact as possible (so you’ll have to keep the other three resolutions anyway) with your users feeling fully informed and assured that you did everything you could to prevent it.

What other security-related resolutions are you making this year as a web developer or web site owner and what other tools/services are you using to secure your sites?

Spent some time today updating the missing bits of the OS X version of the Dropbox cloner I uploaded last night. You can just grab the executable or grab the whole project from the github repository.

The app can now backup/restore of local config, clone dropbox configs to a URL/file and also impersonate a captured Dropbox config.

Use it all at your own risk. As stated in the original post, all comments, bugs, additions, fixes etc. are welcome either here or at github.

UPDATE: Check out the newer post on additional features.

There has been much ado of late about Dropbox security with one of the most egregious issues being how easy it is to surreptitiously “clone” someone else’s Dropbox by obtaining just one piece of data – the host id – from the Dropbox SQLite config.db.

Moloch built a Windows & Linux impersonation/cloning utility in Python that was/is meant to be used from a USB/external volume. The utility can save the cloned host id to a local file and also has the capability to use a simple HTTP GET request to log data to a “mothership” web site.

Since many Dropbox users use OS X (including me) I didn’t want them to feel left out or smugly more secure. So, I set about creating a native version of the utility.

This release is not as feature-rich as Moloch’s Python script but it won’t take much more effort to crank out a version that duplicates all of the functionality. “Release early. Release often.” as the kids these days are wont to say.

You can find the source at its github repository. When building it or just downloading & running the executable (see below), you should heed the repo’s README and take care to change the following items in the application’s Info.plist property list:

  • MothershipURL – this is the URL of the remote host you want to store the cloned info to. It defaults to somesite.domain/mothership.php to avoid accidentally sending your own Dropbox data to a remote host. PLEASE NOTE that you will need to get the mothership.php script from the original Windows/Linux code distribution as I have not asked for permission to distribute it here. You can grab the original dbClone.rar directly from here: dl.dropbox.com/u/341940/dbClone.rar (I love the irony of it being hosted on Dropbox itself).

    ALSO NOTE that there’s no need to modify the application’s property list if you don’t mind typing in a URL each run. I eventually plan on making this a separate property list file that allows for multiple URLs so you can select it from a drop-down (and still type a new one if you like).

  • LogFilenamejust include the filename you want to use when storing the cloned info locally if you do not like the default (it’s the same as Moloch’s – "GroceryList.txt"). It defaults to the top-level of the mounted volume (the original Linux & Windows dbClone was meant to be run from a USB/external volume) or "~/" if running it on your boot drive.

You can use the property list editor(s) that come with Apple’s Developer Tools or use vim, TextEdit, TextWrangler (or your favorite text editor) and modify these lines appropriately:

[code]
<key>LogFilename</key>
<string>GroceryList.txt</string>
<key>MothershipURL</key>
<string>http://somesite.domain/mothership.php</string>
[/code]

If you do use the “backup” option, the current naming scheme is "backup-config.db" and it”s important to note that the program will not attempt to overwrite the file. I may change that behaviour in an upcoming release.

I tested the build on OS X 10.6.7 but the Xcode project is set to build for compatibility with 10.5.x or 10.6.x. Feedback on behaviour on other systems would be most welcome.

If you just want the executable, grab the zip’d app and give it a go.

Any and all feedback is welcome (via github or in the comments).

One of my most popular blog posts — 24,000 reads — in the old, co-mingled site was a short snippet on how to strip HTML tags from a block of content in Objective-C. It’s been used by many-an-iOS developer (which was the original intent).

An intrepid reader & user (“Brian” – no other attribution available) found a memory leak that really rears it’s ugly head when parsing large-content blocks. The updated code is below (with the original post text) and also in the comments on the old site. If Brian reads this, please post full attribution info in the comments or to @hrbrmstr so I can give you proper credit.

I needed to strip the tags from some HTML that was embedded in an XML feed so I could display a short summary from the full content in a UITableView. Rather than go through the effort of parsing HTML on the iPhone (as I already parsed the XML file) I built this simple method from some half-finished snippets I found. It has worked in all of the cases I have needed, but your mileage may vary. It is at least a working method (which cannot be said about most of the other examples). It works both in iOS (iPhone/iPad) and in plain-old OS X code, too.

– (NSString *) stripTags:(NSString *)str {

NSMutableString *html = [NSMutableString stringWithCapacity:[str length]];

NSScanner *scanner = [NSScanner scannerWithString:str];
NSString *tempText = nil;

while (![scanner isAtEnd]) {

[scanner scanUpToString:@"<" intoString:&tempText];

if (tempText != nil)
[html appendString:tempText];

[scanner scanUpToString:@">" intoString:NULL];

if (![scanner isAtEnd])
[scanner setScanLocation:[scanner scanLocation] + 1];

tempText = nil;

}

return html ;

}