Skip navigation

Category Archives: Development

I played around with OSE Firewall for WordPress for a couple days to see if it was worth switching to from the plugin I was previously using. It’s definitely not as full featured and I didn’t see any WP database extensions where it kept a log I could review/analyze, so I whipped up a little script to extract all the alert data from the Gmail account I setup for it to log to.

The script below – while focused on getting OSE Firewall alert data – can be easily modified to search for other types of automated/formatted e-mails and build a CSV file with the results. Remember, tho, that you’re going to be putting your e-mail credentials in this file (if you end up using it) so either use a mailbox you don’t care about or make sure you use sane permissions on the script and keep it somewhere safe.

I tested it on linux boxes, but it should work anywhere you have Python and mailbox access.

I highly doubt there will be any updates to this version (I’m not using OSE Firewall anymore), but you an grab the source below or on github. There should be sufficient annotation in the comments, but if you have any questions, drop a note in the comments.

# oswfw.py - extract WordPress OSE Firewall mail alerts to CSV
# 
# Author: @hrbrmstr
#

import imaplib
import datetime
import re

# get 'today' (in the event you are just reporting on today's hits
date = (datetime.date.today() - datetime.timedelta(1)).strftime("%d-%b-%Y")

# setup IMAP connection

gmail = imaplib.IMAP4_SSL('imap.gmail.com',993) # use your IMAP server it not Gmail
gmail.login("YOUR_IMAP_USERNAME","YOUR_PASSWORD")
gmail.select('[Gmail]/All Mail') # Your IMAP's "all mail" if not using Gmail

# now search for all mails with "OSE Firewall" in the subject

# uncomment this line and comment out the next one to just get results from 'today'
#result, data = gmail.uid('search', None, '(SENTSINCE {date} HEADER Subject "OSE Firewall*")'.format(date=date))
result, data = gmail.uid('search', None, '(HEADER Subject "OSE Firewall*")')

# setup CSV file for output

f = open("osefw.csv", "w+")
f.write("Date,IP,URI,Method,UserAgent,Referer\n") ;

# cycle through result set from IMAP search query, extracting salient info
# from headers/body of each found message

for msg in data[0].split():

    # fetch the msg for the UID
    res, msg_txt = gmail.uid('fetch', msg, '(RFC822)')

    # get rid of carriage returns
    body = re.sub(re.compile('\r', re.MULTILINE), '', msg_txt[0][1])

    # extract salient fields from the message body/header
    DATE = re.findall('^Date: (.*?)$', body, re.M)
    IP = re.findall('^FROM IP: http:\/\/whois.domaintools.com\/(.*?)$', body, re.M)
    URI = re.findall('^URI: (.*?)$', body, re.M)
    METHOD = re.findall('^METHOD: (.*?)$', body, re.M)
    USERAGENT= re.findall('^USERAGENT: (.*?)$', body, re.M)
    REFERER = re.findall('^REFERER: (.*?)$', body, re.M)

    # format for CSV output
    ose_log  = "%s,%s,%s,%s,%s,%s\n" % (DATE, IP, URI, METHOD, USERAGENT, REFERER)

    # quicker to replace array output brackets than to deal with non-array results checking
    f.write(re.sub("[\[\]]*", "", ose_log))

    f.flush() ;

gmail.logout()
f.close()

There’s a good FAQ on how to do the MongoDB query -> R data frame but I wanted to post a more complete example that included the database connection and query setup since I suspect there are folks new to Mongo who would appreciate the end-to-end view. The code is fully annotated with comments, and I’ll caveat that this was for pulling data from my solar radiation sensor (it provides some context for the query and values).

library(rmongodb)
library(chron) # NOTE: you don't need this for Mongo; it's for the sensor readings plot
 
# connect to mongodb server on host and connect to db
mongo = mongo.create(host="MONGODB_HOST",db="DATABASE_NAME")
 
if (mongo.is.connected(mongo)) {
 
  # this sets up the query (there are other "buffer.append…" functions
  today = format(Sys.time(), "%Y-%m-%d")
  buf = mongo.bson.buffer.create()
  mongo.bson.buffer.append.string(buf,"date",today)
  query = mongo.bson.from.buffer(buf)
 
  # run the query and get total results & the starting db cursor  
  todays.readings.count = mongo.count(mongo,"solar.readings",query)
  todays.readings.cursor = mongo.find(mongo,"solar.readings",query)
 
  # setup some vectors to hold our results  
  time = vector("character",todays.readings.count)
  lux = vector("numeric",todays.readings.count)
  full = vector("numeric",todays.readings.count)
  IR = vector("numeric",todays.readings.count)
 
  i = 1
 
  # iterate over the results with the cursor    
  while (mongo.cursor.next(todays.readings.cursor)) {
 
    # get the values of the current record
    cval = mongo.cursor.value(todays.readings.cursor)
 
    # split it out into our vectors    
    time[i] = mongo.bson.value(cval,"time")
    full[i] = mongo.bson.value(cval,"Full")
    lux[i] = mongo.bson.value(cval,"Lux")
    IR[i] = mongo.bson.value(cval,"IR")
 
    i = i + 1
 
  }
 
  # packages all our values up into a data frame  
  df = as.data.frame(list(time=time,full=full,lux=lux,IR=IR))
 
  # (for my wx data, I need 'time' as an actual time value)  
  df$Time = times(df$time)
  df$time = NULL
 
  par(mfrow=c(3,1))
  plot(df$full~df$Time,type="l",col="blue",lwd="1",xlab="",ylab="Full Spectrum",main=paste(today," Solar Radiation Readings"))
  plot(df$lux~df$Time,type="l",col="blue",lwd="1",xlab="",ylab="Lux (calculated)")
  plot(df$IR~df$Time,type="l",col="blue",lwd="1",xlab="Time",ylab="IR")
  par(mfrow=c(1,1))
 
}

Dan Kaminski (@dakami) tweeted a cool, small (11 instruction) TRNG generator called Jytter Friday:

The authors have Windows & Linux ports but no OS X port (and, I play mostly on OS X when not in virtual environments). So, I threw together a quick port of it to OS X. It should work on Snow Leopard or Lion, but YMMV.

You need to install nasm via a quick brew install then copy linux32_build.sh to osx_build.sh and change the nasm and gcc lines to be:

nasm -D_32_ -O0 -fmacho -ojytter.o jytter.asm
nasm -D_32_ -O0 -fmacho -otimestamp.o timestamp.asm
gcc -D_32_ -m32 jytter.o timestamp.o -o demo -O2 demo.c

I ran the resultant demo binary and got:

True random integers:
 
64-bit: 10DE4A7EA676A869
128-bit: E2B9F86CADC854B540090E125A7C7611
256-bit: 7F3AC590F6EE2AC13F136B802BEBCC8323CB26665BC354CDAC488ED86E153641
 
True random passwords:
 
66-bit: OEqQaY8UQeO
132-bit: Gwi9DCMtFy7XzHWHII37Hj
258-bit: TPzqJfLL84Mjq3VZXpQDW0.WhWSFq2HA9X6FL7GSjaX
 
Execution time in CPU ticks:
 
000000004397D59A

which tracks with the linux output I received (obviously not the same values) from the demo program on one of my non-VPS linux nodes.

Russell Leidich (the author) did some really impressive work here. I did virtually nothing (just enabled playing with it on OS X). The posts at the Jytter site are well worth the time spent absorbing.

The Fund For Peace (FFP) and Foreign Policy jointly released the 2012 version of the “failed states index” (FSI). From the FFP site, the FSI:

…focuses on the indicators of risk and is based on thousands of articles and reports that are processed by our CAST Software from electronically available sources.

I read it every year (mostly due to being an ardent reader of Foreign Policy magazine) and find the rankings, methodology & insights quite intriguing. With my recent work on slopegraphs, I thought this would be a good data set to play with to determine what – if any – features were necessary to support rank order (and to provide some impetus to finally refactor the code to support multi-column slopegraphs…more on that later).

However, I was not looking forward to transcribing the data from the Flash visualization on the Foreign Policy web site. There are HTML grids on the FFP site but I really just wanted the overall rankings (i.e. no sub-indices) and noticed this interesting scrollable mini-grid on one of the FFP FSI pages:

Thankfully[?] it’s an IFRAME and I was able to pull 2010, 2011 & 2012 data in a very usable format by manipulating this URL: http://www.fundforpeace.org/global/tables/fsiindex2010_sml.htm.

After some quick transformations, I had two CSV files for a 2010-2012 comparison and a 2011-2012 comparison.

(Before continuing, I feel the need to point out that the data, methodology, etc is 100% Copyright © 2012 The Fund for Peace as they overtly point out many times on their site.)

When I threw the data into the slopegraph tool, it was immediately obvious that I was missing something important: the ability to specify sort order for the data. For most slopegraphs, the code works well since our brains expect the larger values on the top. For a rank-order slopegraph, that sort order (for the most part) should be ascending vs descending to best represent changes in rank position. It does feel odd that being “#1” in the FSI actually means you’re really a loser, but I didn’t make the rules for their index.

So, PySlopegraph now handles two column rank order slopegraphs and, as you’ll see in part two, also handles multi-column slopegraphs (but that bit needs some work). The code will be up on github in a couple days as I’ve also got some half-finished support for Processing.js and Paper.js that I want to finish before another push. If anyone needs it sooner, just @ or DM me.

Now, For The Data

The “Top 25” (that sounds way too positive for what it really means) slopegraph is the easiest to read (as it’s the smallest). It is also where Foreign Policy & FFP focus some dataviz effort as well (though they do have visualizations for all the data). Here’s the slopegraph showing the rank order chance from 2010 to 2012:

The full slopegraphs are tall slopegraphs (I’ve been prototyping some ways to make tall ones more useful, but that’s nowhere near ready for public consumption). You may just want to grab the two PDFs and look there vs in this post:

Rank Order Comparison :: 2010/2012


Rank Order Comparison :: 2011/2012

While it requires scrolling, the changes in rank are immediately noticeable as is the fact that the the FFP folk allow for ties that leave “holes” in the table. I think you really get a feel for which countries are stable, improving and declining very quickly with the slopegraph version, but I’d like to hear your thoughts if you have an opine you’d like to share.

Stay tuned for part two!

[@hrbrmstr starts working in javascript again]
The Internets: What do you think?
@hrbrmstr: It’s vile.
The Internets: I know. It’s so bubbly and cloying and happy.
@hrbrmstr: Just like the Federation.
The Internets: And you know what’s really frightening? If you develop with it enough, you begin to like it.
@hrbrmstr: It’s insidious.
The Internets: Just like the Federation.

(With apologies to ST:DS9)

UPDATE: It seems my use of <script async> optimization for Raphaël busted the inline slopegraph generation. Will work on tweaking the example posts to wait for Raphaël to load when I get some time.

So, I had to alter this to start after a user interaction. It loaded fine as a static, local page but seems to get a bit wonky embedded in a complex page. I also see some artifacts in Chrome but not in Safari. Still, not a bad foray into basic animation.

Animate Slopegraph


There were enough eye-catching glitches in the experimental javascript support and the ugly large-number display in the spam example post that I felt compelled to make a couple formatting tweaks in the code. I also didn’t have time to do “real” work on the codebase this weekend.

So, along with spacing adjustments, there’s now an “add_commas” non-mandatory option that will toss commas in large numbers so they’re easy to read. Here’s an example of the new output (both the Raphaël display and commas):


As usual, it’s up on github

In preparation for the upcoming 1.0 release and with the hopes of laying a foundation for more interactive slopegraphs, I threw together some rudimentary output support over lunch today for Raphaël, which means that all you have to do is generate a new slopegraph with the “js” output type and include the salient portions of the generated html/css/javascript into a web page (along with including the Raphaël script code).

The next github push will have this update. Here’s an example of the output, using the classic Tufte example chart:


It’s definitely a bit rough around the edges (my eyes immediately fixate upon spacing discrepancies) and lacking any interactivity, but the basic building blocks are in place. It also does not render on my Android phone (HTC Incredible 2) but it does render in Chrome, Safari & on my iPad. Embedding a Raphaël graphic in a web page will definitely have advantages over a PNG or PDF in most situations even if it’s not interactive, so I’ll probably keep the support in regardless of whether I continue to improve upon it.

As I was playing with the code, I kept thinking how neat it would be if there was a Raphaël Cairosurface” option. Perhaps that will be a side project if all goes well, since it would not be that much more complicated (in fact, it may be less complicated) than the Cairo SVG surface code.