Skip navigation

Category Archives: MongoDB

I’ve had a Nest thermometer for a while now and it’s been an overall positive experience. It’s given me more visibility into our heating/cooling system usage, patterns and preferences; plus, it actually saved us money last winter.

We try to avoid running the A/C during the summer, and it would have been really helpful if Nest had supported notifications (or had a proper API) for events such as “A/C turned on/off” for the few times it kicked in when we were away and had left the windows open (yes, we could have made “away” mode a bit less susceptible to big temperature swings). So, I decided to whip up a notification system and data logger using Scott Baker’s pynest library (and a little help from redis, mongo and pushover.net).

If you have a Nest thermometer, have an always on Linux box (this script should work nicely on a RaspberryPi) and want this functionality,

  • grab the code over at github
  • create a Pushover app so you can point the API interface there
  • install and start mongo and redis (both are very easy to setup)
  • create the config file
  • tell the script where to find the config file
  • setup a cron job. Every 5 mins shld work nicely:
    */5 * * * * /opt/nest/nizdos.py

Mongo is used for storing the readings (temp and humidity, for the moment; you can change the code to log whatever you want, tho) since it sends nice JSON to D3 without having to whip it into shape.

Redis is used for storing and updating the last known state of the heat/AC system. Technically you could use mongo or a flat file or memcached or sqlite or MySQL (you get the idea) for that, but I have redis running for other things and it’s just far too easy to setup and use.

Pushover is used for iOS and Android notifications (I really hope they add OS X soon :-)

Once @jayjacobs & I are done with our book in November, I’ll be doing another post and adding some code to the github repo to show how to do data analysis and visualization on all this data you’re logging.

If you’re wondering where the name nizdos came from and haven’t googled it yet, it’s an ancient Indo-European word for nest.

Drop me a note here or on github if you use the script (pls)! Send me a pull request on github if you fork the code make any cool changes. Definitely leave a bug report on github if you find any glaring errors.

For those who want the alerting without the overhead of actually dealing with this script, drop me tweet (@hrbrmstr). I’m pretty close to just having the alerting function working in Google’s AppEngine, which won’t require much setup for those without the infrastructure or time to use this script.

Many thanks to all who attended the talk @jayjacobs & I gave at @Secure360 on Wednesday, May 15, 2013. As promised, here are the [slides](https://dl.dropboxusercontent.com/u/43553/Secure360-2013.pdf).

We’ve enumerated quite a bit of non-slide-but-in-presentation information that we wanted to aggregate into a blog post so you can vi[sz] along at home. If you need more of a guided path, I strongly encourage you to take a look at some of the free courses over at [Coursera](https://www.coursera.org/).

For starters, here’s a bit.ly bundle of data analysis & visualization bookmarks that @dseverski & I maintain. We’ve been doing (IMO) a pretty good job adding new resources as they come up and may have some duplicates to the ones below.

People Mentioned

– [Stephen Few’s Perceptual Edge blog](http://www.perceptualedge.com/) : Start from the beginning to learn from a giant in information visualization
– [Andy Kirk’s Visualising Data blog](http://www.visualisingdata.com/) (@visualisingdata) : Perhaps the quintessential leader in the modern visualization movement.
– [Mike Bostock’s blog](http://bost.ocks.org/mike/) (@mbostock) : Creator of D3 and producer of amazing, interactive graphics for the @NYTimes
– [Edward Tufte’s blog](http://www.edwardtufte.com/tufte/) : The father of what we would now identify as our core visualization principles & practices.
– [Nathan Yau’s Flowing Data blog](http://flowingdata.com/) : Making visualization accessible, practical and repeatable.
– [Data Stories Podcast](http://datastori.es/) : Yes, you can learn much about data visualization from an audio podacst (@datastories)
– [storytelling with data](http://www.storytellingwithdata.com/) (@storywithdata) : Extremely practical blog by Cole Nussbaumer that will especially help folks “stuck” in Excel
– [Jay’s blog](http://beechplane.wordpress.com/)
– [My {this} blog](http://rud.is/b)

Tools Mentioned

– [R](http://www.r-project.org/) : Jay & I probably use this a bit too much as a hammer (i.e. treat every data project as a nail) but it’s just far too flexible and powerful to not use as a go-to resource
– [RStudio](http://www.rstudio.com/) : An *amazing* IDE for R. I, personally, usually despise IDEs (yes, I even dislike Xcode), but RStudio truly improves workflow by several orders of magnitude. There are both desktop and server versions of it; the latter gives you the ability to setup a multi-user environment and use the IDE from practically anywhere you are. RStudio also makes generating [reproducible research](http://cran.r-project.org/web/views/ReproducibleResearch.html) a joy with built-in easy access to tools like [kintr](http://yihui.name/knitr/).
– [iPython](http://ipython.org/) : This version of Python takes an already amazing language and kicks it up a few notches. It brings it up to the level of R+RStudio, especially with it’s knitr-like [iPython Notebooks](http://ipython.org/ipython-doc/dev/interactive/htmlnotebook.html) for–again–reproducible research.
– [SecViz](http://secviz.org/) : Security-centric Visualization Site & Tools by @raffaelmarty
– [Mondrian](http://www.theusrus.de/Mondrian/) : This tool needs far more visibility. It enables extremely quick visualization of even very large data sets. The interface takes a bit of getting used to, but it’s faster then typing R commands or fumbling in Excel.
– [Tableau](http://www.tableausoftware.com/) : This tool may be one of the most accessible, fast & flexible ways to explore data sets to get an idea of where you need to/can do further analysis.
– [Processing](http://processing.org/) : A tool that was designed from the ground up to help journalists create powerful, interactive data visualizations that you can slipstream directly onto the web via the [Processing.js](http://processingjs.org/) library.
– [D3](http://d3js.org/) : The foundation of modern, data-driven visualization on the web.
– [Gephi](https://gephi.org/) : A very powerful tool when you need to explore networks & create beautiful, publication-worthy visualizations.
– [MongoDB](http://www.mongodb.org/) : NoSQL database that’s highly & easily scaleable without a steep learning curve.
– [CRUSH Tools by Google](https://code.google.com/p/crush-tools/) : Kicks up your command-line data munging.

alogoWhile you can (and should) view [all the presentations](https://speakerdeck.com/pyconslides) from #PyCon2013, here are my picks for the ones that interested me the most, as they focus on scaling, mapping, automation (both web & electronics) and data analysis:

– [Chef: Why you should automate your web infrastructure](https://speakerdeck.com/pyconslides/chef-why-you-should-automate-your-web-infrastructure-by-kate-heddleston) by Kate Heddleston
– [Messaging at Scale at Instagram](https://speakerdeck.com/pyconslides/messaging-at-scale-at-instagram-by-rick-branson) by Rick Branson
– [Python at Netflix](https://speakerdeck.com/pyconslides/python-at-netflix-by-jeremy-edberg-corey-bertram-and-roy-rapoport) by Jeremy Edberg, Corey Bertram, and Roy Rapoport
– [Real-time Tracking and Mapping of Geographic Objects](https://speakerdeck.com/pyconslides/real-time-tracking-and-mapping-of-geographic-objects-by-ragi-burhum) by Ragi Burhum
– [Scaling Realtime at DISQUS](https://speakerdeck.com/pyconslides/scaling-realtime-at-disqus-by-adam-hitchcock) by Adam Hitchcock
– [A Crash Course in MongoDB](https://speakerdeck.com/pyconslides/a-crash-course-in-mongodb)
– [Server Log Analysis with Pandas](https://speakerdeck.com/pyconslides/server-log-analysis-with-pandas-by-taavi-burns) by Taavi Burns
– [Who’s There – Home Automation with Arduino and RaspberryPi](https://speakerdeck.com/pyconslides/whos-there-home-automation-with-arduino-and-raspberrypi-by-rupa-dachere) by Rupa Dachere
x
– [Why you should use Python 3 for text processing](https://speakerdeck.com/pyconslides/why-you-should-use-python-3-for-text-processing-by-david-mertz) by David Mertz
– [Awesome Big Data Algorithms](https://speakerdeck.com/pyconslides/awesome-big-data-algorithms-by-titus-brown) by Titus Brown

A huge thanks to the speakers and conference organizers for making these resources freely available, especially to those of us who were not able to attend the conference.

There’s a good FAQ on how to do the MongoDB query -> R data frame but I wanted to post a more complete example that included the database connection and query setup since I suspect there are folks new to Mongo who would appreciate the end-to-end view. The code is fully annotated with comments, and I’ll caveat that this was for pulling data from my solar radiation sensor (it provides some context for the query and values).

library(rmongodb)
library(chron) # NOTE: you don't need this for Mongo; it's for the sensor readings plot
 
# connect to mongodb server on host and connect to db
mongo = mongo.create(host="MONGODB_HOST",db="DATABASE_NAME")
 
if (mongo.is.connected(mongo)) {
 
  # this sets up the query (there are other "buffer.append…" functions
  today = format(Sys.time(), "%Y-%m-%d")
  buf = mongo.bson.buffer.create()
  mongo.bson.buffer.append.string(buf,"date",today)
  query = mongo.bson.from.buffer(buf)
 
  # run the query and get total results & the starting db cursor  
  todays.readings.count = mongo.count(mongo,"solar.readings",query)
  todays.readings.cursor = mongo.find(mongo,"solar.readings",query)
 
  # setup some vectors to hold our results  
  time = vector("character",todays.readings.count)
  lux = vector("numeric",todays.readings.count)
  full = vector("numeric",todays.readings.count)
  IR = vector("numeric",todays.readings.count)
 
  i = 1
 
  # iterate over the results with the cursor    
  while (mongo.cursor.next(todays.readings.cursor)) {
 
    # get the values of the current record
    cval = mongo.cursor.value(todays.readings.cursor)
 
    # split it out into our vectors    
    time[i] = mongo.bson.value(cval,"time")
    full[i] = mongo.bson.value(cval,"Full")
    lux[i] = mongo.bson.value(cval,"Lux")
    IR[i] = mongo.bson.value(cval,"IR")
 
    i = i + 1
 
  }
 
  # packages all our values up into a data frame  
  df = as.data.frame(list(time=time,full=full,lux=lux,IR=IR))
 
  # (for my wx data, I need 'time' as an actual time value)  
  df$Time = times(df$time)
  df$time = NULL
 
  par(mfrow=c(3,1))
  plot(df$full~df$Time,type="l",col="blue",lwd="1",xlab="",ylab="Full Spectrum",main=paste(today," Solar Radiation Readings"))
  plot(df$lux~df$Time,type="l",col="blue",lwd="1",xlab="",ylab="Lux (calculated)")
  plot(df$IR~df$Time,type="l",col="blue",lwd="1",xlab="Time",ylab="IR")
  par(mfrow=c(1,1))
 
}