Weather Archives

Category Archives: Weather

Rust CLI For Apple’s WeatherKit REST API

Apple is in the final stages of shuttering the DarkSky service/API. They’ve replaced it with WeatherKit, which has both an xOS framework version as well as a REST API. To use either, you need to be a member of the Apple Developer Program (ADP) — $99.00/USD per-year — and calls to the service via either method are free up to 500K/month. After that, Apple has pricing tears.

As a result of the forced-ADP membership fee, I’m not sure how many folks are going to invest in building anything but freemium native or web apps. DarkSky had a generous free tier that only required an API key.

Since I had a {darksky} R package, I recently made a similar {weatherkit} package —https://rud.is/b/2022/07/07/introducing-weatherkit-the-eventual-replacement-r-package-for-darksky/ — complete with a CLI demo program.

Lots of R folks will disagree with the following, but R is a terrible language for CLI tools if you’re not already invested in the R ecosystem. CRAN makes it a pain to modify the user’s local system, and most R things have a ton of dependencies. So, while I generally code R-first, I do not code R-only, especially for CLI tools.

I like Rust more than Golang, and am also getting used to it over C/C++, so I threw together a Rust-based WeatherKit CLI tool shortly after the R one — https://github.com/hrbrmstr/weatherkit-rust. There’s documentation for how to cross all the t’s and dot all the i’s required to get authentication to work.

The GH releases have a signed macOS universal binary and I’m working on decomposing Starship’s wicked cool Rust release builder that uses the equally cool Google release-please to deal up binaries for virtually every platform.

I may make the Rust version a full WeatherKit API library, but I don’t know if I’m going to invest time into something that may just get shoved aside due to the hate I’m expecting to see pointed in Apple’s direction.

Introducing {weatherkit}: The Eventual Replacement R Package for {darksky}

My {darksky} package has been around for years, now, and the service that powers it was purchased by Apple before the pandemic. The DarkSky API is slated to be shuttered in December of this year and is being replaced by Apple’s WeatherKit xOS Framework and REST API.

I’ve started work on a {weatherkit} package which uses the WeatherKit REST API. You’ll need an Apple Developer account and will also need to setup some items in said account, and locally so you can authenticate to the API. Once all the authentication bits are setup, it’s pretty easy to get the weather data:

wx <- wxkit_weather(43.2683199, -70.8635506)
wx <- wx_tidy(wx)
str(wx)
## List of 4
##  $ currentWeather  :List of 18
##   ..$ name                  : chr "CurrentWeather"
##   ..$ metadata              :List of 8
##   .. ..$ attributionURL: chr "https://weather-data.apple.com/legal-attribution.html"
##   .. ..$ expireTime    : POSIXct[1:1], format: "2022-07-07 12:28:07"
##   .. ..$ latitude      : num 43.3
##   .. ..$ longitude     : num -70.9
##   .. ..$ readTime      : POSIXct[1:1], format: "2022-07-07 12:23:07"
##   .. ..$ reportedTime  : POSIXct[1:1], format: "2022-07-07 10:48:55"
##   .. ..$ units         : chr "m"
##   .. ..$ version       : int 1
##   ..$ asOf                  : POSIXct[1:1], format: "2022-07-07 12:23:07"
##   ..$ cloudCover            : num 0.29
##   ..$ conditionCode         : chr "MostlyClear"
##   ..$ daylight              : logi TRUE
##   ..$ humidity              : num 0.68
##   ..$ precipitationIntensity: num 0
##   ..$ pressure              : num 1018
##   ..$ pressureTrend         : chr "rising"
##   ..$ temperature           : num 19.5
##   ..$ temperatureApparent   : num 19.4
##   ..$ temperatureDewPoint   : num 13.5
##   ..$ uvIndex               : int 2
##   ..$ visibility            : num 29413
##   ..$ windDirection         : int 50
##   ..$ windGust              : num 12
##   ..$ windSpeed             : num 4.42
##  $ forecastDaily   :List of 3
##   ..$ name    : chr "DailyForecast"
##   ..$ metadata:List of 8
##   .. ..$ attributionURL: chr "https://weather-data.apple.com/legal-attribution.html"
##   .. ..$ expireTime    : POSIXct[1:1], format: "2022-07-07 13:23:07"
##   .. ..$ latitude      : num 43.3
##   .. ..$ longitude     : num -70.9
##   .. ..$ readTime      : POSIXct[1:1], format: "2022-07-07 12:23:07"
##   .. ..$ reportedTime  : POSIXct[1:1], format: "2022-07-07 10:48:55"
##   .. ..$ units         : chr "m"
##   .. ..$ version       : int 1
##   ..$ days    :'data.frame': 10 obs. of  26 variables:
##   .. ..$ forecastStart      : POSIXct[1:10], format: "2022-07-07 04:00:00" "2022-07-08 04:00:00" "2022-07-09 04:00:00" "2022-07-10 04:00:00" ...
##   .. ..$ forecastEnd        : POSIXct[1:10], format: "2022-07-08 04:00:00" "2022-07-09 04:00:00" "2022-07-10 04:00:00" "2022-07-11 04:00:00" ...
##   .. ..$ conditionCode      : chr [1:10] "PartlyCloudy" "PartlyCloudy" "MostlyClear" "MostlyClear" ...
##   .. ..$ maxUvIndex         : int [1:10] 7 7 7 8 7 6 7 4 5 4
##   .. ..$ moonPhase          : chr [1:10] "firstQuarter" "firstQuarter" "waxingGibbous" "waxingGibbous" ...
##   .. ..$ moonrise           : POSIXct[1:10], format: "2022-07-07 17:38:12" "2022-07-08 18:50:47" "2022-07-09 20:07:35" "2022-07-10 21:27:35" ...
##   .. ..$ moonset            : POSIXct[1:10], format: "2022-07-07 04:32:48" "2022-07-08 04:54:51" "2022-07-09 05:20:27" "2022-07-10 05:51:50" ...
##   .. ..$ precipitationAmount: num [1:10] 0 0.49 0 0 0 1.32 0.24 3.44 5.07 8.35
##   .. ..$ precipitationChance: num [1:10] 0.01 0.15 0.07 0 0.07 0.39 0.37 0.4 0.47 0.44
##   .. ..$ precipitationType  : chr [1:10] "clear" "rain" "clear" "clear" ...
##   .. ..$ snowfallAmount     : num [1:10] 0 0 0 0 0 0 0 0 0 0
##   .. ..$ solarMidnight      : POSIXct[1:10], format: "2022-07-07 04:48:29" "2022-07-08 04:48:39" "2022-07-09 04:48:49" "2022-07-10 04:48:58" ...
##   .. ..$ solarNoon          : POSIXct[1:10], format: "2022-07-07 16:48:26" "2022-07-08 16:48:35" "2022-07-09 16:48:44" "2022-07-10 16:48:53" ...
##   .. ..$ sunrise            : POSIXct[1:10], format: "2022-07-07 09:10:59" "2022-07-08 09:11:42" "2022-07-09 09:12:26" "2022-07-10 09:13:11" ...
##   .. ..$ sunriseCivil       : POSIXct[1:10], format: "2022-07-07 08:36:06" "2022-07-08 08:36:53" "2022-07-09 08:37:41" "2022-07-10 08:38:31" ...
##   .. ..$ sunriseNautical    : POSIXct[1:10], format: "2022-07-07 07:50:45" "2022-07-08 07:51:39" "2022-07-09 07:52:36" "2022-07-10 07:53:34" ...
##   .. ..$ sunriseAstronomical: POSIXct[1:10], format: "2022-07-07 06:55:17" "2022-07-08 06:56:30" "2022-07-09 06:57:46" "2022-07-10 06:59:04" ...
##   .. ..$ sunset             : POSIXct[1:10], format: "2022-07-08 00:25:50" "2022-07-09 00:25:26" "2022-07-10 00:24:59" "2022-07-11 00:24:30" ...
##   .. ..$ sunsetCivil        : POSIXct[1:10], format: "2022-07-08 01:00:39" "2022-07-09 01:00:10" "2022-07-10 00:59:38" "2022-07-11 00:59:04" ...
##   .. ..$ sunsetNautical     : POSIXct[1:10], format: "2022-07-08 01:46:01" "2022-07-09 01:45:23" "2022-07-10 01:44:42" "2022-07-11 01:43:58" ...
##   .. ..$ sunsetAstronomical : POSIXct[1:10], format: "2022-07-08 02:41:14" "2022-07-09 02:40:16" "2022-07-10 02:39:14" "2022-07-11 02:38:09" ...
##   .. ..$ temperatureMax     : num [1:10] 25.8 28.7 24.9 25.4 28.9 ...
##   .. ..$ temperatureMin     : num [1:10] 13.7 16.3 14.8 12.2 12.4 ...
##   .. ..$ daytimeForecast    :'data.frame':   10 obs. of  11 variables:
##   .. .. ..$ forecastStart      : POSIXct[1:10], format: "2022-07-07 11:00:00" "2022-07-08 11:00:00" "2022-07-09 11:00:00" "2022-07-10 11:00:00" ...
##   .. .. ..$ forecastEnd        : POSIXct[1:10], format: "2022-07-07 23:00:00" "2022-07-08 23:00:00" "2022-07-09 23:00:00" "2022-07-10 23:00:00" ...
##   .. .. ..$ cloudCover         : num [1:10] 0.39 0.45 0.33 0.11 0.42 0.69 0.39 0.95 0.87 0.88
##   .. .. ..$ conditionCode      : chr [1:10] "PartlyCloudy" "PartlyCloudy" "MostlyClear" "Clear" ...
##   .. .. ..$ humidity           : num [1:10] 0.57 0.58 0.54 0.47 0.49 0.63 0.64 0.71 0.7 0.66
##   .. .. ..$ precipitationAmount: num [1:10] 0 0.31 0 0 0 0.26 0.17 3.15 0.22 1.37
##   .. .. ..$ precipitationChance: num [1:10] 0 0.09 0.04 0 0.02 0.29 0.16 0.31 0.33 0.3
##   .. .. ..$ precipitationType  : chr [1:10] "clear" "rain" "clear" "clear" ...
##   .. .. ..$ snowfallAmount     : num [1:10] 0 0 0 0 0 0 0 0 0 0
##   .. .. ..$ windDirection      : int [1:10] 155 263 122 237 231 228 219 98 39 62
##   .. .. ..$ windSpeed          : num [1:10] 8.73 9.42 7.42 6.23 9.75 ...
##   .. ..$ overnightForecast  :'data.frame':   10 obs. of  11 variables:
##   .. .. ..$ forecastStart      : POSIXct[1:10], format: "2022-07-07 23:00:00" "2022-07-08 23:00:00" "2022-07-09 23:00:00" "2022-07-10 23:00:00" ...
##   .. .. ..$ forecastEnd        : POSIXct[1:10], format: "2022-07-08 11:00:00" "2022-07-09 11:00:00" "2022-07-10 11:00:00" "2022-07-11 11:00:00" ...
##   .. .. ..$ cloudCover         : num [1:10] 0.49 0.5 0.15 0.37 0.46 0.4 0.88 0.91 0.8 NA
##   .. .. ..$ conditionCode      : chr [1:10] "PartlyCloudy" "PartlyCloudy" "MostlyClear" "MostlyClear" ...
##   .. .. ..$ humidity           : num [1:10] 0.78 0.78 0.71 0.73 0.69 0.81 0.83 0.85 0.84 NA
##   .. .. ..$ precipitationAmount: num [1:10] 0.06 0.11 0 0 0 1.11 0.04 2.26 6.47 NA
##   .. .. ..$ precipitationChance: num [1:10] 0.05 0.07 0.01 0.02 0.1 0.27 0.24 0.31 0.31 NA
##   .. .. ..$ precipitationType  : chr [1:10] "rain" "rain" "clear" "clear" ...
##   .. .. ..$ snowfallAmount     : num [1:10] 0 0 0 0 0 0 0 0 0 NA
##   .. .. ..$ windDirection      : int [1:10] 192 341 347 223 218 242 276 13 49 NA
##   .. .. ..$ windSpeed          : num [1:10] 9.92 7.15 6.59 5.52 10.95 ...
##   .. ..$ restOfDayForecast  :'data.frame':   10 obs. of  11 variables:
##   .. .. ..$ forecastStart      : POSIXct[1:10], format: "2022-07-07 12:23:07" NA NA NA ...
##   .. .. ..$ forecastEnd        : POSIXct[1:10], format: "2022-07-08 04:00:00" NA NA NA ...
##   .. .. ..$ cloudCover         : num [1:10] 0.47 NA NA NA NA NA NA NA NA NA
##   .. .. ..$ conditionCode      : chr [1:10] "PartlyCloudy" NA NA NA ...
##   .. .. ..$ humidity           : num [1:10] 0.6 NA NA NA NA NA NA NA NA NA
##   .. .. ..$ precipitationAmount: num [1:10] 0 NA NA NA NA NA NA NA NA NA
##   .. .. ..$ precipitationChance: num [1:10] 0.01 NA NA NA NA NA NA NA NA NA
##   .. .. ..$ precipitationType  : chr [1:10] "clear" NA NA NA ...
##   .. .. ..$ snowfallAmount     : num [1:10] 0 NA NA NA NA NA NA NA NA NA
##   .. .. ..$ windDirection      : int [1:10] 163 NA NA NA NA NA NA NA NA NA
##   .. .. ..$ windSpeed          : num [1:10] 9.7 NA NA NA NA NA NA NA NA NA
##  $ forecastHourly  :List of 3
##   ..$ name    : chr "HourlyForecast"
##   ..$ metadata:List of 8
##   .. ..$ attributionURL: chr "https://weather-data.apple.com/legal-attribution.html"
##   .. ..$ expireTime    : POSIXct[1:1], format: "2022-07-07 13:23:07"
##   .. ..$ latitude      : num 43.3
##   .. ..$ longitude     : num -70.9
##   .. ..$ readTime      : POSIXct[1:1], format: "2022-07-07 12:23:07"
##   .. ..$ reportedTime  : POSIXct[1:1], format: "2022-07-07 10:48:55"
##   .. ..$ units         : chr "m"
##   .. ..$ version       : int 1
##   ..$ hours   :'data.frame': 243 obs. of  20 variables:
##   .. ..$ forecastStart         : POSIXct[1:243], format: "2022-07-07 02:00:00" "2022-07-07 03:00:00" "2022-07-07 04:00:00" "2022-07-07 05:00:00" ...
##   .. ..$ cloudCover            : num [1:243] 0.02 0.01 0.02 0.31 0.44 0.74 0.3 1 0.96 0.32 ...
##   .. ..$ conditionCode         : chr [1:243] "Clear" "Clear" "Clear" "MostlyClear" ...
##   .. ..$ daylight              : logi [1:243] FALSE FALSE FALSE FALSE FALSE FALSE ...
##   .. ..$ humidity              : num [1:243] 0.74 0.78 0.81 0.83 0.86 0.88 0.9 0.92 0.88 0.83 ...
##   .. ..$ precipitationAmount   : num [1:243] 0 0 0 0 0 0 0 0 0 0 ...
##   .. ..$ precipitationIntensity: num [1:243] 0 0 0 0 0 0 0 0 0 0 ...
##   .. ..$ precipitationChance   : num [1:243] 0 0 0 0 0 0 0 0 0 0 ...
##   .. ..$ precipitationType     : chr [1:243] "clear" "clear" "clear" "clear" ...
##   .. ..$ pressure              : num [1:243] 1014 1015 1016 1016 1016 ...
##   .. ..$ pressureTrend         : chr [1:243] "rising" "rising" "rising" "rising" ...
##   .. ..$ snowfallIntensity     : num [1:243] 0 0 0 0 0 0 0 0 0 0 ...
##   .. ..$ temperature           : num [1:243] 18.3 17 16.3 15.7 14.9 ...
##   .. ..$ temperatureApparent   : num [1:243] 18.2 16.9 16.1 15.5 14.7 ...
##   .. ..$ temperatureDewPoint   : num [1:243] 13.6 13.1 12.9 12.8 12.6 ...
##   .. ..$ uvIndex               : int [1:243] 0 0 0 0 0 0 0 0 0 1 ...
##   .. ..$ visibility            : num [1:243] 28105 26514 24730 23883 23669 ...
##   .. ..$ windDirection         : int [1:243] 315 302 315 308 310 298 307 316 319 6 ...
##   .. ..$ windGust              : num [1:243] 2.93 2.56 2.92 3.25 3.35 ...
##   .. ..$ windSpeed             : num [1:243] 2.93 2.56 2.92 3.25 3.35 3.09 3.51 2.91 2.36 4.55 ...
##  $ forecastNextHour:List of 6
##   ..$ name         : chr "NextHourForecast"
##   ..$ metadata     :List of 9
##   .. ..$ attributionURL: chr "https://weather-data.apple.com/legal-attribution.html"
##   .. ..$ expireTime    : POSIXct[1:1], format: "2022-07-07 13:23:07"
##   .. ..$ language      : chr "en-US"
##   .. ..$ latitude      : num 43.3
##   .. ..$ longitude     : num -70.9
##   .. ..$ providerName  : chr "US National Weather Service"
##   .. ..$ readTime      : POSIXct[1:1], format: "2022-07-07 12:23:07"
##   .. ..$ units         : chr "m"
##   .. ..$ version       : int 1
##   ..$ summary      :'data.frame':    1 obs. of  4 variables:
##   .. ..$ startTime             : POSIXct[1:1], format: "2022-07-07 12:24:00"
##   .. ..$ condition             : chr "clear"
##   .. ..$ precipitationChance   : num 0
##   .. ..$ precipitationIntensity: num 0
##   ..$ forecastStart: POSIXct[1:1], format: "2022-07-07 12:24:00"
##   ..$ forecastEnd  : POSIXct[1:1], format: "2022-07-07 13:45:00"
##   ..$ minutes      :'data.frame':    81 obs. of  3 variables:
##   .. ..$ startTime             : POSIXct[1:81], format: "2022-07-07 12:24:00" "2022-07-07 12:25:00" "2022-07-07 12:26:00" "2022-07-07 12:27:00" ...
##   .. ..$ precipitationChance   : num [1:81] 0 0 0 0 0 0 0 0 0 0 ...
##   .. ..$ precipitationIntensity: num [1:81] 0 0 0 0 0 0 0 0 0 0 ...

The wx_tidy() function, for now, only does date-time string conversion to POSIXct objects, but it may do more in the future.

It doesn’t appear that historical weather data is available, yet, so you’re limited to using the API to get daily and hourly conditions and forecasts for the present plus some days. As such, I’ve focused a bit on some helper functions to show current conditions and forecasts in the R console/stdout:

current_conditions(wx)

##  Weather for (43.268, -70.864) as of 2022-07-07 08:23:07
## 
##  Conditions: Mostly Clear
## Temperature: 67.08°F
##  Feels like: 66.92°F
##   Dew Point: 56.28°F
##        Wind: 2.7 mph (NE)
##    Pressure: 1017.68 mb (rising)
##  Visibility: 18 miles
##    UV Index: 🟩 2 (Low)
## 
## https://weather-data.apple.com/legal-attribution.html

hourly_forecast(wx)

##  Weather forecast for (43.268, -70.864) as of 2022-07-07 08:23:07
## 
## Today @ 09:00 │ 🌡 69°F │ 💦 63% │ 1018 mb — │ 😎 │ Mostly Clear  │ 🟨
##       @ 10:00 │ 🌡 71°F │ 💦 58% │ 1018 mb — │ 😎 │ Mostly Clear  │ 🟨
##       @ 11:00 │ 🌡 74°F │ 💦 55% │ 1018 mb — │ 😎 │ Mostly Clear  │ 🟧
##       @ 12:00 │ 🌡 75°F │ 💦 53% │ 1017 mb — │ 😎 │ Partly Cloudy │ 🟧
##       @ 13:00 │ 🌡 77°F │ 💦 51% │ 1017 mb ↓ │ 😎 │ Partly Cloudy │ 🟧
##       @ 14:00 │ 🌡 78°F │ 💦 50% │ 1016 mb ↓ │ 😎 │ Partly Cloudy │ 🟧
##       @ 15:00 │ 🌡 78°F │ 💦 50% │ 1016 mb ↓ │ 😎 │ Partly Cloudy │ 🟨
##       @ 16:00 │ 🌡 77°F │ 💦 52% │ 1016 mb ↓ │ 😎 │ Partly Cloudy │ 🟨
##       @ 17:00 │ 🌡 76°F │ 💦 55% │ 1015 mb ↓ │ 😎 │ Partly Cloudy │ 🟩
##       @ 18:00 │ 🌡 75°F │ 💦 58% │ 1015 mb — │ 😎 │ Partly Cloudy │ 🟩
##       @ 19:00 │ 🌡 73°F │ 💦 62% │ 1015 mb — │ 😎 │ Mostly Clear  │ 🟩
##       @ 20:00 │ 🌡 70°F │ 💦 67% │ 1015 mb — │ 😎 │ Partly Cloudy │ 🟩
##       @ 21:00 │ 🌡 68°F │ 💦 71% │ 1015 mb — │ 🌕 │ Mostly Cloudy │ 🟩
##       @ 22:00 │ 🌡 67°F │ 💦 74% │ 1015 mb — │ 🌕 │ Mostly Cloudy │ 🟩
##       @ 23:00 │ 🌡 67°F │ 💦 74% │ 1015 mb — │ 🌕 │ Mostly Cloudy │ 🟩
##   Fri @ 00:00 │ 🌡 66°F │ 💦 74% │ 1015 mb — │ 🌕 │ Partly Cloudy │ 🟩
##       @ 01:00 │ 🌡 65°F │ 💦 78% │ 1015 mb — │ 🌕 │ Partly Cloudy │ 🟩
##       @ 02:00 │ 🌡 64°F │ 💦 81% │ 1015 mb — │ 🌕 │ Mostly Clear  │ 🟩
##       @ 03:00 │ 🌡 63°F │ 💦 83% │ 1015 mb — │ 🌕 │ Partly Cloudy │ 🟩
##       @ 04:00 │ 🌡 62°F │ 💦 85% │ 1015 mb — │ 🌕 │ Partly Cloudy │ 🟩
## 
## https://weather-data.apple.com/legal-attribution.html

Note that the attribution is required by Apple.

I’ll likely add a daily forecast console printer soon.

There are a few helper functions in the package for value conversion between unit systems, iconifying some values, and working with time zones. {weatherkit} uses lutz::tz_lookup_coords() in places to auto-determine the time zone from lat/lng pairs, and also includes a function to intuit lat/lng from an IP address using ipapi (they have a generous free tier).

As of the timestamp on this blog post, Apple’s WeatherKit provides up to 500,000 API calls a month per Apple Developer Program membership. If you need additional API calls, monthly subscription plans will be available for purchase sometime after the beta is officially over. This is the expected pricing:

500,000 calls/month: Included with membership
1 million calls/month: US$ 49.99
2 million calls/month: US$ 99.99
5 million calls/month: US$ 249.99
10 million calls/month: US$ 499.99
20 million calls/month: US$ 999.99

Apple’s WeatherKit documentation consistently says “Apple Developer Program membership”, which seems to indicate you need to give them money every year to use the REST API. We’ll see if that’s truly the case after the service leaves beta status.

FIN

Kick the tyres & drop issues/PRs as one may be wont to do.

Track Hurricane Danny (Interactively) with R + leaflet

2015-08-20 – 11:41
Posted in Data Visualization, DataVis, DataViz, R, Weather
Tagged post
Comments (5)

poster image

Danny became the [first hurricane of the 2015 Season](http://www.accuweather.com/en/weather-news/atlantic-gives-birth-to-tropical-depression-four-danny/51857239), so it’s a good time to revisit how one might be able to track them with R.

We’ll pull track data from [Unisys](http://weather.unisys.com/hurricane/atlantic/2015/index.php) and just look at Danny, but it should be easy to extrapolate from the code.

For this visualization, we’ll use [leaflet](http://rstudio.github.io/leaflet/) since it’s all the rage and makes the plots interactive without any real work (thanks to the very real work by the HTML Widgets folks and the Leaflet.JS folks).

Let’s get the library calls out of the way:

library(leaflet)
library(stringi)
library(htmltools)
library(RColorBrewer)

Now, we’ll get the tracks:

danny <- readLines("http://weather.unisys.com/hurricane/atlantic/2015/DANNY/track.dat")

Why aren’t we using `read.csv` or `read.table` directly, you ask? Well, the data is in a _really_ ugly format thanks to the spaces in the `STATUS` column and two prefix lines:

Date: 18-20 AUG 2015
Hurricane-1 DANNY
ADV  LAT    LON      TIME     WIND  PR  STAT
  1  10.60  -36.50 08/18/15Z   30  1009 TROPICAL DEPRESSION
  2  10.90  -37.50 08/18/21Z    -     - TROPICAL DEPRESSION
  3  11.20  -38.80 08/19/03Z    -     - TROPICAL DEPRESSION
  4  11.30  -40.20 08/19/09Z    -     - TROPICAL DEPRESSION
  5  11.20  -41.10 08/19/15Z    -     - TROPICAL DEPRESSION
  6  11.50  -42.00 08/19/21Z    -     - TROPICAL DEPRESSION
  7  12.10  -42.70 08/20/03Z    -     - TROPICAL DEPRESSION
  8  12.20  -43.70 08/20/09Z    -     - TROPICAL DEPRESSION
  9  12.50  -44.80 08/20/15Z    -     - TROPICAL DEPRESSION
+12  13.10  -46.00 08/21/00Z   70     - HURRICANE-1
+24  14.00  -47.60 08/21/12Z   75     - HURRICANE-1
+36  14.70  -49.40 08/22/00Z   75     - HURRICANE-1
+48  15.20  -51.50 08/22/12Z   70     - HURRICANE-1
+72  16.00  -56.40 08/23/12Z   65     - HURRICANE-1
+96  16.90  -61.70 08/24/12Z   65     - HURRICANE-1
+120  18.00  -66.60 08/25/12Z   55     - TROPICAL STORM

But, we can put that into shape pretty easily, using `gsub` to make it easier to read everything with `read.table` and we just skip over the first two lines (we’d use them if we were doing other things with more of the data).

danny_dat <- read.table(textConnection(gsub("TROPICAL ", "TROPICAL_", danny[3:length(danny)])), 
           header=TRUE, stringsAsFactors=FALSE)

Now, let’s make the data a bit prettier to work with:

# make storm type names prettier
danny_dat$STAT <- stri_trans_totitle(gsub("_", " ", danny_dat$STAT))
 
# make column names prettier
colnames(danny_dat) <- c("advisory", "lat", "lon", "time", "wind_speed", "pressure", "status")

Those steps weren’t absolutely necessary, but why do something half-baked (unless it’s chocolate chip cookies)?

Let’s pick better colors than Unisys did. We’ll use a color-blind safe palette from Color Brewer:

danny_dat$color <- as.character(factor(danny_dat$status, 
                          levels=c("Tropical Depression", "Tropical Storm",
                                   "Hurricane-1", "Hurricane-2", "Hurricane-3",
                                   "Hurricane-4", "Hurricane-5"),
                          labels=rev(brewer.pal(7, "YlOrBr"))))

And, now for the map! We’ll make lines for the path that was already traced by Danny, then make interactive points for the forecast locations from the advisory data:

last_advisory <- tail(which(grepl("^[[:digit:]]+$", danny_dat$advisory)), 1)
 
# draw the map
leaflet() %>% 
  addTiles() %>% 
  addPolylines(data=danny_dat[1:last_advisory,], ~lon, ~lat, color=~color) -> tmp_map
 
if (last_advisory < nrow(danny_dat)) {
 
   tmp_map <- tmp_map %>% 
     addCircles(data=danny_dat[last_advisory:nrow(danny_dat),], ~lon, ~lat, color=~color, fill=~color, radius=25000,
             popup=~sprintf("<b>Advisory forecast for +%sh (%s)</b><hr noshade size='1'/>
                           Position: %3.2f, %3.2f<br/>
                           Expected strength: <span style='color:%s'><strong>%s</strong></span><br/>
                           Forecast wind: %s (knots)<br/>Forecast pressure: %s",
                           htmlEscape(advisory), htmlEscape(time), htmlEscape(lon),
                           htmlEscape(lat), htmlEscape(color), htmlEscape(status), 
                           htmlEscape(wind_speed), htmlEscape(pressure)))
}
 
html_print(tmp_map)

Click on one of the circles to see the popup.

The entire source code is in [this gist](https://gist.github.com/hrbrmstr/e3253ddd353f1a489bb4) and, provided you have the proper packages installed, you can run this at any time with:

devtools::source_gist("e3253ddd353f1a489bb4", sha1="00074e03e92c48c470dc182f67c91ccac612107e")

The use of the `sha1` hash parameter will help ensure you aren’t being asked to run a potentially modified & harmful gist, but you should visit the gist first to make sure I’m not messing with you (which, I’m not).

If you riff off of this or have suggestions for improvement, drop a note here or in the gist comments.

NASA GISS’s Annual Global Temperature Anomaly Trends (dplyr/ggplot version)

2015-01-18 – 09:50
Posted in Charts & Graphs, Data Visualization, DataVis, DataViz, R, Weather
Tagged post
Comments (1)

D Kelly O’Day did a [great post](https://chartsgraphs.wordpress.com/2015/01/16/nasa-gisss-annual-global-temperature-anomaly-trends/) on charting NASA’s Goddard Institute for Space Studies (GISS) temperature anomaly data, but it sticks with base R for data munging & plotting. While there’s absolutely nothing wrong with base R operations, I thought a modern take on the chart using `dplyr`, `magrittr` & `tidyr` for data manipulation and `ggplot2` for formatting would be helpful for the scores of new folk learning R this year (our little language is becoming [all the rage](http://redmonk.com/sogrady/2015/01/14/language-rankings-1-15/), it seems). I also really enjoy working with weather data.

Before further exposition, here’s the result:

forwp

I made liberal use of the “piping” idiom encouraged `magrittr`, `dplyr` and other new R packages, including the forward assignment operator `->` (which may put some folks off a bit). That also meant using `magrittr`’s aliases for `[` and `[[`, which are more readable in pipes.

I don’t use `library(tidyr)` since `tidyr`’s `extract` conflicts with `magrittr`’s, but you’ll see a `tidyr::gather` in the code for wide-to-long data shaping.

I chose to use the monthly temperature anomaly data as a base layer in the chart as a contrast to the monthly- and annual-anomaly means. I also marked the hottest annual- and annual-mean anomalies and framed the decades with vertical markers.

There are no hardcoded years or decades anywhere in the `ggplot2` code, so this should be quite reusable as the data source gets updated.

As I come back to the chart, I think there may be a bit too much “chart junk” on it, but you can tweak it to your own aesthetic preferences (if you do, drop a note in the comments with a link to your creation).

The code is below and in [this gist](https://gist.github.com/hrbrmstr/07ba10fb4c3fe9c9f3a0).

library(httr)
library(magrittr)
library(dplyr)
library(ggplot2)
 
# data retrieval ----------------------------------------------------------
 
# the user agent string was necessary for me; YMMV
 
pg <- GET("http://data.giss.nasa.gov/gistemp/tabledata_v3/GLB.Ts+dSST.txt",
          user_agent("Mozilla/5.0 (Macintosh; Intel Mac OS X 10_9_3) AppleWebKit/537.75.14 (KHTML, like Gecko) Version/7.0.3 Safari/7046A194A"))
 
# extract monthly data ----------------------------------------------------
 
content(pg, as="text") %>%
  strsplit("\n") %>%
  extract2(1) %>%
  grep("^[[:digit:]]", ., value=TRUE) -> lines
 
# extract column names ----------------------------------------------------
 
content(pg, as="text") %>%
  strsplit("\n") %>%
  extract2(1) %>%
  extract(8) %>%
  strsplit("\ +") %>%
  extract2(1) -> lines_colnames
 
# make data frame ---------------------------------------------------------
 
data <- read.table(text=lines, stringsAsFactors=FALSE)
colnames(data) <- lines_colnames
 
# transform data frame ----------------------------------------------------
 
data %>%
  tidyr::gather(month, value, Jan, Feb, Mar, Apr, May, Jun,
                       Jul, Aug, Sep, Oct, Nov, Dec) %>%     # wide to long
  mutate(value=value/100) %>%                                # convert to degree Celcius change
  select(year=Year, month, value) %>%                        # only need these fields
  mutate(date=as.Date(sprintf("%d-%d-%d", year, month, 1)),  # make proper dates
         decade=year %/% 10,                                 # calc decade
         start=decade*10, end=decade*10+9) %>%               # calc decade start/end
  group_by(decade) %>%
    mutate(decade_mean=mean(value)) %>%                      # calc decade mean
  group_by(year) %>%
    mutate(annum_mean=mean(value)) %>%                       # calc annual mean
  ungroup -> data
 
# start plot --------------------------------------------------------------
 
gg <- ggplot()
 
# decade vertical markers -------------------------------------------------
 
gg <- gg + geom_vline(data=data %>% select(end),
                      aes(xintercept=as.numeric(as.Date(sprintf("%d-12-31", end)))),
                          size=0.5, color="#4575b4", linetype="dotted", alpha=0.5)
 
# monthly data ------------------------------------------------------------
 
gg <- gg + geom_line(data=data, aes(x=date, y=value, color="monthly anomaly"),
                     size=0.35, alpha=0.25)
gg <- gg + geom_point(data=data, aes(x=date, y=value, color"monthly anomaly"),
                      size=0.75, alpha=0.5)
 
# decade mean -------------------------------------------------------------
 
gg <- gg + geom_segment(data=data %>% distinct(decade, decade_mean, start, end),
                        aes(x=as.Date(sprintf("%d-01-01", start)),
                            xend=as.Date(sprintf("%d-12-31", end)),
                            y=decade_mean, yend=decade_mean,
                            color="decade mean anomaly"),
                        linetype="dashed")
 
# annual data -------------------------------------------------------------
 
gg <- gg + geom_line(data=data %>% distinct(year, annum_mean),
                      aes(x=as.Date(sprintf("%d-06-15", year)), y=annum_mean,
                          color="annual mean anomaly"),
                      size=0.5)
gg <- gg + geom_point(data=data %>% distinct(year, annum_mean),
                      aes(x=as.Date(sprintf("%d-06-15", year)), y=annum_mean,
                          color="annual mean anomaly"),
                      size=2)
 
# additional annotations --------------------------------------------------
 
# max annual mean anomaly horizontal marker/text
 
gg <- gg + geom_hline(yintercept=max(data$annum_mean),  alpha=0.9,
                      color="#d73027", linetype="dashed", size=0.25)
 
gg <- gg + annotate("text",
                    x=as.Date(sprintf("%d-12-31", mean(range(data$year)))),
                    y=max(data$annum_mean),
                    color="#d73027", alpha=0.9,
                    hjust=0.25, vjust=-1, size=3,
                    label=sprintf("Max annual mean anomaly %2.1fºC", max(data$annum_mean)))
 
gg <- gg + geom_hline(yintercept=max(data$value),  alpha=0.9,
                      color="#7f7f7f", linetype="dashed", size=0.25)
 
# max annual anomaly horizontal marker/text
 
gg <- gg + annotate("text",
                    x=as.Date(sprintf("%d-12-31", mean(range(data$year)))),
                    y=max(data$value),
                    color="#7f7f7f",  alpha=0.9,
                    hjust=0.25, vjust=-1, size=3,
                    label=sprintf("Max annual anomaly %2.1fºC", max(data$value)))
 
gg <- gg + annotate("text",
                    x=as.Date(sprintf("%d-12-31", range(data$year)[2])),
                    y=min(data$value), size=3, hjust=1,
                    label="Data: http://data.giss.nasa.gov/gistemp/tabledata_v3/GLB.Ts+dSST.txt")
 
# set colors --------------------------------------------------------------
 
gg <- gg + scale_color_manual(name="", values=c("#d73027", "#4575b4", "#7f7f7f"))
 
# set x axis limits -------------------------------------------------------
 
gg <- gg + scale_x_date(expand=c(0, 1),
                        limits=c(as.Date(sprintf("%d-01-01", range(data$year)[1])),
                                 as.Date(sprintf("%d-12-31", range(data$year)[2]))))
 
# add labels --------------------------------------------------------------
 
gg <- gg + labs(x=NULL, y="GLOBAL Temp Anomalies in 1.0ºC",
                title=sprintf("GISS Land and Sea Temperature Annual Anomaly Trend (%d to %d)\n",
                              range(data$year)[1], range(data$year)[2]))
 
# theme/legend tweaks -----------------------------------------------------
 
gg <- gg + theme_bw()
gg <- gg + theme(panel.grid=element_blank())
gg <- gg + theme(panel.border=element_blank())
gg <- gg + theme(legend.position=c(0.9, 0.2))
gg <- gg + theme(legend.key=element_blank())
gg <- gg + theme(legend.background=element_blank())
gg

Visualizing Historical & Most-likely First Snowfall Dates for U.S. Regions

2014-11-26 – 01:47
Posted in Charts & Graphs, Data Analysis, Data Visualization, DataVis, DataViz, maine, R, Weather
Tagged post
Comments (4)

UPDATE: You can now run this as a local Shiny app by entering shiny::runGist("95ec24c1b0cb433a76a5", launch.browser=TRUE) at an R prompt (provided all the dependent libraries (below) are installed) or use it interactively over at Shiny Apps.

The impending arrival of the first real snowfall of the year in my part of Maine got me curious about what the most likely “first snow” dates are for my region. The U.S. Historical Climatology Network (USHCN) maintains [historical daily climate records](http://cdiac.ornl.gov/epubs/ndp/ushcn/daily_doc.html) for each station in each state and has data (for some stations) going back as far as the 1800’s. A quick look at their [data files](http://cdiac.ornl.gov/ftp/ushcn_daily/) indicated that they would definitely help satiate my curiosity (and make for a late night of cranking out some R code and ggplot visualizations).

To start, we’ll need a bit more than base R to get the job done:

library(pbapply)
library(data.table)
library(dplyr)
library(ggplot2)
library(grid)
library(gridExtra)
library(stringi)

In all honesty, `pbapply`, `dplyr` and `stringi` are not necessary, but they definitely make life easier by (respectively) giving us:

– free progress bars for `*apply` operations,
– high efficacy data manipulation idioms, and
– a handy utility for converting strings to title case.

With setup out of the way, the first real task is to see which observer station is closest to my area. To figure that out we need to read in the station data file which is, sadly, in fixed-width format. Some stations have `#` characters in their titles, to we have to account for that when we call `read.fwf`. After reading in the station database we use a naive–but-usable distance calculation to find the closest station:

stations <- read.fwf("data/ushcn-stations.txt",
                     widths=c(6, 9, 10, 7, 3, 31, 7, 7, 7, 3),
                     col.names=c("coop_id", "latitude", "longitude", "elevation",
                                 "state", "name", "component_1", "component_2",
                                 "component_3", "utc_offset"),
                     colClasses=c("character", "numeric", "numeric", "numeric",
                                  "character", "character", "character", "character",
                                  "character", "character"),
                     comment.char="", strip.white=TRUE)
 
# not a great circle, but it gets the job done here
closestStation <- function(stations, lat, lon) {
  index <- which.min(sqrt((stations$latitude-lat)^2 +
                          (stations$longitude-lon)^2))
  stations[index,]
}
 
# what's the closest station?
closestStation(stations, 43.2672, -70.8617)
 
##     coop_id latitude longitude elevation state   name component_1 component_2 component_3 utc_offset
633  272174    43.15    -70.95      24.4    NH DURHAM      ------      ------      ------         +5

As a Mainer, I’m not thrilled that this is the actual, closest station, so we’ll also see what the closest one is in Maine:

closestStation(stations %>% filter(state=="ME"), 43.2672, -70.8617)
##    coop_id latitude longitude elevation state             name component_1 component_2 component_3 utc_offset
10  176905  43.6497  -70.3003      13.7    ME PORTLAND JETPORT      ------      ------      ------         +5

The analysis is easy enough to do for both, so we’ll first take a look at Durham, New Hampshire then do the exact same valuation for Portland, Maine.

Despite being fixed-width, the station database was not too difficult to wrangle. The state-level files that contain the readings are another matter:

Variable	Columns	Type
COOP ID	1-6	Character
YEAR	7-10	Integer
MONTH	11-12	Integer
ELEMENT	13-16	Character
VALUE1	17-21	Integer
MFLAG1	22	Character
QFLAG1	23	Character
SFLAG1	24	Character
VALUE2	25-29	Integer
MFLAG2	30	Character
QFLAG2	31	Character
SFLAG2	32	Character
…	…	…
VALUE31	257-261	Integer
MFLAG31	262	Character
QFLAG31	263	Character
SFLAG31	264	Character

We have fixed-width, wide-format records with 31 days for each month, which proves the existence of truly insidious people in the world. Rather than use `read.fwf` again, we’ll take a different approach (since we ultimately need the data in long format) and use `readLines` to read in all the records from the NH data file, then filter out everything but snowfall entries from the station we’re interested in.

Next, we setup nested `lapply` calls to build a long data frame from each month then combine them all together into a single data frame:

snow <- readLines("data/state27_NH.txt")
 
snow <- grep("SNOW", snow, value=TRUE)
snow <- grep("^272174", snow, value=TRUE)
 
snow_dat <- rbindlist(pblapply(snow, function(x) {
 
  rbindlist(lapply(1:31, function(i) {
 
    # record format described here:
    # http://cdiac.ornl.gov/ftp/ushcn_daily/data_format.txt
 
    start <- 17 + (i-1)*8
 
    list(coop_id=substr(x, 1, 6),
         date=sprintf("%s-%02d-%02d", substr(x, 7, 10), as.numeric(substr(x, 11, 12)), i),
         element=substr(x, 13, 16),
         value=as.numeric(substr(x, start, start+4)),
         mflag=substr(x, start+5, start+5),
         qflag=substr(x, start+6, start+6),
         sflag=substr(x, start+7, start+7))
 
  }))
 
}))

Now, we’ll clean up the records even further by removing invalid entries (those with a `value` == `-9999`) and convert record dates to actual `Date` objects and filter out invalid dates:

snow_dat <- snow_dat %>% filter(value != -9999)
 
# since the data file has 31 days for each records regardless of whether
# that's valid or not we do a shortcut to remove invalid dates by doing the
# a vectorized Date conversion, then removing records with NA dates
 
snow_dat$date <- as.Date(snow_dat$date)
snow_dat <- snow_dat %>% filter(!is.na(date))
 
# having the year extracted is handy for filtering
snow_dat$year <- format(snow_dat$date, "%Y")

Given that Winter in the U.S. spans across two calendar years, we need a way to keep dates in January-May associated with the previous year (yes, that adds an inherent assumption that no first snow is in June, which might not hold true for Alaska). To facilitate this, we’ll convert each date to its corresponding day of year value then add the number of total days in the start year to those values for all months <= May. We really do need to do this, too, since there are many cases where the first snowfall will be in January-March for many states.

snow_dat$doy <- as.numeric(format(snow_dat$date, "%j"))
snow_dat$doy <- ifelse(snow_dat$doy<=180,
                       snow_dat$doy + as.numeric(format(as.Date(sprintf("%s-12-31", snow_dat$year)), "%j")),
                       snow_dat$doy)

Now, the fun begins. We use (mostly) `dplyr` to extract the first snowfall day from each year, then make a dot-line plot from the data:

first <- snow_dat %>%
  filter(value>0) %>%                           # ignore 0 values
  filter(date>=as.Date("1950-01-01")) %>%       # start at 1950 (arbitrary)
  merge(stations, by="coop_id", all.x=TRUE) %>% # merge station details
  group_by(coop_id, year) %>%                   # group by station and year
  arrange(doy) %>%                              # sort by our munged day of year
  filter(row_number(doy) == 1) %>%              # grab the first entry by group
  select(name, state, date, value, doy)         # we only need some variables
 
title_1 <- sprintf("First observed snowfall (historical) at %s, %s", stri_trans_totitle(unique(first$name)), unique(first$state))
 
gg <- ggplot(first, aes(y=year, x=doy))
gg <- gg + geom_segment(aes(xend=min(first$doy)-20, yend=year), color="#9ecae1", size=0.25)
gg <- gg + geom_point(aes(color=coop_id), shape=8, size=3, color="#3182bd")
gg <- gg + geom_text(aes(label=format(date, "%b-%d")), size=3, hjust=-0.2)
gg <- gg + scale_x_continuous(expand=c(0, 0), limits=c(min(first$doy)-20, max(first$doy)+20))
gg <- gg + labs(x=NULL, y=NULL, title=title_1)
gg <- gg + theme_bw()
gg <- gg + theme(legend.position="none")
gg <- gg + theme(panel.grid=element_blank())
gg <- gg + theme(panel.border=element_blank())
gg <- gg + theme(axis.ticks.x=element_blank())
gg <- gg + theme(axis.ticks.y=element_blank())
gg <- gg + theme(axis.text.x=element_blank())
gg <- gg + theme(axis.text.y=element_text(color="#08306b"))
by_year <- gg

While that will help us see the diversity across years, we have to do quite a bit of eye tracking to get the most likely date range for the first snowfall, so we’ll add a boxplot into the mix and use `summary` to figure out the quartiles (for labeling the chart) for the actual date values:

wx_range <- summary(as.Date(format(first$date, "2013-%m-%d")))
names(wx_range) <- NULL
min_wx <- gsub("2013-", "", wx_range[2])
max_wx <- gsub("2013-", "", wx_range[5])
 
title_2 <- sprintf("Most likely first snowfall will be between %s & %s", min_wx, max_wx)
 
gg <- ggplot(first %>% mutate(name="0000"), aes(name, doy))
gg <- gg + geom_boxplot(fill="#3182bd", color="#08306b", outlier.colour="#08306b")
gg <- gg + scale_y_continuous(expand=c(0, 0),
                              limits=c(min(first$doy)-20, max(first$doy)+20))
gg <- gg + coord_flip()
gg <- gg + labs(x=NULL, y=NULL, title=title_2)
gg <- gg + theme_bw()
gg <- gg + theme(legend.position="none")
gg <- gg + theme(panel.grid=element_blank())
gg <- gg + theme(panel.border=element_blank())
gg <- gg + theme(axis.ticks.x=element_blank())
gg <- gg + theme(axis.text.x=element_blank())
gg <- gg + theme(axis.ticks.y=element_line(color="white"))
gg <- gg + theme(axis.text.y=element_text(color="white"))
gg <- gg + theme(plot.title=element_text(size=11))
box_wx <- gg

Finally, we’ll combine them together to get the finished product:

grid.arrange(by_year, box_wx, nrow=2, heights=unit(c(0.9, 0.1), "npc"))

And, do the same for Portland:

Click for larger

There are many more analyses and visualizations that can be performed on these data sets, but be wary when creating them as I’ve seen a few files with fixed-width formatting errors and have also noticed missing records for some observer stations.

You can find the complete, commented code up on [github](https://github.com/hrbrmstr/snowfirst).

Plot Me Like a Hurricane (a.k.a. animating historical North Atlantic basin tropical storm tracks)

2014-10-07 – 22:34
Posted in Data Visualization, DataVis, DataViz, maps, R, Weather
Tagged post
Comments (2)

Markus Gessman (@MarkusGesmann) did a beautiful job [Visualising the seasonality of Atlantic windstorms](http://www.magesblog.com/2014/10/visualising-seasonality-of-atlantic.html) using small multiples, which was inspired by both a [post](http://freakonometrics.hypotheses.org/17113) by Arthur Charpentier (@freakonometrics) on using Markov spatial processes to “generate” hurricanes—which was [tweaked a bit](http://robertgrantstats.wordpress.com/2014/10/01/transparent-hurricane-paths-in-r/) by Robert Grant (@robertstats)—and [Gaston Sanchez](https://github.com/gastonstat)’s [Visualizing Hurricane Trajectories](http://rpubs.com/gaston/hurricanes) RPub.

I have [some history](http://rud.is/b/2012/10/28/watch-sandy-in-r-including-forecast-cone/) with hurricane data and thought I’d jump on the bandwagon using the same data and making some stop-frame animations. I borrowed from previous work (hence starting with all the credits above) but have used `dplyr` idioms for data-frame filtering & mutating and my own month/year extraction code.

The first animation accumulates storm tracks in-year and displays the names of the storms in a list down the left side while the second does a full historical accumulation of tracks. I changed the storm path gradient but kept most of the other formatting bits and made the plots suitable for 1080p output/playback.

Rather than go the `ffmpeg` route, I used [ImageMagick](http://www.imagemagick.org/) since it makes equally quick work out of converting a bunch of `png` files to an `mp4` file. I made the animations go quickly, but they can be advanced forward/back one frame at a time in any decent player.

library(maps)
library(data.table)
library(dplyr)
library(ggplot2)
library(grid)
library(RColorBrewer)
 
# takes in a numeric vector and returns a sequence from low to high
rangeseq <- function(x, by=1) {
  rng <- range(x)
  seq(from=rng[1], to=rng[2], by=by)
}
 
# etract the months (as a factor of full month names) from
# a date+time "x" that can be converted to a POSIXct object,
extractMonth <- function(x) {
  months <- format(as.POSIXct(x), "%m")
  factor(months, labels=month.name[rangeseq(as.numeric(months))])
}
 
# etract the years (as a factor of full 4-charater-digit years) from
# a date+time "x" that can be converted to a POSIXct object,
extractYear <- function(x) {
  factor(as.numeric(format(as.POSIXct(x), "%Y")))
}
 
# get from: ftp://eclipse.ncdc.noaa.gov/pub/ibtracs/v03r06/all/csv/Allstorms.ibtracs_all.v03r06.csv.gz
storms_file <- "data/Allstorms.ibtracs_all.v03r06.csv"
storms <-  fread(storms_file, skip=10, select=1:18)
 
col_names <- c("Season", "Num", "Basin", "Sub_basin", "Name", "ISO_time", "Nature",
             "Latitude", "Longitude", "Wind.kt", "Pressure.mb", "Degrees_North", "Deegrees_East")
setnames(storms, paste0("V", c(2:12, 17, 18)), col_names)
 
# use dplyr idioms to filter & mutate the data
 
storms <- storms %>%
  filter(Latitude > -999,                                  # remove missing data
         Longitude > -999,
         Wind.kt > 0,
         !(Name %in% c("UNNAMED", "NONAME:UNNAMED"))) %>%
  mutate(Basin=gsub(" ", "", Basin),                       # clean up fields
         ID=paste(Name, Season, sep="."),
         Month=extractMonth(ISO_time),
         Year=extractYear(ISO_time)) %>%
  filter(Season >= 1989, Basin %in% "NA")                  # limit to North Atlantic basin
 
season_range <- paste(range(storms$Season), collapse=" - ")
knots_range <- range(storms$Wind.kt)
 
# setup base plotting parameters (these won't change)
 
base <- ggplot()
base <- base + geom_polygon(data=map_data("world"),
                            aes(x=long, y=lat, group=group),
                            fill="gray25", colour="gray25", size=0.2)
base <- base + scale_color_gradientn(colours=rev(brewer.pal(n=9, name="RdBu")),
                                     space="Lab", limits=knots_range)
base <- base + xlim(-138, -20) + ylim(3, 55)
base <- base + coord_map()
base <- base + labs(x=NULL, y=NULL, title=NULL, colour = "Wind (knots)")
base <- base + theme_bw()
base <- base + theme(text=element_text(family="Arial", face="plain", size=rel(5)),
                     panel.background = element_rect(fill = "gray10", colour = "gray30"),
                     panel.margin = unit(c(0,0), "lines"),
                     panel.grid.major = element_blank(),
                     panel.grid.minor = element_blank(),
                     plot.margin = unit(c(0,0,0,0), "lines"),
                     axis.text.x = element_blank(),
                     axis.text.y = element_blank(),
                     axis.ticks = element_blank(),
                     legend.position = c(0.25, 0.1),
                     legend.background = element_rect(fill="gray10", color="gray10"),
                     legend.text = element_text(color="white", size=rel(2)),
                     legend.title = element_text(color="white", size=rel(5)),
                     legend.direction = "horizontal")
 
# loop over each year, producing plot files that accumulate tracks over each month
 
for (year in unique(storms$Year)) {
 
  storm_ids <- unique(storms[storms$Year==year,]$ID)
 
  for (i in 1:length(storm_ids)) {
 
    storms_yr <- storms %>% filter(Year==year, ID %in% storm_ids[1:i])
 
    # stuff takes a while, so it's good to have a progress message
    message(sprintf("%s %s", year, storm_ids[i]))
 
    gg <- base
    gg <- gg + geom_path(data=storms_yr,
                         aes(x=Longitude, y=Latitude, group=ID, colour=Wind.kt),
                         size=1.0, alpha=1/4)
    gg <- gg + geom_text(label=year, aes(x=-135, y=51), size=rel(6), color="white", vjust=1)
    gg <- gg + geom_text(label=paste(gsub(".[[:digit:]]+$", "", storm_ids[1:i]), collapse="\n"),
                         aes(x=-135, y=49.5), size=rel(4.5), color="white", vjust=1)
 
    # change "quartz" to "cairo" if you're not on OS X
 
    png(filename=sprintf("output/%s%03d.png", year, i),
        width=1920, height=1080, type="quartz", bg="gray25")
    print(gg)
    dev.off()
 
  }
 
}
 
# convert to mp4 animation - needs imagemagick
system("convert -delay 8 output/*png output/hurr-1.mp4")
# unlink("output/*png") # do this after verifying convert works

# take an alternate approach for accumulating the entire hurricane history
# start with the base, but add to the ggplot object in a loop, which will
# accumulate all the tracks.
 
gg <- base
 
for (year in unique(storms$Year)) {
 
  storm_ids <- unique(storms[storms$Year==year,]$ID)
 
  for (i in 1:length(storm_ids)) {
 
    storms_yr <- storms %>% filter(ID %in% storm_ids[i])
 
    message(sprintf("%s %s", year, storm_ids[i]))
    gg <- gg + geom_path(data=storms_yr,
                         aes(x=Longitude, y=Latitude, group=ID, colour=Wind.kt),
                         size=1.0, alpha=1/4)
 
    png(filename=sprintf("output/%s%03d.png", year, i),
        width=1920, height=1080, type="quartz", bg="gray25")
    print(gg)
    dev.off()
 
  }
 
}
 
system("convert -delay 8 output/*png output/hurr-2.mp4")
# unlink("output/*png") # do this after verifying convert works

Full code in [this gist](https://gist.github.com/hrbrmstr/23bf06784e898871dd61).

Seeing the (day)light with R

2014-09-23 – 12:28
Posted in Charts & Graphs, Data Visualization, DataVis, R, Weather
Tagged post
Comments (4)

The arrival of the autumnal equinox foreshadows the reality of longer nights and shorter days here in the northeast US. We can both see that reality and distract ourselves from it at the same time by firing up RStudio (or your favorite editor) and taking a look at the sunrise & sunset times based on our map coordinates using some functions from the R {maptools} package.

The sunriset function takes in a lat/lon pair, a range of dates and whether we want sunrise or sunset calculated and returns when those ephemeral events occur. For example, we can see the sunrise time for Portsmouth, NH on Christmas day this year (2014) via:

library(maptools)

# these functions need the lat/lon in an unusual format
portsmouth <- matrix(c(-70.762553, 43.071755), nrow=1)
for_date <- as.POSIXct("2014-12-25", tz="America/New_York")
sunriset(portsmouth, for_date, direction="sunrise", POSIXct.out=TRUE)

##         day_frac                time
## newlon 0.3007444 2014-12-25 07:13:04

We can pass in a vector of dates, to this function, and that means we’ll have data points we can work with to visualize this change. Let’s wrap the sequence generation into a function of our own and extract:

sunrise
sunset
solar noon
# hours of daylight

for every day in the sequence, returning the result as a data frame.

# adapted from http://r.789695.n4.nabble.com/maptools-sunrise-sunset-function-td874148.html
ephemeris <- function(lat, lon, date, span=1, tz="UTC") {

  # convert to the format we need
  lon.lat <- matrix(c(lon, lat), nrow=1)

  # make our sequence - using noon gets us around daylight saving time issues
  day <- as.POSIXct(date, tz=tz)
  sequence <- seq(from=day, length.out=span , by="days")

  # get our data
  sunrise <- sunriset(lon.lat, sequence, direction="sunrise", POSIXct.out=TRUE)
  sunset <- sunriset(lon.lat, sequence, direction="sunset", POSIXct.out=TRUE)
  solar_noon <- solarnoon(lon.lat, sequence, POSIXct.out=TRUE)

  # build a data frame from the vectors
  data.frame(date=as.Date(sunrise$time),
             sunrise=as.numeric(format(sunrise$time, "%H%M")),
             solarnoon=as.numeric(format(solar_noon$time, "%H%M")),
             sunset=as.numeric(format(sunset$time, "%H%M")),
             day_length=as.numeric(sunset$time-sunrise$time))

}

Now we can take a look at these values over 10 days near All Hallows Eve:

ephemeris(43.071755, -70.762553, "2014-10-31", 10, tz="America/New_York")

##          date sunrise solarnoon sunset day_length
## 1  2014-10-31     716      1226   1736  10.332477
## 2  2014-11-01     717      1226   1734  10.289145
## 3  2014-11-02     518      1026   1533  10.246169
## 4  2014-11-03     620      1126   1632  10.203563
## 5  2014-11-04     621      1126   1631  10.161346
## 6  2014-11-05     622      1126   1629  10.119535
## 7  2014-11-06     624      1126   1628  10.078148
## 8  2014-11-07     625      1126   1627  10.037204
## 9  2014-11-08     626      1126   1626   9.996721
## 10 2014-11-09     627      1126   1625   9.956719

We now have everything we need to visualize the seasonal daylight changes. We’ll use ggplot (with some help from the grid package) and build a two panel graph, one that gives us a “ribbon” view of what hours of the day are in daylight and the other just showing the changes in the total number of hours of daylight available during the day. We’ll build the function so that it will:

optionally show the current date/time (TRUE by default)
optionally show when solar noon is (FALSE by default)
optionally plot the graphs (TRUE by default)
return an arrangeGrob of the charts in the event we want to use them in other charts

library(ggplot2)
library(scales)
library(gridExtra)

# create two formatter functions for the x-axis display

# for graph #1 y-axis
time_format <- function(hrmn) substr(sprintf("%04d", hrmn),1,2)

# for graph #2 y-axis
pad5 <- function(num) sprintf("%2d", num)

daylight <- function(lat, lon, place, start_date, span=2, tz="UTC", 
                     show_solar_noon=FALSE, show_now=TRUE, plot=TRUE) {

  stopifnot(span>=2) # really doesn't make much sense to plot 1 value

  srss <- ephemeris(lat, lon, start_date, span, tz)

  x_label = ""

  gg <- ggplot(srss, aes(x=date))
  gg <- gg + geom_ribbon(aes(ymin=sunrise, ymax=sunset), fill="#ffeda0")

  if (show_solar_noon) gg <- gg + geom_line(aes(y=solarnoon), color="#fd8d3c")

  if (show_now) {
    gg <- gg + geom_vline(xintercept=as.numeric(as.Date(Sys.time())), color="#800026", linetype="longdash", size=0.25)
    gg <- gg + geom_hline(yintercept=as.numeric(format(Sys.time(), "%H%M")), color="#800026", linetype="longdash", size=0.25)
    x_label = sprintf("Current Date / Time: %s", format(Sys.time(), "%Y-%m-%d / %H:%M"))
  }

  gg <- gg + scale_x_date(expand=c(0,0), labels=date_format("%b '%y"))
  gg <- gg + scale_y_continuous(labels=time_format, limits=c(0,2400), breaks=seq(0, 2400, 200), expand=c(0,0))
  gg <- gg + labs(x=x_label, y="",
                  title=sprintf("Sunrise/set for %s\n%s ", place, paste0(range(srss$date), sep=" ", collapse="to ")))
  gg <- gg + theme_bw()
  gg <- gg + theme(panel.background=element_rect(fill="#525252"))
  gg <- gg + theme(panel.grid=element_blank())

  gg1 <- ggplot(srss, aes(x=date, y=day_length))
  gg1 <- gg1 + geom_area(fill="#ffeda0")
  gg1 <- gg1 + geom_line(color="#525252")

  if (show_now) gg1 <- gg1 + geom_vline(xintercept=as.numeric(as.Date(Sys.time())), color="#800026", linetype="longdash", size=0.25)

  gg1 <- gg1 + scale_x_date(expand=c(0,0), labels=date_format("%b '%y"))
  gg1 <- gg1 + scale_y_continuous(labels=pad5, limits=c(0,24), expand=c(0,0))
  gg1 <- gg1 + labs(x="", y="", title="Day(light) Length (hrs)")
  gg1 <- gg1 + theme_bw()

  if (plot) grid.arrange(gg, gg1, nrow=2)

  arrangeGrob(gg, gg1, nrow=2)

}

We can test our our new function using the same location and graph the sunlight data for a year starting September 1, 2014 (select graph for full-size version):

daylight(43.071755, -70.762553, "Portsmouth, NH", "2014-09-01", 365, tz="America/New_York")

With the longer nights approaching we can further enhance the plotting function to add markers for solstices and perhaps even make a new version that compares sunlight across different geographical locations.

Complete code example is in this gist.

An Alfred (OS X) Workflow for Weather Underground Personal Weather Station (PWS) Data

2014-09-14 – 20:10
Posted in Alfred, Apple, Mavericks, OS X, Weather
Tagged post
Comments (1)

I’ve operated a [Weather Underground](http://www.wunderground.com/) [Personal Weather Station](http://www.wunderground.com/weatherstation/about.asp) (PWS) [[KMEBERWI7](http://www.wunderground.com/personal-weather-station/dashboard?ID=KMEBERWI7#history)] off-and-on (hardware issues notwithstanding) for as long as I can remember, and I thought it was about time to finally do an Alfred↔PWS mashup. My personal requirements were quite modest:

– 5 reading history (including most current)
– Ability to copy all the current day’s readings as CSV
– Quickly get to my PWS data w/o a bookmark

[Alfred](http://www.alfredapp.com/) makes all this possible via customized workflows that support many scripting environments, including Python. Here’s a quick preview before going into the details:

It’s a fairly simple workflow:

– Grab today’s “raw” data (and clean it up)
– Select the last 5 entries
– Connect a URL action and use the full CSV as clipboard contents for any copy action

The full Python code is below and [on github](https://github.com/hrbrmstr/alfred-pws), and you can hit that github link or [Packal](http://www.packal.org/workflow/pws-history) for the compiled workflow. It’s been tested on Mavericks, but more eyes are always welcome.

Customizing the workflow for your own/favorite PWS is as simple as changing the `station` variable.

There’s plenty of room for improvement, including

– performing a background download of the PWS data
– using a sparkline graph as the icon
– customizing which data fields are returned
– providing commands to get/set your/favorite PWS
– providing options for the “copy” return type (currently CSV, but potentially XML or JSON)

Don’t hesitate to post issues or pull requests and check back for updates (as I’m sure some of those improvements will be making their way into the workflow).

import re
import csv
import sys
import datetime
from lib import requests
from workflow import Workflow, web
from StringIO import StringIO
 
# retrieve today's history for station "X"
 
def get_wx_data(station):
 
  tdy = datetime.datetime.today()
 
  # construct the URL for "today"
  url = 'http://www.wunderground.com/weatherstation/WXDailyHistory.asp?ID=%s&day=%d&month=%d&year=%d&graphspan=day&format=1' % (station, tdy.day, tdy.month, tdy.year)
 
  r = web.get(url) # get the data
 
  r.raise_for_status() # report any errors
 
  return(re.sub("\n\n", "\n", re.sub("^\n|<br>", "", r.text))) # clean up the output & pass it back to main control
 
 
# main workflow control
 
def main(wf):
 
  station = "KMEBERWI7" # change to use your own/favorite station
 
  resp = get_wx_data(station)
 
  # only want last 5 readings, change this to whatever you want
  max = 5
 
  i=0
  for row in reversed(list(csv.reader(StringIO(resp)))):
    wf.add_item(title=row[0] + " | " + row[1] + "F | " + row[3] + "in | " + row[8] + "%", 
                subtitle=station, # so you know where you're getting data from
                arg=station, # passed as query to URL action - http://www.wunderground.com/personal-weather-station/dashboard?ID={query}#history
                valid=True, # it can be opened in the browser
                icon="/System/Library/CoreServices/CoreTypes.bundle/Contents/Resources/ToolbarInfo.icns", # info icon
                copytext=resp) # get the whole CSV file in a copy
    if (i==max): break
    i += 1
 
  # output to alfred
 
  wf.send_feedback()
 
if __name__ == u"__main__":
  wf = Workflow()
  sys.exit(wf.run(main))