splashr 0.6.0 Now Uses the CRAN-nascent stevedore Package for Docker Orchestration

The splashr package [srht|GL|GH] — an alternative to Selenium for javascript-enabled/browser-emulated web scraping — is now at version 0.6.0 (still in dev-mode but on its way to CRAN in the next 14 days).

The major change from version 0.5.x (which never made it to CRAN) is a swap out of the reticulated docker package with the pure-R stevedore? package which will make it loads more compatible across the landscape of R installs as it removes a somewhat heavy dependency on a working Python environment (something quite challenging to consistently achieve in that fragmented language ecosystem).

Another addition is a set of new user agents for Android, Kindle, Apple TV & Chromecast as an increasing number of sites are changing what type of HTML (et. al.) they send to those and other alternative glowing rectangles. A more efficient/sane user agent system will also be introduced prior to the CRAN. Now’s the time to vote on existing issues or file new ones if there is a burning desire for new or modified functionality.

Since the Travis tests now work (they were failing miserably because of they Python dependency) I’ve integrated the changes from the 0.6.0 to the master branch but you can follow the machinations of the 0.6.0 branch up until CRAN release.

Cover image from Data-Driven Security
Amazon Author Page

2 Comments splashr 0.6.0 Now Uses the CRAN-nascent stevedore Package for Docker Orchestration

  1. Pingback: splashr 0.6.0 Now Uses the CRAN-nascent stevedore Package for Docker Orchestration

  2. Pingback: splashr 0.6.0 Now Uses the CRAN-nascent stevedore Package for Docker Orchestration – Data Science Austria

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.