21 Recipes For Mining Bluesky Data with {atproto}

Author

boB Rudis

Published

July 8, 2023

Preface

This is a book similar to my 2018 “21 Recipes For Mining Twitter Data with {rtweet}”. Both it and this tome are based on Matthew R. Russell’s book. That book is out of distribution and much of the content is in Matthew’s “Mining the Social Web” book. There will be many similarities between his “21 Recipes” book and this book on purpose. I am not claiming originality in this work, just making an R-centric version of the cookbook.

Well, that’s a (how shall we describe it?) a “misdirection”. While we will most certainly be using R for the examples, we’ll be using using the atproto Python module by Ilya Siamionau via R’s {reticulate} package. We’re doing this because the AT Protocol is super new, has some complexity the Twitter API does not, and Ilya has done an fantastic job designing the Python code, and is keeping up with API changes. This means that there may be many tweaks to this book as the AT Protocol settles down.

What You’ll Need

Well, you will need both R and Python installed. To keep this short, we’ll assume you know how to get both on your system by following the myriad of possible directions. However, I will add that I highly suggest reading “Relieving your Python packaging pain” by Bite Code if you value your sanity. Despite the name, it’s not about writing Python packages but how to not turn into The Incredible Hulk whilst using Python.

To wire-up the {atproto} Python package, run the following in your R console:

install.packages("reticulate")
reticulate::py_install("atproto")

I do not normally do that dance since it highly favors the use of Anaconda, and I am not a huge fan of that ecosystem. But, it will follow the best practices in that “Releiving” post, and should help your wiring up R and Python a bit less painful.

To test the setup, you can do:

library(reticulate)

(atproto <- import("atproto"))
Module(atproto)

If you don’t see Module(atproto) drop an issue on this repo and I’ll try to help as best as possible.

These are the versions of R and Python that were used to execute all the examples contained within this tome:

R.version.string
[1] "R version 4.3.1 Patched (2023-06-16 r84562)"
py_version()
[1] '3.11'

Conventions

Code blocks will appear like they do, above. I will be using other R packages, and leave the install.packages(…) dance to you if you encounter any that aren’t already in your R setup. If I ever use “development mode” packages, I’ll add a comment after the library(…) call for them. They’ll genereally be installed from R Universe, but I may include some direct-to-repo install commands as well.