1  Using Application Passwords To Access Bluesky

1.1 Problem

You want to access your own data or another user’s data for analysis.

1.2 Solution

Take advantage of Bluesky’s / the AT Protocol’s app passwords to gain full access to Bluesky’s entire API.

1.3 Discussion

You will need a Bluesky account to use almost all the code in this book. One exception is access to the firehose (which we’ll get into later).

Unlike Twitter, and other sites, which make you go through an OAuth dance, Bluesky / the AT Protocol relies on App Passwords. These app passwords are both simpler than OAuth, and prevent apps from abusing your most sensitive settings.

On Bluesky, you create said beasts here: https://bsky.app/settings/app-passwords. The rest of this book relies on you storing your Bluesky handle (e.g. handle.bsky.social or a domain you’ve claimed on it, like mine, hrbrmstr.dev) in an environment variable BSKY_USER and the generated application password in BSKY_PASS. In R, the best way to ensure those are available to your R sessions is by putting them in ~/.Renviron or equivalent.

Before logging in to Bluesky programmatically, we need to import the atproto package and create a client:

library(reticulate)

atproto <- import("atproto")

client <- atproto$Client()

profile <- client$login(Sys.getenv("BSKY_USER"), Sys.getenv("BSKY_PASS"))

names(profile)
 [1] "avatar"         "banner"         "description"    "did"           
 [5] "displayName"    "followersCount" "followsCount"   "handle"        
 [9] "indexedAt"      "labels"         "postsCount"     "viewer"        

After the login is successful, profile has a number of slots that contain information about the user.

names(profile) |> 
  sapply(\(.x) profile[[.x]]) |> 
  str()
List of 12
 $ avatar        : chr "https://cdn.bsky.social/imgproxy/XP3hrl0QyMpTYe9MnmXfUCEUol_27qqRGOloaJPYxmQ/rs:fill:1000:1000:1:0/plain/bafkre"| __truncated__
 $ banner        : chr "https://cdn.bsky.social/imgproxy/0OAqfyczYKU1ATNSPWbFkwM3J6CEYTCDTJZHxGlR2GI/rs:fill:3000:1000:1:0/plain/bafkre"| __truncated__
 $ description   : chr "a.k.a. boB Rudis • 🇺🇦 Pampa • Don't look at me…I do what he does—just slower. #rstats #javascript #datascience "| __truncated__
 $ did           : chr "did:plc:hgyzg2hn6zxpqokmp5c2xrdo"
 $ displayName   : chr "hrbrmstr"
 $ followersCount: int 60
 $ followsCount  : int 70
 $ handle        : chr "hrbrmstr.dev"
 $ indexedAt     : chr "2023-07-05T20:13:23.951Z"
 $ labels        : list()
 $ postsCount    : int 50
 $ viewer        :ViewerState(blockedBy=False, blocking=None, followedBy=None, following=None, muted=False, mutedByList=None, _type='app.bsky.actor.defs#viewerState')

The did, or decentralized identifier (DID) is super important. This particular DID is specific to Bluesky and is referred to as a placeholder DID as the Bluesky developers do not want it to stick around forever in its current form. They are actively hoping to replace it with or evolve it into something less centralized - likely a permissioned DID consortium. They will, however, suppoort did:plc in the current form until after any successor is deployed, with a reasonable grace period. They’ll also provide a migration route to allow continued use of existing did:plc identifiers.

You can use this did:plc in place of your handle in BSKY_USER if you’d prefer.

The use of client$login(…) creates and stores a session in your running environment, which will let you access more client methods without the need to reauthenticate each time. These are the top-level methods:

sort(names(client))
 [1] "bsky"             "com"              "delete_post"      "invoke_procedure"
 [5] "invoke_query"     "like"             "login"            "me"              
 [9] "repost"           "request"          "send_image"       "send_post"       
[13] "unlike"          
  • bsky provides access to lower-level methods in the bsky namespace which include:
    • actor
    • feed
    • graph
    • notification
    • unspecced
  • com does the same for the com namespace, which (for now) has one next-level domain: atproto with the following methods:
    • admin
    • identity
    • label
    • moderation
    • repo
    • server
    • sync

The rest of the client methods are pretty self-explanatory.

1.4 See Also