Keeping Track Of URLs Shared On Bluesky

While the future of Bluesky is nowhere near certain, it is most certainly growing. It’s also the largest community of users for the AT Protocol.

Folks are using Bluesky much the same way as any online forum/chat. One of those ways is to share URLs to content.

For the moment, it is possible to eavesdrop on the Bluesky “firehose” sans authentication. I’ve been curious as to what folks are sharing on the platform and decided to do more than poke at it casually in my hacky terminal firehose viewer.

This GitLab project contains all the code necessary to log URLs seen in the firehose to a local SQLite database. As Bluesky grows, this will definitely not scale, but it’s fine for right now, and scaling just means moving the websocket capture client to a more capable environment than my home server and setting up something like a Kafka stream. Might as well move to Postgres while we’re at it.

But, for now, this lightweight script/database is fine.

NOTE: I’m deliberately not tracking any other data, but the code is easy to modify to log whatever you want from the firehose post.

I’m syncing the data to this server every ~30 minutes or so and have created an Observable notebook which keeps track of the most popular domains.

I don’t know what is (Perplexity had some ideas), but it appears to be some AI-driven “card” game that has AT protocol and ActivityPub integration. Due to the programmatic nature of the posts with URLs containing that domain, I suspect it’ll be in the lead for quite some time.

There are some neat sites in the long tail of the distribution.

I think I’ll set up one to monitor post with CVE’s, soon, too.

Cover image from Data-Driven Security
Amazon Author Page

3 Comments Keeping Track Of URLs Shared On Bluesky

  1. Pingback: Keeping Track Of URLs Shared On Bluesky - Ciberdefensa

  2. Pingback: Keeping Track Of URLs Shared On Bluesky - Source: - CISO2CISO.COM & CYBER SECURITY GROUP

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.