Skip navigation

Visiting #2 and doing some $WORK-work, but intrigued with Hanukkah of Data since Puzzle 0 was solvable with a ZIP password cracker (the calendar date math seemed too trivial to bother with).

Decided to fall back to R for this (vs Observable for the Advent of Code which I’ll dedicate time to finishing next week).

R has a {phonenumber} package, so we’ll cheat and use that despite it being very brutish in how it does the letterToNumber() conversion.

No spoilers besides the code.

library(phonenumber)
library(tidyverse)

cust <- read_csv("~/Downloads/noahs-csv/noahs-customers.csv")

cust |> 
  filter(!grepl("[01]", phone)) |> # only care abt letters
  mutate(
    last_name = stri_replace_all_regex(name, "(II|III|IV|Jr\\.)", ""), # get rid of suffix if any
  ) |> 
  separate( # get only the last name
    col = last_name,
    into = c("x1", "x2", "last_name"),
    sep = " ",
    fill = "left"
  ) |> 
  filter(
    nchar(last_name) == 10 # only complete last names
  ) |> 
  mutate(
    last_name = toupper(last_name),
    phone = gsub("-", "", phone) # we're going to compare so remove the '-'
  ) |> 
  select(last_name, phone) |> 
  mutate(
    trans = strsplit(xx$last_name, "") |> 
      map_chr(~map(.x, letterToNumber) |> paste0(collapse="")) # feels like I cld optimize this
  ) |> 
  filter(trans == phone)

Cross-post to Substack where I dropped some details on the newest browser in town: Arc. Intro:

It feels like it’s been forever since The Browser Company started teasing us about their new browser, Arc. I did the dance many of you almost certainly did and typed in my throwaway email address to try to get access to the beta when it came out. I noticed some tech rags starting to cover Arc in-depth this past week, so I checked my email (50/50 chance I’m reading email on any given day), and — sure enough — I had my download link as well.

I won’t be able to give a multi-thousand word review today, especially since I did not get time to capture Netflow over a couple hours to see how skeezy Arc may be, so consider this an Arc introduction vs full review. (I am also, sadly, out of invite codes but drop me a message if you want one as I’m trying to get more invites).

This is a re-post from today’s newsletter. I generally avoid doing this but the content here is def more “bloggy” than “newslettery”.

You can now receive these blog posts in your activity stream. Just follow @hrbrmstr@rud.is and the new posts from here will slide right into your timeline.

So, you’ve committed to abandoning the bird site, joined a 🐘 instance, or three, and are now a citizen of the Fediverse. This is 👍🏽! But, what if you want to dig into this brave new universe a bit and see how it works? Or, perhaps you would like to engage with a specific set of other folks without committing to a particular BBSnode?

Running a full-on Mastodon instance means dealing with PostgreSQL, Redis, Ruby (ugh), and NodeJS. Sure, Docker is an option, but this is still a big project, and it’s more than likely that you’re not a Ruby programmer (which makes it difficult to poke at the code to bend it to your will).

What if I told you there’s a way to run your own ActivityPub (et al.) server that:

  • is built with Golang (requires libsqlite3)
  • uses SQLite for persistence
  • compiles in seconds
  • sets up in minutes
  • takes almost no system resources
  • supports custom emojis
  • allows markdown in posts (including source code block syntax highlighting)
  • features location check-ins (like Foursquare back in the old days)
  • enables filtering and censorship (for abuse prevention)
  • sports a tiny but quite useful API
  • lets folks consume your activity stream as an RSS feed

If that sounds more to your liking, let me introduce you to Honk by Ted Unangst (@tedu@honk.tedunangst.com), and walk you through my Honk (@bob@honk.hrbrmstr.de) setup.

Prepare To Honk

You can either use Mercurial and:

$ hg clone https://humungus.tedunangst.com/r/honk

or grab a tarball and expand it (do either of these things on the box you will be running Honk). Then, just:

$ cd honk
$ make

and in a few seconds you’ll have a honk server binary ready to use.

Now, you’re also going to need a “TLS terminating reverse proxy”. We’ll be using Nginx since it is nigh ubiquitous and straightforward to setup. If you’ve never set up an HTTPS Nginx instance. Nginx drops in nicely almost everywhere, and this isn’t a terrible certbot/nginx ‘splainer. The rest of this drop assumes you have an Nginx instance ready to configure for honking.

I’m hosting my actual Honk instance on an overkill home data science server (you can use a Raspberry Pi to run Honk), which is exposed to one of my internet-facing Nginx reverse proxy servers via Tailscale. Using a setup like this means you can go super-cheap ($5/mo) on a VPS. You can also 100% just run Honk on the same internet-facing box as you do Nginx, just make sure to follow the specific guidance for that setup below, and mebbe spring for a slightly bigger server. FWIW I use SSD Nodes (full-disclosure: that is an affiliate link).

Finally, you’ll need a FQDN configured that points to your reverse proxy. Mine is honk.hrbrmstr.de which has an A record pointing to my internet-facing VPS. Your ActivityPub handle will be @user@ThisFQDNyouChose, so pick one you can live with.

Make sure all three of those things are ready for the remaining steps.

Configure + Run Honk

This part is pretty straightforward. Run:

$ ./honk init

on the box you’re running Honk from. It’s going to ask you for four pieces of information:

  • username: the username you want. Again, this will be your @username@TheHonkFQDNyouChose identity, so pick one you can live with.
  • password: the password you want; pick a long passphrase from a password manager generator that you’ll keep in said password manager. Honk does not support MFA. Attackers will find you. You cannot hide. Just make it hard for them.
  • listenaddr: host + port Honk will listen on. If running Honk on the same system as Nginx make it something like 127.0.0.1:31337 so Honk itself is only accessible locally. I used my VPS’ Tailscale IP address.
  • servername: the FQDN you configured in the previous section.

If you mess up, just remove the SQLite databases Honk just made and start over.

Feel free to get all fancy with whatever system service runner you like, but I just run Honk from a custom application directory with:

$ nohup ./honk &

You leave out the nohup and &, or tail -f nohup.out, if you want to see the log as you configure Nginx in the next section.

Configure Nginx

Remember that this bit assumes you’ve installed Nginx and set it up with a certbot TLS certificate. See above for links to resources to help you do that.

Now you need to tell Nginx where to serve up your honk instance. Modify your base config to look something like this:

server {

  server_name honk.example.com;

  location / {
    proxy_pass http://127.0.0.1:31337;
    proxy_set_header Host $host;
    proxy_set_header X-Real-IP $remote_addr;
    proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
    proxy_set_header X-Forwarded-Proto $scheme; 
  }

  listen 443 ssl; # managed by Certbot

  ssl_certificate /etc/letsencrypt/live/honk.example.com/fullchain.pem; # managed by Certbot
  ssl_certificate_key /etc/letsencrypt/live/honk.example.com/privkey.pem; # managed by Certbot
  include /etc/letsencrypt/options-ssl-nginx.conf; # managed by Certbot
  ssl_dhparam /etc/letsencrypt/ssl-dhparams.pem; # managed by Certbot

}

server {

  if ($host = honk.example.com) {
    return 301 https://$host$request_uri;
  } # managed by Certbot

  listen 80;

  server_name honk.example.com;

  return 404; # managed by Certbot

}

Replace the proxy_pass, honk.example.com, and 127.0.0.1:31337 values with your specific config.

Restart Nginx and go to honk.example.com. If you don’t see the Honk main page, check the Nginx logs and the Honk logs and give it a go again.

Go Honkin’ Crazy

The Honk docs are useful and quite fun reads.

  • The user manual should be required reading so, at the very least, you can grok the honking terminology.
  • The server manual shows you some options you can use when starting Honk, explains how to customize Honk’s (it supports some fun customizations), explains user management, and some additional care and feeding tips. Read this thoroughly.
  • The composition manual is a must-read since it shows off all the post composition features.
  • The filtering and censorship system manual will help you deal with any abuse.
  • The ActivityPub manual explains what Honk does and does not support in that protocol.
  • The API manual has all the information you need to use your Honk instance via some programming language or just curl.

Stuff To Try!

  • Follow folks on other servers! Hit me up at @hrbrmstr@mastodon.social or @bob@honk.hrbrmstr.de if you want to test following out (and get a reply).
  • Customize your site CSS! Make it yours. The manuals provide all the information you need to do this.
  • Add custom emoji and other components (again, the manuals are great).
  • Write an API wrapper package so you can use your instance programmatically (this is a good way to make an ActivityPub bot).
  • Look at the toys/ subdirectory. It has some fine example programs you can riff from (or just use):
    • autobonker.go – repeats mentioned posts
    • gettoken.go – obtains an authorization token
    • saytheday.go – posts a new honk that’s a date based look and say sequence
    • sprayandpray.go – send an activity with no error checking and hope it works
    • youvegothonks.go – polls for new messages
  • Import your toots or Twitter archive
  • Start a Honk instance for one of the communities you’re in. Honk really cannot support a large community, but small clubs can use Honk vs deal with a full-on Mastodon instance.
  • Poke around the SQLite databases Honk uses.
  • Help someone else setup a Honk instance
  • Customize the Honk codebase and show off your additions

Get Familiar With The Protocols

WebFinger (mentioned yesterday) is the on-ramp to poking at things, and I prefer playing with instances I own vs annoy folks trying to run “real” Mastodon servers. Honk lets you play without being a bad netizen.

$ webfinger acct:bob@honk.hrbrmstr.de

drops the following to the terminal:

{
  "aliases": [
    "https://honk.hrbrmstr.de/u/bob"
  ],
  "links": [
    {
      "href": "https://honk.hrbrmstr.de/u/bob",
      "rel": "self",
      "type": "application/activity+json"
    }
  ],
  "subject": "acct:bob@honk.hrbrmstr.de"
}

Visit the aliases in a private browser session (so no cookies/etc are used and you see what the world sees) or just curl it from the terminal.

Explore links:

$ curl --header "Accept: application/activity+json" https://honk.hrbrmstr.de/u/bob

drops the following to the terminal (see what happens w/o that custom Accept header, too):

{
  "@context": "https://www.w3.org/ns/activitystreams",
  "followers": "https://honk.hrbrmstr.de/u/bob/followers",
  "following": "https://honk.hrbrmstr.de/u/bob/following",
  "icon": {
    "mediaType": "image/png",
    "type": "Image",
    "url": "https://honk.hrbrmstr.de/a?a=https%3A%2F%2Fhonk.hrbrmstr.de%2Fu%2Fbob&hex=1"
  },
  "id": "https://honk.hrbrmstr.de/u/bob",
  "inbox": "https://honk.hrbrmstr.de/u/bob/inbox",
  "name": "bob",
  "outbox": "https://honk.hrbrmstr.de/u/bob/outbox",
  "preferredUsername": "bob",
  "publicKey": {
    "id": "https://honk.hrbrmstr.de/u/bob#key",
    "owner": "https://honk.hrbrmstr.de/u/bob",
    "publicKeyPem": "CLIPPED B/C TOO BIG FOR SUBSTACK"
  },
  "summary": "BIO CLIPPED B/C TOO BIG FOR SUBSTACK",
  "tag": [
    {
      "href": "https://honk.hrbrmstr.de/o/rstats",
      "name": "#rstats",
      "type": "Hashtag"
    },
    {
      "href": "https://honk.hrbrmstr.de/o/blm",
      "name": "#blm",
      "type": "Hashtag"
    }
  ],
  "type": "Person",
  "url": "https://honk.hrbrmstr.de/u/bob"
}

Try all those endpoints and see what they drop (feel free to hit my server)

FIN

Honk is a great way to explore the various components of the Fediverse and I encourage folks to use it to get more familiar/comfortable with this tech. Drop comments if you run into any issues or have q’s (feel free to honk/toot q’s as well). ☮

This is more of a test post after enabling some new Fediverse features on the server.

Said Fediverse got a bit more real-ish this week (with moderate apologies to the pioneers in this space who’ve languished for ~five years)

You can find me at:

  • @hrbrmstr (general blathering/primary masto-account)
  • @hrbrmstr (reserved for cybersecurity chatter)
  • @hrbrmstr (the reason for this post; in theory, following that one will have all new blog posts show up in whatever Fediverse client/account you’re using)

My time on the bird site will be thin, and mainly reserved to snarkpost Musk, defend U.S. liberal democracy, and keep up with some journalists.

I’ll do a follow-up post if this works.

I’ve been (mostly) keeping up with annual updates for my R/{sf} U.S. foliage post which you can find on GH. This year, we have Quarto, and it comes with so many batteries included that you’d think it was Christmas. One of those batteries is full support for the Observable runtime. These are used in {ojs} Quarto blocks, and rendered versions can run anywhere.

The Observable platform is great for both tinkering and publishing (we’re using it at work for some quick or experimental vis work), and with a few of the recent posts, here, showing how to turn Observable notebooks into Quarto documents, you’re literally two clicks or one command line away from using any public Observable notebook right in Quarto.

I made a version of the foliage vis in Observable and then did the qmd conversion using the Chrome extension, tweaked the source a bit and published the same in Quarto.

The interactive datavis uses some foundational Observable/D3 libraries:

In the JS code we set some datavis-centric values:

foliage_levels = [0.01, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6]
foliage_colors = ["#83A87C", "#FCF6B4", "#FDCC5C", "#F68C3F", "#EF3C23", "#BD1F29", "#98371F"]
foliage_labels = ["No Change", "Minimal", "Patchy", "Partial", "Near Peak", "Peak", "Past Peak"]
week_label = ["Sept 5th", "Sept 12th", "Sept 19th", "Sept 26th", "Oct 3rd", "Oct 10th", "Oct 17th", "Oct 24th", "Oct 31st", "Nov 7th", "Nov 14th", "Nov 21st"]

We then borrow the U.S. Albers-projected topojson file from the Choropleth example notebook and rebuild the outline mesh and county geometry collections, since we need to get rid of Alaska and Hawaii (they’re not present in the source data). We do this by filtering out two FIPS codes:

counties = {
  var cty = topojson.feature(us, us.objects.counties);
  cty.features = cty.features.filter(
    (d) => (d.id.substr(0, 2) != "02") & (d.id.substr(0, 2) != "15")
  );
  return cty;
}

I also ended up modifying the source CSV a bit to account for missing counties.

After that, it was a straightforward call to our imported Choropleth function:

chart = Choropleth(rendered2022, {
  id: (d) => d.id.toString().padStart(5, "0"), // this is needed since the CSV id column is numeric
  value: (d) => d[week_label.indexOf(week) + 1], // this gets the foliage value based on which index the selected week is at
  scale: d3.scaleLinear, // this says to map foliage_levels to foliage_colors directly
  domain: foliage_levels,
  range: foliage_colors,
  title: (f, d) =>
    `${f.properties.name}, ${statemap.get(f.id.slice(0, 2)).properties.name}`, // this makes the county hover text the county + state names
  features: counties, // this is the counties we modified
  borders: statemesh, // this is the statemesh
  width: 975,
  height: 610
})

and placing the legend and scrubbing slider.

The only real difference between the notebook and qmd is the inclusion of the source functions rather than use Observable’s import (I’ve found that there’s a slight load delay for imports when network conditions aren’t super perfect and the inclusion of the source — WITH copyrights — makes up for that).

I’ve set up the Quarto project so that renders go to the docs/ directory, which makes it easy to publish as a GH page.

FIN

Drop issues on GH if anything needs clarifying or fixing and go experiment! You can’t break anything either on Observable or locally that version control can’t fix (yes, Observable has version control!).

Some things to consider modifying/adding:

  • have a click take you to a (selectable?) mapping service, so folks can get driving directions
  • turn the hover text into a proper tooltip
  • speed up or slow down the animation when ‘Play’ is tapped
  • use different colors
  • bring in older datasets (see the foliage GH repo) and make multiple maps or let the user select them or have them compare across years

My previous post announced a Rust-based command line tool for generating Quarto projects from Observable Notebooks.

Some folks may not want to use yet-another command line tool, and it dawned on me that it’d be more convenient to just do the conversion in-browser when one is already on a Notebook page. So, I dusted off some very creaky Chrome extension skills and put together an extension for doing just that.

It’s pretty straightforward:

  • navigate to a Notebook you want to serialize to a Quarto project
  • press the button
  • profit!

You can download individual resources by hand or just use the zip file that automagically downloaded.

screen capture of observable notebook showing how to press the quartize button

screencap showing the aftermath of the quartize process with annotations

vs code screencap showing the downloaded quarto project in source and render

The previous post had some hacky R code to grab seekrit JSON data in ObservableHQ (OHQ) Notebooks and spit out a directory with a Quarto qmd and any associated FileAttachments. Holding firm to my “no more generic public R packages” decree, that’s as far as the R code for that utility is going to get.

Quarto, by design, is not married to the R ecosystem. This should help it get more traction in and outside of the broader data science crowd than R Markdown was able to attain. One big element that makes Quarto enticing to me is that it ships with the Observable runtime and stdlib. Observable javascript tools make coding up reactive data visualizations and analyses fun, but I still really dislike using a browser for data science work. I’d much rather use Quarto’s {ojs} sections in a proper editor/IDE when iterating over a concept.

I also learn best by example-first -> experiment-second. Up until now, I’ve been doing the example-ing and experiment-ing over at OHQ. When I discovered that the OHQ notebook code and metadata ships with the HTML page of the notebook (in a JSON <script> block at the end of the document) I just had to build a tool to yank that out and turn it into a Quarto project.

The R code in the previous post is fine for R folks (someone should 100% take that, make it nicer, and turn it into a small package with an RStudio addin and CLI wrapper; no credit back to me is required), but — as noted above — Quarto is not just for R folks. Rather than confine a conversion utility to some scripting language, I decided to port the R experiment over to Rust. That is how ohq2quarto was born.

You can grab Windows & macOS binaries from the Releases, or just:

cargo install --git https://github.com/hrbrmstr/ohq2quarto

to install it if you’re on Linux or already have a Rust environment setup. I’m working on the configuration of a GitHub Action that will make shipping binaries for all platforms stupid simple and automated.

When run in verbose mode, you’ll see something like this when converting a notebook:

$ ohq2quarto --ohq-ref @hrbrmstr/just-one-more-thing --output-dir ./examples --verbose 
      Title: Just One More Thing
       Slug: just-one-more-thing
  Author(s): boB Rudis
  Copyright: Copyright 2022 boB Rudis
    License: "mit"
 Observable: https://observablehq.com/@hrbrmstr/just-one-more-thing

A look at examples shows:

$ tree examples
├── _quarto.yml
├── columbo_data.csv
└── just-one-more-thing.qmd

The utility made the directory, created the qmd and downloaded the FileAttachment.

This is what the first few lines of the qmd look like:

$ head -16 examples/just-one-more-thing.qmd
---
title: 'Just One More Thing'
author: 'boB Rudis'
format: html
echo: false
observable: 'https://observablehq.com/@hrbrmstr/just-one-more-thing'
---

```{ojs}
md`# Just One More Thing`
```

```{ojs}
md`This week, Chris Holmes tweeted something super dangerous:`
```

FIN

I’ve tried it on some seriously complex OHQ notebooks and it seems to do what I’ve claimed it does on the tin’s label. If you run into issues, or have some feature requests, please drop an issue in the repo.

Quarto is amazing! And, it’s eating the world! OK. Perhaps not the entire world. But it’s still amazing!

If you browse around the HQ, you’ll find many interesting notebooks. You may even have a few yourself! Wouldn’t it be great if you could just import an Observable notebook right into Quarto? Well, now you can.

#' Transform an Observable Notebook into a Quarto project
#' 
#' This will yank the cells from a live Observable notebook and turn it into a Quarto project,
#' downloading all the `FileAttachments` as well.
#' 
#' @param ohq_ref either a short ref (e.g. `@@hrbrmstr/just-one-more-thing`) or a full
#'     URL to a published Observable notebook
#' @param output_dir quarto project directory (will be created if not already present)
#' @param quarto_filename if `NULL` (the default) the name will be the slug (e.g. `just-one-more-thing`
#'     as in the `ohq_ref` param eample) with `.qmd` suffix
#' @param echo set `echo` to `true` or `false` in the YAML
ohq_to_quarto <- function(ohq_ref, output_dir, quarto_filename = NULL, echo = FALSE) {

  ohq_ref <- ohq_ref[1]
  if (grepl("^@", ohq_ref)) ohq_ref <- sprintf("https://observablehq.com/%s", ohq_ref)

  output_dir <- output_dir[1]
  if (!dir.exists(output_dir)) dir.create(output_dir)

  quarto_filename <- quarto_filename[1]

  pg <- rvest::read_html(ohq_ref)

  pg |> 
    html_nodes("script#__NEXT_DATA__") |> 
    html_text() |> 
    jsonlite::fromJSON() -> x

  meta <- x$props$pageProps$initialNotebook
  nodes <- x$props$pageProps$initialNotebook$nodes

  if (is.null(quarto_filename)) quarto_filename <- sprintf("%.qmd", meta$slug)

  c(
    "---", 
    sprintf("title: '%s'", meta$title), 
    "format: html", 
    if (echo) "echo: true" else "echo: false",
    "---",
    "",
    purrr::map2(nodes$value, nodes$mode, ~{
      c(

        "```{ojs}",
        dplyr::case_when(
          .y == "md" ~ sprintf("md`%s`", .x),
          .y == "html" ~ sprintf("html`%s`", .x),
          TRUE ~ .x
        ),
        "```",
        ""
      )

    })
  ) |> 
    purrr::flatten_chr() |> 
    cat(
      file = file.path(output_dir, quarto_filename), 
      sep = "\n"
    )

  if (length(meta$files)) {
    if (nrow(meta$files) > 0) {
      purrr::walk2(
        meta$files$download_url,
        meta$files$name, ~{
          download.file(
            url = .x,
            destfile = file.path(output_dir, .y),
            quiet = TRUE
          )
        }
      )
    }
  }

}

You can try that out with my Columbo notebook:

ohq_to_quarto(
  ohq_ref = "@hrbrmstr/just-one-more-thing", 
  output_dir = "~/Development/columbo",
  quarto_filename = "columbo.qmd",
  echo = FALSE
)

That will download the CSV file into the specified directory and convert the cells to a .qmd. You can download that example file, but you’ll need the data to run it (or just run the converter).

This is what the directory tree looks like after the script is run and the document is rendered:

columbo/
├── columbo.html
├── columbo.qmd
├── columbo_data.csv
└── columbo_files
    └── libs
        ├── bootstrap
        │   ├── bootstrap-icons.css
        │   ├── bootstrap-icons.woff
        │   ├── bootstrap.min.css
        │   └── bootstrap.min.js
        ├── clipboard
        │   └── clipboard.min.js
        ├── quarto-html
        │   ├── anchor.min.js
        │   ├── popper.min.js
        │   ├── quarto-syntax-highlighting.css
        │   ├── quarto.js
        │   ├── tippy.css
        │   └── tippy.umd.min.js
        └── quarto-ojs
            ├── quarto-ojs-runtime.js
            └── quarto-ojs.css

The function has not been battle tested, and it’s limited to the current functionality, but it should do what it says on the tin.

I’ll turn this into a Rust binary so it’s more usable outside of the R ecosystem.

You can try out a fledgling Rust version here.