Skip navigation

Author Archives: hrbrmstr

Don't look at me…I do what he does — just slower. #rstats avuncular • ?Resistance Fighter • Cook • Christian • [Master] Chef des Données de Sécurité @ @rapid7

I have graphics working in Vanilla JS WebR, now, and I’ll cover the path to that in two parts.

The intent was to jump straight into ggplot2-land, but, as you saw in my previous post, WASM’d ggplot2 is a bear. And, I really didn’t grok what the WebR site docs were saying about how to deal with the special WebR canvas() device until I actually tried to work with it and failed miserably.

You will need to have gotten caught up on the previous WebR blog posts and experiments as I’m just covering some of the gnarly bits.

Not Your Parents’ evalR…()

If you’ve either been playing a bit with WebR or peeked under the covers of what others are doing, you’ve seen the evalR…() family of functions which evaluate supplied R code and optionally return a result. Despite reading the WebR docs on “canvas”, daft me tried to simply use one of those “eval” functions, to no avail.

The solution involves:

I’m going to block quote a key captureR since it explains “why” pretty well. Hit me up anywhere you like if you desire more info.

Unlike evalR() which only returns one R object, captureR() returns a variable number of objects when R conditions are captured. Since this makes memory management of individual objects unwieldy, captureR() requires the shelter approach to memory management, where all the sheltered objects are destroyed at once.

Let’s work through the “plottR” function I made to avoid repeating code to just get images out of R. It takes, as input:

  • an initialized WebR context
  • code that will produce something on a graphics device
  • dimensions of the image
  • the HTML <canvas> id to shove the image data to (we’ll explain this after the code block)
async function plottR(webR, plot_code = "plot(mtcars, col='blue')",
                        width = 400, height = 400, 
                            id = "base-canvas") {

  const webRCodeShelter = await new webR.Shelter();

  await webR.evalRVoid(`canvas(width=${width}, height=${height})`);

  const result = await webRCodeShelter.captureR(`${plot_code}`, {
    withAutoprint: true,
    captureStreams: true,
    captureConditions: false,
    env: webR.objs.emptyEnv,
  });

  await webR.evalRVoid("dev.off()");

  const msgs = await webR.flush();

  const canvas = document.getElementById(id)
  canvas.setAttribute("width", 2 * width);
  canvas.setAttribute("height", 2 * height);

  msgs.forEach(msg => {
    if (msg.type === "canvasExec") Function(`this.getContext("2d").${msg.data}`).bind(canvas)()
  });

}

You 100% need to read up on the HTML canvas element if you’re going to wield WebR yourself vs use Quarto, Shiny, Jupyter-lite, or anything else clever folks come up with. The output of your plots is going to be a series of HTML canvas instructions to do things like “move here”, “switch to this color”, “draw an ellipse”, etc. I will be linking to a full example of the canvas instructions output towards the end.

Now, let’s work through the function’s innards.

const webRCodeShelter = await new webR.Shelter();

gives us a temporary place to execute R code, knowing all the memory consumed will go away after we’re immediately done with it. Unlike the baked-in “global” shelter, this one is super ephemeral.

await webR.evalRVoid(`canvas(width=${width}, height=${height})`);

This is just like a call to png(…), svglite(…), pdf(…), etc. Check out coolbutuseless’ repo for tons of great examples of alternate graphics devices. I have a half-finished one for omnigraffle. They aren’t “hard” to write, but I think they are very tedious to crank through.

const result = await webRCodeShelter.captureR(`${plot_code}`, {
  withAutoprint: true,
  captureStreams: true,
  captureConditions: false,
  env: webR.objs.emptyEnv,
});

is different from what you’re used to. The captureR function will evaluate the given code, and takes some more options, described in the docs. TL;DR: we’re asking the evaluator to give us back pretty much what’d we see in the R console: all console messages and output streams, plus it does the familiar “R object autoprint” that you get for free when you fire up an R console.

So, we’ve sent our plot code into the abyss, and — since this is 100% like “normal” graphics devices — we also need to do the dev.off dance:

await webR.evalRVoid("dev.off()");

This will cause the rendering to happen.

Right now, where you can’t see it, is the digital manifestation of your wonderful plot. That’s great, but we’d like to see it!

const msgs = await webR.flush();

will tell it to get on with it and make sure everything that needs to be done is done. If you’re not familiar with async/await yet, you really need to dig into that to survive in DIY WebR land.

const canvas = document.getElementById(id)
canvas.setAttribute("width", 2 * width);   // i still need to read "why 2x"
canvas.setAttribute("height", 2 * height);

msgs.forEach(msg => {
  if (msg.type === "canvasExec") Function(`this.getContext("2d").${msg.data}`).bind(canvas)()
});

finds our HTML canvas element and then throws messages at it; alot of messages. To see the generated code for the 3D perspective plot example, head to this gist where I’ve pasted all ~10K instructions.

To make said persp plot, it’s just a simple call, now:

await plottR(webR, `basetheme("dark"); persp(x, y, z, theta=-45)`)

I used the default id for the canvas in the online example.

“How Did You Use The basetheme Package? It’s Not In The WASM R Repo?”

I yanked the four R source code files from the package and just source‘d them into the WebR environment:

const baseThemePackage = [ "basetheme.R", "coltools.R", "themes.R", "utils.R" ];

// load up the source from the basetheme pkg
for (const rSource of baseThemePackage) {
  console.log(`Sourcing: ${rSource}…`)
  await globalThis.webR.evalRVoid(`source("https://rud.is/w/ggwebr/r/${rSource}")`)
}

10K+ Lines Is Alot Canvas Code…

Yep! But, that’s how the HTML canvas element works and it’s shockingly fast, as you’ve seen via the demo.

FIN

We’ll cover a bit more in part 2 when we see how to get ggplot2 working, which will include a WebR version of {hrbrthemes}! I also want to thank James Balamuta for the Quarto WebR project which helped me out quite a bit in figuring this new tech out.

Before I let you go, I wanted to note that in those “messages” (the ones we pulled canvasExec call out of), there are message types that are not canvasExec (hence our need to filter them).

I thought you might want to know what they are, so I extracted the JSON, and ran it through some {dplyr}:

msgs |> 
  filter(
    type != "canvasExec"
  ) |> 
  pull(data) |> 
  writeLines()
R version 4.1.3 (2022-03-10) -- "One Push-Up"
Copyright (C) 2022 The R Foundation for Statistical Computing
Platform: wasm32-unknown-emscripten (32-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.

> 

I have a post coming on using base and {ggplot2} plots in VanillaJS WebR, but after posting some bits on social media regarding how slow {ggplot2} is to deal with, I had some “performance”-related inquiries, which led me down a rabbit hole that I’m, now, dragging y’all down into as well.

First, a preview of the aforementioned plot/graphics:

I encourage you to load both of them before continuuing to see why I was curious about package load times.

Getting A Package Into WebR: A Look At {ggplot2}

If we strip away all the cruft, this is the core way to install a package into WebR and make it available to a freshly minted WebR context:

import { WebR } from '/webr/webr.mjs';
globalThis.webR = new WebR({ WEBR_URL: "/webr/", SW_URL: "/w/bench/",});
await globalThis.webR.init();
await globalThis.webR.installPackages(['PACKAGE'])
await globalThis.webR.evalRVoid('library(PACKAGE)')

Let’s look at what happens in the browser during the call to installPackages() when PACKAGE is ggplot2:

Screen capture of DevTools showing ggplot2 dependent packages loading.

Screen capture of DevTools showing ggplot2 dependent packages loading.

Dependent libraries are sequentially loaded until we finally get to ggplot2 (foregoeing {} from now on). There are 28 packages for ggplot2 (including itself) and they have a really skewed package size distribution:

Min.   :   6K
1st Qu.: 108K
Median : 481K
Mean   : 950K
3rd Qu.: 1.2M
Max.   : 5.4M

The good thing is, though, that the browser will cache them (for some period of time) so they aren’t re-downloaded every time you need them. Because of this, we’re going to ignore download time from consideration since they’re all, as we’ll see, below, yanked form cache in single-digit milliseconds.

When you call library(PACKAGE) R code gets executed, and that takes time. On modern desktops with local R installs, you almost never notice the time passage for this. This is not the case for WebR:

Screen capture of the ggplot2 package loading part of a Developer Tools waterfall chart.

Screen capture of the ggplot2 package loading part of a Developer Tools waterfall chart.

The Matrix, mgcv, and farver packages grind things to a halt. You felt that if you hit up the example at the beginning of the post. Brutal. Painful. Terrible.

This got me curious about all the other packages that are available to WebR (93 as of the date on this post).

Approaching R Package Load/library Benchmarking In A Browser

Much like the skewed package file size distribution of presently available R WASM packages, the per-package dependency distribution is also pretty skewed:

Min.   :  1
1st Qu.:  1
Median :  1
Mean   :  2
3rd Qu.:  2
Max.   : 15

This is good! It means you’re mostly safe to have fun with WebR and do not have to focus on working around an initial slowdown. Still, this did not deter me from a time sink.

I had to figure out a way to individually test the install/library of each WASM R packed independently, in a fresh WebR context.

One obvious way is to make 93 HTML files and load them all by hand.

O_O

There had to be a better way, and I immediately turned to “iframes” as a solution.

While I could have scripted the creation of proper for HTML 93 iframes to be put into a page, that’s not a great idea for a number of reasons:

  • that’ll crash every modern browser: far too many child iframes, all with their own DOM contexts sounds horrible
  • 93 “simultaneous” WebR initializations would consume all browser resources and DoS the tab
  • the “simultaneous” loading would skew timing results, even when the package files are cached

The solution was to use dynamically created iframes. One potential “gotcha” for this could have been the modern browser security model. Thanks to some dangerous hardware-level weakness that were discovered and exploited a few years back, Chrome and other browsers shored up the safety contracts between iframes and parent pages. Not doing so could have allowed attackers to have some fun at your expense.

If you’ve been following along the past week or so, to get the best performance with WebR, you need to make sure certain HTTP headers are in place so the browser can trust what you’re doing enough to relax some restrictions. Dynamically created iframes have no “headers”, per-se, but the clever folks who make browser bits for a living came up with a way to handle this. We just need to mark the frame as credentialless and we’ll get good performance (please read the link to get more context).

So, we can run a slightly expanded version of the (way) above javascript code to get timer stats, but how do we collect them?

Well, the parent of the iframe can talk to the iframe and vice-versa via postMessage(), so all we need to do is have the iframe send data back to the parent when it is done. This is also a signal we can kill the child iframe, freeing up resources, and then move on to the next one.

An Unexpected Twist

It turns out that some WASM-ified R packages are busted. Specifically:

  • fs
  • Hmisc
  • latticeExtra
  • pkgLoad

Some functions in each of them are needed by one or more other packages, but — as you’ll see if you run the benchmark site — they fail to library() after installation.

This was a “gotcha” I just had to wrap a try/catch block around, and also pass back information about.

Putting It All Together

You can run your own benchmarks at this playground page. View-source on the page to see the code (there’s just index.html and style.css). You can also see it at the WebR Experiments repo.

When the page loads, it fetches the last produced copy of https://rud.is/data/webr-packages.json. This is a JSON file I’m generating every night that contains all the packages available in “WASM notCRAN”. It just steals PACKAGES.rds every day and serializes it to JSON. Feel free to use it (if you get a CORS error lemme know; you shouldn’t but it’s an odd year).

Controls and sample output for the benchmark site.

The first thing your eyes will likely be drawn to is: “✅ Context is cross-origin isolated!”. When I was debugging early on WebR performance issues, George (the Godfather of WebR) noted that we needed certain headers to get those aforementioned safety restrictions loosened up a bit. You can test the global crossOriginIsolated variable to see if you’ve setup the headers correctly and read more about it when you have time. While it’s not needed on that page, I left it in so I could write this paragraph.

You’ll also see a “download results?” checkbox that is by default un-checked. If checked, you’ll get a JSON file with all the results in the table that is dynamically constructed.

After you tap “Begin Benchmark”, you can go get a matcha and come back.

You’ll see the results in a table and a surprise Observable Plot histogram (the post’s featured image).

I disable the controls after the run since you really should close the tab and start a fresh one (not just a reload) to get a clean context.

If you use the site and download the JSON, you can hit up this Observable notebook and put the JSON in a fork of it. I would also not mind it if you could post your JSON to the WebR Experiments repo as an issue and include the browser and system config you were using at the time.

FIN

This was a fun distraction, and shows you can use most of the presently available WebR packages without concern.

Make sure to check back for those WebR graphics posts!

Let’s walk through how to set up a ~minimal HTML/JS/CSS + WebR-powered “app” on a server you own. This will be vanilla JS (i.e. no React/Vue/npm/bundler) you can hack on at-will.

TL;DR: You can find the source to the app and track changes to it over on GitHub if you want to jump right in.

In the docs/ directory in the GH repo you’ll see an example of using this in GH Pages.Here it is live: https://hrbrmstr.github.io/webr-app/index.html. Info on what you need to do for that is below.

If all went well, you should see the output of a call to WebR right here (it may take a few seconds):

Getting Your Server Set Up

I’ll try to keep updating this with newer WebR releases. Current version is 0.1.0 and you can grab that from: https://github.com/r-wasm/webr/releases/download/v0.1.0/webr-0.1.0.tar.gz.

System-Wide WebR

You should read this section in the official WebR documentation before continuing.

I’m using a server-wide /webr directory on my rud.is domain so I can use it on any page I serve.

WebR performance will suffer if it can’t use SharedArrayBuffers. So, I have these headers enabled on my /webr directory:

Cross-Origin-Opener-Policy: same-origin
Cross-Origin-Embedder-Policy: require-corp

I use nginx, so that looks like:

location ^~ /webr {
  add_header "Cross-Origin-Opener-Policy" "same-origin";
  add_header "Cross-Origin-Embedder-Policy" "require-corp";
}

YMMV.

For good measure (and in case I move things around), I stick those headers on my any app dir that will use WebR. I don’t use them server-wide, though.

And They Call It a MIME. A MIME!

WebR is a JavaScript module, and you need to make sure that files with an mjs extension have a MIME type of text/javascript, or some browsers won’t be happy.

A typical way for webservers to know how to communicate this is via a mime.types file. That is not true for all webservers, and I’ll add steps for ones that use a different way to configure this. The entry should look like this:

text/javascript  mjs;

Testing The WebR Set Up

You should be able to hit that path on your webserver in your browser and see the WebR console app. If you do, you can continue. If not, leave an issue and I can try to help you debug it, but that’s on a best-effort basis for me.

Installing The App

We’ll dig into the app in a bit, but you probably want to see it working, so let’s install this ~minimal app.

My personal demo app is anchored off of /webr-app on my rud.is web server. Here’s how to replicate it:

# Go someplace safe
$ cd $TMPDIR

# Get the app bundle
# You can also use the GH release version, just delete the README after installing it.
$ curl -o webr-app.tgz https://rud.is/dl/webr-app.tgz

# Expand it
$ tar -xvzf webr-app.tgz
x ./webr-app/
x ./webr-app/modules/
x ./webr-app/modules/webr-app.js
x ./webr-app/modules/webr-helpers.js
x ./webr-app/css/
x ./webr-app/css/simple.min.css
x ./webr-app/css/app.css
x ./webr-app/main.js
x ./webr-app/index.html

# 🚨 GO THROUGH EACH FILE
# 🚨 to make sure I'm not pwning you!
# 🚨 Don't trust anything or anyone.

# Go to the webserver root
$ cd $PATH_TO_WEBSERVER_DOC_ROOT_PATH

# Move the directory
$ mv $TMPDIR/webr-app .

# Delete the tarball (optional)
$ rm $TMPDIR/webr-app.tgz

Hit up that path on your web server and you should see what you saw on mine.

WebR-Powered App Structure

.
├── css                  # CSS (obvsly)
│   ├── app.css          # app-specific ones
│   └── simple.min.css   # more on this in a bit
├── index.html           # The main app page
├── main.js              # The main app JS
└── modules              # We use ES6 JS modules
    ├── webr-app.js      # Main app module
    └── webr-helpers.js  # Some WebR JS Helpers I wrote

Simple CSS

If you sub to my newsletter, you know I play with tons of tools and frameworks. Please use what you prefer.For folks who don’t normally do this type of stuff, I included a copy of Simple CSS b/c, well, it is simple to use. Please use this resource to get familiar with it if you do continue to use it.

JavaScript Modules

When I’m in “hack” mode (like I was for the first few days after WebR’s launch), I revert to old, bad habits. We will not replicate those here.

We’re using JavaScript Modules as the project structure. We aren’t “bundling” (slurping up all app support files into a single, minified file) since not every R person is a JS tooling expert. We’re also not using them as they really aren’t needed, and I like to keep things simple and as dependency-free as possible.

In index.html you’ll see this line:

<script type="module" src="./main.js"></script> 

This tells the browser to load that JS file as if it were a module. As you read (you did read the MDN link, above, right?), modules give us locally-scoped names/objects/features and protection from clobbering imported names.

Our main module contains all the crunchy goodness core functionality of our app, which does nothing more than:

  • loads WebR
  • Tells you how fast it loaded + instantiated
  • Yanks mtcars from the instantiated R session (mtcars was the third “thing” I typed into R, ever, so my brain defaults to it).
  • Makes an HTML table from it using D3.

It’s small enough to include here:

import { format } from "https://cdn.skypack.dev/d3-format@3";
import * as HelpR from './modules/webr-helpers.js'; // WebR-specific helpers
// import * as App from './modules/webr-app.js'; // our app's functions, if it had some

console.time('Execution Time'); // keeps on tickin'
const timerStart = performance.now();

import { WebR } from '/webr/webr.mjs'; // service workers == full path starting with /

globalThis.webR = new WebR({
    WEBR_URL: "/webr/", # our system-wide WebR
    SW_URL: "/webr/"    # what ^^ said
}); 
await globalThis.webR.init(); 

// WebR is ready to use. So, brag about it!

const timerEnd = performance.now();
console.timeEnd('Execution Time');

document.getElementById('loading').innerText = `WebR Loaded! (${format(",.2r")((timerEnd - timerStart) / 1000)} seconds)`;

const mtcars = await HelpR.getDataFrame(globalThis.webR, "mtcars");
console.table(mtcars);
HelpR.simpleDataFrameTable("#tbl", mtcars);

globalThis is a special JS object that lets you shove stuff into the global JS environment. Not 100% needed, but if you want to use the same WebR context in in other app module blocks, this is how you’d do it.

Let’s focus on the last three lines.

const mtcars = await HelpR.getDataFrame(globalThis.webR, "mtcars");

This uses a helper function I made to get a data frame object from R in a way more compatible for most JS and JS libraries than the default JS object WebR’s toJs() function converts all R objects to.

console.table(mtcars);

This makes a nice table in the browser’s Developer Tools console. I did this so I could have you open up the console to see it, but I also want you to inspect the contents of the object (just type mtcars and hit enter/return) to see this nice format.

We pass in a WebR context we know will work, and then any R code that will evaluate and return a data frame. It is all on you (for the moment) to ensure the code runs and that it returns a data frame.

The last line:

HelpR.simpleDataFrameTable("#tbl", mtcars);

calls another helper function to make the table.

HelpR

I may eventually blather eloquently and completely about what’s in modules/webr-helpers.js. For now, let me focus on just a couple things, especially since it’s got some sweet JSDoc comments.

First off, let’s talk more about those comments.

I use VS Code for ~60% of my daily ops, and used it for this project. If you open up the project root in VS Code and select/hover over simpleDataFrameTable in that last line, you’ll get some sweet lookin’formatted help. VS Code is wired up for this (other editors/IDEs are too), so I encourage you to make liberal use of JSDoc comments in your own functions/modules.

Now, let’s peek behind the curtain of getDataFrame:

export async function getDataFrame(ctx, rEvalCode) {
    let result = await ctx.evalR(`${rEvalCode}`);
    let output = await result.toJs();
    return (Promise.resolve(webRDataFrameToJS(output)));
}

The export tells the JS environment that that function is available if imported properly. Without the export the function is local to the module.

let result = await ctx.evalR(`${rEvalCode}`);

A proper app would use JS try/catch potential errors. There’s an example of that in the fancy React app code over at WebR’s site. We just throw caution to the wind and evaluate whatever we’re given. In theory, we should have R ensure it’s a data frame which we kind of can’t do on the JS side since the next line:

let output = await result.toJs();

will show the type as a list (b/c data.frames are lists).

I’ll likely add some more helpers to a more standalone helper module, but I suspect that corporate R will beat me to that, so I will likely also not invest too much time on it, at least externally.

Await! Await! Do Tell Me (about await)!

Before we can talk about the last line:

return (Promise.resolve(webRDataFrameToJS(output)));

let’s briefly talk about async ops in JS.

The JavaScript environment in your browser is single-threaded. async-hronous ops let pass of code to threads to avoid blocking page operations. These get executed “whenever”, so all you get is a vapid and shallow promise to of code execution and potentially giving you something back.

We explicitly use await for when we really need the code to run and, in this case, give us something back. We can keep chaining async function calls, but — if we need to make sure the code runs and/or we get data back — we will eventually need to keep our promise to do so; hence, Promise.resolve.

Serving WebR From GitHub Pages

The docs/ directory in the repo shows a working version on GH pages.

main.js needs a few tweaks:

// This will use Posit's CDN

import('https://webr.r-wasm.org/latest/webr.mjs').then( // this wraps the main app code
    async ({ WebR }) => {

        globalThis.webR = new WebR({
            SW_URL: "/webr-app/"            // 👈🏼 needs to be your GHP main path
        });
        await globalThis.webR.init();

        const timerEnd = performance.now();
        console.timeEnd('Execution Time');

        document.getElementById('loading').innerText = `WebR Loaded! (${format(",.2r")((timerEnd - timerStart) / 1000)} seconds)`;

        const mtcars = await HelpR.getDataFrame(globalThis.webR, "mtcars");
        console.table(mtcars);
        HelpR.simpleDataFrameTable("#tbl", mtcars);

  }
);

Moar To Come

Please hit up this terribly coded dashboard app to see some fancier use. I’ll be converting that to modules and expanding git a bit.

WebR 0.1.0 was released! I had been git-stalking George (the absolute genius who we all must thank for this) for a while and noticed the GH org and repos being updated earlier this week, So, I was already pretty excited.

It dropped today, and you can hit that link for all the details and other links.

I threw together a small demo to show how to get it up and running without worrying about fancy “npm projects” and the like.

View-source on that link, or look below for a very small (so, hopefully accessible) example of how to start working with WASM-ified R in a web context.

UPDATE:

Four more links:


<html xmlns="http://www.w3.org/1999/xhtml"> <head> <meta charset="UTF-8"> <meta name="viewport" content="width=device-width, initial-scale=1"> <title>WebR Super Simple Demo</title> <link rel="stylesheet" href="/style.css" type="text/css"> <style> li { font-family:monospace; } .nospace { margin-bottom: 2px; } </style> </head> <body> <div id="main"> <p>Simple demo to show the basics of calling the new WebR WASM!!!!</p> <p><code>view-source</code> to see how the sausage is made</code></p> <p class="nospace">Input a number, press "Call R" (when it is enabled) and magic will happen.</p> <!-- We'll pull the value from here --> <input type="text" id="x" value="10"> <!-- This button is disabled until WebR is loaded --> <button disabled="" id="callr">Call R</button> <!-- Output goes here --> <div id="output"></div> <!-- WebR is a module so you have to do this. --> <!-- NOTE: Many browsers will not like it if `.mjs` files are served --> <!-- with a content-type that isn't text/javascript --> <!-- Try renaming it from .mjs to .js if you hit that snag. --> <script type="module"> // https://github.com/r-wasm/webr/releases/download/v0.1.0/webr-0.1.0.tar.gz // // I was lazy and just left it in one directory import { WebR } from '/webr-d3-demo/webr.mjs'; // service workers == full path starting with / const webR = new WebR(); // get ready to Rumble await webR.init(); // shot's fired console.log("WebR"); // just for me b/c I don't trust anything anymore // we call this function on the button press async function callR() { let x = document.getElementById('x').value.trim(); // get the value we input; be better than me and do validation console.log(`x = ${x}`) // as noted, i don't trust anything let result = await webR.evalR(`rnorm(${x},5,1)`); // call some R! let output = await result.toArray(); // make the result something JS can work with document.getElementById("output").replaceChildren() // clear out the <div> (this is ugly; be better than me) // d3 ops d3.select("#output").append("ul") const ul = d3.select("ul") ul.selectAll("li") .data(output) .enter() .append("li") .text(d => d) } // by the time we get here, WebR is ready, so we tell the button what to do and re-enable the button document.getElementById('callr').onclick = callR; document.getElementById('callr').disabled = false; </script> <!-- d/l from D3 site or here if you trust me --> <script src="d3.min.js"></script> </div> </body> </html>

I’ve been (slowly) making my way through FOSDEM `23 presentations and caught up to Peter Lowe‘s “Bizarre and Unusual Uses of DNS • Rule 53: If you can think of it, someone’s done it in the DNS” talk. DNS oddities are items I collect whenever I see them, and while I knew about a good number of the ones in Peter’s presentation, the ones where DNS is used to retrieve your external IP address were oddly missing from my collection.

His presentation mentioned both a Google DNS hack and OpenDNS DNS hack, and I learned of a similar DNS hack from Akamai from John Payne. I keep saying “hack” because these folks are most certainly abusing the original intentions and design of DNS. “Hack” is not being used pejoratively, as this is a pretty cool and efficient way of discovering your external IP address vs setting up a full-blown HTTP TLS session, making a GET request and retrieving the payload.

I’ve been Down and Out on COVID Street for the past few weeks (#4 brought it home from high school, making multiple years of being overly cautious and careful outside the house quite moot), and had a bit of a level drain relapse over the weekend, so I decided to get my mind directed away from malicious spike proteins and build a client for the existing services and then a server anyone could run to host the same type of service.

I’ve been nerding out on Rust for the past few years, but chose Go (also calling it “Golang” in this parenthetical for the sake of SEO) since I really wanted a small binary, and DNS ops are part of Go’s “batteries included” libraries (and, I’ve worked with them before).

dig-ging The Hacks

Shaft Silhouette with Can You Dig It below

You don’t need a special client for these hacks. dig can do all the hard work for you, and it is (for the most part) on every modern system (or easily installed).

Here are three shell executable statements that will return your external IP address into a shell variable (just remove the VAR= and outer $() to see the result vs store it):

MY_OPENDNS_IP="$(dig myip.opendns.com @resolver1.opendns.com +short)"

MY_GOOGLE_IP="$(dig o-o.myaddr.1.google.com @ns1.google.com TXT +short | tr -d '"')"

MY_AKAMAI_IP="$(dig +short TXT whoami.ds.akahelp.net @$(dig +short +answer NS akamai.com | head -1) | grep ns | sed -e 's/[^0-9\.\:]//g')"

The Akamai one is a bit longer since I didn’t want to lock it in to a pre-specified Akamai resolver (you never know when orgs are going to change things). So, it looks up the nameserver first, then does the IP check.

Remove the pipes to see the “raw” output.

[Client] Hacks In Go

Go's mascot in a hacker hoodie

I’ll eventually set up a GitHub Action to build out binaries for various platforms (and setup a Homebrew tap for it) but you can get started using the nascent Go CLI via:

go install -ldflags "-s -w" github.com/hrbrmstr/extip@latest

the extra flags are there to make the binary size smaller than it otherwise would be (Go and Rust both make larger binaries than I care for, but they do that for good reasons).

At present, there are no command line options, so when you run extip, the executable makes the DNS calls to all three services and will return just your IP address if they all agree (if you’re being service intercepted in a really nasty way, that might not be the case). If any fail, the discrepancies are shown.

Serving Up Your Own Hack

Another reason to use Go is that building a DNS server in it is super straightforward, thanks to Miek Gieben‘s battle tested DNS library.

Now, thanks to this tiny, hacky DNS server I whipped up, you can:

go install -ldflags "-s -w" github.com/hrbrmstr/extip-svr@latest

and run it anywhere you’d like to have the same type of service.

It supports A, AAAA and TXT queries, though I’d use the TXT one if I were you, since you don’t need to know what type of network you’re on or interface the request is coming from. I’ve got it running on one of my random internet nodes, so you can try it out before running it:

dig myip.is TXT @ip.rudis.net

(any FQDN ending in .is can be used)

FIN

Peter’s talk was super fun and informative, so you should 100% watch it. It was great being able to have something to focus on whilst getting better, and also cool to stretch some Golang muscles.

If you have any opines on the CLI argument parser I should use, drop a comment or issue on the repos. I’ll be tweaking both the client and server quite a bit over the coming weeks.

I’ll follow up with a more detailed post in a ~week or so, but if you are considering purchasing a Kucht appliance, please, please reconsider your decision. They have repeatedly lied to us (I have proof) and are incapable of manufacturing functioning equipment.

Thanks to them, we are out thousands of dollars and are in the process of contacting the Maine AG and having our personal legal counsel help us to recoup our losses. I’m just thankful our “professional” oven didn’t harm any of our family as it continued to malfunction and degrade whilst the Kucht representatives just ignored us.

Ref: https://rud.is/b/2022/12/19/2022-hanukkah-of-data-puzzle-1/

library(tidyverse)

cust <- read_csv("~/Downloads/noahs-csv/noahs-customers.csv")
orders_items <- read_csv("~/Downloads/noahs-csv/noahs-orders_items.csv")
orders <- read_csv("~/Downloads/noahs-csv/noahs-orders.csv")
products <- read_csv("~/Downloads/noahs-csv/noahs-products.csv")

orders_items |> 
  left_join(products) -> oip

orders |> 
  left_join(oip) -> orders

orders |> 
  filter(
    2017 == lubridate::year(ordered),
    grepl("cleaner|bagel", desc, ignore.case=TRUE)
  ) |> 
  group_by(customerid, orderid) |> 
  summarise(
    ord = paste0(desc, collapse="; "),
    n = n()
  ) |> 
  arrange(desc(n)) # look for bagel + rug cleaner

cust |> 
  filter(customerid == '####') |> 
  select(phone)

Visiting #2 and doing some $WORK-work, but intrigued with Hanukkah of Data since Puzzle 0 was solvable with a ZIP password cracker (the calendar date math seemed too trivial to bother with).

Decided to fall back to R for this (vs Observable for the Advent of Code which I’ll dedicate time to finishing next week).

R has a {phonenumber} package, so we’ll cheat and use that despite it being very brutish in how it does the letterToNumber() conversion.

No spoilers besides the code.

library(phonenumber)
library(tidyverse)

cust <- read_csv("~/Downloads/noahs-csv/noahs-customers.csv")

cust |> 
  filter(!grepl("[01]", phone)) |> # only care abt letters
  mutate(
    last_name = stri_replace_all_regex(name, "(II|III|IV|Jr\\.)", ""), # get rid of suffix if any
  ) |> 
  separate( # get only the last name
    col = last_name,
    into = c("x1", "x2", "last_name"),
    sep = " ",
    fill = "left"
  ) |> 
  filter(
    nchar(last_name) == 10 # only complete last names
  ) |> 
  mutate(
    last_name = toupper(last_name),
    phone = gsub("-", "", phone) # we're going to compare so remove the '-'
  ) |> 
  select(last_name, phone) |> 
  mutate(
    trans = strsplit(xx$last_name, "") |> 
      map_chr(~map(.x, letterToNumber) |> paste0(collapse="")) # feels like I cld optimize this
  ) |> 
  filter(trans == phone)