@mrshrbrmstr hinted that she would like this post by @RickWicklin translated into R for her stats class. She’s quite capable of cranking out the translation of the core component of that post — a call to
chisq.test — but she wanted to show the entire post (in R) and really didn’t have time (she’s teaching a full load of classes and is department chair + a mom). I suggested that I, too, was a bit short on time which resulted in her putting out a call to the twitterverse for assistance which ultimately ended up coercing me into tackling the problem.
So, why a blog post if not to present the translation?
Two reasons: I needed tidy Goodman simultaneous confidence intervals (SCIs) and Rick’s final plot was just begging to have “real” M&M’s as the point “geom”.
S[c]imple & Tidy SCIs
We’ve got options for calculating simultaneous CIs in R and I could have just used
DescTools::MultinomCI except that I wanted a
tibble and it returns a matrix plus it only has three of the more common methods implemented (yes, I am the ultimate package snob). I recalled that the
CoinMinD package was tailor made for working with SCIs and has many more methods implemented, but the output is actually only that:
print()ed to console output.
Yes, I shouted in disbelief at the glowing rectangle in front of me when I noticed that almost as loudly as you did when you read that sentence.
The algorithms implemented in
CoinMinD are just dandy and the package is coming up on it’s 4th birthday. So, as a present from it (via me) to the R community, I whipped together
scimple which generates tidy tibbles and has a function
scimple_ci() which is similar to
binom::binom.confint() in that it will generate the SCIs for all the available (non-Bayesian) methods, including Goodman.
Kick the tyres (pls!) and drop issues and/or PRs as you see fit.
You can’t plot just one
Rick’s post analyzes distributions of M&M’s so I went to the official M&M’s site to grab the official colors for the ones in his data set. I casually went about making the rest of the post with standard points with a superimposed white “m” when it dawned on me that the M&M’s site used those lentils (yes, it seems the candies are called lentils, or at least their icons are) were all over the site. After some site spelunking with Chrome Developer Tools I had the URLs for the candies in question and managed to use the nascent
ggimage::geom_image() to place them on the plot:
The plot is a bit sparse as you have to get the aspect ratio just right to keep those tasty, tiny circles as circles.
geom_image() opens up many new possibilities for R visualizations (and not all are good possibilities). I think @mrshrbrmstr’s students got a kick out of a stats-y plot having real M&M’s on it so it worked OK this time. Just be wary of using gratuitous imagery and overdoing your watermarking.
As stated earlier you can get the code and see how you can improve upon Rick’s original post and my attempt at a quick riff. If you do end up cranking something out, drop a comment here or a tweet (@hrbrmstr) to show off your creation(s)!