R 4.0.0 has been out for a while, now, and — apart from a case where merge()
was slower than dirt — it’s been really stable for at least me (I use it daily on macOS, Linux, and Windows). Sure, it came with some headline-grabbing features/upgrades, but I’ve started looking at what other useful nuggets might be in the changelog and decided to blog them as I find them.
Today’s nugget is the venerable stopifnot()
function which was significantly enhanced by this PR by Neil Fultz.
Prior to R 4.0.0, if you wanted to use stopifnot()
to perform some input validation (a.k.a. — in this case — [assertions])(https://en.wikipedia.org/wiki/Assertion_(software_development)) you’d do something like this (I’m borrowing from Neil’s example):
some_ƒ <- function(alpha, gradtol, steptol, interlim) {
stopifnot(
(is.numeric(alpha)),
(length(alpha) == 1),
(alpha > 0),
(alpha < 1),
(is.numeric(gradtol)),
(length(gradtol) == 1),
(gradtol > 0),
(is.numeric(steptol)),
(length(steptol) == 1),
(steptol > 0),
(is.numeric(interlim)),
(length(interlim) == 1),
(interlim > 0)
)
message("Do something awesome")
}
When run with acceptable inputs we get:
some_ƒ(0.5, 3, 10, 100)
## Do something awesome
But, when run with something out of kilter:
some_ƒ("a", 3, 10, 100)
## Error in some_ƒ("a", 3, 10, 100) : (is.numeric(alpha)) is not TRUE
we get a semi-useful, but somewhat unfriendly message back. Sure, it points to the right expression, but we’re supposed to be the kinder, friendlier data science (and general purpose) language who cares a bit more about our users. To that end, many folks switch to doing something like this:
some_ƒ <- function(alpha, gradtol, steptol, interlim) {
if (!is.numeric(alpha)) { stop('Error: alpha should be numeric') }
if (length(alpha) != 1) { stop('Error: alpha should be a single value'); }
if (alpha < 0) { stop('Error: alpha is negative'); }
if (alpha > 1) { stop('Error: alpha is greater than one'); }
if (!is.numeric(gradtol)) { stop('Error: gradtol should be numeric') }
if (length(gradtol) != 1) { stop('Error: gradtol should be a single value'); }
if (gradtol <= 0) { stop('Error: gradtol should be positive'); }
if (!is.numeric(steptol)) { stop('Error: steptol should be numeric') }
if (length(steptol) != 1) { stop('Error: steptol should be a single value'); }
if (steptol <= 0) { stop('Error: steptol should be positive'); }
if (!is.numeric(iterlim)) { stop('Error: iterlim should be numeric') }
if (length(iterlim) != 1) { stop('Error: iterlim should be a single value'); }
if (iterlim <= 0) { stop('Error: iterlim should be positive'); }
message("Do something awesome")
}
which results in:
some_ƒ("a", 3, 10, 100)
## Error in some_ƒ("a", 3, 10, 100) : Error: alpha should be numeric
(you can make even better error messages than that).
Neal thought there had to be a better way, and made one! The ...
expressions can be named and those names will become the error message:
some_ƒ <- function(alpha, gradtol, steptol, interlim) {
stopifnot(
'alpha should be numeric' = (is.numeric(alpha)),
'alpha should be a single value' = (length(alpha) == 1),
'alpha is negative' = (alpha > 0),
'alpha is greater than one' = (alpha < 1),
'gradtol should be numeric' = (is.numeric(gradtol)),
'gradtol should be a single value' = (length(gradtol) == 1),
'gradtol should be positive' = (gradtol > 0),
'steptol should be numeric' = (is.numeric(steptol)),
'steptol should be a single value' = (length(steptol) == 1),
'steptol should be positive' = (steptol > 0),
'iterlim should be numeric' = (is.numeric(interlim)),
'iterlim should be a single value' = (length(interlim) == 1),
'iterlim should be positive' = (interlim > 0)
)
message("Do something awesome")
}
some_ƒ("a", 3, 10, 100)
## Error in some_ƒ("a", 3, 10, 100) : alpha should be numeric
Way easier to write and way more respectful to the caller.
Gratuitous Statistics
CRAN has ~2,600 packages that use stopifnot()
in their package /R/
code with the following selected distributions (charts are all log10 scale):
Here are the packages with 50 or more files using stopifnot()
:
pkg n
<chr> <int>
1 spatstat 252
2 pracma 145
3 QuACN 80
4 raster 74
5 spdep 61
6 lavaan 54
7 surveillance 53
8 copula 50
Here are the packages with one or more files that have 100 or more calls to stopifnot()
in them:
pkg fil ct
<chr> <chr> <int>
1 ff ordermerge.R 278
2 OneArmPhaseTwoStudy zzz.R 142
3 bit64 integer64.R 137
4 updog rflexdog.R 124
5 RNetCDF RNetCDF.R 123
6 Rlda rlda.R 105
7 aster2 transform.R 105
8 ads fads.R 104
9 georob georob_exported_functions.R 104
10 bit64 highlevel64.R 101
O_O That’s quite a bit of checking!
FIN
If you’re working on switching to R 4.0.0 or have switched, this and many other new features await! Drop a note in the comments with your favorite new feature (or, even better, a link to a blog post on said feature!).
As I get time to dig out some more nuggets I’ll add more posts to this series.