*valaddin* is a lightweight R package that enables you to transform an existing function into a function with input validation checks. It does so without requiring you to modify the body of the function, in contrast to doing input validation using `stop`

or `stopifnot`

, and is therefore suitable for both programmatic and interactive use.

This document illustrates the use of valaddin, by example. For usage details, see the main documentation page, `?firmly`

.

The workhorse of valaddin is the function `firmly`

, which applies input validation to a function, *in situ*. It can be used to:

For example, to require that all arguments of the function

`f <- function(x, h) (sin(x + h) - sin(x)) / h`

are numerical, apply `firmly`

with the check formula `~is.numeric`

^{1}:

`ff <- firmly(f, ~is.numeric)`

`ff`

behaves just like `f`

, but with a constraint on the type of its arguments:

```
ff(0.0, 0.1)
#> [1] 0.9983342
ff("0.0", 0.1)
#> Error: ff(x = "0.0", h = 0.1)
#> FALSE: is.numeric(x)
```

For example, use `firmly`

to put a cap on potentially long-running computations:

```
fib <- function(n) {
if (n <= 1) return(1L)
Recall(n - 1) + Recall(n - 2)
}
capped_fib <- firmly(fib, list("n capped at 30" ~ ceiling(n)) ~ {. <= 30L})
capped_fib(10)
#> [1] 89
capped_fib(50)
#> Error: capped_fib(n = 50)
#> n capped at 30
```

The role of each part of the value-constraining formula is evident:

The right-hand side

`{. <= 30L}`

is the constraint itself, expressed as a condition on`.`

, a placeholder argument.The left-hand side

`list("n capped at 30" ~ ceiling(n))`

specifies the expression for the placeholder, namely`ceiling(n)`

, along with a message to produce if the constraint is violated.

If the default behavior of a function is problematic, or unexpected, you can use `firmly`

to warn you. Consider the function `as.POSIXct`

, which creates a date-time object:

```
Sys.setenv(TZ = "CET")
(d <- as.POSIXct("2017-01-01 09:30:00"))
#> [1] "2017-01-01 09:30:00 CET"
```

The problem is that `d`

is a potentially *ambiguous* object (with hidden state), because it’s not assigned a time zone, explicitly. If you compute the local hour of `d`

using `as.POSIXlt`

, you get an answer that interprets `d`

according to your current time zone; another user—or you, in another country, in the future—may get a different result.

If you’re in CET time zone:

`as.POSIXlt(d, tz = "EST")$hour #> [1] 3`

If you were to change to EST time zone and rerun the code:

`Sys.setenv(TZ = "EST") d <- as.POSIXct("2017-01-01 09:30:00") as.POSIXlt(d, tz = "EST")$hour #> [1] 9`

To warn yourself about this pitfall, you can modify `as.POSIXct`

to complain when you’ve forgotten to specify a time zone:

`as.POSIXct <- firmly(as.POSIXct, .warn_missing = "tz")`

Now when you call `as.POSIXct`

, you get a cautionary reminder:

```
as.POSIXct("2017-01-01 09:30:00")
#> Warning: Argument(s) expected but not specified in call as.POSIXct(x =
#> "2017-01-01 09:30:00"): `tz`
#> [1] "2017-01-01 09:30:00 CET"
as.POSIXct("2017-01-01 09:30:00", tz = "CET")
#> [1] "2017-01-01 09:30:00 CET"
```

**NB**: The missing-argument warning is implemented by wrapping functions. The underlying function `base::as.POSIXct`

is called *unmodified*.

`loosely`

to access the original functionThough reassigning `as.POSIXct`

may seem risky, it is not, for the behavior is unchanged (aside from the extra precaution), and the original `as.POSIXct`

remains accessible:

- With a namespace prefix:
`base::as.POSIXct`

- By applying
`loosely`

to strip input validation:`loosely(as.POSIXct)`

```
loosely(as.POSIXct)("2017-01-01 09:30:00")
#> [1] "2017-01-01 09:30:00 CET"
identical(loosely(as.POSIXct), base::as.POSIXct)
#> [1] TRUE
```

R tries to help you express your ideas as concisely as possible. Suppose you want to truncate negative values of a vector `w`

:

```
w <- {set.seed(1); rnorm(5)}
ifelse(w > 0, 0, w)
#> [1] -0.6264538 0.0000000 -0.8356286 0.0000000 0.0000000
```

`ifelse`

assumes (correctly) that you intend the `0`

to be repeated 5 times, and does that for you, automatically.

Nonetheless, R’s good intentions have a darker side:

```
z <- rep(1, 6)
pos <- 1:5
neg <- -6:-1
ifelse(z > 0, pos, neg)
#> [1] 1 2 3 4 5 1
```

This smells like a coding error. Instead of complaining that `pos`

is too short, `ifelse`

recycles it to line it up with `z`

. The result is probably not what you wanted.

In this case, you don’t need a helping hand, but rather a firm one:

```
chk_length_type <- list(
"'yes', 'no' differ in length" ~ length(yes) == length(no),
"'yes', 'no' differ in type" ~ typeof(yes) == typeof(no)
) ~ isTRUE
ifelse_f <- firmly(ifelse, chk_length_type)
```

`ifelse_f`

is more pedantic than `ifelse`

. But it also spares you the consequences of invalid inputs:

```
ifelse_f(w > 0, 0, w)
#> Error: ifelse_f(test = w > 0, yes = 0, no = w)
#> 'yes', 'no' differ in length
ifelse_f(w > 0, rep(0, length(w)), w)
#> [1] -0.6264538 0.0000000 -0.8356286 0.0000000 0.0000000
ifelse(z > 0, pos, neg)
#> [1] 1 2 3 4 5 1
ifelse_f(z > 0, pos, neg)
#> Error: ifelse_f(test = z > 0, yes = pos, no = neg)
#> 'yes', 'no' differ in length
ifelse(z > 0, as.character(pos), neg)
#> [1] "1" "2" "3" "4" "5" "1"
ifelse_f(z > 0, as.character(pos), neg)
#> Error: ifelse_f(test = z > 0, yes = as.character(pos), no = neg)
#> 1) 'yes', 'no' differ in length
#> 2) 'yes', 'no' differ in type
```

When R make a function call, say, `f(a)`

, the *value* of the argument `a`

is not materialized in the body of `f`

until it is actually needed. Usually, you can safely ignore this as a technicality of R’s evaluation model; but in some situations, it can be problematic if you’re not mindful of it.

Consider a bank that waives fees for students. A function to make deposits might look like this^{2}:

```
deposit <- function(account, value) {
if (is_student(account)) {
account$fees <- 0
}
account$balance <- account$balance + value
account
}
is_student <- function(account) {
if (isTRUE(account$is_student)) TRUE else FALSE
}
```

Suppose Bob is an account holder, currently not in school:

`bobs_acct <- list(balance = 10, fees = 3, is_student = FALSE)`

If Bob were to deposit an amount to cover an future fee payment, his account balance would be updated to:

```
deposit(bobs_acct, bobs_acct$fees)$balance
#> [1] 13
```

Bob goes back to school and informs the bank, so that his fees will be waived:

`bobs_acct$is_student <- TRUE`

But now suppose that, somewhere in the bowels of the bank’s software, the type of Bob’s account object is converted from a list to an environment:

`bobs_acct <- list2env(bobs_acct)`

If Bob were to deposit an amount to cover an future fee payment, his account balance would now be updated to:

```
deposit(bobs_acct, bobs_acct$fees)$balance
#> [1] 10
```

Becoming a student has cost Bob money. What happened to the amount deposited?

The culprit is lazy evaluation and the modify-in-place semantics of environments. In the call `deposit(account = bobs_acct, value = bobs_acct$fee)`

, the value of the argument `value`

is only set when it’s used, which comes after the object `fee`

in the environment `bobs_acct`

has already been zeroed out.

To minimize such risks, forbid `account`

from being an environment:

```
err_msg <- "`acccount` should not be an environment"
deposit <- firmly(deposit, list(err_msg ~ account) ~ Negate(is.environment))
```

This makes Bob a happy customer, and reduces the bank’s liability:

```
bobs_acct <- list2env(list(balance = 10, fees = 3, is_student = TRUE))
deposit(bobs_acct, bobs_acct$fees)$balance
#> Error: deposit(account = bobs_acct, value = bobs_acct$fees)
#> `acccount` should not be an environment
deposit(as.list(bobs_acct), bobs_acct$fees)$balance
#> [1] 13
```

You don’t mean to shoot yourself, but sometimes it happens, nonetheless:

```
x <- "An expensive object"
save(x, file = "my-precious.rda")
x <- "Oops! A bug or lapse has tarnished your expensive object"
# Many computations later, you again save x, oblivious to the accident ...
save(x, file = "my-precious.rda")
```

`firmly`

can safeguard you from such mishaps: implement a safety procedure

```
# Argument `gear` is a list with components:
# fun: Function name
# ns : Namespace of `fun`
# chk: Formula that specify input checks
hardhat <- function(gear, env = .GlobalEnv) {
for (. in gear) {
safe_fun <- firmly(getFromNamespace(.$fun, .$ns), .$chk)
assign(.$fun, safe_fun, envir = env)
}
}
```

gather your safety gear

```
protection <- list(
list(
fun = "save",
ns = "base",
chk = list("Won't overwrite `file`" ~ file) ~ Negate(file.exists)
),
list(
fun = "load",
ns = "base",
chk = list("Won't load objects into current environment" ~ envir) ~
{!identical(., parent.frame(2))}
)
)
```

then put it on

`hardhat(protection)`

Now `save`

and `load`

engage safety features that prevent you from inadvertently destroying your data:

```
x <- "An expensive object"
save(x, file = "my-precious.rda")
x <- "Oops! A bug or lapse has tarnished your expensive object"
#> Error: save(x, file = "my-precious.rda")
#> Won't overwrite `file`
save(x, file = "my-precious.rda")
# Inspecting x, you notice it's changed, so you try to retrieve the original ...
x
#> [1] "Oops! A bug or lapse has tarnished your expensive object"
load("my-precious.rda")
#> Error: load(file = "my-precious.rda")
#> Won't load objects into current environment
# Keep calm and carry on
loosely(load)("my-precious.rda")
x
#> [1] "An expensive object"
```

**NB**: Input validation is implemented by wrapping functions; thus, if the arguments are valid, the underlying functions `base::save`

, `base::load`

are called *unmodified*.

*valaddin* provides a collection of over 50 pre-made input checkers to facilitate typical kinds of argument checks. These checkers are prefixed by `vld_`

, for convenient browsing and look-up in editors and IDE’s that support name completion.

For example, to create a type-checked version of the function `upper.tri`

, which returns an upper-triangular logical matrix, apply the checkers `vld_matrix`

, `vld_boolean`

(here “boolean” is shorthand for “logical vector of length 1”):

```
upper_tri <- firmly(upper.tri, vld_matrix(~x), vld_boolean(~diag))
# upper.tri assumes you mean a vector to be a column matrix
upper.tri(1:2)
#> [,1]
#> [1,] FALSE
#> [2,] FALSE
upper_tri(1:2)
#> Error: upper_tri(x = 1:2)
#> Not matrix: x
# But say you actually meant (1, 2) to be a diagonal matrix
upper_tri(diag(1:2))
#> [,1] [,2]
#> [1,] FALSE TRUE
#> [2,] FALSE FALSE
upper_tri(diag(1:2), diag = "true")
#> Error: upper_tri(x = diag(1:2), diag = "true")
#> Not boolean: diag
upper_tri(diag(1:2), TRUE)
#> [,1] [,2]
#> [1,] TRUE TRUE
#> [2,] FALSE TRUE
```

`vld_true`

Any input validation can be expressed as an assertion that “such and such must be true”; to apply it as such, use `vld_true`

(or its complement, `vld_false`

).

For example, the above hardening of `ifelse`

can be redone as:

```
chk_length_type <- vld_true(
"'yes', 'no' differ in length" ~ length(yes) == length(no),
"'yes', 'no' differ in type" ~ typeof(yes) == typeof(no)
)
ifelse_f <- firmly(ifelse, chk_length_type)
z <- rep(1, 6)
pos <- 1:5
neg <- -6:-1
ifelse_f(z > 0, as.character(pos), neg)
#> Error: ifelse_f(test = z > 0, yes = as.character(pos), no = neg)
#> 1) 'yes', 'no' differ in length
#> 2) 'yes', 'no' differ in type
ifelse_f(z > 0, c(pos, 6), neg)
#> Error: ifelse_f(test = z > 0, yes = c(pos, 6), no = neg)
#> 'yes', 'no' differ in type
ifelse_f(z > 0, c(pos, 6L), neg)
#> [1] 1 2 3 4 5 6
```

`localize`

A check formula such as `~ is.numeric`

(or `"Not number" ~ is.numeric`

, if you want a custom error message) imposes its condition “globally”:

```
difference <- firmly(function(x, y) x - y, ~ is.numeric)
difference(3, 1)
#> [1] 2
difference(as.POSIXct("2017-01-01", "UTC"), as.POSIXct("2016-01-01", "UTC"))
#> Error: difference(x = as.POSIXct("2017-01-01", "UTC"), y = as.POSIXct("2016-01-01","UTC"))
#> 1) FALSE: is.numeric(x)
#> 2) FALSE: is.numeric(y)
```

With `localize`

, you can concentrate a globally applied check formula to specific expressions. The result is a *reusable* custom checker:

```
chk_numeric <- localize("Not numeric" ~ is.numeric)
secant <- firmly(function(f, x, h) (f(x + h) - f(x)) / h, chk_numeric(~x, ~h))
secant(sin, 0, .1)
#> [1] 0.9983342
secant(sin, "0", .1)
#> Error: secant(f = sin, x = "0", h = 0.1)
#> Not numeric: x
```

(In fact, `chk_numeric`

is equivalent to the pre-built checker `vld_numeric`

.)

Conversely, apply `globalize`

to impose your localized checker globally:

```
difference <- firmly(function(x, y) x - y, globalize(chk_numeric))
difference(3, 1)
#> [1] 2
difference(as.POSIXct("2017-01-01", "UTC"), as.POSIXct("2016-01-01", "UTC"))
#> Error: difference(x = as.POSIXct("2017-01-01", "UTC"), y = as.POSIXct("2016-01-01","UTC"))
#> 1) Not numeric: `x`
#> 2) Not numeric: `y`
```

The inspiration to use

`~`

as a quoting operator came from the vignette Non-standard evaluation, by Hadley Wickham.↩Adapted from an example in Section 6.3 of Chambers,

*Extending R*, CRC Press, 2016. For the sake of the example, ignore the fact that logic to handle fees does not belong in a function for deposits!↩