Glue is advertised as
Fast, dependency free string literals
So what do we mean when we say that glue is fast. This does not mean glue is the fastest thing to use in all cases, however for the features it provides we can confidently say it is fast.
A good way to determine this is to compare it’s speed of execution to some alternatives.
base::paste0()
, base::sprintf()
- Functions in base R implemented in C that provide variable insertion (but not interpolation).R.utils::gstring()
, stringr::str_interp()
- Provides a similar interface as glue, but using ${}
to delimit blocks to interpolate.pystr::pystr_format()
1, rprintf::rprintf()
- Provide a interfaces similar to python string formatters with variable replacement, but not arbitrary interpolation.bar <- "baz"
simple <-
microbenchmark::microbenchmark(
glue = glue::glue("foo{bar}"),
gstring = R.utils::gstring("foo${bar}"),
paste0 = paste0("foo", bar),
sprintf = sprintf("foo%s", bar),
str_interp = stringr::str_interp("foo${bar}"),
rprintf = rprintf::rprintf("foo$bar", bar = bar)
)
print(unit = "eps", order = "median", signif = 4, simple)
#> Unit: evaluations per second
#> expr min lq mean median uq max neval cld
#> rprintf 265.70 1848 1935 1957 2129 2331 100 a
#> gstring 19.56 2189 2312 2358 2522 2885 100 a
#> str_interp 203.40 2783 3060 3050 3549 3845 100 a
#> glue 476.60 5028 5392 5498 6154 7384 100 a
#> sprintf 53900.00 411900 579900 501300 599000 1534000 100 b
#> paste0 111000.00 312700 527000 535700 633300 1065000 100 b
plot_comparison(simple)
While glue()
is slower than paste0
,sprintf()
it is twice as fast as str_interp()
and gstring()
, and on par with rprintf()
.
paste0()
, sprintf()
don’t do string interpolation and will likely always be significantly faster than glue, glue was never meant to be a direct replacement for them.
rprintf()
does only variable interpolation, not arbitrary expressions, which was one of the explicit goals of writing glue.
So glue is ~2x as fast as the two functions (str_interp()
, gstring()
) which do have roughly equivalent functionality.
It also is still quite fast, with over 6000 evaluations per second on this machine.
Taking advantage of glue’s vectorization is the best way to avoid performance. For instance the vectorized form of the previous benchmark is able to generate 100,000 strings in only 22ms with performance much closer to that of paste0()
and sprintf()
. NB. str_interp()
does not support vectorization, so were removed.
bar <- rep("bar", 1e5)
vectorized <-
microbenchmark::microbenchmark(
glue = glue::glue("foo{bar}"),
gstring = R.utils::gstring("foo${bar}"),
paste0 = paste0("foo", bar),
sprintf = sprintf("foo%s", bar),
rprintf = rprintf::rprintf("foo$bar", bar = bar)
)
print(unit = "ms", order = "median", signif = 4, vectorized)
#> Unit: milliseconds
#> expr min lq mean median uq max neval cld
#> paste0 14.28 14.97 15.26 15.11 15.22 21.50 100 a
#> glue 15.03 17.11 19.29 17.60 18.28 144.50 100 b
#> sprintf 17.11 17.62 18.07 17.86 18.07 23.46 100 b
#> gstring 26.35 27.92 28.81 28.34 28.72 51.18 100 c
#> rprintf 64.18 66.44 67.88 67.29 68.28 80.57 100 d
plot_comparison(vectorized, log = FALSE)
pystr is no longer available from CRAN due to failure to correct installation errors and was therefore removed from further testing.↩