# Introduction

Another one of the most common tables in medical literature includes summary statistics for a set of variables paired across two time points. Locally at Mayo, the SAS macro %paired was written to create summary tables with a single call. With the increasing interest in R, we have developed the function paired() to create similar tables within the R environment.

This vignette is light on purpose; paired() piggybacks off of tableby, so most documentation there applies here, too.

# Simple Example

The first step when using the paired() function is to load the arsenal package. We can’t use mockstudy here because we need a dataset with paired observations, so we’ll create our own dataset.

library(arsenal)
dat <- data.frame(
tp = paste0("Time Point ", c(1, 2, 1, 2, 1, 2, 1, 2, 1, 2)),
id = c(1, 1, 2, 2, 3, 3, 4, 4, 5, 6),
Cat = c("A", "A", "A", "B", "B", "B", "B", "A", NA, "B"),
Fac = factor(c("A", "B", "C", "A", "B", "C", "A", "B", "C", "A")),
Num = c(1, 2, 3, 4, 4, 3, 3, 4, 0, NA),
Ord = ordered(c("I", "II", "II", "III", "III", "III", "I", "III", "II", "I")),
Lgl = c(TRUE, TRUE, FALSE, TRUE, FALSE, TRUE, TRUE, FALSE, FALSE, FALSE),
Dat = as.Date("2018-05-01") + c(1, 1, 2, 2, 3, 4, 5, 6, 3, 4),
stringsAsFactors = FALSE
)

To create a simple table stratified by time point, use a formula= statement to specify the variables that you want summarized and the id= argument to specify the paired observations.

p <- paired(tp ~ Cat + Fac + Num + Ord + Lgl + Dat, data = dat, id = id, signed.rank.exact = FALSE)
summary(p)
Time Point 1 (N=4) Time Point 2 (N=4) Difference (N=4) p value
Cat 1.000
A 2 (50.0%) 2 (50.0%) 1 (50.0%)
B 2 (50.0%) 2 (50.0%) 1 (50.0%)
Fac 0.261
A 2 (50.0%) 1 (25.0%) 2 (100.0%)
B 1 (25.0%) 2 (50.0%) 1 (100.0%)
C 1 (25.0%) 1 (25.0%) 1 (100.0%)
Num 0.391
Mean (SD) 2.750 (1.258) 3.250 (0.957) 0.500 (1.000)
Range 1.000 - 4.000 2.000 - 4.000 -1.000 - 1.000
Ord 0.174
I 2 (50.0%) 0 (0.0%) 2 (100.0%)
II 1 (25.0%) 1 (25.0%) 1 (100.0%)
III 1 (25.0%) 3 (75.0%) 0 (0.0%)
Lgl 1.000
FALSE 2 (50.0%) 1 (25.0%) 2 (100.0%)
TRUE 2 (50.0%) 3 (75.0%) 1 (50.0%)
Dat 0.182
median 2018-05-03 2018-05-04 0.500
Range 2018-05-02 - 2018-05-06 2018-05-02 - 2018-05-07 0.000 - 1.000

The third column shows the difference between time point 1 and time point 2. For categorical variables, it reports the percent of observations from time point 1 which changed in time point 2.

# NAs

Note that by default, observations which do not have both timepoints are removed. This is easily changed using the na.action = na.paired("<arg>") argument. For example:

p <- paired(tp ~ Cat + Fac + Num + Ord + Lgl + Dat, data = dat, id = id,
signed.rank.exact = FALSE, na.action = na.paired("fill"))
summary(p)
Time Point 1 (N=6) Time Point 2 (N=6) Difference (N=6) p value
Cat 1.000
N-Miss 2 1 2
A 2 (50.0%) 2 (40.0%) 1 (50.0%)
B 2 (50.0%) 3 (60.0%) 1 (50.0%)
Fac 0.261
N-Miss 1 1 2
A 2 (40.0%) 2 (40.0%) 2 (100.0%)
B 1 (20.0%) 2 (40.0%) 1 (100.0%)
C 2 (40.0%) 1 (20.0%) 1 (100.0%)
Num 0.391
N-Miss 1 2 2
Mean (SD) 2.200 (1.643) 3.250 (0.957) 0.500 (1.000)
Range 0.000 - 4.000 2.000 - 4.000 -1.000 - 1.000
Ord 0.174
N-Miss 1 1 2
I 2 (40.0%) 1 (20.0%) 2 (100.0%)
II 2 (40.0%) 1 (20.0%) 1 (100.0%)
III 1 (20.0%) 3 (60.0%) 0 (0.0%)
Lgl 1.000
N-Miss 1 1 2
FALSE 3 (60.0%) 2 (40.0%) 2 (100.0%)
TRUE 2 (40.0%) 3 (60.0%) 1 (50.0%)
Dat 0.182
N-Miss 1 1 2
median 2018-05-04 2018-05-05 0.500
Range 2018-05-02 - 2018-05-06 2018-05-02 - 2018-05-07 0.000 - 1.000

For more details, see the help page for na.paired().

# Available Function Options

## Testing options

The tests used to calculate p-values differ by the variable type, but can be specified explicitly in the formula statement or in the control function.

The following tests are accepted:

• paired.t: A paired t-test.

• mcnemar: McNemar’s test.

• signed.rank: the signed-rank test.

• sign.test: the sign test.

• notest: Don’t perform a test.

## paired.control settings

A quick way to see what arguments are possible to utilize in a function is to use the args() command. Settings involving the number of digits can be set in paired.control or in summary.tableby.

args(paired.control)
## function (test = TRUE, diff = TRUE, test.pname = NULL, numeric.test = "paired.t",
##     cat.test = "mcnemar", ordered.test = "signed.rank", date.test = "paired.t",
##     numeric.stats = c("Nmiss", "meansd", "range"), cat.stats = c("Nmiss",
##         "countpct"), ordered.stats = c("Nmiss", "countpct"),
##     date.stats = c("Nmiss", "median", "range"), stats.labels = list(Nmiss = "N-Miss",
##         Nmiss2 = "N-Miss", meansd = "Mean (SD)", medianq1q3 = "Median (Q1, Q3)",
##         q1q3 = "Q1, Q3", range = "Range", countpct = "Count (Pct)"),
##     digits = 3L, digits.count = 0L, digits.p = 3L, format.p = TRUE,
##     conf.level = 0.95, mcnemar.correct = TRUE, signed.rank.exact = NULL,
##     signed.rank.correct = TRUE, ...)
## NULL

## summary.tableby settings

Since the “paired” object inherits “tableby”, the summary.tableby function is what’s actually used to format and print the table.

args(arsenal:::summary.tableby)
## function (object, ..., labelTranslations = NULL, text = FALSE,
##     title = NULL, pfootnote = FALSE, term.name = "")
## NULL