This vignette covers the the less-typical uses of REDCapR to interact with REDCap through its API.
There is some information that is specific to a REDCap project, as opposed to an individual operation. This includes the (1) uri of the server, and the (2) token for the user’s project. This is hosted on a machine used in REDCapR’s public test suite, so you can run this example from any computer. Unless tests are running.
library(REDCapR) #Load the package into the current R session.
uri <- "https://bbmc.ouhsc.edu/redcap/api/"
token_simple <- "9A81268476645C4E5F03428B8AC3AA7B"
token_longitudinal <- "0434F0E9CF53ED0587847AB6E51DE762"
Disclaimer: Occasionally we’re asked for a longitudinal dataset to be converted from a “long/tall format” (where typically each row is one observation for a participant) to a “wide format” (where each row is on participant). Usually we advise against it. Besides all the database benefits of a long structure, a wide structure restricts your options with the stat routine. No modern longitudinal analysis procedures (eg, growth curve models or multilevel/hierarchical models) accept wide. You’re pretty much stuck with repeated measures anova, which is very inflexible for real-world medical-ish analyses. It requires a patient to have a measurement at every time point; otherwise the anova excludes the patient entirely.
However we like going wide to produce visual tables for publications, and here’s one way to do it in R. First retrieve the dataset from REDCap.
library(magrittr);
suppressPackageStartupMessages(requireNamespace("dplyr"))
suppressPackageStartupMessages(requireNamespace("tidyr"))
events_to_retain <- c("dose_1_arm_1", "visit_1_arm_1", "dose_2_arm_1", "visit_2_arm_1")
ds_long <- REDCapR::redcap_read_oneshot(redcap_uri=uri, token=token_longitudinal)$data
#> 18 records and 125 columns were read from REDCap in 0.4 seconds. The http status code was 200.
ds_long %>%
dplyr::select(study_id, redcap_event_name, pmq1, pmq2, pmq3, pmq4)
study id | redcap event name | pmq1 | pmq2 | pmq3 | pmq4 |
---|---|---|---|---|---|
100 | enrollment_arm_1 | NA | NA | NA | NA |
100 | dose_1_arm_1 | 2 | 2 | 1 | 1 |
100 | visit_1_arm_1 | 1 | 0 | 0 | 0 |
100 | dose_2_arm_1 | 3 | 1 | 0 | 0 |
100 | visit_2_arm_1 | 0 | 1 | 0 | 0 |
100 | final_visit_arm_1 | NA | NA | NA | NA |
220 | enrollment_arm_1 | NA | NA | NA | NA |
220 | dose_1_arm_1 | 0 | 1 | 0 | 2 |
220 | visit_1_arm_1 | 0 | 3 | 1 | 0 |
220 | dose_2_arm_1 | 1 | 2 | 0 | 1 |
220 | visit_2_arm_1 | 3 | 4 | 1 | 0 |
220 | final_visit_arm_1 | NA | NA | NA | NA |
304 | enrollment_arm_2 | NA | NA | NA | NA |
304 | deadline_to_opt_ou_arm_2 | NA | NA | NA | NA |
304 | first_dose_arm_2 | 0 | 1 | 0 | 0 |
304 | first_visit_arm_2 | 2 | 0 | 0 | 0 |
304 | final_visit_arm_2 | NA | NA | NA | NA |
304 | deadline_to_return_arm_2 | NA | NA | NA | NA |
When widening only one variable (eg, pmq1
), the code’s pretty simple:
ds_wide <- ds_long %>%
dplyr::select(study_id, redcap_event_name, pmq1) %>%
dplyr::filter(redcap_event_name %in% events_to_retain) %>%
tidyr::spread(key=redcap_event_name, value=pmq1)
ds_wide
study id | dose 1 arm 1 | dose 2 arm 1 | visit 1 arm 1 | visit 2 arm 1 |
---|---|---|---|---|
100 | 2 | 3 | 1 | 0 |
220 | 0 | 1 | 0 | 3 |
When widening more than one variable (eg, pmq1
- pmq4
), it’s usually easiest to go even longer/taller (eg, ds_eav
) before reversing direction and going wide:
pattern <- "^(\\w+?)_arm_(\\d)$"
ds_eav <- ds_long %>%
dplyr::select(study_id, redcap_event_name, pmq1, pmq2, pmq3, pmq4) %>%
dplyr::mutate(
event = sub(pattern, "\\1", redcap_event_name),
arm = as.integer(sub(pattern, "\\2", redcap_event_name))
) %>%
dplyr::select(study_id, event, arm, pmq1, pmq2, pmq3, pmq4) %>%
tidyr::gather(key=key, value=value, pmq1, pmq2, pmq3, pmq4) %>%
dplyr::filter(!(event %in% c(
"enrollment", "final_visit", "deadline_to_return", "deadline_to_opt_ou")
)) %>%
dplyr::mutate( # Simulate correcting for mismatched names across arms:
event = dplyr::recode(event, "first_dose"="dose_1", "first_visit"="visit_1"),
key = paste0(event, "_", key)
) %>%
dplyr::select(-event)
# Show the first 10 rows of the EAV table.
ds_eav %>%
head(10)
study id | arm | key | value |
---|---|---|---|
100 | 1 | dose_1_pmq1 | 2 |
100 | 1 | visit_1_pmq1 | 1 |
100 | 1 | dose_2_pmq1 | 3 |
100 | 1 | visit_2_pmq1 | 0 |
220 | 1 | dose_1_pmq1 | 0 |
220 | 1 | visit_1_pmq1 | 0 |
220 | 1 | dose_2_pmq1 | 1 |
220 | 1 | visit_2_pmq1 | 3 |
304 | 2 | dose_1_pmq1 | 0 |
304 | 2 | visit_1_pmq1 | 2 |
# Spread the EAV to wide.
ds_wide <- ds_eav %>%
tidyr::spread(key=key, value=value)
ds_wide
study id | arm | dose 1 pmq1 | dose 1 pmq2 | dose 1 pmq3 | dose 1 pmq4 | dose 2 pmq1 | dose 2 pmq2 | dose 2 pmq3 | dose 2 pmq4 | visit 1 pmq1 | visit 1 pmq2 | visit 1 pmq3 | visit 1 pmq4 | visit 2 pmq1 | visit 2 pmq2 | visit 2 pmq3 | visit 2 pmq4 |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
100 | 1 | 2 | 2 | 1 | 1 | 3 | 1 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 1 | 0 | 0 |
220 | 1 | 0 | 1 | 0 | 2 | 1 | 2 | 0 | 1 | 0 | 3 | 1 | 0 | 3 | 4 | 1 | 0 |
304 | 2 | 0 | 1 | 0 | 0 | NA | NA | NA | NA | 2 | 0 | 0 | 0 | NA | NA | NA | NA |
The official cURL site discusses the process of using SSL to verify the server being connected to.
Use the SSL cert file that come with the openssl
package.
cert_location <- system.file("cacert.pem", package="openssl")
if( file.exists(cert_location) ) {
config_options <- list(cainfo=cert_location)
ds_different_cert_file <- redcap_read_oneshot(
redcap_uri = uri,
token = token_simple,
config_options = config_options
)$data
}
#> 5 records and 24 columns were read from REDCap in 0.5 seconds. The http status code was 200.
Force the connection to use SSL=3 (which is not preferred, and possibly insecure).
config_options <- list(sslversion=3)
ds_ssl_3 <- redcap_read_oneshot(
redcap_uri = uri,
token = token_simple,
config_options = config_options
)$data
#> 5 records and 24 columns were read from REDCap in 0.4 seconds. The http status code was 200.
config_options <- list(ssl.verifypeer=FALSE)
ds_no_ssl <- redcap_read_oneshot(
redcap_uri = uri,
token = token_simple,
config_options = config_options
)$data
#> 5 records and 24 columns were read from REDCap in 0.5 seconds. The http status code was 200.
For the sake of documentation and reproducibility, the current report was rendered in the following environment. Click the line below to expand.
Environment
#> Session info -------------------------------------------------------------
#> setting value
#> version R version 3.4.0 (2017-04-21)
#> system x86_64, linux-gnu
#> ui X11
#> language en_US
#> collate C
#> tz America/Chicago
#> date 2017-05-18
#> Packages -----------------------------------------------------------------
#> package * version date source
#> assertthat 0.2.0 2017-04-11 CRAN (R 3.4.0)
#> backports 1.0.5 2017-01-18 CRAN (R 3.3.1)
#> base * 3.4.0 2017-04-21 local
#> bindr 0.1 2016-11-13 cran (@0.1)
#> bindrcpp * 0.1 2016-12-11 cran (@0.1)
#> compiler 3.4.0 2017-04-21 local
#> curl 2.6 2017-04-27 CRAN (R 3.4.0)
#> data.table 1.10.4 2017-02-01 CRAN (R 3.3.1)
#> datasets * 3.4.0 2017-04-21 local
#> devtools 1.13.1 2017-05-13 CRAN (R 3.4.0)
#> digest 0.6.12 2017-01-27 CRAN (R 3.3.1)
#> dplyr 0.5.0.9005 2017-05-18 Github (tidyverse/dplyr@aece1a5)
#> evaluate 0.10 2016-10-11 CRAN (R 3.3.1)
#> glue 1.0.0 2017-04-17 CRAN (R 3.4.0)
#> graphics * 3.4.0 2017-04-21 local
#> grDevices * 3.4.0 2017-04-21 local
#> highr 0.6 2016-05-09 CRAN (R 3.3.0)
#> htmltools 0.3.6 2017-04-28 CRAN (R 3.4.0)
#> httr 1.2.1 2016-07-03 CRAN (R 3.3.1)
#> kableExtra 0.1.0 2017-03-02 CRAN (R 3.3.1)
#> knitr * 1.16 2017-05-18 CRAN (R 3.4.0)
#> magrittr * 1.5 2014-11-22 CRAN (R 3.3.0)
#> memoise 1.1.0 2017-04-21 CRAN (R 3.4.0)
#> methods * 3.4.0 2017-04-21 local
#> R6 2.2.1 2017-05-10 CRAN (R 3.4.0)
#> Rcpp 0.12.10 2017-03-19 CRAN (R 3.3.1)
#> REDCapR * 0.9.8 2017-05-18 local
#> rlang 0.1.1 2017-05-18 CRAN (R 3.4.0)
#> rmarkdown 1.5 2017-04-26 CRAN (R 3.4.0)
#> rprojroot 1.2 2017-01-16 CRAN (R 3.3.1)
#> rstudioapi 0.6 2016-06-27 CRAN (R 3.3.1)
#> rvest 0.3.2 2016-06-17 CRAN (R 3.3.1)
#> selectr 0.3-1 2016-12-19 CRAN (R 3.3.1)
#> stats * 3.4.0 2017-04-21 local
#> stringi 1.1.5 2017-04-07 CRAN (R 3.3.3)
#> stringr 1.2.0 2017-02-18 CRAN (R 3.3.1)
#> tibble 1.3.1 2017-05-18 Github (tidyverse/tibble@8f30072)
#> tidyr 0.6.3 2017-05-15 CRAN (R 3.4.0)
#> tools 3.4.0 2017-04-21 local
#> utils * 3.4.0 2017-04-21 local
#> withr 1.0.2 2016-06-20 CRAN (R 3.3.0)
#> XML 3.98-1.7 2017-05-03 CRAN (R 3.4.0)
#> xml2 1.1.1 2017-01-24 CRAN (R 3.3.1)
#> yaml 2.1.14 2016-11-12 CRAN (R 3.3.1)
Report rendered by wibeasley at 2017-05-18, 12:45 -0500 in 2 seconds.