psrwe: Propensity Score-Integrated Approaches for Incorporating Real-World Evidence in Clinical Studies

Chenguang Wang

2020-09-01

## Loading required package: psrwe
## Loading required package: rstan
## Loading required package: StanHeaders
## Loading required package: ggplot2
## rstan (Version 2.21.2, GitRev: 2e1f913d3ca3)
## For execution on a local, multicore CPU with excess RAM we recommend calling
## options(mc.cores = parallel::detectCores()).
## To avoid recompilation of unchanged Stan programs, we recommend calling
## rstan_options(auto_write = TRUE)
## Loading required package: Rcpp

Introduction

In the R package psrwe, we implement a series of approaches for leveraging real-world evidence in clinical study design and analysis.

Propensity score estimation

The approaches implemented in psrwe are mostly based on propensity score adjustment. Estimation of propensity scores can be done by using the function rwe_ps.

data(ex_dta)
dta_ps <- rwe_ps(ex_dta,
                 v_covs = paste("V", 1:7, sep = ""),
                 v_grp = "Group",
                 cur_grp_level = "current",
                 nstrata = 5)

It is extremely important to evaluate the propensity score adjustment results. In psrwe, functions are provided to visualize the balance in covariate distributions and propensity score distributions based on propensity score stratification.

plot(dta_ps, "balance")
## Warning: `count_()` is deprecated as of dplyr 0.7.0.
## Please use `count()` instead.
## See vignette('programming') for more help
## This warning is displayed once every 8 hours.
## Call `lifecycle::last_warnings()` to see where this warning was generated.
## Warning: `group_by_()` is deprecated as of dplyr 0.7.0.
## Please use `group_by()` instead.
## See vignette('programming') for more help
## This warning is displayed once every 8 hours.
## Call `lifecycle::last_warnings()` to see where this warning was generated.
## `mutate_if()` ignored the following grouping variables:
## Columns `Strata`, `Group`
## Warning: `rename_()` is deprecated as of dplyr 0.7.0.
## Please use `rename()` instead.
## This warning is displayed once every 8 hours.
## Call `lifecycle::last_warnings()` to see where this warning was generated.
## `mutate_if()` ignored the following grouping variables:
## Column `Group`
## `mutate_if()` ignored the following grouping variables:
## Columns `Strata`, `Group`
## `mutate_if()` ignored the following grouping variables:
## Column `Group`
## `mutate_if()` ignored the following grouping variables:
## Columns `Strata`, `Group`
## `mutate_if()` ignored the following grouping variables:
## Column `Group`

plot(dta_ps, "ps")

PS-integrated power prior approach for single arm studies

For single arm studies when there is one external data source, the function rwe_ps_powerp allows one to conduct the analysis proposed in Wang et. al. (2019). The method uses propensity score to pre-select a subset of real-world data containing patients that are similar to those in the current study in terms of covariates, and to stratify the selected patients together with those in the current study into more homogeneous strata. The power prior approach is then applied in each stratum to obtain stratum-specific posterior distributions, which are combined to complete the Bayesian inference for the parameters of interest.

ps_dist   <- rwe_ps_dist(dta_ps)
post_smps <- rwe_ps_powerp(dta_ps,
                           total_borrow = 40,
                           v_distance   = ps_dist$Dist[1:dta_ps$nstrata],
                           outcome_type = "binary",
                           v_outcome    = "Y")
## 
## SAMPLING FOR MODEL 'powerpsbinary' NOW (CHAIN 1).
## Chain 1: 
## Chain 1: Gradient evaluation took 6.6e-05 seconds
## Chain 1: 1000 transitions using 10 leapfrog steps per transition would take 0.66 seconds.
## Chain 1: Adjust your expectations accordingly!
## Chain 1: 
## Chain 1: 
## Chain 1: Iteration:    1 / 2000 [  0%]  (Warmup)
## Chain 1: Iteration:  200 / 2000 [ 10%]  (Warmup)
## Chain 1: Iteration:  400 / 2000 [ 20%]  (Warmup)
## Chain 1: Iteration:  600 / 2000 [ 30%]  (Warmup)
## Chain 1: Iteration:  800 / 2000 [ 40%]  (Warmup)
## Chain 1: Iteration: 1000 / 2000 [ 50%]  (Warmup)
## Chain 1: Iteration: 1001 / 2000 [ 50%]  (Sampling)
## Chain 1: Iteration: 1200 / 2000 [ 60%]  (Sampling)
## Chain 1: Iteration: 1400 / 2000 [ 70%]  (Sampling)
## Chain 1: Iteration: 1600 / 2000 [ 80%]  (Sampling)
## Chain 1: Iteration: 1800 / 2000 [ 90%]  (Sampling)
## Chain 1: Iteration: 2000 / 2000 [100%]  (Sampling)
## Chain 1: 
## Chain 1:  Elapsed Time: 0.190524 seconds (Warm-up)
## Chain 1:                0.154899 seconds (Sampling)
## Chain 1:                0.345423 seconds (Total)
## Chain 1: 
## 
## SAMPLING FOR MODEL 'powerpsbinary' NOW (CHAIN 2).
## Chain 2: 
## Chain 2: Gradient evaluation took 1.8e-05 seconds
## Chain 2: 1000 transitions using 10 leapfrog steps per transition would take 0.18 seconds.
## Chain 2: Adjust your expectations accordingly!
## Chain 2: 
## Chain 2: 
## Chain 2: Iteration:    1 / 2000 [  0%]  (Warmup)
## Chain 2: Iteration:  200 / 2000 [ 10%]  (Warmup)
## Chain 2: Iteration:  400 / 2000 [ 20%]  (Warmup)
## Chain 2: Iteration:  600 / 2000 [ 30%]  (Warmup)
## Chain 2: Iteration:  800 / 2000 [ 40%]  (Warmup)
## Chain 2: Iteration: 1000 / 2000 [ 50%]  (Warmup)
## Chain 2: Iteration: 1001 / 2000 [ 50%]  (Sampling)
## Chain 2: Iteration: 1200 / 2000 [ 60%]  (Sampling)
## Chain 2: Iteration: 1400 / 2000 [ 70%]  (Sampling)
## Chain 2: Iteration: 1600 / 2000 [ 80%]  (Sampling)
## Chain 2: Iteration: 1800 / 2000 [ 90%]  (Sampling)
## Chain 2: Iteration: 2000 / 2000 [100%]  (Sampling)
## Chain 2: 
## Chain 2:  Elapsed Time: 0.175069 seconds (Warm-up)
## Chain 2:                0.134614 seconds (Sampling)
## Chain 2:                0.309683 seconds (Total)
## Chain 2: 
## 
## SAMPLING FOR MODEL 'powerpsbinary' NOW (CHAIN 3).
## Chain 3: 
## Chain 3: Gradient evaluation took 1.9e-05 seconds
## Chain 3: 1000 transitions using 10 leapfrog steps per transition would take 0.19 seconds.
## Chain 3: Adjust your expectations accordingly!
## Chain 3: 
## Chain 3: 
## Chain 3: Iteration:    1 / 2000 [  0%]  (Warmup)
## Chain 3: Iteration:  200 / 2000 [ 10%]  (Warmup)
## Chain 3: Iteration:  400 / 2000 [ 20%]  (Warmup)
## Chain 3: Iteration:  600 / 2000 [ 30%]  (Warmup)
## Chain 3: Iteration:  800 / 2000 [ 40%]  (Warmup)
## Chain 3: Iteration: 1000 / 2000 [ 50%]  (Warmup)
## Chain 3: Iteration: 1001 / 2000 [ 50%]  (Sampling)
## Chain 3: Iteration: 1200 / 2000 [ 60%]  (Sampling)
## Chain 3: Iteration: 1400 / 2000 [ 70%]  (Sampling)
## Chain 3: Iteration: 1600 / 2000 [ 80%]  (Sampling)
## Chain 3: Iteration: 1800 / 2000 [ 90%]  (Sampling)
## Chain 3: Iteration: 2000 / 2000 [100%]  (Sampling)
## Chain 3: 
## Chain 3:  Elapsed Time: 0.200176 seconds (Warm-up)
## Chain 3:                0.12778 seconds (Sampling)
## Chain 3:                0.327956 seconds (Total)
## Chain 3: 
## 
## SAMPLING FOR MODEL 'powerpsbinary' NOW (CHAIN 4).
## Chain 4: 
## Chain 4: Gradient evaluation took 2.1e-05 seconds
## Chain 4: 1000 transitions using 10 leapfrog steps per transition would take 0.21 seconds.
## Chain 4: Adjust your expectations accordingly!
## Chain 4: 
## Chain 4: 
## Chain 4: Iteration:    1 / 2000 [  0%]  (Warmup)
## Chain 4: Iteration:  200 / 2000 [ 10%]  (Warmup)
## Chain 4: Iteration:  400 / 2000 [ 20%]  (Warmup)
## Chain 4: Iteration:  600 / 2000 [ 30%]  (Warmup)
## Chain 4: Iteration:  800 / 2000 [ 40%]  (Warmup)
## Chain 4: Iteration: 1000 / 2000 [ 50%]  (Warmup)
## Chain 4: Iteration: 1001 / 2000 [ 50%]  (Sampling)
## Chain 4: Iteration: 1200 / 2000 [ 60%]  (Sampling)
## Chain 4: Iteration: 1400 / 2000 [ 70%]  (Sampling)
## Chain 4: Iteration: 1600 / 2000 [ 80%]  (Sampling)
## Chain 4: Iteration: 1800 / 2000 [ 90%]  (Sampling)
## Chain 4: Iteration: 2000 / 2000 [100%]  (Sampling)
## Chain 4: 
## Chain 4:  Elapsed Time: 0.216382 seconds (Warm-up)
## Chain 4:                0.099123 seconds (Sampling)
## Chain 4:                0.315505 seconds (Total)
## Chain 4:
## Warning: There were 44 divergent transitions after warmup. See
## http://mc-stan.org/misc/warnings.html#divergent-transitions-after-warmup
## to find out why this is a problem and how to eliminate them.
## Warning: Examine the pairs() plot to diagnose sampling problems

The mixing of posterior samples should be checked to ensure the convergence of the posterior sampling.

traceplot(post_smps$stan_rst, pars = c("theta", "thetas"))

Results can be further summarized as:

summary(post_smps)
## $overall_mean
## [1] 0.3092625
## 
## $overall_variance
## [1] 0.0008288703
## 
## $theta_by_stratum
##   Strata     Theta    Variance
## 1      1 0.4057454 0.004544006
## 2      2 0.2594251 0.003561362
## 3      3 0.2063270 0.003229119
## 4      4 0.3506278 0.004614629
## 5      5 0.3241875 0.004484550

PS-integrated composite likelihood approach for single arm studies

For single arm studies when there is one external data source, the function rwe_ps_cl allows one to conduct the analysis proposed in Wang et. al. (2020). In this approach, within each propensity score stratum, a composite likelihood function is specified and utilized to down-weight the information contributed by the external data source. Estimates of the stratum-specific parameters are obtained by maximizing the composite likelihood function. These stratum-specific estimates are then combined to obtain an overall population-level estimate of the parameter of interest.

ps_borrow <- rwe_ps_borrow(total_borrow = 40, ps_dist)
rst_cl    <- rwe_ps_cl(dta_ps, v_borrow = ps_borrow, v_outcome = "Y")
summary(rst_cl)
## $overall_mean
## [1] 0.3009473
## 
## $jackknife_variance
## [1] 0.0007589453
## 
## $theta_by_stratum
##   Strata N1  N0     Theta    Variance
## 1      1 40 720 0.4010585 0.003486166
## 2      2 40 143 0.2503177 0.003297067
## 3      3 40  95 0.1924889 0.002852081
## 4      4 40  57 0.3440490 0.004661073
## 5      5 40  16 0.3168223 0.004414814

PS-integrated composite likelihood approach for randomized studies

For randomized studies when there is one external data source that contains control subjects, the function rwe_ps_cl2arm allows one to conduct the analysis proposed in Chen et. al. (2020). In this approach, a propensity score-integrated composite likelihood approach is developed for augmenting the control arm of the two-arm randomized controlled trial with patients from the external data source. An example is given below.

data(ex_dta_rct)
dta_ps_2arm <- rwe_ps(ex_dta_rct,
                      v_covs = paste("V", 1:7, sep = ""),
                      v_grp = "Group",
                      cur_grp_level = "current",
                      nstrata = 5)

rst_2arm <- rwe_ps_cl2arm(dta_ps_2arm,
                          v_arm = "Arm",
                          trt_arm_level = 1,
                          outcome_type = "continuous",
                          v_outcome = "Y",
                          total_borrow = 40)

print(rst_2arm)
## $treatment
## $treatment$overall_mean
## [1] 368.4005
## 
## $treatment$jackknife_variance
## [1] 19.15712
## 
## $treatment$theta_by_stratum
##   Strata N1 N0    Theta  Variance
## 1      1 21 21 386.4337 101.98887
## 2      2 15 15 353.5991  50.85397
## 3      3 22 22 374.9533 102.84782
## 4      4 19 19 367.9795  69.01982
## 5      5 23 23 355.6684 105.97465
## 
## 
## $control
## $control$overall_mean
## [1] 358.1482
## 
## $control$jackknife_variance
## [1] 10.25328
## 
## $control$theta_by_stratum
##   Strata N1  N0    Theta Variance
## 1      1 19 720 373.8150 41.71388
## 2      2 25 143 362.4157 32.31339
## 3      3 18  95 356.2292 99.33514
## 4      4 21  57 354.5110 37.34027
## 5      5 17  16 340.8876 47.54278
## 
## 
## $effect
## $effect$Estimate
## [1] 10.1551
## 
## $effect$Variance
## [1] 6.897587

Reference

Chen, W.C., Wang, C., Li, H., Lu, N., Tiwari, R., Xu, Y. and Yue, L.Q., 2020. Propensity score-integrated composite likelihood approach for augmenting the control arm of a randomized controlled trial by incorporating real-world data. Journal of Biopharmaceutical Statistics, 30(3), pp.508-520.

Wang, C., Lu, N., Chen, W. C., Li, H., Tiwari, R., Xu, Y., & Yue, L. Q. (2020). Propensity score-integrated composite likelihood approach for incorporating real-world evidence in single-arm clinical studies. Journal of biopharmaceutical statistics, 30(3), 495-507.

Wang, C., Li, H., Chen, W. C., Lu, N., Tiwari, R., Xu, Y., & Yue, L. Q. (2019). Propensity score-integrated power prior approach for incorporating real-world evidence in single-arm clinical studies. Journal of biopharmaceutical statistics, 29(5), 731-748.