# Statistical Trending on Medical Device Surveillance Data

#### 2018-09-21

The mdsstat package:

• Standardizes the output of various statistical trending algorithms
• Allows running of multiple algorithms on the same data
• Allows running of both disproportionality and quality control algorithms
• Creates lightweight analysis definitions and output files for auditability, documentation, and reproducibility

Why?

There are many ways to trend medical device event data. Some are drawn from the quality control discipline, others from disproportionality analysis used in pharmacoepidemiology, and yet others from the general field of statistics.

There is a need to rigorously compare and contrast these various methods to more fully understand their respective performance and applicability in surveillance of medical devices.

How?

The mdsstat package aims to provide a collection of statistical trending algorithms used in medical device surveillance. Furthermore, each algorithm is written with a standardized, reusable framework philosophy. The same input data can be fed through multiple algorithms. All algorithms return results that can be sorted, stacked, and compared.

This package is written in tandem with the mds package. These are complementary in the sense that:

• mds standardizes medical device event data.
• mdsstat standardizes the statistical trending of medical device event data.

While mdsstat algorithms can run on generic R data frames, additional efficiency and traceability benefits are derived by running on data frames generated by mds::time_series() from the mds package.

Purpose of This Vignette

## Data: MAUDE Time Series Generated from mds::time_series()

The following examples use a sample list of three time series generated by mds::time_series() from the mds package, saved as mds_ts in this package. The underlying data were queried from the FDA MAUDE API. Furthermore, a simulated exposure dataset was generated to provide exposure data.

library(mdsstat)
data(mds_ts)

## The Algorithms

This is the current list of algorithms available:

Function Description
shewhart() Shewhart x-bar Control Chart with 4 Western Electric Rules
prr() Proportional Reporting Ratio
poisson_rare() Poisson Test on Rare Events

These are planned/proposed algorithms to add:

Function Description
ebgm() Empirical Bayes Geometric Mean (basis of the Gamma Poisson Shrinker)
bcpnn() Bayesian Confidence Propagation Neural Network
ror() Reporting Odds Ratio
chi_square() Chi-Square Test
poisson_rare() Poisson Test on Rare Events
changepoint() Binary Segmentation Changepoint
cusum() Cumulative Sum Control Chart with 4 Western Electric Rules
ewma() Exponentially Weighted Moving Average with 4 WE Rules
cox_stuart() Cox-Stuart Test
uptrend() Linear Uptrend by Linear Modeling

## Run One Algorithm

In basic usage, running an mdsstat algorithm requires two considerations:

1. Input data format (may be reused in other algorithms)
2. Algorithm parameter settings (unique to the algorithm)

Here are some example algorithm calls:

# Example mds_ts data
data <- mds_ts[[3]]
data$rate <- ifelse(is.na(data$nA), 0, data$nA) / data$exposure

# Four different algorithm calls
shewhart(data)
prr(data)
shewhart(data, ts_event=c(Rate="rate"), we_rule=2L)
poisson_rare(data, p_rate=0.3)

### Input Data Format

Input data shall be either a generic data frame (general usage) or an mds_ts data frame. Both are conceptually structured like time series.

#### mds_ts Usage

mds_ts data frames are generated by mds::time_series() from the mds package. These data frames are already structured for seamless integration into mdsstat algorithms.

Note the following:

• Disproportionality algorithms will run only if the mds_ts data contains the columns nA, nB, nC, and nD. These are generated by specifying device and event hierarchies using mds package functions.
• Algorithms run by default using the nA column for event occurrence.
• If running on event rate is desired, you may calculate an additional field and specify that field using the ts_event parameter.

Running an algorithm on event rate instead of event count

data <- mds_ts[[3]]
data$rate <- ifelse(is.na(data$nA), 0, data$nA) / data$exposure
shewhart(data, ts_event=c("Rate"="rate"))

#### General Usage: Count or Rate Data

A generic data frame contains two columns, time and event, where for each unique sequential time (numeric or Date), there corresponds a number indicating the event occurrence. The event occurrence may commonly be the count of events or event rate.

An example:

data <- data.frame(time=c(1:25), event=as.integer(stats::rnorm(25, 100, 25)))
shewhart(data)

#### General Usage: Data for Disproportionality Analysis (DPA)

Because disproportionality analysis is run on count data in a 2x2 contingency table, this data frame requires five columns, time, nA, nB, nC, and nD. For each unique sequential time (numeric or Date), there corresponds a set of numbers indicating the event counts. The latter four columns correspond to counts for cells A through D of the contingency table.

An example:

data <- data.frame(time=c(1:25),
nA=as.integer(stats::rnorm(25, 25, 5)),
nB=as.integer(stats::rnorm(25, 50, 5)),
nC=as.integer(stats::rnorm(25, 100, 25)),
nD=as.integer(stats::rnorm(25, 200, 25)))
prr(data)

#### General Usage: Count/Rate and DPA Data

To run algorihtms on both counts/rates and DPA, just include all the above columns, such as:

data <- data.frame(time=c(1:25),
event=as.integer(stats::rnorm(25, 100, 25)),
nA=as.integer(stats::rnorm(25, 25, 5)),
nB=as.integer(stats::rnorm(25, 50, 5)),
nC=as.integer(stats::rnorm(25, 100, 25)),
nD=as.integer(stats::rnorm(25, 200, 25)))
shewhart(data)
prr(data)

## Run Multiple Algorithms

mdsstat makes it easy to run multiple algorithms and variants of the same algorithm on your data.

Just two steps are required:

1. Use define_algos() to set a list of algorithms with corresponding parameter settings.
2. Use run_algos() to run the algorithms defined in Step 1 on your data.

For example:

# Your data
data <- mds_ts[[3]]
data$rate <- ifelse(is.na(data$nA), 0, data$nA) / data$exposure

# Save a list of algorithms to run
x <- list(prr=list(),
shewhart=list(),
shewhart=list(ts_event=c(Rate="rate"), we_rule=2L),
poisson_rare=list(p_rate=0.3))
algos <- define_algos(x)

# Run algorithms
run_algos(data, algos)

By default, run_algos() returns the results of each algorithm as a row in a data frame. This allows for easy tabular review of algorithm performance.

## One Algorithm Returned as a Data Frame Row

Similar to the default output of run_algos(), you may convert the output of any mdsstat algorithm from the default list to a data frame row. Simply use test_as_row() on any algorithm output.

For example:

data <- data.frame(time=c(1:25), event=as.integer(stats::rnorm(25, 100, 25)))
result <- shewhart(data)
test_as_row(result)