The `mdsstat`

package:

- Standardizes the output of various statistical trending algorithms
- Allows running of multiple algorithms on the same data
- Allows running of both
**disproportionality**and**quality control**algorithms - Creates lightweight analysis definitions and output files for auditability, documentation, and reproducibility

**Why?**

There are many ways to trend medical device event data. Some are drawn from the quality control discipline, others from disproportionality analysis used in pharmacoepidemiology, and yet others from the general field of statistics.

There is a need to rigorously compare and contrast these various methods to more fully understand their respective performance and applicability in surveillance of medical devices.

**How?**

The `mdsstat`

package aims to provide a collection of statistical trending algorithms used in medical device surveillance. Furthermore, each algorithm is written with a standardized, reusable framework philosophy. The same input data can be fed through multiple algorithms. All algorithms return **results that can be sorted, stacked, and compared.**

This package is written in tandem with the `mds`

package. These are complementary in the sense that:

`mds`

standardizes medical device event data.`mdsstat`

standardizes the statistical trending of medical device event data.

While `mdsstat`

algorithms can run on generic R data frames, additional efficiency and traceability benefits are derived by running on data frames generated by `mds::time_series()`

from the `mds`

package.

**Purpose of This Vignette**

- Introduce the algorithms
- Show how to run one algorithm
- Show how to run multiple algorithms

`mds::time_series()`

The following examples use a sample list of three time series generated by `mds::time_series()`

from the `mds`

package, saved as `mds_ts`

in this package. The underlying data were queried from the FDA MAUDE API. Furthermore, a simulated exposure dataset was generated to provide exposure data.

This is the current list of algorithms available:

Function | Description |
---|---|

`shewhart()` |
Shewhart x-bar Control Chart with 4 Western Electric Rules |

`prr()` |
Proportional Reporting Ratio |

`poisson_rare()` |
Poisson Test on Rare Events |

These are planned/proposed algorithms to add:

Function | Description |
---|---|

`ebgm()` |
Empirical Bayes Geometric Mean (basis of the Gamma Poisson Shrinker) |

`bcpnn()` |
Bayesian Confidence Propagation Neural Network |

`ror()` |
Reporting Odds Ratio |

`chi_square()` |
Chi-Square Test |

`poisson_rare()` |
Poisson Test on Rare Events |

`changepoint()` |
Binary Segmentation Changepoint |

`cusum()` |
Cumulative Sum Control Chart with 4 Western Electric Rules |

`ewma()` |
Exponentially Weighted Moving Average with 4 WE Rules |

`cox_stuart()` |
Cox-Stuart Test |

`uptrend()` |
Linear Uptrend by Linear Modeling |

In basic usage, running an `mdsstat`

algorithm requires two considerations:

- Input data format (may be reused in other algorithms)
- Algorithm parameter settings (unique to the algorithm)

Here are some example algorithm calls:

```
# Example mds_ts data
data <- mds_ts[[3]]
data$rate <- ifelse(is.na(data$nA), 0, data$nA) / data$exposure
# Four different algorithm calls
shewhart(data)
prr(data)
shewhart(data, ts_event=c(Rate="rate"), we_rule=2L)
poisson_rare(data, p_rate=0.3)
```

Input data shall be either a generic data frame (general usage) or an `mds_ts`

data frame. Both are conceptually structured like time series.

`mds_ts`

Usage`mds_ts`

data frames are generated by `mds::time_series()`

from the `mds`

package. These data frames are already structured for seamless integration into `mdsstat`

algorithms.

Note the following:

- Disproportionality algorithms will run only if the
`mds_ts`

data contains the columns`nA`

,`nB`

,`nC`

, and`nD`

. These are generated by specifying device and event hierarchies using`mds`

package functions. - Algorithms run by default using the
`nA`

column for event occurrence. - If running on event rate is desired, you may calculate an additional field and specify that field using the
`ts_event`

parameter.

*Running an algorithm on event rate instead of event count*

```
data <- mds_ts[[3]]
data$rate <- ifelse(is.na(data$nA), 0, data$nA) / data$exposure
shewhart(data, ts_event=c("Rate"="rate"))
```

A generic data frame contains two columns, `time`

and `event`

, where for each unique sequential time (numeric or Date), there corresponds a number indicating the event occurrence. The event occurrence may commonly be the count of events or event rate.

An example:

```
data <- data.frame(time=c(1:25), event=as.integer(stats::rnorm(25, 100, 25)))
shewhart(data)
```

Because disproportionality analysis is run on count data in a 2x2 contingency table, this data frame requires five columns, `time`

, `nA`

, `nB`

, `nC`

, and `nD`

. For each unique sequential time (numeric or Date), there corresponds a set of numbers indicating the event counts. The latter four columns correspond to counts for cells A through D of the contingency table.

An example:

```
data <- data.frame(time=c(1:25),
nA=as.integer(stats::rnorm(25, 25, 5)),
nB=as.integer(stats::rnorm(25, 50, 5)),
nC=as.integer(stats::rnorm(25, 100, 25)),
nD=as.integer(stats::rnorm(25, 200, 25)))
prr(data)
```

To run algorihtms on both counts/rates and DPA, just include all the above columns, such as:

```
data <- data.frame(time=c(1:25),
event=as.integer(stats::rnorm(25, 100, 25)),
nA=as.integer(stats::rnorm(25, 25, 5)),
nB=as.integer(stats::rnorm(25, 50, 5)),
nC=as.integer(stats::rnorm(25, 100, 25)),
nD=as.integer(stats::rnorm(25, 200, 25)))
shewhart(data)
prr(data)
```

`mdsstat`

makes it easy to run multiple algorithms and variants of the same algorithm on your data.

Just two steps are required:

- Use
`define_algos()`

to set a list of algorithms with corresponding parameter settings. - Use
`run_algos()`

to run the algorithms defined in Step 1 on your data.

For example:

```
# Your data
data <- mds_ts[[3]]
data$rate <- ifelse(is.na(data$nA), 0, data$nA) / data$exposure
# Save a list of algorithms to run
x <- list(prr=list(),
shewhart=list(),
shewhart=list(ts_event=c(Rate="rate"), we_rule=2L),
poisson_rare=list(p_rate=0.3))
algos <- define_algos(x)
# Run algorithms
run_algos(data, algos)
```

By default, `run_algos()`

returns the results of each algorithm as a row in a data frame. This allows for easy tabular review of algorithm performance.

Similar to the default output of `run_algos()`

, you may convert the output of any `mdsstat`

algorithm from the default list to a data frame row. Simply use `test_as_row()`

on any algorithm output.

For example:

```
data <- data.frame(time=c(1:25), event=as.integer(stats::rnorm(25, 100, 25)))
result <- shewhart(data)
test_as_row(result)
```