FLAME (Fast, Large-scale Almost Matching Exactly) is a fast, interpretable matching method for causal inference. It matches units via a learned, weighted Hamming distance that determines which covariates are more important to match on. For more details, see the below section *Description of the Algorithm* or the original FLAME paper, linked here.

We can start by loading FLAME…

… and generating some toy data using the included `gen_data`

function.

```
set.seed(45)
n <- 100
p <- 5
data <- gen_data(n, p) # Data we would like to match
holdout <- gen_data(n, p) # Data we will train on, to compute PE
```

Note that all our covariates are *factors*, because FLAME is designed to work with categorical covariates:

If this is not the case, they will be assumed to be continuous covariates and binned prior to matching. *This use of FLAME is not recommended.* **To be clear: any covariates that are not continuous, that you would like to match exactly on, must be passed to FLAME as factors.**

In addition to the covariates to match on, `data`

contains an outcome and a treated column:

The outcome must be numeric, either binary or continuous. FLAME focuses on binary treatments and the treatment column must either be logical or binary numeric.

From here, we can run FLAME with its default parameters. This will match units on the covariates – here, `X1`

, `X2`

, `X3`

, `X4`

, `X5`

– and output information about the matches that were made.

By default, `FLAME`

returns a list with 6 entries:

The first, `FLAME_out$data`

contains the original data frame with several modifications:

- There is an extra logical column,
`FLAME_out$data$matched`

, that indicates whether or not a unit was matched. This can be useful if, for example, you’d like to use only the units that were matched for subsequent analysis:

There is an extra numeric column,

`FLAME_out$data$weight`

that denotes on how many different sets of covariates a unit was matched. By default, this will be 1 if a unit is matched and 0 otherwise. With the`replace = TRUE`

argument, however, units are allowed to match several times on multiple sets of covariates and their values for`weight`

can therefore be greater than 1. These weights can be used when estimating treatment effects.Regardless of their original names, the columns denoting treatment and outcome in the data will be renamed

`treated`

and`outcome`

and they are moved to be located after all the covariate data.Units that were not matched on all covariates, will have a * in place of their covariate value for all covariates for which they were not matched.

```
head(FLAME_out$data)
#> X1 X2 X3 X4 X5 outcome treated matched weight
#> 1 1 1 0 * * 10.6908610 1 TRUE 1
#> 2 0 0 0 0 0 -0.3718777 0 TRUE 1
#> 3 * * * * * -0.4749807 1 FALSE 0
#> 4 0 0 0 1 0 -0.8616791 0 TRUE 1
#> 5 * * * * * 2.5016291 1 FALSE 0
#> 6 0 0 0 1 0 -0.1160383 0 TRUE 1
```

The above, for example, implies that while unit 2 was matched to units that also had values (`X1`

, `X2`

, `X3`

, `X4`

, `X5`

) = (0, 0, 0, 0, 0), unit 1 was matched to units that shared values of (`X1`

, `X2`

, `X3`

) = (1, 1, 0), but that differed in their values of `X4`

and `X5`

. Units 3 and 5 were not matched at all.

The second, `MGs`

is a list, each entry of which contains the units in a single matched group.

That is, units 2, 58, 60, 75, and 95 were all matched together and there are 19 matched groups total.

The third, `CATE`

, complements `MGs`

by supplying the conditional average treatment effect (CATE) for each matched group. For example, the CATE of the matched group above is given by:

The fourth, `matched_on`

, is a list also corresponding to `MGs`

that gives the covariates, and their values, on which units in each matched group were matched.

The above shows that each of the units in the first matched group had covariate values (`X1`

, `X2`

, `X3`

, `X4`

, `X5`

) = (0, 0, 0, 0, 0). For matched groups not formed on all covariates, some of these entries will be missing:

Thus, the units in the 17th matched group, as defined by `MGs[[7]]`

, shared the same values of `X1`

, `X2`

, `X3`

, and `X4`

, but not of `X5`

.

The fifth, `matching_covs`

is a list, which shows the covariates for matching on every iteration of FLAME:

```
FLAME_out$matching_covs
#> [[1]]
#> [1] "X1" "X2" "X3" "X4" "X5"
#>
#> [[2]]
#> [1] "X1" "X2" "X3" "X4"
#>
#> [[3]]
#> [1] "X1" "X2" "X3"
```

Thus, first, matches were attempted on covariates `X1`

, `X2`

, `X3`

, `X4`

, `X5`

. Then, matches were attempted on all covariates but `X5`

, and so on. Note that entries of `matching_covs`

do not necessarily denote covariates on which matches were *successfully* made; rather, they denote the covariates which were used to (try and) match on every iteration of FLAME.

The sixth, `dropped`

describes the order in which covariates were dropped:

Thus, first covariate `X5`

was dropped, then `X4`

, and so on. This information is directly inferrable from `matching_covs`

, but for large numbers of covariates, `dropped`

provides an easier way of identifying this order.

After `FLAME`

has been run, the matched data can be used for a variety of purposes. The `FLAME`

package provides functionality for a few quick, post-matching analyses, via the functions `MG`

, `CATE`

, `ATE`

, and `ATT`

.

The function `MG(units, FLAME_out, index_only = FALSE)`

takes in a vector of units, whose matched groups you would like returned, and the output of a call to `FLAME`

. If we want to see the matched group of units 1 and 2, for example, we can run:

```
MG(c(1, 2), FLAME_out)
#> [[1]]
#> X1 X2 X3 X4 X5 outcome treated
#> 1 1 1 0 * * 10.690861 1
#> 8 1 1 0 * * 8.796627 1
#> 12 1 1 0 * * 11.365257 1
#> 28 1 1 0 * * 8.200261 1
#> 74 1 1 0 * * 5.205795 0
#> 76 1 1 0 * * 10.140320 1
#> 91 1 1 0 * * 5.551738 0
#> 93 1 1 0 * * 3.492163 0
#> 97 1 1 0 * * 3.752434 0
#>
#> [[2]]
#> X1 X2 X3 X4 X5 outcome treated
#> 2 0 0 0 0 0 -0.3718777 0
#> 58 0 0 0 0 0 0.6972013 0
#> 60 0 0 0 0 0 2.9151768 1
#> 75 0 0 0 0 0 1.0590852 0
#> 95 0 0 0 0 0 0.3293956 0
```

This returns a list of two data frames, the first corresponding to unit 1 and the second to unit 2. Each contains information for all units in the corresponding matched groups. The asterisks in the last two columns of the first data frame indicate that these units did not match on `X4`

or `X5`

. If we only want the indices of the units in each matched group, we can specify `index_only = TRUE`

:

```
MG(c(1, 2), FLAME_out, index_only = TRUE)
#> [[1]]
#> [1] 1 8 12 28 74 76 91 93 97
#>
#> [[2]]
#> [1] 2 58 60 75 95
```

`CATE(units, FLAME_out)`

takes in the same first two arguments and gives the estimated CATEs of the units in `units`

. The CATE of a unit is defined to be the CATE of its matched group and the CATE of a matched group is difference between average treated and control outcomes in the matched group.

The CATEs of units 1 and 2 are thus

`ATE(FLAME_out)`

and `ATT(FLAME_out)`

take in the output of a call to `FLAME`

and return the estimated average treatment effect and the estimated average treatment effect on the treated, respectively.

Below are brief descriptions of the main arguments that may be passed to `FLAME`

. For their complete descriptions, and those of all acceptable arguments, please refer to the documentation.

These are arguments that govern the format in which data is passed to `FLAME`

.

`data`

: Either a data frame or path to a .csv file containing the data to be matched. If a path to a .csv file, all covariates will be assumed to be categorical Treatments are assumed to be binary (can be input as logical) and outcomes numeric or binary. Treatments and outcome should not be coded as factors. Covariates should be factors; otherwise, they will be interpreted as continuous covariates and binned prior to matching. Using FLAME to match on binned, continuous covariates is*not*recommended. In addition, if a supplied factor has \(k\) levels, they must be: \(0, 1, \dots, k - 1\). This will change in a future update.`holdout`

: Either a data frame, or path to a .csv file or a value between 0 and 1. In the first two cases, the argument indicates the holdout set to be used for computing predictive error. In the third case, that proportion of`data`

will be used as a holdout set and only the remaining proportion will be matched. In this case, the rows (units) of the original`data`

input to`FLAME`

that are matched are those specified by`rownames(FLAME_out$data)`

. Restrictions on column types are the same as for`data`

. Must have same column names and order as`data`

.`treated_column_name`

: A character with the name of the column to be used as treatment in`data`

. Defaults to ‘treated’.`outcome_column_name`

: A character with the name of the column to be used as outcome in`data`

. Defaults to ‘outcome’.

These are arguments that deal with features of the underlying FLAME algorithm.

`C`

: The hyperparameter governing the relative weights of the balancing factor and predictive error in determining match quality.`replace`

: If`TRUE`

, allows the same unit to be matched multiple times, on different sets of covariates. For example, if`TRUE`

and two units match exactly on all covariates, they will also match on every subsequent iteration of FLAME.`verbose`

: Controls how FLAME displays progress while running. If 0, no output. If 1, only outputs the stopping condition. If 2, outputs the iteration and number of unmatched units every 5 iterations, and the stopping condition. If 3, outputs the iteration and number of unmatched units every iteration, and the stopping condition.`PE_method`

: One of ‘ridge’ or ‘xgb’, respectively denoting whether ridge regression or xgboost is used to compute the predictive error on the holdout set. The former relies on`glmnet::cv.glmnet`

and cross validates over \(\lambda\), with`alpha = 0`

,`nfolds = 5`

, and all other parameters at their defaults. The latter relies on`xgboost::xgb.cv`

and cross validates over a grid of`eta`

,`max_depth`

,`alpha`

,`nrounds`

, and`subsample`

, leaving all other parameters at their defaults. Ignored if`user_PE_fit`

is supplied.`user_PE_fit`

and`user_PE_fit_params`

:`user_PE_fit`

, is an optional, user supplied function that fits a model for an outcome from covariates. Must take in a matrix of covariates as its first argument and a vector outcome as its second argument. If supplied,`PE_method`

will be ignored.`user_PE_fit_params`

, is a named list of optional parameters to be used by`user_PE_fit`

.`user_PE_predict`

and`user_PE_predict_params`

:`user_PE_predict`

is an optional, user supplied function to generate predictions from the output of`user_PE_fit`

. It must take the output of`user_PE_fit`

as its first argument and a matrix of values for which to make predictions as its second argument. If not supplied, defaults to`predict`

.`user_PE_predict_params`

is a named list of optional parameters to be used by`user_PE_predict`

.

To illustrate the usage of these last four parameters, we can have `FLAME`

compute PE via Bayesian Additive Regression Trees (BART) with 100 trees as follows:

```
library(dbarts)
my_fit <- dbarts::bart
my_fit_params <- list(ntree = 100, verbose = FALSE, keeptrees = TRUE)
my_predict <- function(bart_fit, new_data) {
return(colMeans(predict(bart_fit, new_data)))
}
FLAME_out <-
FLAME(data = data, holdout = holdout,
user_PE_fit = my_fit, user_PE_fit_params = my_fit_params,
user_PE_predict = my_predict)
```

By default, FLAME terminates when all covariates have been dropped or all control / treatment units have been matched. There are various early stopping arguments that can be supplied to alter this behavior. In all cases, however, FLAME still terminates if all covariates have been dropped or all control / treatment units have been matched, even if the user-specified stopping condition has not yet been met.

`early_stop_iterations`

: A number of iterations, corresponding to a number of covariates dropped, after which FLAME will automatically stop. A value of 0 has FLAME perform a single round of exact matching on all covariates and then stop.`early_stop_epsilon`

: If FLAME attemts to drop a covariate that would raise the PE above (1 + early_stop_epsilon) times the baseline PE (the PE before any covariates have been dropped), FLAME will stop.`early_stop_bf`

: If FLAME attempts to drop a covariate that would lead to a BF below this value, FLAME stops.`early_stop_pe`

: If FLAME attempts to drop a covariate that would lead to a PE below this value, FLAME stops.`early_stop_control`

: If FLAME attempts to drop a covariate that would lead the proportion of control units that are unmatched to fall below this value, FLAME stops.`early_stop_treated`

: If FLAME attempts to drop a covariate that would lead the proportion of treatment units that are unmatched to fall below this value, FLAME stops.

FLAME offers several options for dealing with missing data, outlined below:

`missing_data`

and`n_data_imputations`

: These two arguments govern FLAME’s response to missingness in the data to be matched. If`missing_data`

is 0, it is assumed that there is no missingness. If it is 1, units with missingness are dropped. If it is 2,`n_data_imputations`

imputed datasets are generated using`mice::mice`

. In this case, the FLAME algorithm will be run on each imputed dataset and all results returned. If it is 3, units will be prevented from matching on the covariates they are missing.`missing_holdout`

and`n_holdout_imputations`

: These two arguments govern FLAME’s response to missingness in the holdout data. If`missing_holdout`

is 0, it is assumed that there is no missingness. If it is 1, units with missingness are dropped. If it is 2,`n_holdout_imputations`

imputed holdout datasets are generated using`mice::mice`

. In this case, the predictive error computed by`FLAME`

is the average of the predictive errors across the imputed holdout datasets.

FLAME operates by iteratively matching all possible units on a set of covariates and then dropping one of those covariates to make more matches. Roughly, units are said to ‘match’ on a set of covariates if they have identical values of all those covariates. FLAME is thus designed to be run on categorical covariates. However, continuous covariates can be discretized, via histogram binning rules and then passed to FLAME.

More specifically, we define our inputs to the algorithm as the datasets \(\mathcal{S} = (X, Y, T)\) and \(\mathcal{S}^H = (X^H, Y^H, T^H)\), where \(X \in \mathbb{R}^{n \times d}\) denotes the \(d\) covariates of the \(n\) units, \(Y \in \mathbb{R}^n\) denotes their outcomes, and \(T \in \mathbb{R}^n\) denotes their *binary* treatment assignments. We will refer to a unit \(i\) as ‘control’ if \(T_i = 0\) and as ‘treated’ if \(T_i = 1\). The dataset \(\mathcal{S}^H\) is identically structured, but for a separate, holdout set of units.

We denote the covariates used to match on an iteration \(l\) by a binary vector \(\boldsymbol{\theta}^{l} \in \mathbb{R}^d\). The \(j\)’th entry of \(\boldsymbol{\theta}^{l}\) denotes whether the \(j\)’th covariate is used to match units on iteration \(l\). When we go from iteration \(l\) to iteration \(l + 1\), we change a single entry of \(\boldsymbol{\theta}^{l}\) from 1 to 0 to generate \(\boldsymbol{\theta}^{l+1}\) and then match all possible units on \(\boldsymbol{\theta}^{l+1}\). *There are two key points regarding these matches:* 1: matches are only made for units in \(\mathcal{S}\) and not for units in \(\mathcal{S}^H\) and 2: units with identical values of the covariates indicated by \(\boldsymbol{\theta}^{l+1}\) are only matched if at least one is control and one is treated.

More specifically, FLAME begins with \(\boldsymbol{\theta}^{0} = \mathbf{1}_d\); that is, by attempting to match units on all covariates. At any iteration \(l\), it then drops the covariate yielding the greatest increase in match quality (\(\mathtt{MQ}\)), defined as \(\mathtt{MQ} := C \cdot \mathtt{BF} - \mathtt{PE}\), where \(C\) is a hyperparameter. The balancing factor, \(\mathtt{BF}\), at an iteration \(l\), is defined as the proportion of control units, plus the proportion of treated units, that are matched *by the update from* \(\boldsymbol{\theta}^{l}\) *to* \(\boldsymbol{\theta}^{l + 1}\). The predictive error, \(\mathtt{PE}\), at an iteration \(l\), is defined as the training MSE incurred when predicting \(Y^{H}\) from the subset of \(X^H\) indicated by \(\boldsymbol{\theta}^{l + 1}\). In this way, FLAME encourages making many matches (lowering variance of treatment effect estimates) and matching on covariates important to the outcome (lowering bias of treatment effect estimates).

By default, the algorithm terminates when all covariates have been dropped or all treated/control units have been matched, but we provide several options for early stopping, described above

For more details, see the FLAME paper