# Methodology

#### 2019-02-28

Methodology described in this vignette is adapted from the article “BIGL: Biochemically Intuitive Generalized Loewe null model for prediction of the expected combined effect compatible with partial agonism and antagonism” (2017) by K. Van der Borght, A. Tourny, R. Bagdziunas, O. Thas, M. Nazarov, H. Turner, B. Verbist and H. Ceulemans (doi:10.1038/s41598-017-18068-5) as well as its technical supplement. We advise the reader to consult it for a deeper understanding of the procedure described next.

Further chapters were added as extensions on top of the original article regarding variance heterogeneity and Bliss independence.

## Marginal monotherapy curves

First, a monotherapy model is described by the following equation.

$y\left(d\right) = b + \dfrac{m - b}{1 + \left(\frac{\operatorname{EC50}}{d}\right)^{|h|}}$

where $$y$$ is the response (or effect), $$d$$ is the dose (or concentration) of the compound, $$h$$ is the Hill’s coefficient and $$b$$ and $$m$$ are respectively baseline and maximum response for that compound. Lastly, $$\textrm{EC50}$$ stands for the dose level of the compound needed to attain the midpoint effect, i.e. $y\left(\textrm{EC50}\right) = b + \frac{m - b}{2}$

Note that $$m > b$$ if and only if the response is increasing with the dose of the compound. If the response is decreasing, then $$m < b$$.

This monotherapy equation is estimated for both compounds with the constraint that $$b$$, the baseline level, is shared across compounds. This baseline level is denoted by b in the parameter vector. Additionally, m1 and m2 in the parameter vector stand for estimates of maximal responses $$m_{1}$$ and $$m_{2}$$, respectively, whereas h1 and h2 are Hill’s coefficients (slope) of the monotherapy curve for each compound. Lastly, e1 and e2 are log-transformed inflection points, i.e. e1 $$= \log\left(\textrm{EC50}_{1}\right)$$ and e2 $$= \log\left(\textrm{EC50}_{2}\right)$$.

## Null models of no synergy

### Occupancy

Define the occupancy level $$\textrm{occup}$$, i.e. the fractional (enzymatic) effect or observed effect relative to maximal effect, for both compounds at given dose levels as $\textrm{occup}_{1}\left(d_{1}\right) = \frac{1}{1 + \left(\frac{\operatorname{EC50}_{1}}{d_{1}}\right)^{h_{1}}}$ $\textrm{occup}_{2}\left(d_{2}\right) = \frac{1}{1 + \left(\frac{\operatorname{EC50}_{2}}{d_{2}}\right)^{h_{2}}}$ Alternatively, the above equations can be rearranged to express dose in terms of occupancy so that $d_{1} = \operatorname{EC50}_{1} \left(\frac{1}{\operatorname{occup_{1}}} - 1 \right)^{-1/h_{1}}$ $d_{2} = \operatorname{EC50}_{2} \left(\frac{1}{\operatorname{occup_{2}}} - 1 \right)^{-1/h_{2}}$ Although the occupancy was considered here in the marginal case, it is equally well-defined when compounds are combined and is understood as the fraction of enzyme bound to any compound. It can thus be used to re-express classical Loewe additivity equations.

### Classical Loewe model

In the classical Loewe model where both marginal models share upper $$(m)$$ and lower $$(b)$$ asymptotes, occupancy is defined as the solution to this additivity equation for each dose combination $$(d_{1}, d_{2})$$, namely $\frac{d_1\left(\textrm{occup}^{-1} - 1\right)^{1/h_{1}}}{\textrm{EC50}_{1}}+ \frac{d_2\left(\textrm{occup}^{-1} - 1\right)^{1/h_{2}}}{\textrm{EC50}_{2}} = 1$

Once occupancy is computed, in the classical Loewe model the predicted response at dose combination $$(d_{1}, d_{2})$$ can be calculated to be $$$\begin{split} y & = b + \left(m - b\right) \times \textrm{occup} = \\ & = b + \left(m - b\right) \times \textrm{occup} \times \left[\frac{d_{1}\left(\textrm{occup}^{-1} - 1\right)^{1/h_{1}}}{\textrm{EC50}_{1}} + \frac{d_{2}\left(\textrm{occup}^{-1} - 1\right)^{1/h_{2}}}{\textrm{EC50}_{2}}\right] = \\ & = b + \textrm{occup} \times \left[\frac{ \left(m - b\right) d_{1}\left(\textrm{occup}^{-1} - 1\right)^{1/h_{1}}}{\textrm{EC50}_{1}} + \frac{ \left(m - b\right) d_{2}\left(\textrm{occup}^{-1} - 1\right)^{1/h_{2}}}{\textrm{EC50}_{2}}\right] \end{split}$$$

### Generalized Loewe model

Generalized Loewe model extends the classical Loewe model by allowing compounds to have different upper asymptotes so that when adjusted, the above predicted response is written instead as $y = b + \textrm{occup} \times \left[\frac{\left(m_{1} - b\right) d_{1}\left(\textrm{occup}^{-1} - 1\right)^{1/h_{1}}}{\textrm{EC50}_{1}} + \frac{\left(m_{2} - b\right) d_{2}\left(\textrm{occup}^{-1} - 1\right)^{1/h_{2}}}{\textrm{EC50}_{2}}\right]$

In particular, if $$m_{1} = m_{2}$$, then generalized Loewe is equivalent to the classical Loewe.

### Highest Single Agent

A null model based on the Highest Single Agent (HSA) model does not attempt to model interaction effects at all and the predicted effect of a combination is either the minimum (if marginal curves are decreasing) or the maximum (if marginal curves are increasing) of both monotherapy curves.

### Bliss independence model

Bliss independence implies that two agents do not cooperate, i.e. act independently of each other.

Bliss independence model is formulated for the fractional responses $$f$$, where the predicted response $$f_{12}$$ at dose combination $$(d_{1}, d_{2})$$ is defined as:

$f_{12}(d_1, d_2) = f_1(d_1) + f_2(d_2) - f_1(d_1)f_2(d_2),$

with $f_1(d_1) = \frac{y\left(d_1\right) - b}{m_1 - b} = \frac{1}{1 + \left(\frac{\operatorname{EC50}_{1}}{d_{1}}\right)^{|h_{1}|}}$ $f_2(d_2) = \frac{y\left(d_2\right) - b}{m_2 - b} = \frac{1}{1 + \left(\frac{\operatorname{EC50}_{2}}{d_{2}}\right)^{|h_{2}|}}$ In the classical Bliss independence model, marginal models share baseline and maximum response.

To allow the compounds to have different maximal responses, the fractional responses are rescaled to the maximum range (i.e. absolute difference between baseline and maximal response). Then the predicted response is defined as:
$y = b + (m_{max}-b) \left[ \tilde{f_1}(d_1) + \tilde{f_2}(d_2) - \tilde{f_1}(d_1)\tilde{f_2}(d_2) \right],$ where $$m_{max}$$ is one of $$m_1$$ or $$m_2$$, for which the value of $$|m_i - b|$$ is larger, and $\tilde{f_i} = f_i\frac{m_i-b}{m_{max}-b}~~\text{for}~~i = 1, 2.$

### Calculation procedure

In order to evaluate any of the null models described above, the fitSurface function will use the monotherapy parameter estimates from the previous step. The idea is if there are synergistic or antagonistic effects, then administration of both compounds will lead to important deviations from what combined monotherapy data would suggest according to the null model. Routines within fitSurface function do essentially the following.

1. Find occupancy for each combination of doses by solving the additivity equation of the classical Loewe model. This step does not require knowledge of the baseline or maximal response for either of the compounds. Occupancy solution is also reported in the HSA model case although occupancy plays no role in such a model.
2. Compute the predicted response based on the above described response equations and the previously computed occupancy rate for each dose combination.
3. If desired, the function will then calculate the selected statistic to evaluate the deviation of the predictions from the desired null model.

## Synergy evaluation assuming equal variances for on- and off-axis points

Synergy is evaluated for all off-axis dose combinations, i.e. dose combinations that are not used in the monotherapy curve estimation. Synergy evaluation depends on the underlying null model and any of the above models, i.e. generalized or classical Loewe or Highest Single Agent, can be used for this purpose. We provide here a brief summary of both statistical tests. Technical derivations and further details are available in the article cited at the beginning of the document.

To define test statistics, the following notations are used.

• Let $$y_{ij}$$ be the observed effect for replicate $$j$$ of dose combination $$i$$, so that $$y_{11}, y_{12}, y_{13}, y_{21}, ..., y_{kn_{k}}$$ is a set of observed effects. We assume $$k$$ different dose combinations and $$n_{k}$$ replicates for each combination. The number of different off-axis dose combinations is denoted as $$n_{1}$$.
• $$p_{1}, ..., p_{k}$$ are the predicted responses for the $$k$$ off-axis dose combinations.
• $$\sigma^{2}$$ is the variance of the replicate observations, assumed to be constant over all dose combinations, and estimated by taking MSE of the null model.
• $$\operatorname{df}_{0}$$ is the number of degrees of freedom from the marginal model estimation.

We construct a vector $$R = (r_{1}, ..., r_{k})$$ which represents mean deviation from the predicted effect. In particular, $r_{k} = \frac{1}{n_{k}} \sum_{i = 1}^{n_{k}} y_{ki} - p_{k}$

With the help of bootstrapping, the covariance matrix of $$R$$ can be estimated under the null hypothesis of no synergy so that $$\operatorname{Var}\left(R\right) = \sigma^{2}\left(D + C_{p}\right)$$ where $$D$$ is a diagonal matrix with $$1 / n_{i}$$ in the $$i$$-th position and $$C_{p}$$ is the covariance matrix obtained from bootstrap.

### meanR

The meanR test will evaluate whether the null model globally fits well the observed data. It is derived using a lack-of-fit sum of squares technique. In particular, the test statistic is given by $\operatorname{TestStat} = \frac{R^{T}\left(D + C_{p}\right)^{-1}R}{n_{1}\sigma^{2}}$ Assuming that residuals from the generalized Loewe model are normally distributed, it can be shown that this statistic follows an $$F_{n_{1}, \operatorname{df}_{0}}$$ distribution under the null. If these assumptions are not satisfied, the null distribution can be approximated by bootstrapping.

### maxR

The maxR test evaluates whether the null model locally fits the observed data. In particular, it provides a test score for each off-axis combination. Based on the sign of this score, it can be determined whether synergy or antagonism is more likely and a formal test can be constructed. Under the null hypothesis of no lack-of-fit and normally distributed effects, $\max \left| R^{T}\left(D + C_{p}\right)^{-1/2} \right| / \sigma \sim \max \left| Z_{1}, \dots, Z_{k} \right|$ where $$Z_{j} \sim N\left(0,1\right)$$. More particularly, the test statistic for the $$k$$-th off-axis dose combination $$(d_{1}, d_{2})$$ is computed as $\operatorname{TestStat}\left(d_{1}, d_{2}\right) = \left[\left| R^{T}\left(D + C_{p}\right)^{-1/2} \right| / \sigma\right]_{k}$ where $$\left[\cdot\right]_{k}$$ indicates the $$k$$-th coordinate. This test statistic is then compared either to the null distribution based on normal approximation or a bootstrapped approximation.

## Synergy evaluation in case of variance heterogeneity

In the methodology described above one important assumption is made regarding the variance of the on- and off-axis dose combinations. It is considered to be equal across all points. This assumption is also mentioned in the original article and its technical supplement.

In reality it is often seen that the variance of the monotherapies is not equal to the variance of the off-axis combinations. The assumption of equal variances is thus not always valid. That is why the meanR and maxR test-statistics can also estimate the variances for on-axis (monotherapies) and off-axis dose-combinations separately. Two extra methods are described below: the SepVar method (Separated variance) and the ModVar method (Modeled variance). For both methods replicates are required and no variance-stabilizing transformations are required. The latter is often necessary when assuming equal variances.

### Adapted meanR

The adapted meanR test uses two separate variance estimates for (a) the monotherapies (= $$\sigma^{2}_{0}$$) and (b) the dose combinations (= $$\Sigma_{1}$$, a diagonal matrix). The notation for both SepVar as ModVar will be the same, but the estimation of $$\Sigma_{1}$$ will be different. The variance of the monotherapies $$\sigma^{2}_{0}$$ is estimated as $$\sigma^{2}$$ above by taking the MSE of the null model. The test statistic is given by:

$\operatorname{TestStat} = \frac{R^{T}\left(\Sigma_{1}D + \sigma^{2}_{0}C_{p}\right)^{-1}R}{n_{1}}$

1. SepVar method: The variance for the dose combinations is estimated by taking the variance in each dose combination and then taking the mean of all these variances, thus $$\Sigma_{1} = \sigma^2_1 I_{n_1}$$. The downside of this method is that the variance for all combinations is assumed to be equal. In reality the variance often depends on the mean effect. This is taken into account in the ModVar method.

2. ModVar method: In this method the diagonal elements of $$\Sigma_{1}$$ are no longer estimated as single number but rather as a vector of variances. Each off-axis point has now its own variance. A linear model is fitted on the original dataset, modeling the variance of each off-axis point as a function of its mean effect. The estimated model parameters are then used to predict the variance for the corresponding mean effect measured for that dose combination. These predicted variances are placed in the diagonal of $$\Sigma_{1}$$.

### Adapted maxR

The same approach is taken for the adapted maxR test statistic. Instead of using one estimated variance for both on- and off-axis points, two separate estimates are used. The estimates for $$\Sigma_{1}$$ are different depending on the method used (ModVar or SepVar). The methodology of estimating the variance is the same as was described in the “Adapted meanR” section above.

The maxR test becomes

$\max \left| R^{T}\left(\Sigma_{1} D + \sigma^{2}_{0} C_{p}\right)^{-1/2} \right| \sim \max \left| Z_{1}, \dots, Z_{k} \right|$

where $$Z_{j} \sim N\left(0,1\right)$$. In particular, the test statistic for the $$k$$-th off-axis dose combination $$(d_{1}, d_{2})$$ is computed as $\operatorname{TestStat}\left(d_{1}, d_{2}\right) = \left[\left| R^{T}\left(\Sigma_{1} D + \sigma^{2}_{0} C_{p}\right)^{-1/2} \right| \right]_{k}$ where $$\left[\cdot\right]_{k}$$ indicates the $$k$$-th coordinate.

### Advantages of ModVar and SepVar methods compared to assumption of equal variances

The assumption of equal variances between monotherapies and off-axis dose-combinations fails to control the type I error rate around pre-specified level, when the variance of off-axis points increases (natural variance or outliers). This results in false positive synergy calls when in reality there were none. Both the SepVar and the ModVar methods control the type I error rate far better, with slightly better results obtained by the ModVar method.

Furthermore, the sensitivity and specificity of the maxR test statistics are higher with the methods assuming variance heterogeneity compared to the methods where equal variances are assumed.