Marginal Effects at Specific Values

Daniel Lüdecke

2018-10-11

Marginal effects at specific values or levels

This vignettes shows how to calculate marginal effects at specific values or levels for the terms of interest. It is recommended to read the general introduction first, if you haven’t done this yet.

The terms-argument not only defines the model terms of interest, but each model term can be limited to certain values. This allows to compute and plot marginal effects for (grouping) terms at specific values only, or to define values for the main effect of interest.

There are several options to define these values, which always should be placed in square brackets directly after the term name and can vary for each model term.

  1. Concrete values are separated by a comma: terms = "c172code [1,3]". For factors, you could also use factor levels, e.g. terms = "Species [setosa,versicolor]".
  2. Ranges are specified with a colon: terms = c("c12hour [30:80]", "c172code [1,3]"). This would plot all values from 30 to 80 for the variable c12hour.
  3. Convenient shortcuts to calculate common values like mean +/- 1 SD (terms = "c12hour [meansd]"), quartiles (terms = "c12hour [quart]") or minumum and maximum values (terms = "c12hour [mixmax]"). See rprs_values() for the different options.
  4. A function name. The function is then applied to all unique values of the indicated variable, e.g. terms = "hp [exp]".
  5. Finally, if the first variable specified in terms is a numeric vector for which no specific values are given, a “pretty range” is calculated (see pretty_range()), to avoid memory allocation problems for vectors with many unique values. If a numeric vector is specified as second or third variable in term (i.e. if this vector represents a grouping structure), representative values (see rprs_values()) are chosen.

Specific values and value range

Defining value ranges is especially useful when variables are, for instance, log-transformed. ggpredict() then typically only uses the range of the log-transformed variable, which is in most cases not what we want. In such situation, specify the range in the terms-argument.

Choosing representative values

Especially in situations where we have two continuous variables in interaction terms, or where the “grouping” variable is continuous, it is helpful to select representative values of the grouping variable - else, predictions would be made for too many groups, which is no longer helpful when interpreting marginal effects.

You can use

Transforming values with functions

The brackets in the terms-argument also accept the name of a valid function, to (back-)transform predicted valued. In this example, an alternative would be to specify that values should be exponentiated, which is indicated by [exp] in the terms-argument:

Pretty value ranges

This section is intended to show some examples how the plotted output differs, depending on which value range is used. To see the difference in the “curvilinear” trend, we use a quadratic term on a standardized variable.

ggpredict() “prettifies” the vector, resulting in a smaller set of unique values. This is less memory consuming and may be needed especially for more complex models.

You can turn off automatic “prettifying” by adding the "all"-shortcut to the terms-argument.

This results in a smooth plot, as all values from the term of interest are taken into account.

Marginal effects conditioned on specific values of the covariates

By default, the typical-argument determines the function that will be applied to the covariates to hold these terms at constant values. By default, this is the mean-value, but other options (like median or mode) are possible as well.

Use the condition-argument to define other values at which covariates should be held constant. condition requires a named vector, with the name indicating the covariate.