One of the most common questions about multilevel models is how much influence grouping terms have on the outcome. One way to explore this is to simulate the predicted values of an observation across the distribution of random effects for a specific grouping variable and term. This can be described as “marginalizing” predictions over the distribution of random effects. This allows you to explore the influence of the grouping term and grouping levels on the outcome scale by simulating predictions for simulated values of each observation across the distribution of effect sizes.
The REmargins()
function allows you to do this. Here, we take the example sleepstudy
model and marginalize predictions for all of the random effect terms (Subject:Intercept, Subject:Days). By default, the function will marginalize over the quartiles of the expected rank (see expected rank vignette) of the effect distribution for each term.
fm1 <- lmer(Reaction ~ Days + (Days | Subject), sleepstudy)
mfx <- REmargins(merMod = fm1, newdata = sleepstudy[1:10,])
head(mfx)
#> Reaction Days Subject case grouping_var term breaks
#> 1 249.56 0 309 1 Subject Intercept 1
#> 2 249.56 0 334 1 Subject Days 1
#> 3 249.56 0 350 1 Subject Intercept 2
#> 4 249.56 0 330 1 Subject Days 2
#> 5 249.56 0 308 1 Subject Intercept 3
#> 6 249.56 0 332 1 Subject Days 3
#> original_group_level fit_combined upr_combined lwr_combined fit_Subject
#> 1 308 211.6022 252.5443 177.4033 -40.220047
#> 2 308 244.4658 283.3303 207.5940 -7.471686
#> 3 308 237.1407 277.0037 201.7126 -15.712567
#> 4 308 276.0716 311.4830 239.3424 22.626852
#> 5 308 256.3923 292.7700 216.6426 3.815614
#> 6 308 262.6514 295.5489 224.5716 10.306149
#> upr_Subject lwr_Subject fit_fixed upr_fixed lwr_fixed
#> 1 -4.177672 -74.44036 249.3960 283.5701 215.7122
#> 2 29.913604 -45.58652 251.8809 286.7075 218.0804
#> 3 21.558164 -52.58777 250.4633 288.5282 216.1605
#> 4 59.942004 -14.46200 249.9996 285.5817 216.0925
#> 5 40.481303 -34.47880 252.2277 285.3248 217.0583
#> 6 48.049021 -28.16187 252.2168 286.4449 218.3318
The new data frame output from REmargins
contains a lot of information. The first few columns contain the original data passed to newdata
. Each observation in newdata
is identified by a case
number, because the function repeats each observation by the number of random effect terms and number of breaks to simulate each term over. Then the grouping_var
Finally - you can plot the results marginalization to evaluate the effect of the random effect terms graphically.