huxreg
From version 0.2, huxtable includes the function huxreg
to build a table of regressions.
huxreg
can be called with a list of models. These models can be of any class which has a tidy
method defined in the broom package. The method should return a list of regression coefficients with names term
, estimate
, std.error
and p.value
. That covers most standard regression packages.
Let’s start by running some regressions to predict a diamond’s price.
data(diamonds, package = 'ggplot2')
lm1 <- lm(price ~ carat + depth, diamonds)
lm2 <- lm(price ~ depth + factor(color, ordered = FALSE), diamonds)
lm3 <- lm(log(price) ~ carat + depth, diamonds)
Now, we call huxreg
to display the regression output side by side.
huxreg(lm1, lm2, lm3)
(1) | (2) | (3) | |
(Intercept) | 4045.333 *** | 6491.466 *** | 7.313 *** |
(286.205) | (730.537) | (0.074) | |
carat | 7765.141 *** | 1.971 *** | |
(14.009) | (0.004) | ||
depth | -102.165 *** | -53.835 *** | -0.018 *** |
(4.635) | (11.815) | (0.001) | |
factor(color, ordered = FALSE)E | -95.142 | ||
(62.037) | |||
factor(color, ordered = FALSE)F | 554.742 *** | ||
(62.374) | |||
factor(color, ordered = FALSE)G | 832.357 *** | ||
(60.338) | |||
factor(color, ordered = FALSE)H | 1324.183 *** | ||
(64.296) | |||
factor(color, ordered = FALSE)I | 1929.902 *** | ||
(71.561) | |||
factor(color, ordered = FALSE)J | 2164.044 *** | ||
(88.144) | |||
N | 53940 | 53940 | 53940 |
R2 | 0.851 | 0.032 | 0.847 |
logLik | -472488.441 | -522908.139 | -26617.649 |
AIC | 944984.882 | 1045834.277 | 53243.298 |
*** p < 0.001; ** p < 0.01; * p < 0.05. |
The basic output includes estimates, standard errors and summary statistics.
Some of those variable names are hard to read. We can change them by specifying a named list of variables in the coefs
argument, like this:
color_names <- paste0('factor(color, ordered = FALSE)', LETTERS[5:10])
names(color_names) <- paste('Color:', LETTERS[5:10])
huxreg(lm1, lm2, lm3, coefs = c('Carat' = 'carat', 'Depth' = 'depth', color_names))
(1) | (2) | (3) | |
Carat | 7765.141 *** | 1.971 *** | |
(14.009) | (0.004) | ||
Depth | -102.165 *** | -53.835 *** | -0.018 *** |
(4.635) | (11.815) | (0.001) | |
Color: E | -95.142 | ||
(62.037) | |||
Color: F | 554.742 *** | ||
(62.374) | |||
Color: G | 832.357 *** | ||
(60.338) | |||
Color: H | 1324.183 *** | ||
(64.296) | |||
Color: I | 1929.902 *** | ||
(71.561) | |||
Color: J | 2164.044 *** | ||
(88.144) | |||
N | 53940 | 53940 | 53940 |
R2 | 0.851 | 0.032 | 0.847 |
logLik | -472488.441 | -522908.139 | -26617.649 |
AIC | 944984.882 | 1045834.277 | 53243.298 |
*** p < 0.001; ** p < 0.01; * p < 0.05. |
Alternatively, since the output from huxreg
is just a huxtable, we could just edit its contents directly before we print it:
diamond_regs <- huxreg(lm1, lm2, lm3)
diamond_regs[seq(8, 18, 2), 1] <- paste('Color:', LETTERS[5:10])
diamond_regs
(1) | (2) | (3) | |
(Intercept) | 4045.333 *** | 6491.466 *** | 7.313 *** |
(286.205) | (730.537) | (0.074) | |
carat | 7765.141 *** | 1.971 *** | |
(14.009) | (0.004) | ||
depth | -102.165 *** | -53.835 *** | -0.018 *** |
(4.635) | (11.815) | (0.001) | |
Color: E | -95.142 | ||
(62.037) | |||
Color: F | 554.742 *** | ||
(62.374) | |||
Color: G | 832.357 *** | ||
(60.338) | |||
Color: H | 1324.183 *** | ||
(64.296) | |||
Color: I | 1929.902 *** | ||
(71.561) | |||
Color: J | 2164.044 *** | ||
(88.144) | |||
N | 53940 | 53940 | 53940 |
R2 | 0.851 | 0.032 | 0.847 |
logLik | -472488.441 | -522908.139 | -26617.649 |
AIC | 944984.882 | 1045834.277 | 53243.298 |
*** p < 0.001; ** p < 0.01; * p < 0.05. |
Of course, we aren’t limited to just changing names. We can also make our table prettier. Let’s add the “article” theme, and a vertical stripe for background colour, tweak a few details like font size, and add a caption. All of these are just standard huxtable commands.
suppressPackageStartupMessages(library(dplyr))
diamond_regs %>%
theme_article %>%
set_background_color(1:nrow(diamond_regs), evens, grey(.95)) %>%
set_font_size(final(), 1, 9) %>%
set_bold(final(), 1, FALSE) %>%
set_top_border(final(), 1, 1) %>%
set_caption('Linear regressions of diamond prices')
(1) | (2) | (3) | |
(Intercept) | 4045.333 *** | 6491.466 *** | 7.313 *** |
(286.205) | (730.537) | (0.074) | |
carat | 7765.141 *** | 1.971 *** | |
(14.009) | (0.004) | ||
depth | -102.165 *** | -53.835 *** | -0.018 *** |
(4.635) | (11.815) | (0.001) | |
Color: E | -95.142 | ||
(62.037) | |||
Color: F | 554.742 *** | ||
(62.374) | |||
Color: G | 832.357 *** | ||
(60.338) | |||
Color: H | 1324.183 *** | ||
(64.296) | |||
Color: I | 1929.902 *** | ||
(71.561) | |||
Color: J | 2164.044 *** | ||
(88.144) | |||
N | 53940 | 53940 | 53940 |
R2 | 0.851 | 0.032 | 0.847 |
logLik | -472488.441 | -522908.139 | -26617.649 |
AIC | 944984.882 | 1045834.277 | 53243.298 |
*** p < 0.001; ** p < 0.01; * p < 0.05. |
By default, standard errors are shown below coefficient estimates. To display them in a column to the right, use error_pos = 'right'
:
huxreg(lm1, lm3, error_pos = 'right')
(1) | (2) | |||
(Intercept) | 4045.333 *** | (286.205) | 7.313 *** | (0.074) |
carat | 7765.141 *** | (14.009) | 1.971 *** | (0.004) |
depth | -102.165 *** | (4.635) | -0.018 *** | (0.001) |
N | 53940 | 53940 | ||
R2 | 0.851 | 0.847 | ||
logLik | -472488.441 | -26617.649 | ||
AIC | 944984.882 | 53243.298 | ||
*** p < 0.001; ** p < 0.01; * p < 0.05. |
This will give column headings a column span of 2.
To display standard errors in the same cell as estimates, use error_pos = 'same'
:
huxreg(lm1, lm3, error_pos = 'same')
(1) | (2) | |
(Intercept) | 4045.333 *** (286.205) | 7.313 *** (0.074) |
carat | 7765.141 *** (14.009) | 1.971 *** (0.004) |
depth | -102.165 *** (4.635) | -0.018 *** (0.001) |
N | 53940 | 53940 |
R2 | 0.851 | 0.847 |
logLik | -472488.441 | -26617.649 |
AIC | 944984.882 | 53243.298 |
*** p < 0.001; ** p < 0.01; * p < 0.05. |
You can change the default column headings by giving names to your models:
huxreg('Price' = lm1, 'Log price' = lm3)
Price | Log price | |
(Intercept) | 4045.333 *** | 7.313 *** |
(286.205) | (0.074) | |
carat | 7765.141 *** | 1.971 *** |
(14.009) | (0.004) | |
depth | -102.165 *** | -0.018 *** |
(4.635) | (0.001) | |
N | 53940 | 53940 |
R2 | 0.851 | 0.847 |
logLik | -472488.441 | -26617.649 |
AIC | 944984.882 | 53243.298 |
*** p < 0.001; ** p < 0.01; * p < 0.05. |
To display a particular row of summary statistics, use the statistics
parameter. This should be a character vector. Valid values are anything returned from your models by broom::glance
. Another valid value is "nobs"
, which returns the number of observations from the regression. If the statistics
vector has names, these will be used for row headings:
broom::glance(lm1)
r.squared | adj.r.squared | sigma | statistic | p.value | df | logLik | AIC | BIC | deviance | df.residual |
0.851 | 0.851 | 1.54e+03 | 1.54e+05 | 0 | 3 | -4.72e+05 | 9.45e+05 | 9.45e+05 | 1.28e+11 | 53937 |
huxreg(lm1, lm3, statistics = c('# observations' = 'nobs', 'R squared' = 'r.squared', 'F statistic' = 'statistic',
'P value' = 'p.value'))
(1) | (2) | |
(Intercept) | 4045.333 *** | 7.313 *** |
(286.205) | (0.074) | |
carat | 7765.141 *** | 1.971 *** |
(14.009) | (0.004) | |
depth | -102.165 *** | -0.018 *** |
(4.635) | (0.001) | |
# observations | 53940 | 53940 |
R squared | 0.851 | 0.847 |
F statistic | 153634.765 | 149771.327 |
P value | 0.000 | 0.000 |
*** p < 0.001; ** p < 0.01; * p < 0.05. |
By default, huxreg
displays significance stars. You can alter the symbols used and significance levels with the stars
parameter, or set stars = NULL
to turn off significance stars completely.
huxreg(lm1, lm3, stars = c(`*` = 0.1, `**` = 0.05, `***` = 0.01)) # a little boastful?
(1) | (2) | |
(Intercept) | 4045.333 *** | 7.313 *** |
(286.205) | (0.074) | |
carat | 7765.141 *** | 1.971 *** |
(14.009) | (0.004) | |
depth | -102.165 *** | -0.018 *** |
(4.635) | (0.001) | |
N | 53940 | 53940 |
R2 | 0.851 | 0.847 |
logLik | -472488.441 | -26617.649 |
AIC | 944984.882 | 53243.298 |
*** p < 0.01; ** p < 0.05; * p < 0.1. |
huxreg(lm1, lm3, stars = NULL)
(1) | (2) | |
(Intercept) | 4045.333 | 7.313 |
(286.205) | (0.074) | |
carat | 7765.141 | 1.971 |
(14.009) | (0.004) | |
depth | -102.165 | -0.018 |
(4.635) | (0.001) | |
N | 53940 | 53940 |
R2 | 0.851 | 0.847 |
logLik | -472488.441 | -26617.649 |
AIC | 944984.882 | 53243.298 |
You aren’t limited to displaying standard errors of the estimates. If you prefer, you can display t statistics or p values, using the error_format
option. Any column from tidy
can be used by putting it in curly brackets:
huxreg(lm1, lm3, error_format = '({statistic})')
(1) | (2) | |
(Intercept) | 4045.333 *** | 7.313 *** |
(14.134) | (99.383) | |
carat | 7765.141 *** | 1.971 *** |
(554.282) | (547.305) | |
depth | -102.165 *** | -0.018 *** |
(-22.041) | (-14.936) | |
N | 53940 | 53940 |
R2 | 0.851 | 0.847 |
logLik | -472488.441 | -26617.649 |
AIC | 944984.882 | 53243.298 |
*** p < 0.001; ** p < 0.01; * p < 0.05. |
huxreg(lm1, lm3, error_format = '({p.value})')
(1) | (2) | |
(Intercept) | 4045.333 *** | 7.313 *** |
(0.000) | (0.000) | |
carat | 7765.141 *** | 1.971 *** |
(0.000) | (0.000) | |
depth | -102.165 *** | -0.018 *** |
(0.000) | (0.000) | |
N | 53940 | 53940 |
R2 | 0.851 | 0.847 |
logLik | -472488.441 | -26617.649 |
AIC | 944984.882 | 53243.298 |
*** p < 0.001; ** p < 0.01; * p < 0.05. |
Or you can display confidence intervals. Use ci_level
to set the confidence level for the interval, then use {conf.low}
and {conf.high}
in error_format
:
huxreg(lm1, lm3, ci_level = .99, error_format = '{conf.low} to {conf.high}')
(1) | (2) | |
(Intercept) | 4045.333 *** | 7.313 *** |
3308.091 to 4782.576 | 7.123 to 7.502 | |
carat | 7765.141 *** | 1.971 *** |
7729.054 to 7801.228 | 1.962 to 1.981 | |
depth | -102.165 *** | -0.018 *** |
-114.105 to -90.225 | -0.021 to -0.015 | |
N | 53940 | 53940 |
R2 | 0.851 | 0.847 |
logLik | -472488.441 | -26617.649 |
AIC | 944984.882 | 53243.298 |
*** p < 0.001; ** p < 0.01; * p < 0.05. |
To change the footnote, use note
. If note
contains the string "{stars}"
it will be replaced by a description of the significance stars used. If you don’t want a footnote, just set note = NULL
.
huxreg(lm1, lm3, note = 'Linear regressions on diamond price. {stars}.')
(1) | (2) | |
(Intercept) | 4045.333 *** | 7.313 *** |
(286.205) | (0.074) | |
carat | 7765.141 *** | 1.971 *** |
(14.009) | (0.004) | |
depth | -102.165 *** | -0.018 *** |
(4.635) | (0.001) | |
N | 53940 | 53940 |
R2 | 0.851 | 0.847 |
logLik | -472488.441 | -26617.649 |
AIC | 944984.882 | 53243.298 |
Linear regressions on diamond price. *** p < 0.001; ** p < 0.01; * p < 0.05. |
To change number formatting, set the number_format
parameter. This works the same as the number_format
property for a huxtable - if it is numeric, numbers will be rounded to that many decimal places; if it is character, it will be taken as a format to the base R sprintf
function. huxreg
tries to be smart and to format summary statistics like nobs
as integers.
huxreg(lm1, lm3, number_format = 2)
(1) | (2) | |
(Intercept) | 4045.33 *** | 7.31 *** |
(286.21) | (0.07) | |
carat | 7765.14 *** | 1.97 *** |
(14.01) | (0.00) | |
depth | -102.17 *** | -0.02 *** |
(4.64) | (0.00) | |
N | 53940 | 53940 |
R2 | 0.85 | 0.85 |
logLik | -472488.44 | -26617.65 |
AIC | 944984.88 | 53243.30 |
*** p < 0.001; ** p < 0.01; * p < 0.05. |
Lastly, if you want to bold all significant coefficients, set the parameter bold_signif
to a maximum significance level:
huxreg(lm1, lm3, bold_signif = 0.05)
(1) | (2) | |
(Intercept) | 4045.333 *** | 7.313 *** |
(286.205) | (0.074) | |
carat | 7765.141 *** | 1.971 *** |
(14.009) | (0.004) | |
depth | -102.165 *** | -0.018 *** |
(4.635) | (0.001) | |
N | 53940 | 53940 |
R2 | 0.851 | 0.847 |
logLik | -472488.441 | -26617.649 |
AIC | 944984.882 | 53243.298 |
*** p < 0.001; ** p < 0.01; * p < 0.05. |
Sometimes, you want to report different statistics for a model. For example, you might want to use robust standard errors.
One way to do this is to pass a tidy
-able test object into huxreg
. The function coeftest
in the “lmtest” package has tidy
methods defined:
library(lmtest)
## Loading required package: zoo
##
## Attaching package: 'zoo'
## The following objects are masked from 'package:base':
##
## as.Date, as.Date.numeric
library(sandwich)
lm_robust <- coeftest(lm1, vcov = vcovHC)
huxreg("Normal SEs" = lm1, "Robust SEs" = lm_robust)
## Warning in FUN(X[[i]], ...): No `glance` method for model of class coeftest
Normal SEs | Robust SEs | |
(Intercept) | 4045.333 *** | 4045.333 *** |
(286.205) | (369.327) | |
carat | 7765.141 *** | 7765.141 *** |
(14.009) | (25.114) | |
depth | -102.165 *** | -102.165 *** |
(4.635) | (5.948) | |
N | 53940 | |
R2 | 0.851 | |
logLik | -472488.441 | |
AIC | 944984.882 | |
*** p < 0.001; ** p < 0.01; * p < 0.05. |
If that is not possible, you can compute statistics yourself and add them to your model using the tidy_override
function:
lm_fixed <- tidy_override(lm1, p.value = c(0.5, 0.2, 0.06))
huxreg("Normal p values" = lm1, "Supplied p values" = lm_fixed)
Normal p values | Supplied p values | |
(Intercept) | 4045.333 *** | 4045.333 |
(286.205) | (286.205) | |
carat | 7765.141 *** | 7765.141 |
(14.009) | (14.009) | |
depth | -102.165 *** | -102.165 |
(4.635) | (4.635) | |
N | 53940 | 53940 |
R2 | 0.851 | 0.851 |
logLik | -472488.441 | -472488.441 |
AIC | 944984.882 | 944984.882 |
*** p < 0.001; ** p < 0.01; * p < 0.05. |
You can override any statistics returned by tidy
or glance
.