Recommendations for Using summarytools With Rmarkdown

Dominic Comtois

2020-02-10

Introduction

This document mainly contains examples using recommended styles for Rmarkdown documents. Available styles in summarytools are the same as pander’s:

For freq(), descr() (and ctable(), although with caveats), rmarkdown style is recommended. For dfSummary(), grid is recommended.

Starting with freq(), we’ll review the recommended methods and styles to quickly get satisfying results in Rmarkdown documents.

To see how this vignette is configured, see this section.

Jump to…


freq()

freq() is best used with `style = ‘rmarkdown’; html rendering is also possible.

Rmarkdown Style

explicit NA's detected - temporarily setting 'report.nas' to FALSE

Tableau de fréquences

tobacco$gender
Type: Facteur

  Fréq. % % Cum.
F 489 48.90 48.90
M 489 48.90 97.80
(Missing) 22 2.20 100.00
Total 1000 100.00 100.00

HTML Rendering

explicit NA's detected - temporarily setting 'report.nas' to FALSE

Tableau de fréquences

tobacco$gender
Type: Facteur
gender Fréq. % % Cum.
F 489 48.90 48.90
M 489 48.90 97.80
(Missing) 22 2.20 100.00
Total 1000 100.00 100.00

If you find the table too large, you can use table.classes = 'st-small' - an example is provided further below.


Back to top

ctable()

Rmarkdown Style

Tables with heading spanning over 2 rows are not fully supported in markdown (yet), but the result is getting close to acceptable. This, however, is not true for all themes. That is why the rendering method is preferred.

Tableau croisé, proportions par rangées

gender * smoker
Data frame: tobacco

smoker Yes No Total
gender
F 147 (30.1%) 342 (69.9%) 489 (100.0%)
M 143 (29.2%) 346 (70.8%) 489 (100.0%)
(Missing) 8 (36.4%) 14 (63.6%) 22 (100.0%)
Total 298 (29.8%) 702 (70.2%) 1000 (100.0%)

HTML Rendering

For best results, use this method.

Tableau croisé, proportions par rangées

gender * smoker
Data frame: tobacco
smoker
gender Yes No Total
F 147 ( 30.1% ) 342 ( 69.9% ) 489 ( 100.0% )
M 143 ( 29.2% ) 346 ( 70.8% ) 489 ( 100.0% )
(Missing) 8 ( 36.4% ) 14 ( 63.6% ) 22 ( 100.0% )
Total 298 ( 29.8% ) 702 ( 70.2% ) 1000 ( 100.0% )

Back to top

descr()

descr() is also best used with style = 'rmarkdown', and HTML rendering is also supported.

Rmarkdown Style

Non-numerical variable(s) ignored: gender, age.gr, smoker, diseased, disease

Statistiques descriptives

tobacco
N: 1000

  BMI age cigs.per.day samp.wgts
Moy 25.73 49.60 6.78 1.00
Écart-type 4.49 18.29 11.88 0.08
Min 8.83 18.00 0.00 0.86
Q1 22.93 34.00 0.00 0.86
Médiane 25.62 50.00 0.00 1.04
Q3 28.65 66.00 11.00 1.05
Max 39.44 80.00 40.00 1.06
ÉMA 4.18 23.72 0.00 0.01
ÉIQ 5.72 32.00 11.00 0.19
CV 0.17 0.37 1.75 0.08
Asymétrie 0.02 -0.04 1.54 -1.04
ET-Asymétrie 0.08 0.08 0.08 0.08
Aplatissement 0.26 -1.26 0.90 -0.90
Nb.Valide 974.00 975.00 965.00 1000.00
Pct.Valide 97.40 97.50 96.50 100.00

HTML Rendering

We’ll use table.classes = ‘st-small’ to show how it affects the table’s size, compared to the freq() table rendered earlier.

Non-numerical variable(s) ignored: gender, age.gr, smoker, diseased, disease

Statistiques descriptives

tobacco
N: 1000
BMI age cigs.per.day samp.wgts
Moy 25.73 49.60 6.78 1.00
Écart-type 4.49 18.29 11.88 0.08
Min 8.83 18.00 0.00 0.86
Q1 22.93 34.00 0.00 0.86
Médiane 25.62 50.00 0.00 1.04
Q3 28.65 66.00 11.00 1.05
Max 39.44 80.00 40.00 1.06
ÉMA 4.18 23.72 0.00 0.01
ÉIQ 5.72 32.00 11.00 0.19
CV 0.17 0.37 1.75 0.08
Asymétrie 0.02 -0.04 1.54 -1.04
ET-Asymétrie 0.08 0.08 0.08 0.08
Aplatissement 0.26 -1.26 0.90 -0.90
Nb.Valide 974 975 965 1000
Pct.Valide 97.40 97.50 96.50 100.00

Back to top

dfSummary()

Grid Style

Don’t forget to specify plain.ascii = FALSE (or set it as a global option with st_options(plain.ascii = FALSE)), or you won’t get good results.

HTML Rendering

This method also works really well, and not having to specify the tmp.img.dir parameter is a plus.

Tableau-synthèse

tobacco
Dimensions: 1000 x 9
Doublons: 2
No Variable Stats / valeurs Fréq. (% de valide) Diagramme Valide Manquant
1 gender [factor] 1. F 2. M 3. (Missing)
489(48.9%)
489(48.9%)
22(2.2%)
1000 (100%) 0 (0%)
2 age [numeric] Moy (é-t) : 49.6 (18.3) min < med < max: 18 < 50 < 80 ÉIQ (CV) : 32 (0.4) 63 valeurs uniques 975 (97.5%) 25 (2.5%)
3 age.gr [factor] 1. 18-34 2. 35-50 3. 51-70 4. 71 +
258(26.5%)
241(24.7%)
317(32.5%)
159(16.3%)
975 (97.5%) 25 (2.5%)
4 BMI [numeric] Moy (é-t) : 25.7 (4.5) min < med < max: 8.8 < 25.6 < 39.4 ÉIQ (CV) : 5.7 (0.2) 974 valeurs uniques 974 (97.4%) 26 (2.6%)
5 smoker [factor] 1. Yes 2. No
298(29.8%)
702(70.2%)
1000 (100%) 0 (0%)
6 cigs.per.day [numeric] Moy (é-t) : 6.8 (11.9) min < med < max: 0 < 0 < 40 ÉIQ (CV) : 11 (1.8) 37 valeurs uniques 965 (96.5%) 35 (3.5%)
7 diseased [factor] 1. Yes 2. No
224(22.4%)
776(77.6%)
1000 (100%) 0 (0%)
8 disease [character] 1. Hypertension 2. Cancer 3. Cholesterol 4. Heart 5. Pulmonary 6. Musculoskeletal 7. Diabetes 8. Hearing 9. Digestive 10. Hypotension [ 3 autres ]
36(16.2%)
34(15.3%)
21(9.5%)
20(9.0%)
20(9.0%)
19(8.6%)
14(6.3%)
14(6.3%)
12(5.4%)
11(5.0%)
21(9.5%)
222 (22.2%) 778 (77.8%)
9 samp.wgts [numeric] Moy (é-t) : 1 (0.1) min < med < max: 0.9 < 1 < 1.1 ÉIQ (CV) : 0.2 (0.1)
0.86!:267(26.7%)
1.04!:249(24.9%)
1.05!:324(32.4%)
1.06!:160(16.0%)
! arrondi
1000 (100%) 0 (0%)

Managing Lengthy dfSummary() Outputs in Rmarkdown Documents

For data frames containing numerous variables, we can use the max.tbl.height argument to wrap the results in a scrollable window having the specified height, in pixels. For instance:

Tableau-synthèse

tobacco
Dimensions: 1000 x 9
Doublons: 2
No Variable Stats / valeurs Fréq. (% de valide) Diagramme Manquant
1 gender [factor] 1. F 2. M 3. (Missing)
489(48.9%)
489(48.9%)
22(2.2%)
0 (0%)
2 age [numeric] Moy (é-t) : 49.6 (18.3) min < med < max: 18 < 50 < 80 ÉIQ (CV) : 32 (0.4) 63 valeurs uniques 25 (2.5%)
3 age.gr [factor] 1. 18-34 2. 35-50 3. 51-70 4. 71 +
258(26.5%)
241(24.7%)
317(32.5%)
159(16.3%)
25 (2.5%)
4 BMI [numeric] Moy (é-t) : 25.7 (4.5) min < med < max: 8.8 < 25.6 < 39.4 ÉIQ (CV) : 5.7 (0.2) 974 valeurs uniques 26 (2.6%)
5 smoker [factor] 1. Yes 2. No
298(29.8%)
702(70.2%)
0 (0%)
6 cigs.per.day [numeric] Moy (é-t) : 6.8 (11.9) min < med < max: 0 < 0 < 40 ÉIQ (CV) : 11 (1.8) 37 valeurs uniques 35 (3.5%)
7 diseased [factor] 1. Yes 2. No
224(22.4%)
776(77.6%)
0 (0%)
8 disease [character] 1. Hypertension 2. Cancer 3. Cholesterol 4. Heart 5. Pulmonary 6. Musculoskeletal 7. Diabetes 8. Hearing 9. Digestive 10. Hypotension [ 3 autres ]
36(16.2%)
34(15.3%)
21(9.5%)
20(9.0%)
20(9.0%)
19(8.6%)
14(6.3%)
14(6.3%)
12(5.4%)
11(5.0%)
21(9.5%)
778 (77.8%)
9 samp.wgts [numeric] Moy (é-t) : 1 (0.1) min < med < max: 0.9 < 1 < 1.1 ÉIQ (CV) : 0.2 (0.1)
0.86!:267(26.7%)
1.04!:249(24.9%)
1.05!:324(32.4%)
1.06!:160(16.0%)
! arrondi
0 (0%)

Back to top

Using Other Formatting Packages

As explained in the introductory vignette, tb() can be used to convert summarytools objects created with freq() and descr() to simple tibbles that packages specialized in table formatting will be able to process. This is particularly helpful with stby objects:

variable Species min q1 med q3 max
Petal.Length setosa 1.0 1.4 1.50 1.6 1.9
versicolor 3.0 4.0 4.35 4.6 5.1
virginica 4.5 5.1 5.55 5.9 6.9
Petal.Width setosa 0.1 0.2 0.20 0.3 0.6
versicolor 1.0 1.2 1.30 1.5 1.8
virginica 1.4 1.8 2.00 2.3 2.5
Sepal.Length setosa 4.3 4.8 5.00 5.2 5.8
versicolor 4.9 5.6 5.90 6.3 7.0
virginica 4.9 6.2 6.50 6.9 7.9
Sepal.Width setosa 2.3 3.2 3.40 3.7 4.4
versicolor 2.0 2.5 2.80 3.0 3.4
virginica 2.2 2.8 3.00 3.2 3.8

This Vignette’s Setup

This vignette uses theme rmarkdown::html_vignette. Its yaml section looks like this:

# ---
# title: "Recommendations for Using summarytools With Rmarkdown"
# author: "Dominic Comtois"
# date: "2020-02-10"
# output: 
#   rmarkdown::html_vignette: 
#     css: 
#     - !expr system.file("rmarkdown/templates/html_vignette/resources/vignette.css", package = "rmarkdown")
# vignette: >
#   %\VignetteIndexEntry{Recommendations for Rmarkdown}
#   %\VignetteEngine{knitr::rmarkdown}
#   %\VignetteEncoding{UTF-8}
# ---

The following summarytools global options have been set. More of them can be useful, but this is a good starting point.

Also, the following knitr chunk options were set this way:

Finally, summarytools’ CSS has been included in the following manner, with chunk option echo = FALSE:

Back to top

Final Notes

This is by no way a definitive guide; depending on the themes you use, you could find that other settings yield better results. If you are looking to create a Word or a PDF document, you might want to try different combinations of options.