Decks are a way to save a set of analyses so that you or your team can refer back to them later, export to Excel or PowerPoint, or create a Crunch Dashboard. Each deck is made up of a set of slides, and each slide contains an analysis and optionally a title and subtitle.
While a good slide generally appears simple to the viewer, a lot depends on getting the analysis exactly right, and so creating them does require setting the analysis’s attributes just right. The attributes of a slide’s analysis are:
The recipes in this cookbook all start from the pets example dataset (available from newExampleDataset()
).
suppressPackageStartupMessages({
library(purrr)
library(crunch)
})
options("crunch.show.progress" = FALSE)
login()
<- newExampleDataset() ds
You want to create a new deck and add slides to it.
A deck is created on a dataset with the command newDeck()
. It takes the dataset, a title for the deck, and can take is_public=TRUE
if the deck should be made public to other users of the dataset (it defaults to FALSE
).
After the deck is created, the newSlide()
function adds a slide.
<- newDeck(ds, "Q3 Pets Deck", is_public = TRUE)
deck <- newDeck(ds, "Private Deck")
private_deck
# If no `vizType` is specified, defaults to a table
<- newSlide(deck, ~q1, title = "Table of Favorite Pet")
slide
# Example of setting a vizType and filter
<- newSlide(
slide
deck, mean(ndogs) ~ country,
title = "Dot Plot of Mean Dogs by Country",
display_settings = list(vizType = "dotplot"),
filter = ds$q1 == "Dog"
)
<- refresh(deck) deck
You want to add a slide to a deck that has already been made.
The decks()
function access the decks catalog for a dataset. You can select one name or position and then add to it.
<- refresh(ds)
ds decks(ds)
<- decks(ds)[["Private Deck"]]
private_deck
<- newSlide(
slide
private_deck, ~q1,
title = "Bar Plot of Favorite Pet",
display_settings = list(vizType = "groupedBarPlot")
)
You want to edit a slide that’s already been created.
Slides can be accessed from the deck’s slide catalog, available from the slides()
command. You can retrieve them by their title or position.
The helper functions title<-
, subtitle<-
, query<-
, weight<-
, filter<-
, transforms<-
, displaySettings<-
and vizSpecs<-
help set options on a slide.
# Move title to subtitle and change the title
<- slides(deck)[["Table of Favorite Pet"]]
slide subtitle(slide) <- title(slide)
title(slide) <- "Cats are the most popular"
# Rename a category
<- slides(deck)[[2]]
slide transforms(slide) <- list(
rows_dimension = makeDimTransform(rename = c("AUS" = "Australia"))
)
You want to delete a slide from a deck.
Access a slide from the slide catalog and then use the delete()
command to delete it (it will ask before deleting unless you use command with_consent()
).
<- slides(deck)[[1]]
slide
if (FALSE) { # Not actually run for example
delete(slide)
}
You want to change a deck from being private to public (or vice-versa).
The is.public<-
function can set the deck’s status.
is.public(private_deck) <- TRUE # now public
Queries define the variables and summary measures used for the slide’s analysis. They use the formula notation used by the crunch function crtabs()
which is based on base R’s xtabs()
.
You want to get the frequencies of a single categorical or multiple response variable.
The query for a univariate count query puts the variable on the right hand side of a formula (for example ~var
).
<- newSlide(
slide
deck, ~q1,
title = "Univariate frequency: Favorite Pet"
)
You want to get a crosstab (or a frequency from two variables’ joint distribution)
The query for a multivariate frequency uses the +
to separate the variables on the right hand side of the formula (for example ~var1 + var2
).
<- newSlide(
slide
deck, ~q1 + country,
title = "Bivariate frequency: Favorite Pet by country"
)
# A third dimension is possible, which will usually result in a tabbed result:
<- newSlide(
slide
deck, ~q1 + country + wave,
title = "Trivariate frequency: Favorite Pet by country by wave"
)
You want to get the frequencies from a categorical array variable
A categorical array contributes two dimensions to the analysis, a “categories” dimension and a “subvariables” dimension. If your query just specifies the variable, by default the categories dimension is used first and the categories second, but you can specify the order by using categories()
and subvaribles()
functions in your query.
<- newSlide(
slide
deck, ~allpets,
title = "Categorical array: default order"
)
<- newSlide(
slide
deck, ~categories(allpets) + subvariables(allpets),
title = "Categorical array: categories on rows dimension"
)
You want to get the mean from a Numeric (Numeric Array) variable
A numeric summary measure like a mean goes on the left hand side of the formula in a query. The right hand side cannot be empty, but to get the mean of the whole dataset put 1
.
<- newSlide(
slide
deck, mean(ndogs) ~ 1,
title = "Mean Number of Dogs"
)
<- newSlide(
slide
deck, mean(ndogs) ~ country,
title = "Mean Number of Dogs by Country"
)
You want to make comparisons of frequencies of a set of Multiple Response variables with the same items (response)
A scorecard is a rectangular grid of different Multiple Response variables with their items aligned. The query for a scorecard can be created using the scorecard()
function.
# There's only one MR available on this dataset, so we repeat the same one twice to illustrate
<- newSlide(
slide
deck, ~scorecard(allpets, allpets),
title = "Scorecard"
)
Query results have “dimensions”, which are enumerated sets that the calculation’s results are formed in, such as the categories of a categorical variables or the items in a multiple response variables. Their behavior in the slide can be customized using dimension transforms.
A query result generally has up to three dimensions. The first is the “rows_dimension”, second is the “columns” dimension and third is the “tabs_dimension”. When using the transform
argument of newSlide()
or setting the transforms<-
of a slide directly, you form a named list with these dimensions as the names. The helper function makeDimTransform()
can also help create the dimension changes.
You want to make the colors of a dashboard tile use a pre-defined palette.
Each Crunch Dataset has a set of color palettes associated with it’s account and folder. You can access the palettes using the palettes()
or defaultPalette()
functions. Then using the makeDimTransform()
function you can use this palette. The colors are used in the order they appear and if more colors are needed than provided by the palette, the default colors are used.
<- newSlide(
slide
deck, ~q1,
title = "Favorite pet using default palette",
display_settings = list(vizType = "groupedBarPlot"),
transform = list(
rows_dimension = makeDimTransform(colors = defaultPalette(ds))
)
)
<- palettes(ds)[["purple palette"]]
graph_pal <- newSlide(
slide
deck, ~categories(petloc) + subvariables(petloc),
title = "Pets by location using another palette",
display_settings = list(vizType = "horizontalBarPlot"),
transform = list(
rows_dimension = makeDimTransform(colors = graph_pal)
) )
You want to make the colors of a dashboard tile use a set of colors you specify in the script
If you want to specify the colors manually, you can also use a character vector of RGB hex codes.
<- newSlide(
slide
deck, ~q1,
title = "Favorite pet using colors from R",
display_settings = list(vizType = "groupedBarPlot"),
transform = list(
rows_dimension = makeDimTransform(colors = c("#af8dc3", "#f7f7f7", "#7fbf7b"))
) )
You want to hide a dimension item (a category or subvariable) from the slide.
The hide
argument of makeDimTransform()
takes a category name or id, if the dimension is made from categories, or a subvariable name or alias if the dimension is made from subvariables (as in a Multiple Response variable or a subvariables dimension of a Categorical Array or Numeric Array).
<- newSlide(
slide
deck, ~q1,
title = "Favorite pet excluding birds",
display_settings = list(vizType = "groupedBarPlot"),
transform = list(
rows_dimension = makeDimTransform(hide = "Bird")
) )
You want to create a slide with a display type other than table.
The default display of a tile is the table, but the vizType
display setting chooses between other options. The most commonly used vizType
s are: - table
(always available) - groupedBarPlot
, stackedBarPlot
, horizontalBarPlot
, horizontalStackedBarPlot
(available for queries based on a count in any number of dimensions) - timeplot
(available when the second dimension has a time component) - dotplot
(available for displays of means) - donut
(available only for 1 dimensional count queries)
<- newSlide(
slide
deck, ~q1 + country,
title = "Favorite pet by country horizontal bar plot",
display_settings = list(vizType = "horizontalBarPlot")
)
<- newSlide(
slide
deck, ~q1 + wave,
title = "Favorite pet over time timeplot",
display_settings = list(vizType = "timeplot")
)
You want to use the settings from an existing slide to create a new one (or modify an existing one).
The functions displaySettings()
and vizSpecs()
give access to the settings on an existing slide. This slide can be a slide you’ve created from R or from the web app, so that you can use the visual editor to perfect the look for one slide and then use it for a whole set of slides. You can either set the attributes directly, or use dput()
to print out the object in a way that you can copy and paste into your code.
<- newDeck(ds, "Templates", is_public = TRUE)
template_deck <- newSlide(
slide
template_deck, ~q1,
title = "Donut with value labels",
display_settings = list(vizType = "donut", showValueLabels = TRUE),
viz_specs = list(
default = list(
format = list(
decimal_places = list(percentages = 0L, other = 2L),
show_empty = FALSE
)
)
)
)
# Setting the slide `display_setting` and `viz_specs` directly:
<- newSlide(
slide
deck, ~country,
title = "Country donut with value labels",
display_settings = displaySettings(template_deck[["Donut with value labels"]]),
viz_specs = vizSpecs(template_deck[["Donut with value labels"]])
)
# How to print out the structure in a format that can be copy and pasted into your code
print(dput(displaySettings(template_deck[["Donut with value labels"]])))
Sometimes you want to make many slides with related formatting to create a document that gives a good high level overview of a dataset. The [tabBook()
] function is designed to create a basic “top line” report of simple crosstabs from a multitable, and is probably the first thing you should check if you’re thinking of making bulk analyses. However, tabBook()
does not allow for all of the customization possible in a slide.
The trickiest part of bulk creating slides from R is iterating over the variables. The general behind all of these cookbook recipes is to get a list of variable aliases, iterate over them using them to get other variable metadata. The trickiest part is to create a query formula from a string, but the as.formula()
function helps with this. This cookbook uses base R functions lapply()
and paste0()
, but the “tidyverse” functions purrr::walk()
and glue::glue()
are well-suited to this task.
You want to create a simple report for every variable in a dataset.
Use the variables()
function to get the variables from a dataset, and the aliases()
function to get their aliases. Then use lapply()
to iterate over the variable aliases and construct the slide using paste0()
and as.formula()
.
<- newDeck(ds, "Full Dataset Topline Deck", is_public = TRUE)
deck
<- aliases(variables(ds))
var_aliases
<- lapply(var_aliases, function(alias) {
slides <- as.formula(paste0("~", alias))
slide_query <- paste0("Topline - ", name(ds[[alias]]))
slide_title
newSlide(deck, slide_query, title = slide_title)
})
You want to create a simple report for every variable in a particular folder.
The variables()
function can also work on a folder, so we can make a deck from variables in a folder in a similar way to making one for a whole dataset.
<- newDeck(ds, "Folder Topline Deck", is_public = TRUE)
deck
<- cd(ds, "Key Pet Indicators")
folder <- aliases(variables(folder))
var_aliases
<- lapply(var_aliases, function(alias) {
slides <- as.formula(paste0("~", alias))
slide_query <- paste0("Topline - ", name(ds[[alias]]))
slide_title
newSlide(deck, slide_query, title = slide_title)
})
You want to create crosstabs for many variables across a set of variables.
You can use lapply()
to iterate over both the row and column variables of the crosstab.
<- newDeck(ds, "Crosstabs Deck", is_public = TRUE)
deck
<- aliases(variables(cd(ds, "Dimensions")))
demo_vars <- setdiff(aliases(variables(ds)), demo_vars) # don't cross demo vars with themselves
var_aliases
<- lapply(var_aliases, function(alias) {
slides # Add a slide before crosstabs of the univariate frequency
<- as.formula(paste0("~", alias))
all_query <- paste0("Frequency - ", name(ds[[alias]]))
all_title
newSlide(deck, all_query, title = all_title)
lapply(demo_vars, function(demo_alias) {
<- as.formula(paste0("~", demo_alias, " + ", alias))
crosstab_query <- paste0("Crosstab - ", name(ds[[alias]]), " by ", name(ds[[demo_alias]]))
crosstab_title
newSlide(deck, crosstab_query, title = crosstab_title)
}) })
You want to create a report with slides that vary based on the variable’s type.
You can create functions that create slides for a particular variable type and then choose which function to use based on the variable’s type while iterating.
<- function(alias, ds, deck) {
cat_slide <- as.formula(paste0("~", alias))
slide_query <- paste0(name(ds[[alias]]))
slide_title newSlide(
deck,
slide_query,title = slide_title,
display_settings = list(vizType = "donut")
)
}
<- function(alias, ds, deck) {
mr_slide <- as.formula(paste0("~", alias))
slide_query <- paste0(name(ds[[alias]]))
slide_title newSlide(
deck,
slide_query,title = slide_title,
display_settings = list(vizType = "groupedBarPlot")
)
}
<- function(alias, ds, deck) {
numeric_slide <- as.formula(paste0("mean(", alias, ") ~ wave"))
slide_query <- paste0(name(ds[[alias]]), " over time")
slide_title newSlide(
deck,
slide_query,title = slide_title,
display_settings = list(vizType = "timeplot")
)
}
<- newDeck(ds, "Slides Customized by Variable Type", is_public = TRUE)
deck
<- c("q1", "allpets", "ndogs")
var_aliases <- lapply(var_aliases, function(alias) {
slides switch(
type(ds[[alias]]),
"categorical" = cat_slide(alias, ds, deck),
"multiple_response" = mr_slide(alias, ds, deck),
"numeric" = numeric_slide(alias, ds, deck),
) })