Alluvial plots are similar to sankey diagrams and visualise categorical data over multiple dimensions as flows. Rosval et. al. 2010 Their graphical grammar however is a bit more complex then that of a regular x/y plots. The ggalluvial
package made a great job of translating that grammar into ggplot2
syntax and gives you many option to tweak the appearance of an alluvial plot, however there still remains a multi-layered complexity that makes it difficult to use ‘ggalluvial’ for explorative data analysis. ‘easyalluvial’ provides a simple interface to this package that allows you to produce a decent alluvial plot from any dataframe in either long or wide format from a single line of code while also handling continuous data. It is meant to allow a quick visualisation of entire dataframes with a focus on different colouring options that can make alluvial plots a great tool for data exploration.
In order to learn about all the features an how they can be useful check out the following tutorials:
suppressPackageStartupMessages( require(tidyverse) )
suppressPackageStartupMessages( require(easyalluvial) )
data = as_tibble(mtcars)
categoricals = c('cyl', 'vs', 'am', 'gear', 'carb')
numericals = c('mpg', 'cyl', 'disp', 'hp', 'drat', 'wt', 'qsec')
data = data %>%
mutate_at( vars(categoricals), as.factor )
Continuous Variables will be automatically binned as follows.
tailnum | carrier | origin | dest | qu | mean_arr_delay |
---|---|---|---|---|---|
N0EGMQ LGA BNA MQ | MQ | LGA | BNA | Q1 | on_time |
N0EGMQ LGA BNA MQ | MQ | LGA | BNA | Q2 | on_time |
N0EGMQ LGA BNA MQ | MQ | LGA | BNA | Q3 | on_time |
N0EGMQ LGA BNA MQ | MQ | LGA | BNA | Q4 | on_time |
N11150 EWR MCI EV | EV | EWR | MCI | Q1 | late |
N11150 EWR MCI EV | EV | EWR | MCI | Q2 | late |
alluvial_long( quarterly_flights
, key = qu
, value = mean_arr_delay
, id = tailnum
, fill = carrier )