The pivottabler
package is comprised of R6 classes. This section provides a short description of the main classes, their function and relationships in the package.
The PivotTable
class represents a single pivot table. In all of the example code, the instance of the pivot table is represented by the pt
variable, on which various functions/methods are invoked e.g. pivot tables are created by:
pt <- PivotTable$new()
pt$addData(bhmtrains)
...
The PivotData
class contains the references to the source data (i.e. data frames) used to build a pivot table and is accessible via pt$data
.
A pivot table can be built from more than one data frame, though this is typically only viable if the variable names and values in the data frames are consistent - e.g. if a given variable is present in multiple data frames, it should have the same name in all of them.
A pivot table has a set of column headings and a set of row headings, each of which is an instance of the PivotDataGroup
class. A data group can (and normally does) have child groups.
A single invisible data group pt$rowGroup
acts as the top-level parent for the row groups, and another single invisible data group pt$columnGroup
acts as the top-level parent for the column groups.
Calling pt$addColumnDataGroups(...)
or pt$addRowDataGroups(...)
generates a new level of child data groups, each of which is a PivotDataGroup
.
The children for a given PivotDataGroup
can be accessed using the childGroups
property, e.g. the first level of row headings can be accessed via pt$rowGroup$childGroups
. The child data groups of a data group in this level can then be accessed via pt$rowGroup$childGroups[[1]]$childGroups
.
The display value of a data group is accessible via the caption property, e.g. pt$rowGroup$childGroups[[1]]$caption
.
See the Data Groups vignette for more details.
A pivot table normally has one or more calculations. Each calculation definition is an instance of the PivotCalculation
class. Calculation definitions are grouped together by a PivotCalculationGroup
, of which there is normally only a single default group in a pivot table. If this calculation group contains three calculation definitions, then in the populated pivot table there will be three cells underneath each data group (one for each calculation), e.g.
library(pivottabler)
pt <- PivotTable$new()
pt$addData(bhmtrains)
pt$addColumnDataGroups("TrainCategory")
pt$addRowDataGroups("TOC")
pt$defineCalculation(calculationName="NumberOfTrains", caption="Number of Trains",
summariseExpression="n()")
pt$defineCalculation(calculationName="MinimumpeedMPH", caption="Minimum Speed (MPH)",
summariseExpression="min(SchedSpeedMPH, na.rm=TRUE)")
pt$defineCalculation(calculationName="MaximumSpeedMPH", caption="Maximum Speed (MPH)",
summariseExpression="max(SchedSpeedMPH, na.rm=TRUE)")
pt$renderPivot()
It is possible (though not typical) for a pivot table to contain multiple calculation groups. See the Irregular Layout vignette for an example.
pt$calculationGroups
contains the calculation groups present in a pivot table. The calculation definitions within the default group are accessed via pt$calculationGroups$defaultGroup$calculations
.
see the Calculations vignette for more details.
A pivot table contains a set of cells represented by the PivotCells
class. The cells of a pivot table are accessed via pt$cells
. The dimensions of a pivot table (excluding data group headings) can be checked with pt$rowCount
and pt$columnCount
.
Individual cells are represented by the PivotCell
class. The easiest way to access individual cells (i.e. individual PivotCell
objects) is pt$cells$getCell(r, c)
.
Within the PivotCells
object is a list of rows, each of which is a list of PivotCell
objects. So a more direct (but less-safe) way to access individual cells is pt$cells$rows[[r]][[c]]
.
Each cell has a raw value (typically of data type numeric) and a formatted value (typically of data type character). These can be accessed via cell$rawValue
and cell$formattedValue
respectively.
It is possible to find the leaf-level data groups (i.e. right-most row heading and bottom column heading) that relate to a cell using cell$rowLeafGroup
and cell$columnLeafGroup
. If the data group has other parent groups, then these can be accessed recursively via cell$rowLeafGroup
, cell$rowLeafGroup$parentGroup
, cell$rowLeafGroup$parentGroup$parentGroup
, etc.
The Finding and Formatting vignette describes other ways of accessing cells and data group headings.
Each data group typically acts as a filter for the data in that row/column of the pivot table. E.g. a heading of “France” typically implies “Country=France”. This filter condition is represented by the PivotFilter
class. Every cell in a pivot table has a set of filters (one filter from each row/column heading) and these filters are represented by the PivotFilters
class. See the Cell Context vignette for more details.
The PivotFilterOverrides
class provides a mechanism for individual calculation definitions to override the default filters associated with a cell. See the Calculations vignette for more details.
The PivotCalculator
class provides much of the functionality for calculating the values in a pivot table. Users of the pivottabler
package typically do not directly interact with this internal class.
The PivotBatch
, PivotBatchCalculator
and PivotBatchStatistics
classes provide functionality for calculating the values of batches of cells in one/a small number of dplyr/data.table calculations. These classes are also internal.
The PivotStyle
class represents a single style declaration in the form of a single name-value pair (similar to a CSS style declaration).
The PivotStyles
class is a set of style declarations that would be applied to a cell (or a set of cells).
The PivotOpenXlsxStyle
and PivotOpenXlsxStyles
classes are similar, except these are specific to Excel export.
The PivotThemes
class represents a set of styles to apply to the different parts (headings, cells, totals, etc) of a pivot table.
See the Styling vignette for more details.
The PivotHtmlRenderer
, PivotLatexRenderer
and PivotOpenXlsxRenderer
classes provide rendering logic for each of the formats that a pivot table can be output in.
See the Outputs vignette for more details.
PivotTable
class represents a single pivot table, typically named pt
in the package examples.pt$data
is an instance of the PivotData
class that wraps the data frames used to build the pivot table.pt$rowGroup
and pt$columnGroup
are the invisible top-level data groups - each is an instance of the PivotDataGroup
class.pt$rowGroup$childGroups
, pt$rowGroup$childGroups[[1]]$childGroups
, etc.pt$calculationGroups
contains the calculation groups present in a pivot table. The calculation definitions within the default calculation group are accessed via pt$calculationGroups$defaultGroup$calculations
.pt$rowCount
and pt$columnCount
.PivotCell
class and can be accessed via pt$cells$getCell(r, c)
.cell$rowLeafGroup
, cell$rowLeafGroup$parentGroup
, cell$rowLeafGroup$parentGroup$parentGroup
, etc.The full set of vignettes is: