# Basic theory and concepts

This document was built in Markdown in R 4.0.2, and covers package lefko3 version 2.4.1.

## CHAPTER I: INTRODUCTION TO PACKAGE LEFKO3

R package lefko3 is devoted to the analysis of demographic data through matrix projection models (MPMs). It is intended to serve as a one-stop destination for the construction of size-classified MPMs. It was originally developed specifically to estimate and analyze historical matrix projection models (hMPMs), which are matrices designed to include more individual history than standard population projection matrices. Such matrices are large, typically having a dimensionality at least an order of magnitude higher than their standard, ahistorical counterparts (the latter will be hereafter refered to as ahistorical MPMs, or ahMPMs, while the acronym MPM will be used to refer to all matrix projection models, whether historical or not). As this package has developed, we have prioritized the development of core algorithms and methods to construct these matrices quickly, efficiently, and at least relatively painlessly. To make comparison possible and straightforward, the package contains methods covering the estimation of both ahistorical and historical MPMs, using both raw matrix and function-based formats. It can even be used to create integral projection models (IPMs), and we are currently planning to include a set of functions to estimate age × stage matrices in both raw and function-based formats in the near future.

Package lefko3 introduces a complete suite of functions covering the MPM workflow, from dataset management to the construction of MPMs themselves to their analysis. Dataset management functions standardize demographic datasets from the dominant formats into a format that allows estimation of complete historical MPMs while accounting for individual identity and other parameters. Demographic functions may be estimated using standardized demographic datasets, and these functions take the form of generalized linear or mixed linear models using a variety of response distributions. Matrix estimation functions produce all necessary matrices from a single dataset, including all called for years, patches, and even populations at a single shot, and do so quickly through core binaries engineered for speed.

This tutorial was written as a basic introduction to the concepts and methods underlying lefko3. The target audience is everyone from beginners with little knowledge of population ecology and even less of R, to experienced ecologists with advanced working knowledge of R and the analysis of population dynamics. The tutorial is divided into three core vignettes. First, this section introduces the concepts, background, and workflow underlying the package. Second and third, we include two vignettes showcasing lefko3 on two datasets included in the package, one from a Swedish population of the perennial herb Lathyrus vernus and another from an American population of the terrestrial orchid Cypripedium candidum.

## SECTION I: CONCEPTS, BACKGROUND, AND WORKFLOW

### BASIC CONCEPTS BEHIND POPULATION MATRIX PROJECTION

#### Life history models and population projection matrices

Matrix projection models are representations of the dynamics of a population across all life history stages deemed relevant, across the most relevant time interval (typically one year, but sometimes differing). They require a complete model of the organism’s life history prior to construction, and this model must explicitly show all life stages and all life history transitions to be modeled. Each stage is mutually exclusive, and is represented in the matrix by a single column and a single row. Matrix elements (akj) show either the probability of transition for an individual in stage j at time t (along the columns), to stage k at time t+1 (along the rows), or the mean rate of production of new recruits into stage k at time t+1 (along the rows) by individuals in reproductive stage j in time t (along the columns). Conceptually, each individual is in a particular stage in the instance of monitoring or observation, and then transitions in the interval between that time step’s observation and the next time step’s observation. Death is not an explicit life stage and so is not actually modeled as such, instead becoming a potential endpoint of each transition.

Figure 1 is an example of a model and matrix for one organism. This is a highly simplified stage-classified model based on a terrestrial orchid species, in this case Cypripedium parviflorum (Shefferson et al. 2001). Here, each stage is shown as a node, and each transition is shown as an arrow (a). The rates and probabilities are shown as mathematical symbols in (b), with Skj denoting survival-transition probability from stage j in time t to stage k in time t+1, and Fkj denoting the fecundity of reproductive stage j into recruit stage k in this life history. Note that arrows are never bidirectional, since transition rates in one direction are unlikely to equal those in the reverse direction.

Figure 1. Simple life history model (a) and ahistorical MPM (b) for Cypripedium candidum, a North American herbaceous plant species. Here, S is the dormant seed stage, J is the seedling stage, D is vegetative dormancy, V is vegetative sprouting, and F is flowering. Stages D, V, and F are mature stages, while S and J are immature.

#### Ahistorical vs. historical matrix models

Figure 1 is an example of an ahistorical MPM (ahMPM), meaning that individual history is not incorporated into the matrix. It may sound strange to say that individual history is not incorporated into the matrices given that demographic datasets are composed of records of individual histories. However, construction of a typical MPM requires these individual histories to be broken up into pairs of consecutive states across time, with each pair treated as independent of every other pair. For example, if individual 1 is in stage A in time 1, stage B in time 2, stage E in time 3, and dead in time 4, and individual 2 is in stage B in time 1, stage C in time 2, and Dead in time 3, then these individual histories are broken up into A-B, B-E, E-Dead, B-C, and C-Dead, with each pair assumed to be independent of every other pair regardless of the individual of origin. This reflects a central assumption in MPM analysis: the state of an individual in the next time step is only influenced by its current state. Conceptually, if an organism’s state in the next time step is entirely determined by its current state, then its previous states do not influence these transitions. Thus, standard MPMs are two dimensional and reflect only the current and next immediate state of individuals, given by the columns and rows, respectively. This is ultimately an extension of the all important iid assumption in statistics - that the states of individuals are independent and originate from identically-distributed random variables. Population projection matrices have been estimated for approximately a full century now. We have never done a meta-analysis of all of these studies, but nonetheless it is safe to say that well over 99% of published MPM analyses have been conducted on ahistorical MPMs, and so have assumed independence of stage transitions across time even when from the same individual. In fact, at the time of writing, we can only think of five examples of studies using a historical approach, meaning that they incorporated some degree of individual history into matrix estimation and analysis (Ehrlén 2000; Shefferson, Warren II, and Pulliam 2014; Shefferson, Mizuta, and Hutchings 2017; Shefferson et al. 2018; Vries and Caswell 2018).

The historical MPM is an extension of the matrix projection model that incorporates information on one extra previous time step into the determination of vital rates. Thus, the expected survival probability of an individual in stage j at time t to stage k at time t+1 also depends on its stage in time t-1. Population ecologists considering this problem analytically might be inclined to add an extra dimension to the matrix to deal with this, thus creating a 3d array or cube. However, this approach is computationally intensive and makes analysis challenging. Instead, we utilize the approach developed by Ehrlén (2000), in which life history stages are paired in representation in the matrix. Thus, columns now represent the From pair of stages (state in times t and t-1), and rows now represent the To pair of stages (state in times t and t+1), as in the following diagram.

Figure 2. Historical MPM for Cypripedium candidum, a North American herbaceous plant.

Here, we can refer to matrix elements as akjl, where the element represents the rate at which individuals become stage k in time t+1 after having been stage j in time t and stage l in time t-1. If nkjl is the number of individuals making this transition, and n.jl is the total number of individuals in stage j in time t and stage l in time t-1 regardless of stage (or even status as alive or dead) in time t+1, then:

$$$\tag{1} a _{kjl} = \frac{n _{kjl}}{n _{.jl}} = \frac{n _{kjl}}{\sum_{i=1}^{m} n _{ijl}}$$$

Furthermore, these values can be used to compute the elements of matrices in ahistorical MPMs, as follows:

$$$\tag{2} a _{kj} = \frac{\sum_{l=1}^{m} n _{kjl}}{\sum_{l=1}^{m} \sum_{i=1}^{m} n _{ijl}}$$$

where m is the number of stages in the life history model.

Historical MPMs are more data-hungry than ahistorical MPMs. In the simplest terms, the former can be vastly bigger than the latter, leading to many more elements to estimate. Figure 2 illustrates this issue, but a more realistic example might illustrate the issue more completely. A life history with 10 stages would yield ahistorical matrices that have 10 columns and 10 rows, and so 100 elements. A historical version of such a MPM might have 100 columns and 100 rows, consisting of 10,000 elements, because the column and row stages represent pairs of stages. Clearly, 10,000 is a much bigger number than 100 - in fact it is two orders of magnitude bigger! If the life history includes 25 stages, then an ahistorical matrix has 25 rows and 25 columns, yielding 252 or 625 elements, while the associated historical matrix has 252 = 625 rows and 625 columns, yielding 254 or 390,625 elements, which is three orders of magnitude larger.

The vast increase in the number of matrix elements can lead to natural worry that historical MPMs may be overparameterized (and, indeed, in a short dataset of a small population that may very well be the case!). However, most elements in a historical MPM are structural 0s. Since the To pair of stages and the From pair of stages includes state in time t, most matrix transitions are logically impossible, because transition elements in historical MPMs are only estimable if stage at time t is equal in the column and row stage pairs. For example, the transition probability between stage A in time t-1 and stage B in time t (column stage), to stage C in time t and stage D in time t+1 (row stage) equals 0, because an organism cannot be in both stages B and C in time t. In fact, if there are m stages in a life history, yielding m2 elements in an ahistorical matrix, then although there will be m4 elements in the historical matrix, only m3 will be potentially estimable while (m-1)m3 are structural 0s (we say potentially because some of the logically possible transitions may still be biologically impossible, and so may also be structural 0s).

#### Raw vs. function-based matrix models, including integral projection models

There are two main kinds of matrix projection models: raw and function-based. The most common is the raw MPM, in which transition probabilities in the matrix are estimated by counting all of the number of individuals alive in a particular stage at time t, and then dividing that number into the number of individuals from that set that are still alive in each possible stage in time t+1 (equations 1 and 2). For example, if 100 individuals are alive in stage D in 2010, and 20 of these are alive in stage D in year 2011, as are a further 40 in stage V and a further 25 in stage F, then the associated transition probabilities are 0.20, 0.40, and 0.25, respectively. Further, the overall survival probability for stage D is the sum of these transitions, or 0.85. Methods for estimating fecundity in raw matrices vary from system to system. In this example, which is of an herbaceous plant that produces untrackable seeds, we simply count the number of fruits in one year and then multiply by the mean number of seeds per fruit and the mean germination probability. In other systems in which offspring may be observed and tracked, counts of actual recruits in the next year may be possible, such as in studies of nesting birds.

Function-based MPMs are more recent inventions, and although they have been much less utilized than raw matrices, they are becoming increasingly common in the literature. In function-based MPMs, most matrix elements are populated by kernels that link together functions representing key demographic processes governing each transition. Typically, demographic datasets are analyzed via linear models to estimate demographic parameters such as survival probability, the probability of becoming a certain size assuming survival, and fecundity rate. Elements in these matrices are then estimated as products of these linear functions set to particular inputs, which must involve all parameters governing the construction of the matrix and must be estimated as conditional rates and probabilities. The most common examples of function-based matrix models are integral projection projection models (IPMs), which break up a life history into many fine-scale size classes using a continuous measure of size, and then estimate survival-transitions and fecundity according to these fine-scale size classes (Ellner and Rees 2006).

The example from Figures 1 and 2 may help to illustrate. In Cypripedium candidum, the simple life history model shown in Figure 1 includes one adult stage that is unobservable (D - vegetative dormancy), one adult stage that is observable but not reproductive (V - vegetative sprouting), and one adult stage that is both observable and reproductive (F - flowering). If $$\sigma _{.jl}$$ is the probability of survival from stage j in time t and stage l in time t-1 to time t+1, $$\psi _{.jl}$$ is the probability of sprouting in time t+1 after being stage j in time t and stage l in time t-1, and $$\rho _{.jl}$$ is the probability of flowering in time t+1 after being stage j in time t and stage l in time t-1, then:

$$$\tag{3} a _{DDD} = \sigma _{.DD} (1 - \psi _{.DD})$$$

$$$\tag{4} a _{VDD} = \sigma _{.DD} \psi _{.DD} (1 - \rho _{.DD})$$$

$$$\tag{5} a _{FDD} = \sigma _{.DD} \psi _{.DD} \rho _{.DD}$$$

In both raw and function-based MPMs, most estimated elements are survival-transition probabilities. In ahistorical MPMs, these give the probability of an individual in stage j at time t surviving and transitioning to stage k at time t+1. Only some elements devoted to fecundity. In Figure 1, fecundity is shown in the top-right of the matrix, as the mean production of seeds either dormant and germinating in the next time interval. Ignoring the fecundity elements, we have a survival-transition matrix (symbolized as either U or T), in which the column sums correspond to the expected survival probabilities of individuals in each stage from time t to time t+1. Ignoring the survival-transition terms, we have the fecundity matrix (symbolized as F), in which the column sums correspond to the expected overall fecundity of individuals in these respective stages.

### WORKFLOW

#### Step 1: Data characterization and reorganization

The basic workflow to analysis with package lefko3 starts with the development of a life history model that encapsulates all of the appropriate life stages relevant to population dynamics. We leave it to the reader to figure out how to do this properly, though many resources exist in the literature. Once this is done, we need to characterize the main life stages in a way that is relevant to the dataset. We do this with the sf_create() function.

Function sf_create() creates what we have termed a stageframe object. This term is a combination of the terms ‘life history stage’ and ‘data frame’, because it creates a data frame in R that describes all of the life history stages. This object should be created in a way that shows how stage relates to size, reproductive status, observation status, propagule status, maturity status, and presence in the dataset, and in which the description of each life stage is unique. It describes stages in ways that matrix-estimating functions utilize to conduct proper computations, and also allows the user to stipulate extra information to remind him or herself of what exactly the stages mean. If the user wishes to create an IPM, then this function also allows that to be specified.

Once the stageframe has been created, the demographic data needs to be organized into a format that lefko3 can use. We have developed two functions for this purpose, depending on the format of the data. If the data is arranged horizontally, meaning that individual life histories are recorded in a spreadsheet with rows corresponding to unique individuals and columns corresponding to their states at different times, then the verticalize3() function can turn this dataset into the proper format. If the dataset is in vertical, ahistorical format, in which individuals states are recorded in across rows in a spreadsheet, with a single row corresponding to a single observation, then the historicalize3() function can organize the data properly. The case studies included in lefko3 illustrate the useage of these functions on real data.

#### Step 2: Develop supplemental information for matrix estimation

Matrices are often estimated only partially from available demographic datasets. Some information is sometimes provided from other studies to parameterize key transitions. For example, if fecundity is a function of seed production and germination probability, then germination probability might be estimated via a separate field germination study, and then used to develop the MPM. In lefko3, this information is provided in two ways.

First, the user may specify specific rates and probabilities to incorporate directly into the matrix via the overwrite() function. This function provides an opportunity to show the specific transition, in either historical or ahistorical format, along with the exact value to use. It also allows the user to specify whether a specific transition should be equal to another transition to be estimated within the matrix estimation procedure. For example, if there are no juveniles in a study, then it may still be valid in some circumstances to set transitions from juvenile to adult stages as equal to transitions from the smallest adult class. Such proxy estimations can be specified with overwrite(). Additionally, overwrite() allows large groups of transitions to be altered with a single line, for example all reproductive transitions.

Second, the user may provide detailed information on fecundity, including which stage yields what recruit stages, and also any modifiers on fecundity estimates. This is accomplished by creating a numeric matrix equal in both dimension and row/column designation to the ahistorical matrix associated with the study. Assuming the same order of stages as in the associated stageframe object for analysis, a zero matrix is created and fecundity elements can be changed appropriately. Changing all fecundity elements to 1 simply yields fecundity estimated as equal to whatever variable corresponds to fecundity in the dataset. Multiplying by, say, 0.25 will yield estimates 1/4 of that magnitude.

The next step in the process depends on whether we wish to build raw matrices or function-based matrices. If the former, then we can skip ahead to Step 4, but if the latter, then we need to develop models of demographic parameters.

#### Step 3: Develop models of demographic parameters (for function-based matrices only)

function-based matrices are generated using models of key demographic parameters. Package lefko3 can create models to estimate up to nine different vital rates:

1. Survival probability - This is the probability of surviving from time step t to time step t+1, given that the individual is in some stage j in time t (and, if historical, in stage l in time t-1). In lefko3, this parameter may be modeled as a function of size, reproductive status, patch, year, and individual identity in times t and t-1. This parameter is required in all function-based matrices.

2. Observation probability - This is an optional parameter denoting the probability of observation in time step t+1 of an individual in stage k given survival from time t to time t+1. This parameter is only used when at least one stage is technically unobservable. For example, some plants are capable of vegetative dormancy, in which case they are alive do not necessarily sprout in all years. In these cases, the probability of sprouting can be estimated as the observation probability. Note that this probability does NOT refer to observer effort, and so should ONLY be used to differentiate completely unobservable stages where the observation status refers to an important biological phenomenon, such as when individuals may be alive but have a size of 0. In lefko3, this parameter may be modeled as a function of size, reproductive status, patch, year, and individual identity in times t and t-1.

3. Size transition probability - This is the probability of becoming size k in time step t+1 assuming survival from time t to time step t+1 and observation in that time. In lefko3, this parameter may be modeled as a function of size, reproductive status, patch, year, and individual identity in times t and t-1. This parameter is required in all function-based size-classified matrices.

4. Reproduction probability - This is an optional parameter denoting the probability of reproducing in time step t+1 given survival from time t to time t+1, and observation in that time. Note that this should be used only if the researcher wishes to separate breeding from non-breeding mature stages. If all adult stages are potentially reproductive, then this parameter is not needed. In lefko3, this parameter may be modeled as a function of size, reproductive status, patch, year, and individual identity in times t and t-1.

5. Fecundity rate - This is the rate of successful production of offspring in time t by individuals alive, observable, and reproductive in that time, and the survival of those offspring into time t+1 in whatever juvenile class is possible. Thus, the fecundity rate of seed-producing plants might be split into seedlings, which are plants that germinated within a year of seed production, and dormant seed. In lefko3, this parameter may be modeled as a function of size, reproductive status, patch, year, and individual identity in times t and t-1. This parameter is required in all function-based matrices.

6. Juvenile survival probability - This is the probability of surviving from juvenile stage j in time step t to a mature stage in time step t+1. In lefko3, this parameter may be modeled as a function of size, patch, year, and individual identity in times t and t-1.

7. Juvenile observation probability - This is a parameter denoting the probability of observation in time step t+1 of an individual in mature stage k given survival from a juvenile stage in time t to time t+1. In lefko3, this parameter may be modeled as a function of size, patch, year, and individual identity in times t and t-1, and all other caveats noted in (2) above apply.

8. Juvenile size transition probability - This is the probability of becoming mature size k in time step t+1 assuming survival from juvenile stage j in time t to time step t+1 and observation in that time. In lefko3, this parameter may be modeled as a function of size, patch, year, and individual identity in times t and t-1.

9. Juvenile reproduction probability - This is a parameter denoting the probability of reproducing in mature stage k in time step t+1 given survival from juvenile stage j in time t to time t+1, and observation in that time. In lefko3, this parameter may be modeled as a function of size, patch, year, and individual identity in times t and t-1, and all other caveats in (4) apply. Note that this parameter denotes transition from the juvenile stages of life to maturity.

Of these nine vital rates, most users will estimate at least parameters (1) survival probability, (3) size transition probability, and (5) fecundity. Parameters (2) observation probability and (4) reproduction probability may be used when some stages are included that are completely unobservable (and so do not have any size), or that are mature but non-reproductive, respectively. Parameters (6) through (9) should only be added if the dataset contains juvenile individuals, making the estimation of these parameters possible.

Package lefko3 includes one main function to handle model estimation: modelsearch(). Function modelsearch() handles the entire modeling process, including the development of global models, exhaustive model building, and the selection of best-fit models. Users need to provide this function with information about the following.

1. Individual history - Are the matrices to be built historical or ahistorical? If the former, then the state of the individual in time t-1 will be included in modeling.

2. Modeling approach - Should the models be estimated as generalized linear models (GLMs) or mixed linear models? While most function-based matrix models are estimated as the former, the latter approach can correct for pseudo-replication by treating individual identity as a random factor. Mixed models also allow time and patch to be treated as random variables.

3. Suite of factors - Should both size and reproductive status be tested as causal factors? Or just one? Or is just a constant required? Should two-way interactions be included?

4. Suite of vital rates - Which demographic parameters should be estimated? The defaults are (1) survival, (3) size, and (5) fecundity.

5. Juvenile stage estimation - Should parameters (6) through (9) also be modeled?

6. Best-fit criterion - If a model with fewer parameters exists within 2.0 AICc units of the model with the lowest AICc, then should this model be used as the best-fit model, or should the model with the lowest AICc always be chosen?

7. Size distribution - Should size be modeled as a continuous variable under a Gaussian distribution, or as a count variable under either a Poisson or negative binomial distribution?

8. Fecundity distribution - Should fecundity be modeled as a continuous variable under a Gaussian distribution, or as a count variable under either a Poisson or negative binomial distribution?

9. Timing of fecundity - Modelsearch() assumes that linear models of fecundity use a metric counted or measured in time t as the response. This applies well with plants, where flowers or seeds produced might be the fecundity response. However, users not wishing to follow this default behavior can use the fectime option to stipulate a fecundity metric measured in time t+1.

10. Censoring - Should data points marked as questionable be used or eliminated?

11. Variable names - The names of all relevant variables in the dataset need to be specified. Note that the default behavior assumes variable names produced via the historicalize() or verticalize() functions, which produce standardized vertical datasets.

Once all inputs are provided, modelsearch() goes to work. The result will be a lefkoMod object, which is essentially a list in which the first several elements are the best-fit models developed for each vital rate. These are followed by an equivalent number of elements showing the full model tables developed and tested, followed by an element detailing the best-fit criterion used, and ending on a quality control data frame showing the number of individuals and number of unique transitions used in the estimation of each model. Model selection proceeds through the dredge() function in package MuMIn (Bartoń 2014).

Advanced users of lefko3 may wish to create their own models without the package’s automated model selection function. In that case, lefko3’s matrix functions can accommodate single models developed using base R, package lme4 (Bates et al. 2015), and package glmmTMB (Brooks et al. 2017). Plans are in the works to add nlme compatibility, as well as non-linear models (e.g. general additive models).

#### Step 4: Estimate matrices

MPM creation can be accomplished with four different functions:

1. rlefko2() - This function creates raw ahistorical MPMs given a dataset, a stageframe, a reproductive matrix, and an overwrite matrix.

2. rlefko3() - This function creates raw historical MPMs given a dataset, a stageframe, a reproductive matrix, and an overwrite matrix.

3. flefko2() - This function creates function-based ahistorical MPMs given a dataset, a set of models, a stageframe, a reproductive matrix, and an overwrite matrix.

4. flefko3() - This function creates function-based historical MPMs given a dataset, a set of models, a stageframe, a reproductive matrix, and an overwrite matrix.

These functions incorporate binary kernels developed to handle the estimation of matrix elements quickly and efficiently. A single run of flefko3(), for example, should be able to yield all annual matrices for all patches for the Cypripedium candidum dataset provided with lefko3 in under a minute on most machines (30s or so on RPS’ MacBook Pro 2019 with 2.3 GHz 8-Core Intel Core i9). Parallel computing should not be necessary, even with the slowest of current machines, provided that the machine is current enough to handle at least R 4.0. We are currently developing approaches to increase the speed further, and envision noticeable increases in speed in future versions of lefko3.

### MPM ANALYSES

Package lefko3 includes a number of functions to aid analysis once matrices are created. These may be of greater utility in some circumstances than functions such as eigen(), because our functions are made to handle even extremely large, sparse matrices. Currently, we include functions to estimate arithmetic mean matrices, discrete population growth rate, stable stage structure, and reproductive value. Plans are in the works for further analysis functions in the future.

The function lmean() estimates mean matrices using lefkoMat objects as input. This function can estimate arithmetic mean matrices under sparse and non-sparse settings even for extremely large matrices with speed. It separates these means as grand mean population and patch-level matrices. The arithmetic mean matrix may yield overestimates of lambda, so we encourage users to estimate the population growth rate using annual matrices.

One consideration with regards to mean matrices is the sparseness of the matrices themselves relative to the data. Particularly when raw matrices have been estimated, are there cases in which matrix elements are occasionally 0 simply because in some years no individual moves through that particular stage? Such situations may cause difficulties in mean matrix estimation because they result in 0 values for matrix elements when no data exists to properly estimate them. The impact is to drag the population growth rate down. The best method to deal with this situation is to design the life history model appropriately enough that all life history stages are occupied in all years, but this may not be possible in some cases when datasets are particularly small. Thus, lmean() also includes the capacity to ignore 0 values unless they are structural (the sparse option), where structural zeroes are defined as those elements occurring as zeroes in ALL matrices. Note, however, that this may also have unintended consequences, such as resulting in artificially high population growth rate estimates under some scenarios, and estimated survival probabilities of more than 1.0 for some stages.

#### Stable stage distribution and reproductive value

In ahistorical analyses, the stable stage distribution and the reproductive value of stages are estimated as standardized right and left eigenvectors associated with the dominant eigenvalue of the matrix. Standardization in the stable stage distribution is handled by dividing each respective element of the right eigenvector by the sum of the elements in that eigenvector. Standardization in the reproductive value case is handled by dividing each element in the left eigenvector by the value of the first non-zero element in that eigenvector.

The methods mentioned above apply to historical matrices as well. However, as described, they only provide the stable stage distribution and reproductive values of stage pairs. We provide two functions, stablestage3() and repvalue3() to allow the estimation of these vectors, and also to estimate the associated stable stage distributions and reproductive values of the original stages in the associated life history model. In the stable stage distribution case, this is handled simply as the sum of all stable stage proportions for stage j in time t across all stages in time t-1, as in:

$$$\tag{6} SS _j = \sum _{l=1}^{m} SS _{jl}$$$

where l is stage in time t-1, j is stage in time t, m is the number of stages, SSj is the stable stage proportion of stage j, and SSjl is the stable stage proportion of stage pair j - l. In the reproductive value case, this is calculated as the sum of all reproductive values for stage j in time t across all stages in time t-1, weighted by the stable stage proportion of the respective stage pair divided by the stable stage proportion of the respective stage at time t (Ehrlén 2000), as in:

$$$\tag{7} RV _j = \sum _{l=1}^{m} RV _{jl} (SS _{jl} / SS _{j})$$$

where RV refers to reproductive value. The influence of history can make these numbers differ quite dramatically from those produced by ahistorical matrices.

#### Further analyses

Users can take the matrices produced by matrix creation functions in package lefko3 and plug them into matrix analysis functions in other packages. Technically, matrices are produced and stored within lefkoMat objects, which are S3 objects structured as lists. These lists include a number of elements, but among the most important are elements $A, $U, and $F, which are lists of complete projection matrices, survival-transition matrices, and fecundity matrices, respectively. For example, code such as matobject$A[[1]] would access the first complete projection matrix in a lefkoMat object named matobject. The \$labels element is a dataset giving a description of each matrix in order, including its population, patch, and year designations. This allows the use of all functions that work with matrices, including function in base R such as eigen(), as well as in packages such as popbio (Stubben and Milligan 2007) and popdemo (Stott, Hodgson, and Townley 2012). We encourage users to explore whether the packages and functions they wish to use can handle sparse matrices, as well as large matrices - some were not designed to and can fail or yield unpredictable behavior when applied to matrices produced by lefko3.

Now let’s turn to some case studies using two datasets included in the package.

## Acknowledgements

The project resulting in this package and this tutorial was funded by Grant-In-Aid 19H03298 from the Japan Society for the Promotion of Science.

## Literature cited

Bartoń, Kamil A. 2014. “MuMIn: Multi-Model Inference.” https://CRAN.R-project.org/package=MuMIn.

Bates, Douglas, Martin Maechler, Ben Bolker, and Steve Walker. 2015. “Fitting Linear Mixed-Effects Models Using Lme4.” Journal of Statistical Software 67 (1): 1–48. https://doi.org/10.18637/jss.v067.i01.

Brooks, Mollie E., Kasper Kristensen, Koen J. van Benthem, Arni Magnusson, Casper W. Berg, Anders Nielsen, Hans J. Skaug, Martin Mächler, and Benjamin M. Bolker. 2017. “glmmTMB Balances Speed and Flexibility Among Packages for Zero-Inflated Generalized Linear Mixed Modeling.” The R Journal 9 (2): 378–400. https://journal.r-project.org/archive/2017/RJ-2017-066/index.html.

Ehrlén, Johan. 2000. “The Dynamics of Plant Populations: Does the History of Individuals Matter?” Ecology 81 (6): 1675–84. https://doi.org/10.1890/0012-9658(2000)081[1675:TDOPPD]2.0.CO;2.

Ellner, Stephen P., and Mark Rees. 2006. “Integral Projection Models for Species with Complex Demography.” The American Naturalist 167 (3): 410–28. https://doi.org/10.1086/499438.

Shefferson, Richard P., Tiiu Kull, Michael J. Hutchings, Marc-Andre Selosse, Hans Jacquemyn, Kimberly M. Kellett, Eric S. Menges, et al. 2018. “Drivers of Vegetative Dormancy Across Herbaceous Perennial Plant Species.” Ecology Letters 21 (5): 724–33. https://doi.org/10.1111/ele.12940.

Shefferson, Richard P., Ryo Mizuta, and Michael J. Hutchings. 2017. “Predicting Evolution in Response to Climate Change: The Example of Sprouting Probability in Three Dormancy-Prone Orchid Species.” Royal Society Open Science 4 (1): 160647. https://doi.org/10.1098/rsos.160647.

Shefferson, Richard P., Brett K. Sandercock, Joyce Proper, and Steven R. Beissinger. 2001. “Estimating Dormancy and Survival of a Rare Herbaceous Perennial Using Mark-Recapture Models.” Ecology 82 (1): 145–56. https://doi.org/10.1890/0012-9658(2001)082[0145:EDASOA]2.0.CO;2.

Shefferson, Richard P., Robert J. Warren II, and H. Ronald Pulliam. 2014. “Life History Costs Make Perfect Sprouting Maladaptive in Two Herbaceous Perennials.” Journal of Ecology 102 (5): 1318–28. https://doi.org/10.1111/1365-2745.12281.

Stott, Iain, David J. Hodgson, and Stuart Townley. 2012. “Popdemo: An R Package for Population Demography Using Projection Matrix Analysis.” Methods in Ecology and Evolution 3 (5): 797–802. https://doi.org/10.1111/j.2041-210X.2012.00222.x.

Stubben, Chris J., and Brook G. Milligan. 2007. “Estimating and Analyzing Demographic Models Using the Popbio Package in R.” Journal of Statistical Software 22: 11. https://doi.org/10.18637/jss.v022.i11.

Vries, Charlotte de, and Hal Caswell. 2018. “Demography When History Matters: Construction and Analysis of Second-Order Matrix Population Models.” Theoretical Ecology 11 (2): 129–40. https://doi.org/10.1007/s12080-017-0353-0.