Advanced climwin

Liam D. Bailey and Martijn van de Pol

2016-06-01

In the previous vignette we described the basic features available in the climwin package. Below, we will look in more detail at more advanced features available to users. We will cover:


Climate thresholds

Many studies, may be interested in testing climate windows using climatic thresholds. When testing climatic thresholds, we assume that a biological response is driven by the total numbers of days that surpass a particular climatic value. For example, seed germination may be influenced by the number of days over 30 degrees. Alternatively, temperature may only influence organism survival when temperatures fall below freezing. A common example of such a climate threshold is the using of growing degree days in plant studies.

Statistics like these can be achieved in slidingwin using the three parameters ‘upper’, ‘lower’ and, ‘binary’. When a value is provided for the parameter ‘upper’, slidingwin will create a new climate dataset where all values equal to or below this threshold are set at 0. Similarly, when a value is set for ‘lower’, all values equal to or above this threshold will be set at 0. When values are provided for both ‘upper’ and ‘lower’, all values that fall between these two threshold will be set at 0.

upper = 30

Date Original temperature Threshold temperature
01/06/2015 25.9 0
02/06/2015 24.0 0
03/06/2015 32.5 32.5
04/06/2014 28.1 0
05/06/2014 30.5 30.5
06/06/2014 30.0 0
07/06/2014 31.2 31.2
08/06/2014 27.0 0

upper = 30 lower = 25

Date Original temperature Threshold temperature
01/06/2015 25.9 25.9
02/06/2015 24.0 0
03/06/2015 32.5 0
04/06/2014 28.1 28.1
05/06/2014 30.5 0
06/06/2014 30.0 0
07/06/2014 31.2 0
08/06/2014 27.0 27.0

In some circumstances we may assume that all values past the climatic threshold will have an equally large impact of the biological response. In this case, we would set ‘binary’ to TRUE so that all non-zero values are set at 1. By default however, ‘binary’ will be set at FALSE, so that all values past the climatic threshold keep their original value.

upper = 30 binary = TRUE

Date Original temperature Threshold temperature
01/06/2015 25.9 0
02/06/2015 24.0 0
03/06/2015 32.5 1
04/06/2014 28.1 0
05/06/2014 30.5 1
06/06/2014 30.0 0
07/06/2014 31.2 1
08/06/2014 27.0 0

A worked example

Below we will provide a worked example using the Mass and MassClimate dataset. In this example, lets imagine we are interested in testing the impact of the number of days above freezing on our mass response variable. To do this we would set both our ‘upper’ and ‘binary’ parameters.

upper = 0 binary = TRUE

As we are interested in measuring the number of days above freezing, we set our stat parameter to ‘sum’. Otherwise, we used model parameter values identical to our earlier vignette.

library(climwin)
MassWin <- slidingwin(xvar = list(Temp = MassClimate$Temp),
                      cdate = MassClimate$Date,
                      bdate = Mass$Date,
                      baseline = lm(Mass ~ 1, data = Mass),
                      cinterval = "day",
                      range = c(150, 0),
                      upper = 0, binary = TRUE,
                      type = "absolute", refday = c(20, 05),
                      stat = "sum",
                      func = "lin")

When we examine the best model data, we can see that our climate data is now count data.

    head(MassWin[[1]]$BestModelData)
Yvar climate
140 0
138 0
136 1
135 2
134 0
134 0

Biological measurments that encompass two years

Many long-term datasets that will be suitable for climwin are likely to be measured during Northern hemisphere spring/summer (e.g. breeding data). These biological records are measured during the middle of the year, meaning that biological records can be easily grouped by year. Yet in other circumstances biological measurements will fall across two years, particularly in Southern hemisphere species where spring/summer falls across the new year period [e.g., 1].

This can cause issues when fitting ‘absolute’ climate windows. As described in the introductory vignette, ‘absolute’ windows will use a set reference day for all biological records. Where biological measurements cross two years however, measurements from the same season can be split up. In the table below, where a reference day of November 1st is used, all those measurements taken at the start of the breeding season are given a date of November 1st 2014 while all values following the new year are set at November 1st 2015. This is obviously unrealistic, as biological measurements in January 2015 cannot be impacted by climatic conditions that occured 11 months later.

Date Reference Date
05/11/2014 01/11/2014
10/11/2014 01/11/2014
01/12/2014 01/11/2014
12/12/2014 01/11/2014
01/01/2015 01/11/2015
07/01/2015 01/11/2015

As a solution, climwin includes a ‘cohort’ parameter that allows users to specify which biological measurements should be grouped together (e.g. when they are from same breeding season). Each biological record should be given a cohort level (see below), which is taken into account when setting the reference day for climate window analyses.

Date Cohort Reference Date
05/11/2014 2014 01/11/2014
10/11/2014 2014 01/11/2014
01/12/2014 2014 01/11/2014
12/12/2014 2014 01/11/2014
01/01/2015 2014 01/11/2014
07/01/2015 2014 01/11/2014

Spatial replication

To detect climate signals using climwin can often require large amounts of data, particularly if the relationship between climate and biological response is weak [2]. To obtain the required data through temporal replication can require the collection of data over multiple years, often decades; however, spatial replication may also allow users to expand their sample size over a shorter period by collecting data from multiple sites/populations.

Using spatial replication assumes that the relationship between the biological response and climatic predictor is consistent across the different measured populations. Where this assumption is valid, spatial replication can help expand the amount of data available for climwin analyses.

A worked example

Spatially replicated data can be analysed using the slidingwin function with the addition of the ‘spatial’ parameter. As with regular slidingwin analysis, analysis with spatial replication requires a separate biological and climate dataset. However, these datasets should now contain an additional variable which specifies the site at which biological and climate data was collected. Below, we have called this parameter ‘SiteID’.

Date Mass (g) SiteID
04/06/2015 120 A
05/06/2015 123 A
07/06/2015 110 B
07/06/2015 140 A
06/06/2015 138 B
Date Temperature SiteID
01/06/2015 15 A
02/06/2015 16 A
03/06/2015 12 A
04/06/2015 18 A
05/06/2015 20 A
06/06/2015 23 A
07/06/2015 21 A
01/06/2015 10 B
02/06/2015 12 B
03/06/2015 9 B
04/06/2015 5 B
05/06/2015 13 B
06/06/2015 10 B
07/06/2015 11 B

NOTE: The climate dataset for spatially replicated climwin analysis will often include duplication of dates. In a regular climwin analysis this will lead to errors.

With these new datasets, we can carry out a slidingwin analysis with the addition of a ‘spatial’ parameter.

MassWin <- slidingwin(xvar = list(Temp = Climate$Temp),
                      cdate = Climate$Date,
                      bdate = Biol$Date,
                      baseline = lm(Mass ~ 1, data = Biol),
                      cinterval = "day",
                      range = c(150, 0),
                      type = "absolute", refday = c(20, 05),
                      stat = "mean",
                      func = "lin", spatial = list(Biol$SiteID, Climate$SiteID))

The ‘spatial’ parameter is a list item that includes the SiteID variable for the biological and climate datasets respectively. When slidingwin fits individual climate windows, climate data will be subset so that each biological record will be matched with the corresponding climate data.


weightwin function

When we run regular slidingwin analyses we assume that all days within the climate window are evenly weighted. While this is often a convenient assumption, this may be biologically unrealistic as we create a strict cut-off for when climate data is considered (see below).

In certain cases we may be interested in looking for climate windows where the importance of climate decays slowly over time. The function weightwin allows users to fit either Weibull (below left) and generalised extreme value (GEV; below right) weight distributions to climate data.

Instead of varying the start and end date of climate windows like slidingwin, weightwin instead uses an optimisation function to vary the shape, scale and location of either of these weight functions. Each weight function is then used to weight the climate data, which is then used to produce a climate model and delta AICc value. Therefore, although the method of optimising climate data is different, the ultimate output (i.e. \(\Delta AICc\) of a climate window compared to a null model) is the same.

A worked example

weightwin can often be useful to use with climate data where we have already identified a climate window using slidingwin. Here, we will use weightwin to further investigate the Mass and MassClimate data included with the climwin package.

The basic parameters in weightwin are the same as slidingwin (though note the abscence of the ‘stat’ parameter). In addition however, we must designate which weight distribution we want to use. In this case we consider a Weibull function.

weightfunc = "W"

Next, we must set the location, scale and shape values for the starting distribution that will be used to begin the optimisation procedure. The default values (3, 0.2, 0) are often appropriate for fitting Weibull distributions. However, you can explore different parameter values using the explore function.

weight <- weightwin(xvar = list(Temp = MassClimate$Temp), cdate = MassClimate$Date, 
                    bdate = Mass$Date, 
                    baseline = lm(Mass ~ 1, data = Mass), 
                    range = c(150, 0), 
                    func = "lin", type = "absolute", 
                    refday = c(20, 5), 
                    weightfunc = "W", cinterval = "day",
                    par = c(3, 0.2, 0))

As part of the weightwin function a plot will be generated showing the progress of the optimisation function. Most of this information is useful for assessing the effectiveness of the optimisation function. For our purposes however, we will focus only on the final weighted window function (top left).

In the above plot, we can see that the importance of temperature declines rapidly as we near May 20th. However, temperature later in time declines less rapidly. We can extract the \(\Delta AICc\) value for this weight function below.

weight$WeightedOutput$deltaAICc

If we compare this value to the delta AICc obtained from the slidingwin function we can see that the weightwin function is better able to explain variation in our mass parameter.

slidingwin weightwin
-64.81 -68.22

Pros and cons

weightwin can provide greater detail on the relationship between climate and the biological response, such as the occurrence of exponential functions. Additionally, by using more diverse weight distributions, weightwin will often generate models with better \(\Delta AICc\) values, which may be especially important when users are most interested in achieving high explanatory power. Furthermore, by using an optimisation routine weightwin often tests far fewer models than slidingwin, allowing for more rapid analysis.

Despite these benefits, weightwin will not always be the most appropriate function for all scenarios. Firstly, the nature of the fitted weight distributions means that weightwin can only detect single climate signals, which forces users to detect and compare potential climate signals with separate analyses.

Secondly, weightwin can be a more technical process. While the above example works easily, optimisation procedures can get stuck on false optima or fail to converge. In these cases, users may be required to test different starting parameters and adjust optimisation characteristics such as step size. Often this procedure can be inhibative for users with less technical knowledge.

Finally, weightwin can only be used for testing mean climate, with no capacity to consider other aggregate statistics. Therefore, whether one chooses to use weightwin or slidingwin will largely depend on the summary statistic used, the level of detail desired, and the ones technical knowledge.


To ask additional questions or report bugs please e-mail:

liam.bailey@anu.edu.au