Animal trajectory analysis with trajr

Jim McLean

2018-05-10

Introduction

The trajr R package is a toolkit for the numerical characterisation and analysis of the trajectories of moving animals. Trajectory analysis is applicable in fields as diverse as optimal foraging theory, migration, and behavioural mimicry (e.g. for verifying similarities in locomotion). A trajectory is simply a record of the path followed by a moving animal. Trajr operates on trajectories in the form of a series of locations (as x, y coordinates) with times. Trajectories may be obtained by any method which provides this information, including manual tracking, radio telemetry, GPS tracking, and motion tracking from videos.

The goal of this package (and this document) is to aid biological researchers, who may not have extensive experience with R, to analyse trajectories without being handicapped by a limited knowledge of R or programming. However, a basic understanding of R is required.

Within this document, the plots are generated by running the code examples immediately before them. For that reason, most code examples start with a call to TrajGenerate, which generates a random trajectory to be plotted. These calls would not occur in an analysis of real animal trajectories.

If you use trajr in your publications, please cite McLean DJ, Skowron Volponi MA. trajr: An R package for characterisation of animal trajectories. Ethology. 2018;00:1–9. https://doi.org/10.1111/eth.12739.

Installation and setup

Before using any R package, it is first necessary to install and load it. To install the latest released version of trajr, simply run:

To install the latest development version, which is available on GitHub, run:

Once the package has been installed, you must load it, then it’s ready for use:

Packages only need to be installed once, but they must be loaded in every session before they can be used.

Once the trajr package has been installed and loaded, you have access to the set of R functions it provides. All trajr functions (other than the generic functions plot, lines and points) start with the prefix Traj. Several of the functions provided by trajr do not provide additional analytic functionality, rather they make it easier to work with multiple trajectories. These convenience functions are prefixed with Trajs.

Trajectories in trajr

Trajr works with trajectories, where a trajectory is a simplification of a real path travelled by an animal, and is a set of 2-dimensional spatial coordinates with a third temporal dimension. A trajectory can also be thought of as a series of steps, each comprised of a length (Li), a turning angle (\(\Delta\)i), and a time. While trajectories have been modelled in various ways, trajr works with two theoretical models: correlated random walks in which the turning angle of each step is the direction of the previous step ± some error; and directed walks, or compass-based navigation, in which the angular errors at each step are added to the “ideal” or compass direction. Trajr generally assumes correlated random walks, but provides some support for directed walks.

A correlated random walk with 5 steps. The direction of step i is the direction of step (i - 1) + \Deltai.

A correlated random walk with 5 steps. The direction of step \(i\) is the direction of step \((i - 1) + \Delta\)i.

A directed walk with 5 steps. The direction of step i is the compass direction (here 0^\circ) + \Deltai.

A directed walk with 5 steps. The direction of step \(i\) is the compass direction (here 0\(^\circ\)) + \(\Delta\)i.

Within trajr, trajectories are represented by R objects with class Trajectory. The TrajFromCoords function is used to create a Trajectory object from a set of x-y coordinates, with optional times.

# Define x, y, and time coordinates
coords <- data.frame(x = c(1, 1.5, 2, 2.5, 3, 4), 
                     y = c(0, 0, 1, 1, 2, 1), 
                     times = c(0, 1, 2, 3, 4, 5))
# Create a trajectory from the coordinates
trj <- TrajFromCoords(coords)

# Plot it
plot(trj)
Trajectory created from coordinates.

Trajectory created from coordinates.

Creating Trajectories

In practice, trajectory coordinates are typically read from a data file such as a CSV file, and then passed in to the TrajFromCoords function.

TrajFromCoords assumes that the first column in coords contains x values, the second contains y values, and the third contains time values. The help page for TrajFromCoords describes how to alter these defaults - run ?TrajFromCoords to access the help page. Many functions (including TrajFromCoords) contain code examples as part of their help information. You can also display and run an example by calling the example function: i.e. example("TrajFromCoords").

Scaling Trajectories

Trajectories may need to be scaled before analysis. For instance, when a trajectory is digitised from video, the spatial units will be pixels, and require conversion to more meaningful units such as metres. Trajectories are scaled by calling TrajScale, specifying the scale factor and the abbreviation of the transformed units. In the case of a trajectory digitised from a video, you can calculate scale as \(width^{m}/width^{p}\), where \(width^{m}\) is the width of an object in metres, and \(width^{p}\) is the width of the same object in pixels, as measured from the video. A commonly used method to determine the scale is to include a measuring tape in the video, then measure the number of pixels in a length of the tape.

Smoothing trajectories

Many trajectories will suffer from noise which will bias the results of some analyses. The TrajSmoothSG function reduces high frequency noise while preserving the shape of the trajectory, by applying a Savitzky-Golay smoothing filter. It requires two arguments (in addition to the trajectory), n, the filter order, and m, the filter length (which must be odd). Refer to the help for the function sgolayfilt (run ?signal::sgolayfilt) for details of the filter function.

Effect of smoothing a trajectory.

Effect of smoothing a trajectory.

Resampling trajectories

Some functions require a Trajectory with a fixed step length. The process of resampling a trajectory to a fixed step length is called rediscretization. The function TrajRediscretize implements trajectory resampling using the algorithm described by Bovet & Benhamou (1988). There are no clear guidelines for choosing a suitable step length for rediscretization: too small a step length leads to oversampling, leading to high autocorrelation between steps and high variability; too large a step length results in undersampling and loss of information. See Turchin (1998) section 5.2.2 for a discussion.

Rediscretization of trajectory with \mu_{L} = 2 to step length 1.

Rediscretization of trajectory with \(\mu_{L} = 2\) to step length \(1\).

Other trajectory operations

This table lists trajectory operations not otherwise addressed by this document. Refer to the help for each function for information.

Function Description
TrajRotate Rotates a trajectory
TrajTranslate Translates a trajectory
TrajReverse Reverses a trajectory
TrajGetFPS Returns the frames-per-second of a trajectory
TrajGetNCoords Returns the number of coordinates of a trajectory
TrajGetUnits Returns the spatial units of a trajectory
TrajGetTimeUnits Returns the temporal units of a trajectory
TrajStepLengths Returns the lengths of each step within the trajectory
TrajLength Returns the total length of the trajectory (or a portion)
TrajDistance Returns the straight-line distance from the start to the end of the trajectory (or a portion)
TrajDuration Returns the temporal duration of the trajectory (or a portion)
TrajMeanVelocity Returns the mean velocity vector of the trajectory (or a portion)
TrajAngles Returns the turning angles (radians) of a trajectory
TrajMeanVectorOfTurningAngles Returns the mean vector of the turning angles
TrajExpectedSquareDisplacement Returns the expected square displacement of a correlated random walk

Trajectory analysis

Once you have obtained a correctly scaled Trajectory object, you may call various functions to analyse the trajectory. Broadly speaking, analysis may be divided into measures of speed or acceleration, and measures of straightness or tortuosity.

Analysing speed

The TrajDerivatives function calculates linear speed and acceleration along a Trajectory. If the trajectory is noisy, it should be smoothed before the derivatives are calculated (see Smoothing trajectories).

Speed and acceleration over time.

Speed and acceleration over time.

Once the trajectory speed and acceleration have been obtained, it is simple to calculate values such as mean, maximum, minimum or standard deviation of speed (or acceleration): mean(derivs$speed), max(derivs$speed), min(derivs$speed), or sd(derivs$speed). Similarly, calculations excluding periods when speed drops below (or above) some value are also straightforward, for example: mean(derivs$speed[deriv$speed > STOPPED_SPEED]). In addition, the function TrajSpeedIntervals allows you to determine the time intervals within the trajectory that satisfy a speed constraint, either slower than some speed, faster than a speed, or both. This may be useful, for example, when testing how often, or for how long, a flying insect hovers, or a running insect stops, i.e. when its speed drops below some threshold. The object returned by a call to TrajSpeedIntervals can be plotted.

#>   startFrame startTime stopFrame  stopTime  duration
#> 1         11 0.2337088        27 0.5488033 0.3150945
#> 2         85 1.7073675        94 1.8866781 0.1793107
Speed over time, hovering intervals (speed < 2 m/s) highlighted.

Speed over time, hovering intervals (speed < 2 m/s) highlighted.

Analysing straightness

Straightness index

Various methods to measure the straightness, or conversely, tortuosity, of trajectories are available within trajr. The simplest is \(D/L\), where \(D\) is the distance from the start to the end of the trajectory, and \(L\) is the length of the trajectory. This straightness index is calculated by the function TrajStraightness, and is a number ranging from 0 to 1, where 1 indicates a straight line. Benhamou (2004) considers the straightness index to be a reliable measure of the efficiency of a directed walk, but inapplicable to random trajectories. Batschelet (1981) considers this straightness index to be an approximation of r, which is the length of the mean vector of turning angles after rediscretizing to a constant step length. r can be calculated by calling Mod(TrajMeanVectorOfTurningAngles(trj)), assuming trj is a Trajectory with constant step length.

Within this section, most of the figures show the appropriate statistic plotted for multiple randomly generated correlated random walks which vary in the standard deviations of angular errors, \(\sigma_{\Delta}\) (higher values of \(\sigma_{\Delta}\) produce more tortuous trajectories).

Two straightness indices (r, dots, and D/L, crosses) as a function of \sigma_{\Delta}, plotted for several different step sizes (linear axes).

Two straightness indices (r, dots, and D/L, crosses) as a function of \(\sigma_{\Delta}\), plotted for several different step sizes (linear axes).

Sinuosity

The sinuosity index defined by Benhamou (2004) may be an appropriate measure of the tortuosity of a random search path. Sinuosity is a function of the mean cosine of turning angles, and is a corrected form of the original sinuosity index defined by Bovet & Benhamou (1988). The function TrajSinuosity2 calculates the corrected form while TrajSinuosity calculates the original index. Sinuosity should be calculated for a trajectory with a constant step length, so trajectories may first require rediscretization. In the absence of a biologically meaningful step size, rediscretizing to the mean step length of the trajectory will produce a trajectory with approximately the same shape and number of steps as the original.

Sinuosity as a function of \sigma_{\Delta}, plotted for several different step sizes (linear axes).

Sinuosity as a function of \(\sigma_{\Delta}\), plotted for several different step sizes (linear axes).

Emax

\(E^{a}_{max}\) is a dimensionless estimate of the maximum expected displacement of a trajectory. Larger \(E^{a}_{max}\) values (approaching infinity) represent straighter paths (Cheung, Zhang, Stricker, & Srinivasan, 2007). \(E^{b}_{max}\) is \(E^{a}_{max}\) multiplied by the mean step length, so gives the maximum possible displacement in spatial units. TrajEmax returns the value of \(E^{a}_{max}\) for a trajectory, or \(E^{b}_{max}\) if the argument eMaxB is TRUE. TrajEmax differentiates between random and directed walks; see ?TrajEmax for details.

Emax as a function of \sigma_{\Delta} (logarithmic axes).

Emax as a function of \(\sigma_{\Delta}\) (logarithmic axes).

Fractal dimension

Fractal dimension has been considered a promising measure of straightness or tortuosity, varying between 1 (straight line movement) and 2 (Brownian motion). However, several studies have found it inappropriate for use with animal trajectories, as animal trajectories are not fractal curves, and the fractal dimension of a non-fractal curve depends critically on the range of step sizes used to calculate it (Nams, 2006; Turchin, 1996). Nonetheless, fractal dimension continues to be used, and the TrajFractalDimension function, by default, calculates fractal dimension using the modified dividers method to account for truncation error (Nams, 2006). See ?TrajFractalDimension and ?TrajFractalDimensionValues for more information.

Fractal dimension as a function of \sigma_{\Delta}.

Fractal dimension as a function of \(\sigma_{\Delta}\).

Directional change (DC and SDDC)

Directional change is defined as the change in direction over time, and has been used to assess locomotor mimicry in butterflies (Kitamura & Imafuku, 2015). Directional change is defined for each pair of steps, so a trajectory (or a portion thereof) may be characterised by the mean (DC) and standard deviation (SDDC) of all directional changes. DC may be used as an index of nonlinearity, and SDDC as a measure of irregularity. The directional changes for a trajectory are calculated by the function TrajDirectionalChange, so SD may be calculated as mean(TrajDirectionalChange(trj)), and SDDC as sd(TrajDirectionalChange(trj)). Note that directional change cannot be calculated on rediscretized trajectories, as rediscretizing discards time information.

DC and SDDC as a function of \sigma_{\Delta}.

DC and SDDC as a function of \(\sigma_{\Delta}\).

Direction autocorrelation

The direction autocorrelation function was devised to detect and quantify regularities within trajectories, in the context of locomotor mimicry of ant-mimics (Shamble, Hoy, Cohen, & Beatus, 2017). It detects wave-like periodicities, and provides an indication of their wavelength and amplitude, so can be used to detect, for example, the winding movements of trail-following ants. The function \(C(\Delta s)\) is applied to a rediscretized trajectory, and calculates the differences in step angles at all steps separated by \(\Delta\), for a range of \(\Delta s\). The position of the first local minimum in \(C(\Delta s)\) (i.e. the 2-dimensional position \((\Delta s, c(\Delta s))\)) may be used to characterise the periodicity within a trajectory.

The function is calculated by TrajDirectionAutocorrelations, and the result may be plotted. The position of the first local minimum (or maximum) is calculated by TrajDAFindFirstMinimum (or TrajDAFindFirstMaximum). Some trajectories will not have a first local minimum (or maximum), which indicates that they do not contain regular changes in direction.

Direction autocorrelation of a random trajectory with first local minimum.

Direction autocorrelation of a random trajectory with first local minimum.

Working with multiple trajectories

Building multiple trajectories

As trajectory studies are likely to involve multiple trajectories per individual, per species, or trajectories for multiple species, trajr provides some functions to simplify the manipulation of multiple trajectories. TrajsBuild assumes that you have a set of trajectory files (such as CSV files), each of which may have a scale and a frames-per-second value. The function reads each of the files specified in a list of file names, optionally using a custom function, then passes the result to TrajFromCoords, which assumes that the first column is x, the second is y, and there is no time column. If these settings are incorrect, the csvStruct argument can be specified to identify the x, y and time columns (see the example below). TrajsBuild optionally scales the trajectories with the specified scale (by calling TrajScale), and optionally smooths them with the specified smoothing parameters (by calling TrajSmoothSG). The value returned is a list of Trajectory objects.

Characterising multiple trajectories

TrajsMergeStats simplifies the construction of a data frame of values, with one row per trajectory. To use it, you need a list of trajectories (which you will probably obtain from a call to TrajsBuild), and a function which calculates statistics for a single trajectory. The following example demonstrates how you might write and use such a function.

#>         mean_speed    sd_speed  sinuosity       Emax min_deltaS
#> deltaS 0.061296301 0.009069424  0.9833243 3180.72322         33
#> 2      0.010573501 0.005718966 20.1799815  100.43632         50
#> 3      0.014175447 0.005206138 10.5781943  692.21342         NA
#> 4      0.006245555 0.003005911  6.3442130   67.52479          7
#> 5      0.005344287 0.007936181 60.5580878   14.90601         NA
#> 6      0.019378799 0.005363627  3.3069095   74.33277         35
#> 7      0.014796829 0.002573456  2.4142475  112.93798         NA
#>              min_C
#> deltaS 0.899013229
#> 2      0.303049396
#> 3               NA
#> 4      0.930992488
#> 5               NA
#> 6      0.002917922
#> 7               NA

PCA analysis of multiple trajectories

Principal components analysis (PCA) is a commonly used statistical technique which can be used to visualise similarities and differences between groups of trajectories. In R, the function prcomp is usually used to perform a PCA. Prcomp will not process data containing NAs (in R, the value NA is used to indicate a ‘Not Available’, or missing, value). However, NAs can be potentially informative, especially when using the first local minimum of the direction autocorrelation function - in this case, an NA indicates that there was no periodicity detected in the trajectory. To avoid discarding this information, Trajr provides a utility function, TrajsStatsReplaceNAs. It is used to replace NA values with some other value, and optionally adds an additional column to the data which flags trajectories which originally contained NA values. It operates on a single column at a time.

PCA of trajectory characterisations.

PCA of trajectory characterisations.

The plot above shows that the trajectory of the non-mimetic spider (red dot) seems quite different from the trajectories of ant mimicking bugs, ant-mimicking spiders (blue dots) and ants (black dots). The arrows indicate the importance and direction of the parameters that separate the trajectories. From the plot, it seems that the non-mimetic spider varies mainly in speed, (it is faster and more variable) and Emax (it walks in a straighter path). While these trajectories have been digitised from videos of walking animals, clearly this example does not contain enough data to be meaningful.

Other operations

The function TrajsStepLengths returns a vector containing all of the step lengths in a list of trajectories.

#>      Min.   1st Qu.    Median      Mean   3rd Qu.      Max. 
#> 1.000e-10 7.813e-05 2.266e-04 2.266e-04 3.243e-04 1.550e-03

Random trajectory generation

Trajr allows you to generate random trajectories with the TrajGenerate function. Random trajectories may be used in simulation studies, or simply for experimenting with trajectory analysis, and it has been used to generate the majority of trajectories in this document. By default, TrajGenerate returns a correlated random walk, however, by specifying different arguments, a variety of trajectory types can be created.

# Correlated random walk
trj <- TrajGenerate(n = 200)
# Plot it
plot(trj)
mtext("a)", 3, -1.3, adj = .05)

# Directed walk
trj <- TrajGenerate(n = 20, random = FALSE)
plot(trj)
mtext("b)", 3, -1.3, adj = .05)

# Brownian motion
trj <- TrajGenerate(n = 500, angularErrorDist = function(n) stats::runif(n, -pi, pi))
plot(trj)
mtext("c)", 3, -1.3, adj = .05)

# Levy walk - path lengths follow a cauchy distribution
trj <- TrajGenerate(linearErrorDist = stats::rcauchy)
plot(trj)
mtext("d)", 3, -1.3, adj = .05)
Some generated trajectories. a) correlated walk, b) directed walk, c) brownian motion, d) Levy walk.

Some generated trajectories. a) correlated walk, b) directed walk, c) brownian motion, d) Levy walk.

And finally…

Trajr is open source, with the source available on Github. If you have any problems or suggestions, email jim_mclean@optusnet.com.au.

References

Batschelet, E. (1981). Circular statistics in biology. ACADEMIC PRESS, 111 FIFTH AVE., NEW YORK, NY 10003, 1981, 388.

Benhamou, S. (2004). How to reliably estimate the tortuosity of an animal’s path. Journal of Theoretical Biology, 229(2), 209-220.

Benhamou, S. (2006). Detecting an orientation component in animal paths when the preferred direction is individual-dependent. Ecology, 87(2), 518-528.

Bovet, P., & Benhamou, S. (1988). Spatial analysis of animals’ movements using a correlated random walk model. Journal of Theoretical Biology, 131(4), 419-433.

Cheung, A., Zhang, S., Stricker, C., & Srinivasan, M. V. (2007). Animal navigation: the difficulty of moving in a straight line. Biological Cybernetics, 97(1), 47-61.

Kitamura, T., & Imafuku, M. (2015). Behavioural mimicry in flight path of Batesian intraspecific polymorphic butterfly Papilio polytes. Proceedings of the Royal Society B: Biological Sciences, 282(1809).

Nams, V. O. (2006). Improving Accuracy and Precision in Estimating Fractal Dimension of Animal movement paths. Acta Biotheoretica, 54(1), 1-11.

Shamble, P. S., Hoy, R. R., Cohen, I., & Beatus, T. (2017). Walking like an ant: a quantitative and experimental approach to understanding locomotor mimicry in the jumping spider Myrmarachne formicaria. Proceedings of the Royal Society B: Biological Sciences, 284(1858).

Turchin, P. (1996). Fractal Analyses of Animal Movement: A Critique. Ecology, 77(7), 2086-2090.

Turchin, P. (1998). Quantitative analysis of movement. Sunderland (Mass.): Sinauer assoc.