The basic work-flow behind the PACE approach for sparse functional data is as follows (see eg. (Yao, Müller, and Wang 2005; Liu and Müller 2009) for more information):
As a working assumption a dataset is treated as sparse if it has on average less than 20, potentially irregularly sampled, measurements per subject. A user can manually change the automatically determined dataType
if that is necessary. For densely observed functional data simplified procedures are available to obtain the eigencomponents and associated functional principal components scores (see eg. (Castro, Lawton, and Sylvestre 1986) for more information). In particular in this case we:
In the case of sparse FPCA the most computational intensive part is the smoothing of the sample’s raw covariance function. For this, we employ a local weighted bilinear smoother.
A sibling MATLAB package for fdapace
can be found in here.
The simplest scenario is that one has two lists yList
and tList
where yList
is a list of vectors, each containing the observed values \(Y_{ij}\) for the \(i\)th subject and tList
is a list of vectors containing corresponding time points. In this case one uses:
FPCAobj <- FPCA(Ly=yList, Lt=tList)
The generated FPCAobj
will contain all the basic information regarding the desired FPCA.
library(fdapace)
# Set the number of subjects (N) and the
# number of measurements per subjects (M)
N <- 200;
M <- 100;
set.seed(123)
# Define the continuum
s <- seq(0,10,length.out = M)
# Define the mean and 2 eigencomponents
meanFunct <- function(s) s + 10*exp(-(s-5)^2)
eigFunct1 <- function(s) +cos(2*s*pi/10) / sqrt(5)
eigFunct2 <- function(s) -sin(2*s*pi/10) / sqrt(5)
# Create FPC scores
Ksi <- matrix(rnorm(N*2), ncol=2);
Ksi <- apply(Ksi, 2, scale)
Ksi <- Ksi %*% diag(c(5,2))
# Create Y_true
yTrue <- Ksi %*% t(matrix(c(eigFunct1(s),eigFunct2(s)), ncol=2)) + t(matrix(rep(meanFunct(s),N), nrow=M))
L3 <- MakeFPCAInputs(IDs = rep(1:N, each=M), tVec=rep(s,N), t(yTrue))
FPCAdense <- FPCA(L3$Ly, L3$Lt)
# Plot the FPCA object
plot(FPCAdense)
# Find the standard deviation associated with each component
sqrt(FPCAdense$lambda)
## [1] 5.050606 1.999073
# Create sparse sample
# Each subject has one to five readings (median: 3)
set.seed(123)
ySparse <- Sparsify(yTrue, s, sparsity = c(1:5))
# Give your sample a bit of noise
ySparse$yNoisy <- lapply(ySparse$Ly, function(x) x + 0.5*rnorm(length(x)))
# Do FPCA on this sparse sample
# Notice that sparse FPCA will smooth the data internally (Yao et al., 2005)
# Smoothing is the main computational cost behind sparse FPCA
FPCAsparse <- FPCA(ySparse$yNoisy, ySparse$Lt, list(plot = TRUE))