The LUCIDus R package is an integrative tool to obtain a joint estimation of latent or unknown clusters/subgroups with multi-omics data and phenotypic traits. This package is an implementation for the novel statistical method proposed in the research paper “A Latent Unknown Clustering Integrating Multi-Omics Data (LUCID) with Phenotypic Traits” published by the Bioinformatics.
Cheng Peng, Jun Wang, Isaac Asante, Stan Louie, Ran Jin, Lida Chatzi, Graham Casey, Duncan C Thomas, David V Conti, A Latent Unknown Clustering Integrating Multi-Omics Data (LUCID) with Phenotypic Traits, Bioinformatics, , btz667, https://doi.org/10.1093/bioinformatics/btz667
You can install the released version of LUCIDus from CRAN directly with:
Or, it can be installed from GitHub using the following codes:
Three functions, including est_lucid()
, boot_lucid()
, and tune_lucid()
, are currently available for model fitting and selection. The model outputs can be summarized and visualized using summary_lucid()
and plot_lucid()
respectively. Predictions could be made with pred_lucid()
.
est_lucid()
Estimating latent clusters with multi-omics data, missing values in biomarker data are allowed, and information in the outcome of interest can be integrated
For a testing dataset with 10 genetic features (5 causal) and 4 biomarkers (2 causal)
summary_lucid()
plot_lucid()
boot_lucid()
Bootstrap method to achieve SEs for LUCID parameter estimates
set.seed(10)
boot_lucid(G = G1, CoG = CoG, Z = Z1, Y = Y1, CoY = CoY, useY = TRUE, family = "binary", K = 2, R=500)
tune_lucid()
Grid search for tuning parameters using parallel computing
# Better be run on a server or HPC
set.seed(10)
GridSearch <- tune_lucid(G=G1, Z=Z1, Y=Y1, K=2, Family="binary", USEY = TRUE,
LRho_g = 0.008, URho_g = 0.012, NoRho_g = 3,
LRho_z_invcov = 0.04, URho_z_invcov = 0.06, NoRho_z_invcov = 3,
LRho_z_covmu = 90, URho_z_covmu = 110, NoRho_z_covmu = 2)
GridSearch$Results
GridSearch$Optimal
Run LUCID with best tuning parameters and select informative features
set.seed(10)
IntClusFit <- est_lucid(G=G1,Z=Z1,Y=Y1,K=2,family="binary",Pred=TRUE,
tunepar = def_tune(Select_G=TRUE,Select_Z=TRUE,
Rho_G=0.01,Rho_Z_InvCov=0.06,Rho_Z_CovMu=90))
# Identify selected features
summary_lucid(IntClusFit)$No0G; summary_lucid(IntClusFit)$No0Z
colnames(G1)[summary_lucid(IntClusFit)$select_G]; colnames(Z1)[summary_lucid(IntClusFit)$select_Z]
# Select the features
if(!all(summary_lucid(IntClusFit)$select_G==FALSE)){
G_select <- G1[,summary_lucid(IntClusFit)$select_G]
}
if(!all(summary_lucid(IntClusFit)$select_Z==FALSE)){
Z_select <- Z1[,summary_lucid(IntClusFit)$select_Z]
}
IntClusCoFit <- est_lucid(G=G1,CoG=CoG,Z=Z1,Y=Y1,K=2,family="binary",Pred=TRUE,
initial=def_initial(), itr_tol=def_tol(),
tunepar = def_tune(Select_G=TRUE,Select_Z=TRUE,Rho_G=0.02,Rho_Z_InvCov=0.1,Rho_Z_CovMu=93))
summary_lucid(IntClusCoFit)
For more details, see documentations for each function in the R package.
The current version is 1.0.0.
For the versions available, see the Release on this repository.
This project is licensed under the GPL-2 License.