The R package bigstatsr provides functions for fast statistical analysis of large-scale data encoded as matrices. The package can handle matrices that are too large to fit in memory thanks to memory-mapping to binary files on disk. This is very similar to the format
big.matrix provided by the R package bigmemory, which is no longer used by this package.
Introduction to package bigstatsr
LIST OF FEATURES
Note that most of the algorithms of this package don’t handle missing values.
# For the CRAN version install.packages("bigstatsr") # For the current development version devtools::install_github("privefl/bigstatsr") # For the first version (depending on package bigmemory) devtools::install_github("privefl/bigstatsr", ref = "v-bigmemory")
As inputs, package bigstatsr uses Filebacked Big Matrices (FBM).
Please open an issue if you find a bug. If you want help using bigstatsr, please post on Stack Overflow with the tag bigstatsr (not yet created). How to make a great R reproducible example?
Parallelization: package bigstatsr uses package foreach for its parallelization tasks. Learn more on parallelism with foreach with this tuto.
Computing the null space of a bigmatrix in R (works if one dimension is not too large)
Please note that this project is released with a Contributor Code of Conduct. By participating in this project you agree to abide by its terms.