Parallel Modes of sperrorest

Patrick Schratz

Introduction

sperrorest is parallelized by default from v2.0.0 and higher.

Most users are not familiar with parallelization and have no time/motivation to wrap their head around it. Instead, they just accept to wait “a bit” longer until the process finishes.

While this is no problem for “quick” cross-validation (CV) cases with a low number of repetitions and models which converge quickly, in some cases processing may take up to several months. For example, running a spatial cross-validation using a Generalized Linear Mixed Model (GLMM) with both random effects and a spatial autocorrelation structure on around 1000 observations takes roughly this time, if executed sequentially. Most of the fitting time hereby is devoted to the integration of the spatial autocorrelation structure.

sperrorest comes with four different parallelization modes and also offers sequential execution.

Unless specified otherwise, all cores of the machine are used. Limiting the number of cores makes sense in cases when you want to do other work on your machine while running a cross-validation so that your system stays responsive. Also, if you are working on a server and have, let’s say, 48 cores available and want to do a 100 repetition CV. Since most models take roughly the same time to fit, it would be smart to use 34 cores. Taking this number of cores is faster than using 48 because

  1. You need 3 iterations (34 in the first, 68 in the second and finishing in the 3rd) to process all repetitions. During the third iteration, a lot of cores would do nothing else but just wait for the others to finish.

  2. The parallelization overhead, which is mainly caused by splitting and combining all jobs to the workers, would be higher for the case with 48 cores than for 34 cores. Hence, 34 cores will finish faster than 48 cores on 100 repetitions. Of course, when taking 50 cores it would only need 2 worker iterations to process everything which would again speed up the process.

The future backend

All modes expect "apply" (including the sequential one) are running on the parallel API of the future package. It offers a unified, cross-platform API combining all other existing parallel approaches of R into one package. Besides the variety of parallel options to choose from (multiprocess, multisession, multicore, cluster, etc.) it also provides a sequential option. Every options is initiated in the same way:

library(future)
registerDoFuture()

plan("sequential") # sequential
plan("multicore") # parallel (Unix only)
plan("multisession") # parallel
plan("multiprocess") # parallel
plan("cluster") # parallel

Every option has its advantages and disadvantages. Check the future package vignettes for more information.

Mode “foreach”

Unless specified otherwise, the default parallel mode uses foreach with the "cluster" option of the future package. Package doFuture takes care that foreach works with the parallel initialization of the future package.

This option is taken as default because it works cross-platform and provides progress output to the console. Unfortunately, on Windows this output is not shown to the console but needs to be written to a file (default to the current working directory). Another downside is that the global environment needs to copied to every worker before processing starts. Workers are started sequentially and therefore the startup of > 10 workers may take some seconds.

Mode “apply”

This mode is also cross-platform but uses different functions on Unix/non-Unix systems for actual processing. On Unix, it uses the pbmcapply package which combines the pbapply package (provides progress bar for ‘apply’ functions) and the future package to speed up processing. On Windows, pbapply is used which in the end uses parApply() to setup a cluster like parallelization including a progress bar.

Mode “future”

This modes entirely uses the future package in combination with future_lapply() as the working horse. It can be used with any future plan specified via par_option. It is the fastest mode but provides no progress output.

Mode “sequential”

This mode executes sperrorest() sequentially. It also runs on the future API using foreach/doFuture which provide the possibility of sequential execution using plan("sequential").

Performance comparison

Example setup:

Note that the only argument which needs to be changed is par_mode here. Subsequently, par_mode = "foreach", par_mode = "apply" and par_mode = "future" were used.

All default settings of each mode were used. par_mode = "foreach" runs on plan("cluster") while par_mode = "future" runs on plan("multiprocess"). Mode "apply" used pbmcapply in the end since the test was running on a Unix System.

data(ecuador)
fo <- slides ~ dem + slope + hcurv + vcurv + log.carea + cslope

sperrorest(data = ecuador, formula = fo,
           model_fun = glm, model_args = list(family = "binomial"),
           pred_args = list(type = "response"),
           smp_fun = partition_cv,
           smp_args = list(repetition = 1:100, nfold = 5),
           par_args = list(par_mode = "foreach", par_units = 20),
           benchmark = TRUE, progress = FALSE,
           importance = TRUE, imp_permutations = 100)
foreach apply future
runtime (min) 52.33 51.67 49.54