baguette

Lifecycle: experimental CRAN status Codecov test coverage R build status R-CMD-check

Introduction

The goal of baguette is to provide efficient functions for bagging (aka bootstrap aggregating) ensemble models.

The model objects produced by baguette are kept smaller than they would otherwise be through two operations:

Installation

You can install the released version of baguette from CRAN with:

install.packages("baguette")

Install the development version from GitHub with:

require("devtools")
install_github("tidymodels/baguette")

Example

Let’s build a bagged decision tree model to predict a continuous outcome.

library(baguette)
#> Loading required package: parsnip

bag_tree() %>% 
  set_engine("rpart") # C5.0 is also available here
#> Bagged Decision Tree Model Specification (unknown)
#> 
#> Main Arguments:
#>   cost_complexity = 0
#>   min_n = 2
#> 
#> Computational engine: rpart

set.seed(123)
bag_cars <- 
  bag_tree() %>% 
  set_engine("rpart", times = 25) %>% # 25 ensemble members 
  set_mode("regression") %>% 
  fit(mpg ~ ., data = mtcars)

bag_cars
#> parsnip model object
#> 
#> Fit time:  3.6s 
#> Bagged CART (regression with 25 members)
#> 
#> Variable importance scores include:
#> 
#> # A tibble: 10 x 4
#>    term  value std.error  used
#>    <chr> <dbl>     <dbl> <int>
#>  1 disp  905.       51.9    25
#>  2 wt    889.       56.8    25
#>  3 hp    814.       48.7    25
#>  4 cyl   581.       42.9    25
#>  5 drat  540.       54.1    25
#>  6 qsec  281.       53.2    25
#>  7 vs    150.       51.2    20
#>  8 carb   84.4      30.6    25
#>  9 gear   80.0      35.8    23
#> 10 am     51.5      22.9    18

The models also return aggregated variable importance scores.

Contributing

This project is released with a Contributor Code of Conduct. By contributing to this project, you agree to abide by its terms.