This Vignette is supposed to give you a short introductory glance at the key features of mlr
.
A more detailed in depth and continuously updated tutorial can be found on the GitHub project page:
The main goal of mlr
is to provide a unified interface for machine learning tasks as classification, regression, cluster analysis and survival analysis in R.
In lack of a common interface it becomes a hassle to carry out standard methods like cross-validation and hyperparameter tuning for different learners.
Hence, mlr
offers the following features:
To highlight the main principles of mlr
we give a quick introduction to the package.
We demonstrate how to simply perform a classification analysis using a stratified cross validation, which illustrates some of the major building blocks of the mlr
workflow, namely tasks and learners.
library(mlr)
## Loading required package: BBmisc
## Loading required package: ggplot2
## Loading required package: ParamHelpers
data(iris)
## Define the task:
task = makeClassifTask(id = "tutorial", data = iris, target = "Species")
print(task)
## Supervised task: tutorial
## Type: classif
## Target: Species
## Observations: 150
## Features:
## numerics factors ordered
## 4 0 0
## Missings: FALSE
## Has weights: FALSE
## Has blocking: FALSE
## Classes: 3
## setosa versicolor virginica
## 50 50 50
## Positive class: NA
## Define the learner:
lrn = makeLearner("classif.lda")
print(lrn)
## Learner classif.lda from package MASS
## Type: classif
## Name: Linear Discriminant Analysis; Short name: lda
## Class: classif.lda
## Properties: twoclass,multiclass,numerics,factors,prob
## Predict-Type: response
## Hyperparameters:
## Define the resampling strategy:
rdesc = makeResampleDesc(method = "CV", stratify = TRUE)
## Do the resampling:
r = resample(learner = lrn, task = task, resampling = rdesc)
## [Resample] cross-validation iter: 1
## [Resample] cross-validation iter: 2
## [Resample] cross-validation iter: 3
## [Resample] cross-validation iter: 4
## [Resample] cross-validation iter: 5
## [Resample] cross-validation iter: 6
## [Resample] cross-validation iter: 7
## [Resample] cross-validation iter: 8
## [Resample] cross-validation iter: 9
## [Resample] cross-validation iter: 10
## [Resample] Result: mmce.test.mean=0.02
print(r)
## Resample Result
## Task: tutorial
## Learner: classif.lda
## mmce.aggr: 0.02
## mmce.mean: 0.02
## mmce.sd: 0.03
## Runtime: 0.153026
## Get the mean misclassification error:
r$aggr
## mmce.test.mean
## 0.02
The previous example just demonstrated a tiny fraction of the capabilities of mlr
.
More features are covered in the tutorial which can be found online on the mlr
project page.
It covers among others: benchmarking, preprocessing, imputation, feature selection, ROC analysis, how to implement your own learner and the list of all supported learners.
Reading is highly recommended!
We would like to thank the authors of all packages which mlr
uses under the hood: