Dockerize R Markdown Documents

Nan Xiao <http://nanx.me>

2016-08-05

1 Add liftr Metadata

To dockerize your R Markdown document, the first step is adding liftr options in the YAML front-matter of a document. For example:

---
title: "The Missing Example of liftr"
author: "Author Name"
date: "2016-08-05"
output:
  html_document:
    highlight: haddock
    theme: readable
liftr:
  maintainer: "Author Name"
  maintainer_email: "name@example.com"
  from: "rocker/r-base:latest"
  latex: false
  pandoc: true
  syslib:
    - gfortran
    - samtools
  cranpkg:
    - randomForest
  biocpkg:
    - Gviz
    - ggbio
  ghpkg:
    - "road2stat/liftr"
  rabix: true
  rabix_json: "https://s3.amazonaws.com/rabix/rabix-test/bwa-mem.json"
  rabix_d: "~/liftr_rabix/bwa/"
  rabix_args:
    - reference: "https://s3.amazonaws.com/rabix/rabix-test/chr20.fa"
    - reads: "https://s3.amazonaws.com/rabix/rabix-test/example_human_Illumina.pe_1.fastq"
    - reads: "https://s3.amazonaws.com/rabix/rabix-test/example_human_Illumina.pe_2.fastq"
---

All available options are expained below.

1.1 Required options

1.2 Optional options

1.3 Rabix options

The Rabix options are optional. Just make sure rabix: true when you need to enable Rabix support.

2 Use lift() and drender()

After adding proper liftr metadata to the document YAML data block, we can use lift() to parse the document and generate a Dockerfile (it will also generate a Rabixfile if necessary).

We will use docker.Rmd included in the package as an example. First, we create a new directory and copy the example document to the directory:

dir_docker = "~/liftr_docker/"
dir.create(dir_docker)
file.copy(system.file("docker.Rmd", package = "liftr"), dir_docker)

Then, we use lift() to parse the document and generate Dockerfile:

library("liftr")
docker_input = paste0(dir_docker, "docker.Rmd")
lift(docker_input)

After successfully running lift() on docker.Rmd, the Dockerfile will be in the ~/liftr_docker/ directory.

Now we can use drender() on docker.Rmd to render the document to a html file, under a Docker container:

drender(docker_input)

The drender() function will parse the Dockerfile, build a new Docker image, and run a container to render the input document. If successfully rendered, the output docker.html will be in the ~/liftr_docker/ directory. You can also passed additional arguments in rmarkdown::render to this function.

In order to share the dockerized R Markdown document, simply share the .Rmd file. Other users can use the lift() and drender() functions to render the document as above.

3 Rabix Support

Rabix is an open source implementation of the Common Workflow Language specification for building portable bioinformatics pipelines. Users can write JSON-based tools/workflows and run them with Rabix.

We will use rabix.Rmd included in the package as an example. As before, we create a new directory and copy the example document to the directory:

dir_rabix  = "~/liftr_rabix/"
dir.create(dir_rabix)
file.copy(system.file("rabix.Rmd", package = "liftr"), dir_rabix)

Use lift() and drender() as before:

library("liftr")
rabix_input = paste0(dir_rabix, "rabix.Rmd")
lift(rabix_input)
drender(rabix_input)

Rabix tools/workflows will run first, the document will be rendered after. In this way, we can use the output of the bioinformatics pipelines for further analysis in our R Markdown document. See rabix.Rmd for details.

If successfully rendered, the output rabix.html will be in the ~/liftr_rabix/ directory.

4 System Requirements

As the host platform, Linux is currently preferred over the other platforms due to certain limitations of running Docker and performance issues.

4.1 Docker

We need Docker installed to render the documents.

To install Docker in Ubuntu:

sudo apt-get install docker.io

We should configure Docker to run without sudo. To avoid sudo when using the docker command, simply create a group named docker and add yourself to it:

sudo usermod -aG docker your-username

Here is a detailed guide for installing Docker on most platforms. Anyhow, just make sure you can run docker under shell.

4.2 Rabix

Rabix needs to be installed if you want to run Rabix tools/workflows before rendering the documents. Make sure you can run rabix under shell after installation.

To install Rabix in Ubuntu:

sudo apt-get install python-dev python-pip docker.io phantomjs libyaml-dev
sudo pip install rabix

Here is a more detailed guide for installing Rabix on other platforms.


Project website: liftr.me