cwbtools: Tools to create, modify and manage 'CWB' Corpora

The 'Corpus Workbench' ('CWB', <>) offers a classic and mature approach for working with large, linguistically and structurally annotated corpora. The 'CWB' is memory efficient and its design makes running queries fast (Evert and Hardie 2011, <>). The 'cwbtools' package offers pure R tools to create indexed corpus files as well as high-level wrappers for the original C implementation of CWB as exposed by the 'RcppCWB' package <>. Additional functionality to add and modify annotations of corpora from within R makes working with CWB indexed corpora much more flexible and convenient. The 'cwbtools' package in combination with the R packages 'RcppCWB' (<>) and 'polmineR' (<>) offers a lightweight infrastructure to support the combination of quantitative and qualitative approaches for working with textual data.

Version: 0.1.2
Imports: data.table, R6, xml2, stringi, curl, RcppCWB (≥ 0.2.8), pbapply, methods
Suggests: tm (≥ 0.7.3), knitr, tokenizers (≥ 0.2.1), tidytext, SnowballC, janeaustenr, devtools, polmineR, NLP
Published: 2019-12-17
Author: Andreas Blaette [aut, cre], Christoph Leonhardt [ctb]
Maintainer: Andreas Blaette <andreas.blaette at>
License: GPL-3
NeedsCompilation: no
Language: en-US
Citation: cwbtools citation info
Materials: NEWS
CRAN checks: cwbtools results


Reference manual: cwbtools.pdf
Vignettes: Europarl
Introducing 'cwbtools'
Package source: cwbtools_0.1.2.tar.gz
Windows binaries: r-devel:, r-devel-gcc8:, r-release:, r-oldrel:
OS X binaries: r-release: cwbtools_0.1.2.tgz, r-oldrel: cwbtools_0.1.2.tgz
Old sources: cwbtools archive


Please use the canonical form to link to this page.