eulerr generates area-proportional euler diagrams that display set relationships (intersections, unions, and disjoints) with circles. Euler diagrams are Venn diagrams without the requirement that all set interactions be present (whether they are empty or not). That is, depending on input, eulerr will sometimes produce Venn diagrams but sometimes not.
R features a number of packages that produce euler and/or venn diagrams; some of the more prominent ones (on CRAN) are
The last of these serves as the primary inspiration for this package, along with the refinements that Ben Fredrickson has presented on his blog and made available in his javascript venn.js.
venneuler, however, is written in java, which prevents R users from browsing the source code (unless they are also literate in java) or contributing. Furthermore, venneuler is known to produce imperfect output for set relationships that have perfect euler diagram solutions. Consider, for instance
venn_fit <- venneuler::venneuler(c(A = 75, B = 50, "A&B" = 0))
par(mar = c(0, 0, 0, 0))
plot(venn_fit)
that reasonably should not display any intersection between A
and B
.
eulerr is based around the improvements to venneuler that Ben Fredrickson introcued with venn.js but with rewritten code, different optimizers, and methods to calculate stress statistics. It also provides a highly customizable interface for its plotting function.
Currently, it is possible to provide input to eulerr
as either
library(eulerr)
# Input in the form of a named numeric vector
fit1 <- eulerr(c("A" = 25, "B" = 5, "C" = 5,
"A&B" = 5, "A&C" = 5, "B&C" = 3,
"A&B&C" = 3))
# Input as a matrix of logicals
set.seed(1)
mat <-
cbind(
A = sample(c(TRUE, TRUE, FALSE), size = 50, replace = TRUE),
B = sample(c(TRUE, FALSE), size = 50, replace = TRUE),
C = sample(c(TRUE, FALSE, FALSE, FALSE), size = 50, replace = TRUE)
)
fit2 <- eulerr(mat)
We can expect our results by printing the eulerr object
fit2
## $coefficients
## x y r
## A 34.398248 19.1381437 33.09853
## B 18.887664 26.7291801 32.01503
## C 7.988108 0.5417698 21.44077
##
## $original.values
## A B C A&B A&C B&C A&B&C
## 31 29 13 20 6 7 5
##
## $fitted.values
## A B C A&B A&C B&C A&B&C
## 31.005893 29.009133 13.010910 19.977026 5.894665 6.927797 5.191156
##
## $residuals
## A B C A&B A&C
## -0.005892980 -0.009132989 -0.010909949 0.022973583 0.105335028
## B&C A&B&C
## 0.072202508 -0.191155861
##
## $stress
## [1] 2.941349e-05
##
## attr(,"class")
## [1] "eulerr" "list"
or directly access and plot the residuals and plot using standard methods.
resid(fit2)
## A B C A&B A&C
## -0.005892980 -0.009132989 -0.010909949 0.022973583 0.105335028
## B&C A&B&C
## 0.072202508 -0.191155861
# Cleveland dot plot of the residuals
graphics::dotchart(resid(fit2))
abline(v = 0, lty = 3)
This shows us that the A&B&C
intersection is somewhat overrepresented in fit2
. Althgouh, given that these residuals are on the scale of the original values, the residuals are arguably small.
For an overall measure of the fit of the solution, we use the same stress statistic that Leland Wilkinson presented in his academic paper on venneuler (Wilkinson (2012)), which is given by the sums of squared residuals divided by the total sums of squares: \[\frac{\sum \limits_{i=1}^n (f_i -y_i)^2}{\sum \limits_{i=1}^n (y_i - \bar{y})^2}\]
For our solution, the stress is
fit2$stress
## [1] 2.941349e-05
, which is quite low.
We can now be confident that eulerr provides a reasonable representation of our input. Were it otherwise, we would do best to stop here and look for another way to visualize our data. (I suggest the excellent UpSetR package.)
No we get to the fun part: plotting our euler fit. This is easy, as well as highly customizable, with eulerr.
par(mar = c(0, 0, 0, 0))
plot(fit2)
# Change fill colors, border type (remove) and fontface.
plot(fit2,
polygon_args = list(col = c("dodgerblue4", "darkgoldenrod1", "cornsilk4"),
border = "transparent"),
text_args = list(font = 8))
eulerr’s default color palette is taken from qualpalr – another package that I have developed – which uses color difference algorithms to generate distinct qualitative color palettes.
Details of the implementation will be left for a future vignette but almost completely resemble the approach documented here.
eulerr would not be possible without Ben Fredrickson’s work on venn.js or Leland Wilkinson’s venneuler.
Wilkinson, L. 2012. “Exact and Approximate Area-Proportional Circular Venn and Euler Diagrams.” IEEE Transactions on Visualization and Computer Graphics 18 (2): 321–31. doi:10.1109/TVCG.2011.56.