Thematic choropleth maps are used to display quantities of some variable within areas, such as mapping median income across a city’s neighborhoods. However, we often think in bivariate terms - “how do race and income vary together?”. Maps that captures this, known as bivariate choropleth maps, are often perceived as difficult to create and interpret. The goal of biscale
is to implement a consistent approach to bivariate mapping entirely within R
. The package’s workflow is based on a recent tutorial written by Timo Grossenbacher and Angelo Zehr, and supports both two-by-two and three-by-three bivariate maps.
Since the package does not directly use functions from sf
, it is a suggested dependency rather than a required one. However, the most direct approach to using biscale
is with sf
objects, and we therefore recommend users install sf
before proceeding with using biscale
. Windows users should be able to install sf
without significant issues, but macOS and Linux users will need to install several open source spatial libraries to get sf
itself up and running. The easiest approach for macOS users is to install the GDAL 2.0 Complete framework from Kyng Chaos.
For Linux users, steps will vary based on the flavor being used. Our configuration file for Travis CI and its associated bash script should be useful in determining the necessary components to install.
Once sf
is installed, the easiest way to get biscale
is to install it from CRAN:
Alternatively, the development version of biscale
can be accessed from GitHub with remotes
:
All functions within biscale
use the prefix bi_
to leverage the auto-completion features of RStudio and other IDEs.
biscale
contains a data set of U.S. Census tracts for the City of St. Louis in Missouri. Both median income and the percentage of white residents are included, both of which can be used to demonstrate the package’s functionality.
Once data are loaded, bivariate classes can be applied with the bi_class()
function. There is an important caveat, however:
> # load dependencies
> library(biscale)
>
> # create classes
> data <- bi_class(stl_race_income, x = pctWhite, y = medInc, style = "quantile", dim = 3)
Warning message:
In bi_class(stl_race_income, x = pctWhite, y = medInc, dim = 3) :
The 'sf' package is not loaded, and the class 'sf' attribute of the given data set has been lost. Load 'sf' to retain the class when using 'bi_class'.
If you are going to be using sf
objects, it is important that you load the sf
package as well so that its methods can be correctly applied to your data:
> # load dependencies
> library(biscale)
> library(sf)
>
> # create classes
> data <- bi_class(stl_race_income, x = pctWhite, y = medInc, style = "quantile", dim = 3)
This warning will not be generated if sf
is loaded, or if you are using bi_class()
on a non-sf
object (which leaves room for folks who use other tools like sp
, or who are applying classes to tibbles or data frames).
The dim
argument is used to control the extent of the legend - do you want to produce a two-by-two map (dim = 2
) or a three-by-three map (dim = 3
)?
Classes can be applied with the style
parameter using four approaches for calculating breaks: "quantile"
(default), "equal"
, "fisher"
, and "jenks"
. The default "quantile"
approach will create relatively equal “buckets” of data for mapping, with a break created at the median (50th percentile) for a two-by-two map or at the 33rd and 66th percentiles for a three-by-three map.
With the sample data, this creates a very broad range for the percent white measure in particular. Using one of the other approaches to calculating breaks yields a narrower range for the breaks and produces a map that does not overstate the percent of white residents living on the north side of St. Louis: