tidycensus is an R package that allows users to interface with the US Census Bureau’s decennial Census and five-year American Community APIs and return tidyverse-ready data frames, optionally with simple feature geometry included. Install from CRAN with the following command:
tidycensus now defaults to the new 2012-2016 five-year ACS estimates in
By passing a named vector to the
variables parameter in
get_decennial(), tidycensus will let you define your own variable names rather than the Census ID codes. For example:
racevars <- c(White = "P0050003", Black = "P0050004", Asian = "P0050006", Hispanic = "P0040003") harris <- get_decennial(geography = "tract", variables = racevars, state = "TX", county = "Harris County", geometry = TRUE, summary_var = "P0010001") head(harris) ## Simple feature collection with 6 features and 5 fields ## geometry type: MULTIPOLYGON ## dimension: XY ## bbox: xmin: -95.37457 ymin: 29.74486 xmax: -95.32409 ymax: 29.80907 ## epsg (SRID): 4269 ## proj4string: +proj=longlat +datum=NAD83 +no_defs ## # A tibble: 6 x 6 ## GEOID NAME variable value summary_value ## <chr> <chr> <chr> <dbl> <dbl> ## 1 48201100000 Census Tract 1000 White 2082 4690 ## 2 48201100000 Census Tract 1000 Black 1047 4690 ## 3 48201100000 Census Tract 1000 Asian 134 4690 ## 4 48201100000 Census Tract 1000 Hispanic 1070 4690 ## 5 48201210900 Census Tract 2109 White 35 1620 ## 6 48201210900 Census Tract 2109 Black 1195 1620 ## # ... with 1 more variables: geometry <S3: sfc_MULTIPOLYGON>
state parameters now work when
geography is set to
moe_sum() function now avoids inflating the derived margin of error when multiple zero estimates are involved.
Variables without associated margins of error (e.g. B00001_001) can now be obtained with
census_api_key("KEY", install = TRUE):
library(tidycensus) df <- get_acs(geography = "state", table = "B01001", survey = "acs1", year = 2016)
table parameter fetches a variable list from the Census Bureau website to perform table lookup. To cache the variable list on your computer for faster use of the
table parameter in the future, set
cache_table = TRUE the first time you fetch a table for a particular dataset.
My work heavily involves the use of data from the US Census Bureau, and like many R users, I do most of my work within the tidyverse. Beyond this, the sf package now allows R users to work with spatial data in an integrated way with tidyverse tools, and updates to the tigris package provide access to Census boundary data as
sf objects. Recently, I’ve found myself writing the same routines over and over to get Census data ready for use with tidyverse packages and sf. This motivated me to wrap these functions in a package and open-source in case other R users find them useful.
tidycensus is designed to help R users get Census data that is pre-prepared for exploration within the tidyverse, and optionally spatially with sf. To learn more about how the package works, I encourage you to read the following articles:
To keep up with on-going development of tidycensus and get even more examples of how to use the package, subscribe to my email list by clicking here (no spam, I promise!). You’ll also get updates on the development of my upcoming book with CRC Press, Analyzing the US Census with R.
You can also follow my blog at https://walkerke.github.io.
My development focus is on making the current datasets as accessible as possible; if you need other approaches or datasets, I recommend the censusapi and acs packages.
If you find this project useful, you can support package development in the following ways:
Note: This product uses the Census Bureau Data API but is not endorsed or certified by the Census Bureau.