Statistics Netherlands (CBS) is the office that produces all official statistics of the Netherlands.
For long SN has put its data on the web in its online database StatLine. Since 2014 this data base has an open data web API based on the OData protocol. The cbsodata package allows for retrieving data right into R.
A list of tables can be retrieved using the get_table_list
function.
tables <- get_table_list(Language="en") # retrieve only enlgish tables
tables %>%
select(Identifier, ShortTitle) %>%
head
## Identifier ShortTitle
## 1 80783eng Agriculture; general farm type, region
## 2 80784eng Agriculture; labour force, region
## 3 7100eng Arable crops; production
## 4 70671ENG Fruit culture; area fruit orchards
## 5 37738ENG Vegetables; yield per kind of vegetable
## 6 71509ENG Yield apples and pears
Using an “Identifier” from get_table_list
information on the table can be retrieved with get_meta
m <- get_meta('71509ENG')
m
## 71509ENG: 'Yield apples and pears', 2013
## FruitFarmingRegions: 'Fruit farming regions'
## Periods: 'Periods'
The meta object contains all metadata properties of cbsodata (see the original documentation) in the form of data.frames. Each data.frame describes properties of the SN table.
names(m)
## [1] "TableInfos" "DataProperties" "FruitFarmingRegions"
## [4] "Periods"
With get_data
data can be retrieved. By default all data for this table will be downloaded in a temporary directory.
get_data('71509ENG') %>%
select(2:5) %>% # select column 2 to 5 (for demonstration purpose)
head
## Source: local data frame [6 x 4]
##
## FruitFarmingRegions Periods TotalAppleVarieties_1 CoxSOrangePippin_2
## (fctr) (fctr) (chr) (chr)
## 1 Total Netherlands 1997 420 43
## 2 Total Netherlands 1998 518 40
## 3 Total Netherlands 1999 568 39
## 4 Total Netherlands 2000 461 27
## 5 Total Netherlands 2001 408 30
## 6 Total Netherlands 2002 354 17
The data will be automatically recoded with titles for the categories. If needed the original data can be retained with recode=FALSE
get_data('71509ENG', recode = FALSE) %>%
select(2:5) %>%
head
## Source: local data frame [6 x 4]
##
## FruitFarmingRegions Periods TotalAppleVarieties_1 CoxSOrangePippin_2
## (chr) (chr) (chr) (chr)
## 1 1 1997JJ00 420 43
## 2 1 1998JJ00 518 40
## 3 1 1999JJ00 568 39
## 4 1 2000JJ00 461 27
## 5 1 2001JJ00 408 30
## 6 1 2002JJ00 354 17
It is possible restrict the download using filter statements. This may shorten the download time considerably.
get_data('71509ENG', Periods='2000JJ00') %>%
select(2:5) %>%
head
## Source: local data frame [5 x 4]
##
## FruitFarmingRegions Periods TotalAppleVarieties_1 CoxSOrangePippin_2
## (fctr) (fctr) (chr) (chr)
## 1 Total Netherlands 2000 461 27
## 2 Region North 2000 87 5
## 3 Region West 2000 105 10
## 4 Region Central 2000 215 10
## 5 Region South 2000 53 2