spocc

Build Status Build status Coverage Status rstudio mirror downloads cran version

spocc = SPecies OCCurrence data

At rOpenSci, we have been writing R packages to interact with many sources of species occurrence data, including GBIF, Vertnet, BISON, iNaturalist, the Berkeley ecoengine, AntWeb. spocc is an R package to query and collect species occurrence data from many sources. The goal is to wrap functions in other R packages to make a seamless experience across data sources for the user.

The inspiration for this comes from users requesting a more seamless experience across data sources, and from our work on a similar package for taxonomy data (taxize).

BEWARE: In cases where you request data from multiple providers, especially when including GBIF, there could be duplicate records since many providers' data eventually ends up with GBIF. See ?spocc_duplicates, after installation, for more.

Contributing

See CONTRIBUTING.md

Installation

Stable version from CRAN

install.packages("spocc", dependencies = TRUE)

Or the development version from GitHub

install.packages("devtools")
devtools::install_github("ropensci/spocc")
library("spocc")

Basic use

Get data from GBIF

(out <- occ(query='Accipiter striatus', from='gbif', limit=100))
#> Searched: gbif
#> Occurrences - Found: 447,905, Returned: 100
#> Search type: Scientific
#>   gbif: Accipiter striatus (100)
out$gbif # just gbif data
#> Species [Accipiter striatus (100)] 
#> First 10 rows of [Accipiter_striatus]
#> 
#>                  name  longitude latitude prov
#> 1  Accipiter striatus    0.00000  0.00000 gbif
#> 2  Accipiter striatus         NA       NA gbif
#> 3  Accipiter striatus -104.88120 21.46585 gbif
#> 4  Accipiter striatus  -71.19554 42.31845 gbif
#> 5  Accipiter striatus  -78.15051 37.95521 gbif
#> 6  Accipiter striatus  -97.80459 30.41678 gbif
#> 7  Accipiter striatus  -75.17209 40.34000 gbif
#> 8  Accipiter striatus -122.20175 37.88370 gbif
#> 9  Accipiter striatus  -99.47894 27.44924 gbif
#> 10 Accipiter striatus -135.32701 57.05420 gbif
#> ..                ...        ...      ...  ...
#> Variables not shown: issues (chr), key (int), datasetKey (chr),
#>      publishingOrgKey (chr), publishingCountry (chr), protocol (chr),
#>      lastCrawled (chr), lastParsed (chr), extensions (chr), basisOfRecord
#>      (chr), sex (chr), establishmentMeans (chr), taxonKey (int),
#>      kingdomKey (int), phylumKey (int), classKey (int), orderKey (int),
#>      familyKey (int), genusKey (int), speciesKey (int), scientificName
#>      (chr), kingdom (chr), phylum (chr), order (chr), family (chr), genus
#>      (chr), species (chr), genericName (chr), specificEpithet (chr),
#>      taxonRank (chr), continent (chr), stateProvince (chr), year (int),
#>      month (int), day (int), eventDate (time), modified (chr),
#>      lastInterpreted (chr), references (chr), identifiers (chr), facts
#>      (chr), relations (chr), geodeticDatum (chr), class (chr), countryCode
#>      (chr), country (chr), startDayOfYear (chr), verbatimEventDate (chr),
#>      preparations (chr), institutionID (chr), verbatimLocality (chr),
#>      nomenclaturalCode (chr), higherClassification (chr), rights (chr),
#>      higherGeography (chr), occurrenceID (chr), type (chr), collectionCode
#>      (chr), occurrenceRemarks (chr), gbifID (chr), accessRights (chr),
#>      institutionCode (chr), endDayOfYear (chr), county (chr),
#>      catalogNumber (chr), otherCatalogNumbers (chr), occurrenceStatus
#>      (chr), locality (chr), language (chr), identifier (chr), disposition
#>      (chr), dateIdentified (chr), informationWithheld (chr),
#>      http...unknown.org.occurrenceDetails (chr), rightsHolder (chr),
#>      taxonID (chr), datasetName (chr), recordedBy (chr), identificationID
#>      (chr), eventTime (chr), georeferencedDate (chr), georeferenceSources
#>      (chr), identifiedBy (chr), identificationVerificationStatus (chr),
#>      samplingProtocol (chr), georeferenceVerificationStatus (chr),
#>      individualID (chr), locationAccordingTo (chr),
#>      verbatimCoordinateSystem (chr), previousIdentifications (chr),
#>      georeferenceProtocol (chr), identificationQualifier (chr),
#>      dynamicProperties (chr), georeferencedBy (chr), lifeStage (chr),
#>      elevation (dbl), elevationAccuracy (dbl), waterBody (chr),
#>      recordNumber (chr), samplingEffort (chr), locationRemarks (chr),
#>      infraspecificEpithet (chr), collectionID (chr), ownerInstitutionCode
#>      (chr), datasetID (chr), verbatimElevation (chr), vernacularName (chr)

Pas options to each data source

Get fine-grained detail over each data source by passing on parameters to the packge rebird in this example.

out <- occ(query='Setophaga caerulescens', from='ebird', ebirdopts=list(region='US'))
out$ebird # just ebird data
#> Species [Setophaga caerulescens (500)] 
#> First 10 rows of [Setophaga_caerulescens]
#> 
#>                      name longitude latitude  prov
#> 1  Setophaga caerulescens -72.00877 43.44167 ebird
#> 2  Setophaga caerulescens -71.71499 44.82760 ebird
#> 3  Setophaga caerulescens -69.24381 46.10138 ebird
#> 4  Setophaga caerulescens -83.48799 35.02802 ebird
#> 5  Setophaga caerulescens -71.73941 44.76401 ebird
#> 6  Setophaga caerulescens -86.01660 44.75613 ebird
#> 7  Setophaga caerulescens -73.99459 44.30380 ebird
#> 8  Setophaga caerulescens -80.27513 38.19799 ebird
#> 9  Setophaga caerulescens -74.03718 42.20338 ebird
#> 10 Setophaga caerulescens -73.88611 44.51111 ebird
#> ..                    ...       ...      ...   ...
#> Variables not shown: comName (chr), howMany (dbl), locID (chr), locName
#>      (chr), locationPrivate (lgl), obsDt (time), obsReviewed (lgl),
#>      obsValid (lgl)

Many data sources at once

Get data from many sources in a single call

ebirdopts = list(region='US'); gbifopts = list(country='US')
out <- occ(query='Setophaga caerulescens', from=c('gbif','bison','inat','ebird'), gbifopts=gbifopts, ebirdopts=ebirdopts, limit=50)
head(occ2df(out)); tail(occ2df(out))
#> Error in mapply(FUN = f, ..., SIMPLIFY = FALSE): object 'id' not found
#> Error in mapply(FUN = f, ..., SIMPLIFY = FALSE): object 'id' not found

Make maps

All mapping functionality is now in a separate package spoccutils, to make spocc easier to maintain.

Meta

ropensci_footer