rgeospatialquality
provides R native access to the methods of the Geospatial Data Quality REST API. With this API, users can perform some basic quality checks on the geospatial aspect of biodiversity data. See the link above for more information on the API rationale, development, methods and usage.
Latest version is 0.3.2
Although we have plans to make this package available via CRAN, currently the only way to install it is via the install_github
command, from devtools
package. Important note: Windows users will need to install Rtools first.
install.packages("devtools")
devtools::install_github("ropenscilabs/rgeospatialquality")
rgeospatialquality
depends on three packages:
httr
to perform the GET
and POST
requests to the API URLjsonlite
to transform input data to JSON and to parse JSON responses to R native formats (list
and data.frame
)plyr
to make some dataset transformation operationsAlso, it suggests the installation of ROpenSci's rgbif
(https://github.com/ropensci/rgbif), for executing examples.
library(rgeospatialquality)
There are two ways to assess single records and get information on their spatial quality: by providing a list-type object with named elements or by passing the required data via function arguments. In any case, flags are calculated with the function parse_record
, and the result is a named list with the quality information.
The API works on four specific fields, which should be present to provide the most complete answer: decimalLatitude
, decimalLongitude
, countryCode
and scientificName
. None of them is mandatory, but the more complete the provided information, the better the result set will be. See the API documentation.
rec <- list(decimalLatitude=42.1833,
decimalLongitude=-1.8332,
countryCode="ES",
scientificName="Puma concolor")
parse_record(record=rec)
parse_record(decimalLatitude=42.1833,
decimalLongitude=-1.8332,
countryCode="ES",
scientificName="Puma concolor")
The response is a list of named elements, each element being the result of a single test. For more info on the meaning of these flags, please check out the API documentation.
This is what any of the two calls above would return:
## $hasCoordinates
## [1] TRUE
##
## $validCountry
## [1] TRUE
##
## $validCoordinates
## [1] TRUE
##
## $hasCountry
## [1] TRUE
##
## $coordinatesInsideCountry
## [1] TRUE
##
## $hasScientificName
## [1] TRUE
##
## $highPrecisionCoordinates
## [1] TRUE
##
## $coordinatesInsideRangeMap
## [1] FALSE
##
## $nonZeroCoordinates
## [1] TRUE
##
## $distanceToRangeMapInKm
## [1] 6874.023
Apart from assessing records one by one, the API also allows sending a set of records to evaluate them all with a single call, using the add_flags
function. Records must be provided in the form of a data.frame
. Just as before, each record should have the four key fields (decimalLatitude
, decimalLongitude
, countryCode
and scientificName
) to give a response as accurate as possible, although none is mandatory. This time, however, the function returns the provided data.frame
with a new column, called flags
, consisting of a list of all geospatial quality assessment results.
rec1 <- list(decimalLatitude=42.1833, decimalLongitude=-1.8332, countryCode="ES", scientificName="Puma concolor", ...)
rec2 <- list(...)
...
df <- rbind(rec1, rec2, ...)
df2 <- add_flags(df)
One easy way to directly get occurrences with the right format is to use the occ_data
function in ROpenSci's rgbif
package (https://github.com/ropensci/rgbif). There is a vignette (rgbif-synergy
) illustrating how to integrate the two packages to improve the workflow of biodiversity data analysis.
citation(package = 'rgeospatialquality')