Because the stations provide geospatial location information, it is possible to create a spatial file. GeoPackage files are a open, standards-based, platform-independent, portable, self-describing compact format for transferring geospatial information, which handle vector files much like shapefiles do, but eliminate many of the issues that shapefiles have with field names and the number of files.
Simple features are an ISO defined standard that now have support in R. From the sf vignette:
Simple features or simple feature access refers to a formal standard (ISO 19125-1:2004) that describes how objects in the real world can be represented in computers, with emphasis on the spatial geometry of these objects. It also describes how such objects can be stored in and retrieved from databases, and which geometrical operations should be defined for them. The standard is widely implemented in spatial databases (such as PostGIS), commercial GIS (e.g., ESRI ArcGIS) and forms the vector data basis for libraries such as GDAL. A subset of simple features forms the GeoJSON standard.
For this example, we convert the GSOD data for Australia in 2017 to an sf object.
library(GSODR)
library(future)
library(sf)
plan("multisession")
GSOD <- get_GSOD(years = 2017, country = "AUS")
GSOD_SF <- st_as_sf(x = GSOD,
coords = c("LONGITUDE", "LATITUDE"),
crs = "+proj=longlat +datum=WGS84")
str(GSOD_SF)
## Classes 'sf', 'data.table' and 'data.frame': 182508 obs. of 43 variables:
## $ STNID : chr "941000-99999" "941000-99999" "941000-99999" "941000-99999" ...
## $ NAME : chr "KALUMBURU" "KALUMBURU" "KALUMBURU" "KALUMBURU" ...
## $ CTRY : chr "AS" "AS" "AS" "AS" ...
## $ STATE : chr "" "" "" "" ...
## $ ELEVATION : num 24 24 24 24 24 24 24 24 24 24 ...
## $ BEGIN : int 20010912 20010912 20010912 20010912 20010912 20010912 20010912 20010912 20010912 20010912 ...
## $ END : int 20190831 20190831 20190831 20190831 20190831 20190831 20190831 20190831 20190831 20190831 ...
## $ YEARMODA : Date, format: "2017-01-01" "2017-01-02" ...
## $ YEAR : int 2017 2017 2017 2017 2017 2017 2017 2017 2017 2017 ...
## $ MONTH : int 1 1 1 1 1 1 1 1 1 1 ...
## $ DAY : int 1 2 3 4 5 6 7 8 9 10 ...
## $ YDAY : int 1 2 3 4 5 6 7 8 9 10 ...
## $ TEMP : num 29.3 29 26.8 27.3 28.9 29.2 29.1 30.1 30.4 28.4 ...
## $ TEMP_ATTRIBUTES : chr "16" "16" "16" "16" ...
## $ DEWP : num 22.8 23.2 23.8 22.7 23.2 22.6 22.3 23.2 23.1 22.8 ...
## $ DEWP_ATTRIBUTES : chr "16" "16" "16" "16" ...
## $ SLP : num 1003 1002 1000 999 1003 ...
## $ SLP_ATTRIBUTES : chr "16" "16" "16" "16" ...
## $ STP : num NA 1000 998 996 NA ...
## $ STP_ATTRIBUTES : chr "16" "16" "16" "16" ...
## $ VISIB : num NA NA NA NA NA NA NA NA NA NA ...
## $ VISIB_ATTRIBUTES: chr " 0" " 0" " 0" " 0" ...
## $ WDSP : num 2.3 2 1.5 3.7 3.3 2.4 2.2 2.2 2.3 2 ...
## $ WDSP_ATTRIBUTES : chr "16" "16" "16" "16" ...
## $ MXSPD : num 4.6 3.6 3.1 5.7 6.2 5.1 5.1 4.1 5.1 4.1 ...
## $ GUST : num NA NA NA NA NA NA NA NA NA NA ...
## $ MAX : num 34.7 34.7 31.4 31.6 32.5 34.2 34.5 35.1 35.5 35.4 ...
## $ MAX_ATTRIBUTES : chr "*" "*" "*" "*" ...
## $ MIN : num 24.7 24.6 24.5 24.2 25.2 25.2 24.6 24.5 25.2 24.2 ...
## $ MIN_ATTRIBUTES : chr "*" NA "*" NA ...
## $ PRCP : num 5.1 6.1 2 64 15 0.3 0 0 0 25.9 ...
## $ PRCP_ATTRIBUTES : chr "A" "G" "G" "G" ...
## $ SNDP : num NA NA NA NA NA NA NA NA NA NA ...
## $ I_FOG : int 1 1 1 1 1 1 1 1 1 1 ...
## $ I_RAIN_DRIZZLE : int NA NA NA NA NA NA NA NA NA NA ...
## $ I_SNOW_ICE : int NA NA NA NA NA NA NA NA NA NA ...
## $ I_HAIL : int NA NA NA NA NA NA NA NA NA NA ...
## $ I_THUNDER : int NA NA NA NA NA NA NA NA NA NA ...
## $ I_TORNADO_FUNNEL: int NA NA NA NA NA NA NA NA NA NA ...
## $ EA : num 2.8 2.8 2.9 2.8 2.8 2.7 2.7 2.8 2.8 2.8 ...
## $ ES : num 4.1 4 3.5 3.6 4 4.1 4 4.3 4.3 3.9 ...
## $ RH : num 68.3 70 82.9 77.8 70 65.9 67.5 65.1 65.1 71.8 ...
## $ geometry :sfc_POINT of length 182508; first list element: 'XY' num 126.6 -14.3
## - attr(*, "sf_column")= chr "geometry"
## - attr(*, "agr")= Factor w/ 3 levels "constant","aggregate",..: NA NA NA NA NA NA NA NA NA NA ...
## ..- attr(*, "names")= chr "STNID" "NAME" "CTRY" "STATE" ...
Writing a shapefile is a simple matter.
## Warning in abbreviate_shapefile_names(obj): Field names abbreviated for
## ESRI Shapefile driver
As is writing a GeoPackage from the sf object.
After getting weather stations for Australia and creating a GeoPackage file, sf can import the data into R again in a spatial format.
## Reading layer `GSOD' from data source `/private/var/folders/_x/gqh2xrvn4qb0qs5d8795n8vr0000gn/T/RtmpS2DFky/GSOD.gpkg' using driver `GPKG'
## Simple feature collection with 182508 features and 42 fields
## geometry type: POINT
## dimension: XY
## bbox: xmin: 113.583 ymin: -54.499 xmax: 159.079 ymax: -10.05
## epsg (SRID): 4326
## proj4string: +proj=longlat +datum=WGS84 +no_defs
## [1] "sf" "data.frame"
Since GeoPackage files are formatted as SQLite databases you can also use other R tools for SQLite files (J. Stachelek 2016). One easy way is using dplyr. This option is much faster to load since it does not load the geometry.
library(dplyr)
AUS_sqlite <- tbl(src_sqlite(file.path(tempdir(), "GSOD.gpkg")), "GSOD")
class(AUS_sqlite)
## [1] "tbl_SQLiteConnection" "tbl_dbi" "tbl_sql"
## [4] "tbl_lazy" "tbl"
Stachelek, J. (2016) Using the Geopackage Format with R. URL: https://jsta.github.io/2016/07/14/geopackage-r.html