Using hydroscoper’s data

Konstantinos Vantas

2018-03-16

This vignette shows how to use the package’s internal data sets.

Load libraries

library(hydroscoper)
library(tibble)
library(ggplot2)

Data sets

There are three data sets stored in the package. stations is comprised of the stations’ id, name, longitude, latitude, etc.

stations
#> # A tibble: 2,322 x 9
#>    station_id name    water_basin  water_division owner longitude latitude
#>         <int> <chr>   <chr>        <chr>          <chr>     <dbl>    <dbl>
#>  1     501032 AG. BA~ "KOURTALIOT~ GR13           min_~      NA       NA  
#>  2     200246 GEPH. ~ "ALPHEIOS P~ GR01           min_~      22.0     37.5
#>  3     200237 TROPAIA "ALPHEIOS P~ GR01           min_~      22.0     37.7
#>  4     200231 BYTINA  "ALPHEIOS P~ GR01           min_~      22.2     37.7
#>  5     200200 LYKOUR~ "ALPHEIOS P~ GR01           min_~      22.2     37.9
#>  6     200236 MEGALO~ "ALPHEIOS P~ GR01           min_~      22.1     37.4
#>  7     200244 ODOG. ~ "REMA CHORA~ GR01           min_~      21.8     37.0
#>  8     200204 TRIPOT~ "ALPHEIOS P~ GR01           min_~      21.9     37.9
#>  9     200198 KASTEL~ "ALPHEIOS P~ GR01           min_~      22.0     37.9
#> 10     200239 PERDIK~ "ALPHEIOS P~ GR01           min_~      22.0     37.7
#> # ... with 2,312 more rows, and 2 more variables: altitude <dbl>,
#> #   subdomain <chr>

timeseries of the time series’ id, the corresponding station, variable type, time step etc.

timeseries
#> # A tibble: 10,804 x 8
#>    time_id station_id variable   timestep units start_date     end_date   
#>      <int>      <int> <chr>      <chr>    <chr> <chr>          <chr>      
#>  1    2248     501049 temperatu~ <NA>     °     2009-02-01T00~ 2010-08-31~
#>  2     430     200103 wind_dire~ <NA>     °     1950-10-26T08~ 1997-07-19~
#>  3     905     200247 wind_dire~ <NA>     °     1967-01-01T08~ 1997-12-31~
#>  4    2243     501058 temperatu~ <NA>     °     1999-01-01T00~ 2010-08-31~
#>  5     438     200105 wind_dire~ <NA>     °     1950-06-05T08~ 1997-07-31~
#>  6     553     200135 wind_dire~ <NA>     °     1964-11-21T08~ 1997-08-31~
#>  7     966     200265 wind_dire~ <NA>     °     1967-01-01T08~ 1997-03-31~
#>  8     775     200203 wind_dire~ <NA>     °     1964-05-20T08~ 1997-06-30~
#>  9    2245     501046 temperatu~ <NA>     °     2007-07-01T01~ 2010-07-07~
#> 10     247     200034 wind_dire~ <NA>     °     1969-11-25T08~ 1997-09-18~
#> # ... with 10,794 more rows, and 1 more variable: subdomain <chr>

greece_borders is a data-frame for use with the function geom_polygon from the ggplot2 package.

Stations location

stations and greece_borders can be used to create a map with all Hydroscope’s stations. Unfortunately, there is a number of them that have erroneous coordinates (over the sea and far from Greece). Also, there are 120 stations with missing coordinates.

ggplot() + 
  geom_polygon(data = greece_borders,
               aes(long, lat, group = group),
               fill = "grey",
               color = NA) +
  geom_point(data = stations,
             aes(x = longitude, y = latitude, color = subdomain)) +
  scale_color_manual(values=c("#E64B35FF", "#4DBBD5FF", "#00A087FF", 
                              "#3C5488FF"))+
  coord_fixed(ratio=1) +
  theme_bw()
#> Warning: Removed 120 rows containing missing values (geom_point).

Stations with available time series

The location of the stations with time series available to download are presented at the following map.

stations_ts <- subset(stations, station_id %in% timeseries$station_id &
                        subdomain %in% c("kyy", "ypaat"))


ggplot() + 
  geom_polygon(data = greece_borders,
               aes(long, lat, group = group),
               fill = "grey",
               color = NA) +
  geom_point(data = stations_ts,
             aes(x = longitude, y = latitude, color = subdomain)) +
  scale_color_manual(values=c("#00A087FF", "#3C5488FF"))+
  coord_fixed(ratio=1) +
  theme_bw()
#> Warning: Removed 10 rows containing missing values (geom_point).

Although there is a large number of stations with available data, there is heterogeneity in the coverage of the country.