echor
is an R package to search and download data from the US Environmental Protection Agency (EPA) Environmental Compliance and History Online (ECHO). echor
uses the ECHO API to download data directly to the R as dataframes or simple features. ECHO provides information about facilities permitted to emitted air pollutants or discharge into water bodies. ECHO also provides data reported by permitted facilites as volume or concentration of pollutants during reporting time periods (typically anually for air emissions and monthly or quarterly for water discharges).
ECHO provides data for:
echor
currently provides functions to retrieve information about permitted air dischargers, water dischargers, and public drinking water supply systems. It also provides functions to download discharge reports for permitted air and water dischargers. echor
does not currently provide functionality to retrieve RCRA data.
See https://echo.epa.gov/tools/web-services for information about ECHO web services and API functions.
This vignette documents a few key functions to get started.
There are three types of functions:
Retrieve metadata from ECHO to narrow the specify data returned or lookup parameter codes.
echoAirGetMeta()
- Returns variable name and descriptions for paramaters returned in air facility queries.
echoSDWGetMeta()
- Returns variable name and descriptions for parameters returned in public water system queries.
echoWaterGetMeta()
- Returns varaiable name and descriptions for parameters returned in water discharge facility queries (e.g. facilities with an NPDES permit).
echoWaterGetParams()
- Search parameter codes for constiuent pollutants regulated under NPDES permits.
Search and return facility information based on lookup parameters.
echoAirGetFacilityInfo()
- Returns a dataframe of permitted air discharge facilities and associated information based on lookup parameters specified by the user.
echoSDWGetSystems()
- Returns a dataframe of permitted air discharge facilities and associated information based on lookup parameters specified by the user.
echoWaterGetFacilityInfo()
- Returns a dataframe of permitted water discharge facilities and associated information based on lookup parameters specified by the user.
Search and return discharge and emissions reports for specified facilities.
echoGetCAAPR()
- Returns a dataframe with reported annual air emissions from permitted facilities.
echoGetEffluent()
- Returns a dataframe with reported water effluent discharges from permitted facilities.
Suppose we want to find facilities permitted under the Clean Air Act requirements.
Step 1 - Identify the information we need returned from the query:
library(echor)
meta <- echoAirGetMeta()
meta
#> # A tibble: 131 x 6
#> ColumnName DataType DataLength ColumnID ObjectName Description
#> <chr> <chr> <chr> <chr> <chr> <chr>
#> 1 AIR_NAME VARCHAR2 200 1 AIRName The name of the A~
#> 2 SOURCE_ID VARCHAR2 30 2 SourceID Unique Identifier~
#> 3 AIR_STREET VARCHAR2 200 3 AIRStreet The street addres~
#> 4 AIR_CITY VARCHAR2 100 4 AIRCity <NA>
#> 5 AIR_STATE CHAR 2 5 AIRState The state where t~
#> 6 LOCAL_CONT~ CHAR 3 6 LocalContr~ Code for regions ~
#> 7 AIR_ZIP VARCHAR2 10 7 AIRZip The five-digit zi~
#> 8 REGISTRY_ID VARCHAR2 50 8 RegistryID An internal 12-di~
#> 9 AIR_COUNTY VARCHAR2 100 9 AIRCounty The name of the c~
#> 10 AIR_EPA_RE~ CHAR 2 10 AIREPARegi~ The EPA region wh~
#> # ... with 121 more rows
The dataframe includes ColumnID, which can be included as an argument that specifies what information you want returned: qcolumns = "1,2,3,22,23"
Step 2 - Create the query. The ECHO API provides numerous arguments to search by that are not documented in this package. I reccomend exploring the documentation here: https://echo.epa.gov/tools/web-services/facility-search-air#!/Facilities/get_air_rest_services_get_facility_info. In this example, we will search by a geographic bounding box and specfiy the returned information with the qcolumns
argument. Each argument should be passeed to ECHO as echoAirGetFacilityInfo(parameter = "value")
. echor
will URL encode strings automatically. Please note that any date argument needs to be entered as “mm/dd/yyyy”.
library(echor)
## Retrieve information about facilities within a geographic location
df <- echoAirGetFacilityInfo(output = "df",
xmin = '-96.387509',
ymin = '30.583572',
xmax = '-96.281422',
ymax = '30.640008',
qcolumns = "1,2,3,22,23")
AIRName | SourceID | AIRStreet | FacLat | FacLong |
---|---|---|---|---|
AGGIE CLEANERS | 06000000480416E020 | 111 COLLEGE MAIN | 30.61869 | -96.34588 |
ALL SEASONS 1 HR CLEANERS | 06000000480416E015 | 2501 TEXAS AVENUE SOUTH #D100 | 30.60704 | -96.30875 |
BLUEBONNET PAVING | TX0000004877700147 | HWY. 60, WEST OF | 30.61337 | -96.32098 |
BRAZOS PAVING TRENCH BURNER IN1633 | TX0000004877702288 | TURN S ON VICTORIA AVE OFF OF HWY 40 TAKE A R ON W | 30.61337 | -96.32098 |
BRYAN CERAMICS PLANT | TX0000004804100027 | 1500 INDEPENDENCE AVE | 30.63760 | -96.36235 |
BRYAN CLEANERS & LAUNDRY | 06000000480416E012 | 1803 HOLLEMAN DRIVE | 30.61225 | -96.31750 |
Some example arguments are listed below:
p_fn string Facility Name Filter.
One or more case-insesitive facility names.
Provide multiple values as comma-delimited list
ex:
p_fn = "Aggie Cleaners, City of Bryan, TEXAS A&M UNIVERSITY COLLEGE STATION CAMPUS"
p_sa string Facility Street Address
ex:
p_sa = "WELLBORN ROAD & UNIVERSITY DR"
p_ct string Facility City
Provide a single case-insensitive city name
ex:
p_ct = "College Station"
p_co string Facility County
Provide a single county name, in combination with a state value
provided through p_st
ex:
p_co = "Brazos", p_st = "Texas"
p_fips string FIPS Code
Single 5-character Federal Information Processing Standards (FIPS)
state+county value
p_st string Facility State or State Equivalent Filter
Provide one or more USPS postal abbreviations
ex:
p_st = "TX, NC"
p_zip string Facility 5-Digit Zip Code
Provide one or more 5-digit postal zip codes
ex:
p_zip = "77843, 77845"
xmin string Minimum longitude value in decimal degrees
ymin string Minimum latitude value in decimal degrees
xmax string Maximum longitude value in decimal degrees
ymax string Maximum latitude value in decimal degrees
Step 3 - Download the emission inventory report for a permitted facility:
Name | SourceID | Street | City | State | Zip | County | Region | Latitude | Longitude | Pollutant | UnitsOfMeasure | Program | Year | Discharge |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
CP&L - SUTTON PLANT | 110000350174 | 801 SUTTON STEAM PLANT ROAD | WILMINGTON | NC | 28401 | NEW HANOVER | 04 | 34.28332 | -77.98523 | 1,2-Dichloroethane | Pounds | NEI | 2008 | 47.95 |
CP&L - SUTTON PLANT | 110000350174 | 801 SUTTON STEAM PLANT ROAD | WILMINGTON | NC | 28401 | NEW HANOVER | 04 | 34.28332 | -77.98523 | Acetophenone | Pounds | NEI | 2008 | 17.98 |
CP&L - SUTTON PLANT | 110000350174 | 801 SUTTON STEAM PLANT ROAD | WILMINGTON | NC | 28401 | NEW HANOVER | 04 | 34.28332 | -77.98523 | Anthracene | Pounds | NEI | 2008 | 0.25 |
CP&L - SUTTON PLANT | 110000350174 | 801 SUTTON STEAM PLANT ROAD | WILMINGTON | NC | 28401 | NEW HANOVER | 04 | 34.28332 | -77.98523 | Dimethyl sulfate | Pounds | NEI | 2008 | 63.19 |
CP&L - SUTTON PLANT | 110000350174 | 801 SUTTON STEAM PLANT ROAD | WILMINGTON | NC | 28401 | NEW HANOVER | 04 | 34.28332 | -77.98523 | Hexane | Pounds | NEI | 2008 | 80.31 |
CP&L - SUTTON PLANT | 110000350174 | 801 SUTTON STEAM PLANT ROAD | WILMINGTON | NC | 28401 | NEW HANOVER | 04 | 34.28332 | -77.98523 | Manganese | Pounds | NEI | 2008 | 290.51 |
There are only two valid arguments for echoGetCAAPR
.
p_id string EPA Facility Registry Service's REGISTRY_ID.
p_units string Units of measurement. Defaults is 'lbs'.
Enter "TPWE" for toxic weighted pounds equivalents.
Find facilites with NPDES permits to discharge wastewater:
df <- echoWaterGetFacilityInfo(xmin = '-96.407563', ymin = '30.554395',
xmax = '-96.25947', ymax = '30.751984',
output = 'df')
CWPName | SourceID | CWPStreet | CWPCity | CWPState | CWPStateDistrict | CWPZip | MasterExternalPermitNmbr | RegistryID | CWPCounty | CWPEPARegion | FacDerivedHuc | FacLat | FacLong | CWPTotalDesignFlowNmbr | CWPActualAverageFlowNmbr | ReceivingMs4Name | AssociatedPollutant | MsgpPermitType | CWPPermitStatusDesc | CWPPermitTypeDesc | CWPIssueDate | CWPEffectiveDate | CWPExpirationDate | CWPSNCStatusDate | CWPStateWaterBodyCode |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
ACE TOWNHOME | TXR15667I | 2136 CHESTNUT OAK CIR | COLLEGE STATION | TX | 77845-4168 | TXR150000 | 110070364352 | 06 | 12070103 | 30.61411 | -96.28490 | NA | NA | Effective | General Permit Covered Facility | 2018-06-01 | 2018-06-01 | 2023-03-05 | 2018-09-30 | ||||||
AGGIE ACRES WWTP | TX0132187 | 800 FT SE OF N DOWLING RD APPROX 600 FT SW OF WALN | COLLEGE STATION | TX | 77845 | 110064633829 | Brazos | 06 | 30.55565 | -96.29110 | NA | NA | Not Needed | NPDES Individual Permit | NA | NA | NA | 2018-09-30 | |||||||
AGRIVEST SWINE FEEDLOT | TX0121240 | SWISHER COUNTY | BRYAN | TX | 00000 | 110039193271 | Swisher | 06 | 30.66658 | -96.36552 | NA | NA | Terminated | NPDES Individual Permit | 2000-01-14 | 2000-01-14 | 2004-07-27 | 2018-09-30 | |||||||
ALENCO WINDOWS | TXR05U808 | 615 W CARSON ST | BRYAN | TX | 77801-1102 | TXR050000 | 110000464284 | Brazos | 06 | 12070103 | 30.64457 | -96.37270 | NA | NA | Effective | General Permit Covered Facility | 2016-10-28 | 2016-11-01 | 2021-08-13 | 2018-09-30 | |||||
ANDREWS ORTHODONTIST AND RETAIL BUILDING | TXR15193R | 1098 ARRINGTON RD | COLLEGE STATION | TX | 77845 | TXR150000 | 110070368658 | 06 | 12070103 | 30.55800 | -96.26566 | NA | NA | Effective | General Permit Covered Facility | 2018-08-27 | 2018-09-01 | 2023-03-05 | 2018-09-30 | ||||||
ASTIN AVIATION | TXR05CE76 | 1770 GEORGE BUSH DR W | COLLEGE STATION | TX | 77845-4761 | TXR050000 | 110070360445 | 06 | 12070101 | 30.59288 | -96.35277 | NA | NA | Effective | General Permit Covered Facility | 2017-07-10 | 2017-08-01 | 2021-08-13 | 2018-09-30 |
Again, there are a ton of possible arguments to query ECHO with. All arguments are described here: https://echo.epa.gov/tools/web-services/facility-search-water#!/Facility_Information/get_cwa_rest_services_get_facility_info
Commonly used arguments are provided below:
p_fn string Facility Name Filter.
One or more case-insesitive facility names.
Provide multiple values as comma-delimited list
ex:
p_fn = "Aggie Cleaners, City of Bryan, TEXAS A&M UNIVERSITY COLLEGE STATION CAMPUS"
p_sa string Facility Street Address
ex:
p_sa = "WELLBORN ROAD & UNIVERSITY DR"
p_ct string Facility City
Provide a single case-insensitive city name
ex:
p_ct = "College Station"
p_co string Facility County
Provide a single county name, in combination with a state value
provided through p_st
ex:
p_co = "Brazos", p_st = "Texas"
p_fips string FIPS Code
Single 5-character Federal Information Processing Standards (FIPS)
state+county value
p_st string Facility State or State Equivalent Filter
Provide one or more USPS postal abbreviations
ex:
p_st = "TX, NC"
p_zip string Facility 5-Digit Zip Code
Provide one or more 5-digit postal zip codes
ex:
p_zip = "77843, 77845"
xmin string Minimum longitude value in decimal degrees
ymin string Minimum latitude value in decimal degrees
xmax string Maximum longitude value in decimal degrees
ymax string Maximum latitude value in decimal degrees
p_huc string 2-,4,6-,or 8-digit watershed code.
May contain comma-seperated values
Download discharge monitoring reports from ECHO from specified facilities:
activity_id | npdes_id | version_nmbr | perm_feature_id | perm_feature_nmbr | perm_feature_type_code | perm_feature_type_desc | limit_set_id | limit_set_schedule_id | limit_id | limit_type_code | limit_begin_date | limit_end_date | nmbr_of_submission | parameter_code | parameter_desc | monitoring_location_code | monitoring_location_desc | stay_type_code | stay_type_desc | limit_value_id | limit_value_type_code | limit_value_type_desc | limit_value_nmbr | limit_unit_code | limit_unit_desc | standard_unit_code | standard_unit_desc | limit_value_standard_units | statistical_base_code | statistical_base_short_desc | statistical_base_type_code | statistical_base_type_desc | limit_value_qualifier_code | stay_value_nmbr | dmr_event_id | monitoring_period_end_date | dmr_form_value_id | value_type_code | value_type_desc | dmr_value_id | dmr_value_nmbr | dmr_unit_code | dmr_unit_desc | dmr_value_standard_units | dmr_value_qualifier_code | value_received_date | days_late | nodi_code | nodi_desc | exceedence_pct | npdes_violation_id | violation_code | violation_desc | rnc_detection_code | rnc_detection_desc | rnc_detection_date | rnc_resolution_code | rnc_resolution_desc | rnc_resolution_date | violation_severity |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
This function only retrieves from a single facility per call. The following arguments are available from ECHO:
p_id string EPA Facility Registry Service's REGISTRY_ID.
outfall string Three-character code identifying the point of discharge.
parameter_code string Five-digit numeric code identifying the parameter.
start_date string Start date of interest. Must be entered as "mm/dd/yyyy"
end_date string End date of interest. Must be entered as "mm/dd/yyyy"
Parameters codes can be searched using echoWaterGetParams
.
echoWaterGetParams(term = "Oxygen, dissolved")
#> # A tibble: 5 x 2
#> ValueCode ValueDescription
#> <chr> <chr>
#> 1 00300 Oxygen, dissolved [DO]
#> 2 51646 Oxygen, dissolved [DO] maximum
#> 3 51645 Oxygen, dissolved [DO] minimum
#> 4 00301 Oxygen, dissolved percent saturation
#> 5 00399 Oxygen, dissolved, % of time violated
Available arguments include:
term string partial or complete search phrase or word
code string partial or complete code value
You can only enter either term or code arguments.
Multiple DMRs can be downloaded using a helper function: downloadDMRs
:
df <- tibble::tibble(permit = c('TX0119407', 'TX040237'))
df <- downloadDMRs(df, idColumn = permit)
df <- tidyr::unnest(df)
tibble::glimpse(df)
#> Observations: 880
#> Variables: 62
#> $ permit <chr> "TX0119407", "TX0119407", "TX01194...
#> $ activity_id <chr> "3600178396", "3600178396", "36001...
#> $ npdes_id <chr> "TX0119407", "TX0119407", "TX01194...
#> $ version_nmbr <int> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0...
#> $ perm_feature_id <chr> "3600049681", "3600049681", "36000...
#> $ perm_feature_nmbr <chr> "001", "001", "001", "001", "001",...
#> $ perm_feature_type_code <chr> "EXO", "EXO", "EXO", "EXO", "EXO",...
#> $ perm_feature_type_desc <chr> "External Outfall", "External Outf...
#> $ limit_set_id <dbl> 3600061722, 3600061722, 3600061722...
#> $ limit_set_schedule_id <dbl> 3600073706, 3600073706, 3600073706...
#> $ limit_id <dbl> 3600437315, 3600437315, 3600437315...
#> $ limit_type_code <chr> "ENF", "ENF", "ENF", "ENF", "ENF",...
#> $ limit_begin_date <date> 2015-08-01, 2015-08-01, 2015-08-0...
#> $ limit_end_date <date> 2020-03-01, 2020-03-01, 2020-03-0...
#> $ nmbr_of_submission <int> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1...
#> $ parameter_code <chr> "00300", "00300", "00300", "00300"...
#> $ parameter_desc <chr> "Oxygen, dissolved [DO]", "Oxygen,...
#> $ monitoring_location_code <chr> "1", "1", "1", "1", "1", "1", "1",...
#> $ monitoring_location_desc <chr> "Effluent Gross", "Effluent Gross"...
#> $ stay_type_code <chr> "", "", "", "", "", "", "", "", ""...
#> $ stay_type_desc <chr> "", "", "", "", "", "", "", "", ""...
#> $ limit_value_id <chr> "3600678121", "3600678121", "36006...
#> $ limit_value_type_code <chr> "C1", "C1", "C1", "C1", "C1", "C1"...
#> $ limit_value_type_desc <chr> "Concentration1", "Concentration1"...
#> $ limit_value_nmbr <dbl> 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4...
#> $ limit_unit_code <chr> "19", "19", "19", "19", "19", "19"...
#> $ limit_unit_desc <chr> "mg/L", "mg/L", "mg/L", "mg/L", "m...
#> $ standard_unit_code <chr> "19", "19", "19", "19", "19", "19"...
#> $ standard_unit_desc <chr> "mg/L", "mg/L", "mg/L", "mg/L", "m...
#> $ limit_value_standard_units <dbl> 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4...
#> $ statistical_base_code <chr> "MO", "MO", "MO", "MO", "MO", "MO"...
#> $ statistical_base_short_desc <chr> "MO MIN", "MO MIN", "MO MIN", "MO ...
#> $ statistical_base_type_code <chr> "MIN", "MIN", "MIN", "MIN", "MIN",...
#> $ statistical_base_type_desc <chr> "Minimum", "Minimum", "Minimum", "...
#> $ limit_value_qualifier_code <chr> ">=", ">=", ">=", ">=", ">=", ">="...
#> $ stay_value_nmbr <chr> "", "", "", "", "", "", "", "", ""...
#> $ dmr_event_id <chr> "3403423185", "3403423200", "34034...
#> $ monitoring_period_end_date <date> 2015-10-31, 2015-11-30, 2015-12-3...
#> $ dmr_form_value_id <chr> "3442281413", "3442281617", "34422...
#> $ value_type_code <chr> "C1", "C1", "C1", "C1", "C1", "C1"...
#> $ value_type_desc <chr> "Concentration1", "Concentration1"...
#> $ dmr_value_id <chr> "3614856990", "3616166174", "36168...
#> $ dmr_value_nmbr <dbl> 5.21, 7.90, 6.50, 7.80, 5.36, 4.60...
#> $ dmr_unit_code <chr> "19", "19", "19", "19", "19", "19"...
#> $ dmr_unit_desc <chr> "mg/L", "mg/L", "mg/L", "mg/L", "m...
#> $ dmr_value_standard_units <dbl> 5.21, 7.90, 6.50, 7.80, 5.36, 4.60...
#> $ dmr_value_qualifier_code <chr> "=", "=", "=", "=", "=", "=", "=",...
#> $ value_received_date <date> 2015-11-20, 2015-12-29, 2016-01-1...
#> $ days_late <int> NA, 9, NA, NA, 1, NA, NA, NA, NA, ...
#> $ nodi_code <chr> "", "", "", "", "", "", "", "", ""...
#> $ nodi_desc <chr> "", "", "", "", "", "", "", "", ""...
#> $ exceedence_pct <int> NA, NA, NA, NA, NA, NA, NA, NA, NA...
#> $ npdes_violation_id <chr> "", "", "", "", "", "", "", "", ""...
#> $ violation_code <chr> "", "", "", "", "", "", "", "", ""...
#> $ violation_desc <chr> "", "", "", "", "", "", "", "", ""...
#> $ rnc_detection_code <chr> "", "", "", "", "", "", "", "", ""...
#> $ rnc_detection_desc <chr> "", "", "", "", "", "", "", "", ""...
#> $ rnc_detection_date <date> NA, NA, NA, NA, NA, NA, NA, NA, N...
#> $ rnc_resolution_code <int> NA, NA, NA, NA, NA, NA, NA, NA, NA...
#> $ rnc_resolution_desc <chr> "", "", "", "", "", "", "", "", ""...
#> $ rnc_resolution_date <date> NA, NA, NA, NA, NA, NA, NA, NA, N...
#> $ violation_severity <int> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0...
echor
can also return spatial data frames known as simple features (https://r-spatial.github.io/sf/), to facilitate creation of maps. Both echoAirGetFacilityInfo
and echoWaterGetFacilityInfo
include arguments to return simple feature dataframes.
Using sf
, ggmap
, and the current development version of ggplot2
(devtools::install_github("tidyverse/ggplot2")
), we can quickly create a map of downloaded data.
## Sample code only,
## This example requires the development
## version of ggplot2 with support for
## geom_sf()
library(ggplot2)
library(ggmap)
library(dplyr)
library(purrr)
library(sf)
library(ggrepel)
## Download data as a simple feature
df <- echoWaterGetFacilityInfo(xmin = '-96.407563', ymin = '30.554395',
xmax = '-96.25947', ymax = '30.751984',
output = 'sf')
## Download a basemap with gg_map
collegestation <- get_map(location = c(-96.387509, 30.583572,
-96.281422, 30.640008),
zoom = 14, maptype = "toner")
## Use coordinates to create label locations
df <- df %>%
mutate(
coords = map(geometry, st_coordinates),
coords_x = map_dbl(coords, 1),
coords_y = map_dbl(coords, 2)
)
## Make the map
ggmap(collegestation) +
geom_sf(data = df, inherit.aes = FALSE, shape = 21,
color = "darkred", fill = "darkred",
size = 2, alpha = 0.25) +
geom_label_repel(data = df, aes(x = coords_x, y = coords_y, label = SourceID),
point.padding = .5, min.segment.length = 0.1,
size = 2, color = "dodgerblue") +
labs(x = "Longitude", y = "Latitude",
title = "NPDES permits near Texas A&M",
caption = "Source: EPA ECHO database")