Title: | Parallel Processing Options for Package 'dataRetrieval' |
---|---|
Description: | Provides methods for retrieving United States Geological Survey (USGS) water data using sequential and parallel processing (Bengtsson, 2022 <doi:10.32614/RJ-2021-048>). In addition to parallel methods, data wrangling and additional statistical attributes are provided. |
Authors: | Josh Erickson [aut, cre, cph] |
Maintainer: | Josh Erickson <[email protected]> |
License: | MIT + file LICENSE |
Version: | 0.1.3 |
Built: | 2024-10-25 05:03:22 UTC |
Source: | https://github.com/joshualerickson/whitewater |
Delay
delay_setup()
delay_setup()
a number for amount of time to delay
A subset of USGS stations in HUC 17
pnw_wy
pnw_wy
A data frame with 18934 rows and 30 variables:
name of USGS station
station site id number
water year
peak flow value
peak flow date
drainage area in sq.miles
latitude
longitude
altitude in meters
observations per water year per site
water year count per site
Sum of Flow
Maximum of Flow
Minimum of Flow
Mean of Flow
Median of Flow
Standard Deviation of Flow
Coeffiecient of Variation of Flow
Maximum of Flow normalized by drainage area
Minimum of Flow normalized by drainage area
Mean of Flow normalized by drainage area
Median of Flow normalized by drainage area
Maximum of Flow normalized by drainage area
Minimum of Flow normalized by standard deviation
Mean of Flow normalized by standard deviation
Median of Flow normalized by standard deviation
Standard Deviation of Flow normalized by standard deviation
decade
comid of site
dam index
a tibble
Get Current Conditions
ww_current_conditions()
ww_current_conditions()
a tibble
with current conditions and attributes from USGS dashboard.
The time zone used in the URL call is the R session time zone. Also, the time is 1-hour behind. Here are the attributes that are with the data.frame: AgencyCode,SiteNumber,SiteName,SiteTypeCode,Latitude,Longitude, CurrentConditionID,ParameterCode,TimeLocal,TimeZoneCode,Value, ValueFlagCode,RateOfChangeUnitPerHour,StatisticStatusCode,FloodStageStatusCode.
## Not run: current_conditions <- ww_current_conditions() ## End(Not run)
## Not run: current_conditions <- ww_current_conditions() ## End(Not run)
This function is a wrapper around readNWISdv but includes added variables like water year, lat/lon, station name, altitude and tidied dates.
ww_dvUSGS( sites, parameter_cd = "00060", start_date = "", end_date = "", stat_cd = "00003", parallel = FALSE, wy_month = 10, verbose = TRUE, ... )
ww_dvUSGS( sites, parameter_cd = "00060", start_date = "", end_date = "", stat_cd = "00003", parallel = FALSE, wy_month = 10, verbose = TRUE, ... )
sites |
A vector of USGS NWIS sites |
parameter_cd |
A USGS code for metric, default is "00060". |
start_date |
A character of date format, e.g. |
end_date |
A character of date format, e.g. |
stat_cd |
character USGS statistic code. This is usually 5 digits. Daily mean (00003) is the default. |
parallel |
|
wy_month |
|
verbose |
|
... |
arguments to pass on to future_map. |
A tibble
with daily metrics and added meta-data.
Use it the same way you would use readNWISdv.
## Not run: library(whitewater) yaak_river_dv <- ww_dvUSGS('12304500', parameter_cd = '00060', wy_month = 10) #parallel #get sites huc17_sites <- dataRetrieval::whatNWISdata(huc = 17, siteStatus = 'active', service = 'dv', parameterCd = '00060') library(future) #need to call future::plan() plan(multisession(workers = availableCores()-1)) pnw_dv <- ww_dvUSGS(huc17_sites$site_no, parameter_cd = '00060', wy_month = 10, parallel = TRUE) ## End(Not run)
## Not run: library(whitewater) yaak_river_dv <- ww_dvUSGS('12304500', parameter_cd = '00060', wy_month = 10) #parallel #get sites huc17_sites <- dataRetrieval::whatNWISdata(huc = 17, siteStatus = 'active', service = 'dv', parameterCd = '00060') library(future) #need to call future::plan() plan(multisession(workers = availableCores()-1)) pnw_dv <- ww_dvUSGS(huc17_sites$site_no, parameter_cd = '00060', wy_month = 10, parallel = TRUE) ## End(Not run)
This function generates instantaneous NWIS data from https://waterservices.usgs.gov/ and then floors to a user defined interval with wwOptions ('1 hour' is default) by taking the mean.
ww_floorIVUSGS( procDV, sites = NULL, parameter_cd = NULL, options = wwOptions(), parallel = FALSE, verbose = TRUE, ... )
ww_floorIVUSGS( procDV, sites = NULL, parameter_cd = NULL, options = wwOptions(), parallel = FALSE, verbose = TRUE, ... )
procDV |
A previously created ww_dvUSGS object. |
sites |
A |
parameter_cd |
A USGS code parameter code, only if using |
options |
A wwOptions call. |
parallel |
|
verbose |
|
... |
arguments to pass on to future_map. |
A tibble
with a user defined interval time step.
For performance reasons, with multi-site retrievals you may
retrieve data since October 1, 2007 only. If a previously created ww_dvUSGS object is not used then the user needs to
provide a sites
vector. This will run ww_dvUSGS in the background.
## Not run: library(whitewater) yaak_river_dv <- ww_dvUSGS('12304500', parameter_cd = '00060', wy_month = 10) yaak_river_iv <- ww_floorIVUSGS(yaak_river_dv) #change floor method yaak_river_iv <- ww_floorIVUSGS(yaak_river_dv, options = wwOptions(floor_iv = '6-hour')) #change number of days yaak_river_iv <- ww_floorIVUSGS(yaak_river_dv, options = wwOptions(floor_iv = '2-hour', period = 365)) # get by date range yaak_river_wy <- ww_floorIVUSGS(yaak_river_dv, options = wwOptions(date_range = 'date_range', dates = c('2022-03-01', '2022-05-11'))) #parallel #get sites huc17_sites <- dataRetrieval::whatNWISdata(huc = 17, siteStatus = 'active', service = 'dv', parameterCd = '00060') library(future) #need to call future::plan() plan(multisession(workers = availableCores()-1)) pnw_dv <- ww_dvUSGS(huc17_sites$site_no, parameter_cd = '00060', wy_month = 10, parallel = TRUE) pnw_iv <- ww_floorIVUSGS(pnw_dv, parallel = TRUE) ## End(Not run)
## Not run: library(whitewater) yaak_river_dv <- ww_dvUSGS('12304500', parameter_cd = '00060', wy_month = 10) yaak_river_iv <- ww_floorIVUSGS(yaak_river_dv) #change floor method yaak_river_iv <- ww_floorIVUSGS(yaak_river_dv, options = wwOptions(floor_iv = '6-hour')) #change number of days yaak_river_iv <- ww_floorIVUSGS(yaak_river_dv, options = wwOptions(floor_iv = '2-hour', period = 365)) # get by date range yaak_river_wy <- ww_floorIVUSGS(yaak_river_dv, options = wwOptions(date_range = 'date_range', dates = c('2022-03-01', '2022-05-11'))) #parallel #get sites huc17_sites <- dataRetrieval::whatNWISdata(huc = 17, siteStatus = 'active', service = 'dv', parameterCd = '00060') library(future) #need to call future::plan() plan(multisession(workers = availableCores()-1)) pnw_dv <- ww_dvUSGS(huc17_sites$site_no, parameter_cd = '00060', wy_month = 10, parallel = TRUE) pnw_iv <- ww_floorIVUSGS(pnw_dv, parallel = TRUE) ## End(Not run)
This function generates Instantaneous NWIS data from https://waterservices.usgs.gov/.
ww_instantaneousUSGS( procDV, sites = NULL, parameter_cd = NULL, options = wwOptions(), parallel = FALSE, verbose = TRUE, ... )
ww_instantaneousUSGS( procDV, sites = NULL, parameter_cd = NULL, options = wwOptions(), parallel = FALSE, verbose = TRUE, ... )
procDV |
A previously created ww_dvUSGS object. |
sites |
A |
parameter_cd |
A USGS code parameter code, only if using |
options |
A wwOptions call. |
parallel |
|
verbose |
|
... |
arguments to pass on to future_map. |
A tibble
with instantaneous values.
For performance reasons, with multi-site retrievals you may
retrieve data since October 1, 2007 only. If a previously created ww_dvUSGS object is not used then the user needs to
provide a sites
vector. This will run ww_dvUSGS in the background.
## Not run: library(whitewater) yaak_river_dv <- ww_dvUSGS('12304500', parameter_cd = '00060', wy_month = 10) yaak_river_iv <- ww_instantaneousUSGS(yaak_river_dv) #change number of days yaak_river_iv <- ww_instantaneousUSGS(yaak_river_dv, options = wwOptions(period = 365)) # get by date range yaak_river_wy <- ww_instantaneousUSGS(yaak_river_dv, options = wwOptions(date_range = 'date_range', dates = c('2022-03-01', '2022-05-11'))) # get most recent yaak_river_wy <- ww_instantaneousUSGS(yaak_river_dv, options = wwOptions(date_range = 'recent')) #parallel #get sites huc17_sites <- dataRetrieval::whatNWISdata(huc = 17, siteStatus = 'active', service = 'dv', parameterCd = '00060') library(future) #need to call future::plan() plan(multisession(workers = availableCores()-1)) pnw_dv <- ww_dvUSGS(huc17_sites$site_no, parameter_cd = '00060', wy_month = 10, parallel = TRUE) pnw_iv <- ww_instantaneousUSGS(pnw_dv, parallel = TRUE) ## End(Not run)
## Not run: library(whitewater) yaak_river_dv <- ww_dvUSGS('12304500', parameter_cd = '00060', wy_month = 10) yaak_river_iv <- ww_instantaneousUSGS(yaak_river_dv) #change number of days yaak_river_iv <- ww_instantaneousUSGS(yaak_river_dv, options = wwOptions(period = 365)) # get by date range yaak_river_wy <- ww_instantaneousUSGS(yaak_river_dv, options = wwOptions(date_range = 'date_range', dates = c('2022-03-01', '2022-05-11'))) # get most recent yaak_river_wy <- ww_instantaneousUSGS(yaak_river_dv, options = wwOptions(date_range = 'recent')) #parallel #get sites huc17_sites <- dataRetrieval::whatNWISdata(huc = 17, siteStatus = 'active', service = 'dv', parameterCd = '00060') library(future) #need to call future::plan() plan(multisession(workers = availableCores()-1)) pnw_dv <- ww_dvUSGS(huc17_sites$site_no, parameter_cd = '00060', wy_month = 10, parallel = TRUE) pnw_iv <- ww_instantaneousUSGS(pnw_dv, parallel = TRUE) ## End(Not run)
This function uses the results of the ww_dvUSGS object to generate mean, maximum, median, standard deviation and coefficient of variation for month only.
ww_monthUSGS(procDV, sites = NULL, parallel = FALSE, verbose = TRUE, ...)
ww_monthUSGS(procDV, sites = NULL, parallel = FALSE, verbose = TRUE, ...)
procDV |
A previously created ww_dvUSGS object. |
sites |
A |
parallel |
|
verbose |
|
... |
arguments to pass on to future_map and ww_dvUSGS. |
A tibble
filtered by month and added meta-data.
If a previously created ww_dvUSGS object is not used then the user needs to
provide a sites
vector. This will run ww_dvUSGS in the background.
## Not run: library(whitewater) yaak_river_dv <- ww_dvUSGS('12304500', parameter_cd = '00060', wy_month = 10) yaak_river_month <- ww_monthUSGS(yaak_river_dv) ## End(Not run)
## Not run: library(whitewater) yaak_river_dv <- ww_dvUSGS('12304500', parameter_cd = '00060', wy_month = 10) yaak_river_month <- ww_monthUSGS(yaak_river_dv) ## End(Not run)
Get Peak Flows
ww_peakUSGS(sites, parallel = FALSE, wy_month = 10, verbose = TRUE, ...)
ww_peakUSGS(sites, parallel = FALSE, wy_month = 10, verbose = TRUE, ...)
sites |
A vector of USGS NWIS sites |
parallel |
|
wy_month |
|
verbose |
|
... |
arguments to pass on to future_map. |
a tibble
with peaks by water year
This function uses the readNWISstat to gather daily, monthly or yearly percentiles.
ww_statsUSGS( procDV, sites = NULL, temporalFilter = "daily", parameter_cd = NULL, days = 10, parallel = FALSE, verbose = TRUE, ... )
ww_statsUSGS( procDV, sites = NULL, temporalFilter = "daily", parameter_cd = NULL, days = 10, parallel = FALSE, verbose = TRUE, ... )
procDV |
A previously created ww_dvUSGS object. |
sites |
A |
temporalFilter |
A |
parameter_cd |
A USGS code parameter code, only if using |
days |
A |
parallel |
|
verbose |
|
... |
arguments to pass on to future_map. |
a tibble with associated site statistics.
Be aware, the parameter values ('Flow', 'Wtemp', etc) are calculated from the ww_floorIVUSGS
function by taking the daily mean of the hourly data. Thus, the instantaneous values will look different than the daily mean values, as it should.
The .temporalFilter
argument is used to generate the window of percentiles.
## Not run: # get by date range yaak_river_dv <- ww_dvUSGS('12304500') #daily yaak_river_stats <- ww_statsUSGS(yaak_river_dv, temporalFilter = 'daily', days = 10) #monthly yaak_river_stats <- ww_statsUSGS(yaak_river_dv, temporalFilter = 'monthly', days = 10) #yearly yaak_river_stats <- ww_statsUSGS(yaak_river_dv, temporalFilter = 'yearly', days = 10) ## End(Not run)
## Not run: # get by date range yaak_river_dv <- ww_dvUSGS('12304500') #daily yaak_river_stats <- ww_statsUSGS(yaak_river_dv, temporalFilter = 'daily', days = 10) #monthly yaak_river_stats <- ww_statsUSGS(yaak_river_dv, temporalFilter = 'monthly', days = 10) #yearly yaak_river_stats <- ww_statsUSGS(yaak_river_dv, temporalFilter = 'yearly', days = 10) ## End(Not run)
This function uses the results of the ww_dvUSGS object to generate mean, maximum, median, standard deviation and coefficient of variation per water year per month.
ww_wymUSGS(procDV, sites = NULL, parallel = FALSE, verbose = TRUE, ...)
ww_wymUSGS(procDV, sites = NULL, parallel = FALSE, verbose = TRUE, ...)
procDV |
A previously created ww_dvUSGS object. |
sites |
A |
parallel |
|
verbose |
|
... |
arguments to pass on to future_map and ww_dvUSGS. |
A tibble
filtered by water year and month with added meta-data.
If a previously created ww_dvUSGS object is not used then the user needs to
provide a sites
vector. This will run ww_dvUSGS in the background.
## Not run: library(whitewater) yaak_river_dv <- ww_dvUSGS('12304500', parameter_cd = '00060', wy_month = 10) yaak_river_wym <- ww_wymUSGS(yaak_river_dv) ## End(Not run)
## Not run: library(whitewater) yaak_river_dv <- ww_dvUSGS('12304500', parameter_cd = '00060', wy_month = 10) yaak_river_wym <- ww_wymUSGS(yaak_river_dv) ## End(Not run)
This function uses the results of the ww_dvUSGS object to generate mean, maximum, median, standard deviation and some normalization methods (drainage area, scaled by log and standard deviation) per water year.
ww_wyUSGS(procDV, sites = NULL, parallel = FALSE, verbose = TRUE, ...)
ww_wyUSGS(procDV, sites = NULL, parallel = FALSE, verbose = TRUE, ...)
procDV |
A previously created ww_dvUSGS object. |
sites |
A |
parallel |
|
verbose |
|
... |
arguments to pass on to future_map and/or ww_dvUSGS. |
A tibble
filtered by water year with added meta-data.
If a previously created ww_dvUSGS object is not used then the user needs to
provide a sites
vector. This will run ww_dvUSGS in the background.
## Not run: library(whitewater) yaak_river_dv <- ww_dvUSGS('12304500', parameter_cd = '00060', wy_month = 10) yaak_river_wy <- ww_wyUSGS(yaak_river_dv) #parallel #get sites huc17_sites <- dataRetrieval::whatNWISdata(huc = 17, siteStatus = 'active', service = 'dv', parameterCd = '00060') library(future) #need to call future::plan() plan(multisession(workers = availableCores()-1)) pnw_dv <- ww_dvUSGS(huc17_sites$site_no, parameter_cd = '00060', wy_month = 10, parallel = TRUE) pnw_wy <- ww_wyUSGS(pnw_dv, parallel = TRUE) ## End(Not run)
## Not run: library(whitewater) yaak_river_dv <- ww_dvUSGS('12304500', parameter_cd = '00060', wy_month = 10) yaak_river_wy <- ww_wyUSGS(yaak_river_dv) #parallel #get sites huc17_sites <- dataRetrieval::whatNWISdata(huc = 17, siteStatus = 'active', service = 'dv', parameterCd = '00060') library(future) #need to call future::plan() plan(multisession(workers = availableCores()-1)) pnw_dv <- ww_dvUSGS(huc17_sites$site_no, parameter_cd = '00060', wy_month = 10, parallel = TRUE) pnw_wy <- ww_wyUSGS(pnw_dv, parallel = TRUE) ## End(Not run)
Options
wwOptions( date_range = "pfn", period = 11, dates = NULL, site_status = "all", floor_iv = "1 hour", ... )
wwOptions( date_range = "pfn", period = 11, dates = NULL, site_status = "all", floor_iv = "1 hour", ... )
date_range |
A |
period |
A |
dates |
A |
site_status |
A |
floor_iv |
A |
... |
other options used for options. |
A list with API options.
A site is considered active if; it has collected time-series (automated) data within the last 183 days (6 months) or it has collected discrete (manually collected) data within 397 days (13 months).
## Not run: library(whitewater) yaak_river_dv <- ww_dvUSGS('12304500', parameter_cd = '00060', wy_month = 10) yaak_river_iv <- ww_floorIVUSGS(yaak_river_dv) #change floor method yaak_river_iv <- ww_floorIVUSGS(yaak_river_dv, options = wwOptions(floor_iv = '6-hour')) #change number of days yaak_river_iv <- ww_floorIVUSGS(yaak_river_dv, options = wwOptions(floor_iv = '2-hour', period = 365)) # get by date range yaak_river_wy <- ww_floorIVUSGS(yaak_river_dv, options = wwOptions(date_range = 'date_range', dates = c('2022-03-01', '2022-05-11'))) # site status as 'active' yaak_river_wy <- ww_floorIVUSGS(yaak_river_dv, options = wwOptions(site_status = 'active', date_range = 'date_range', dates = c('2022-03-01', '2022-05-11'))) ## End(Not run)
## Not run: library(whitewater) yaak_river_dv <- ww_dvUSGS('12304500', parameter_cd = '00060', wy_month = 10) yaak_river_iv <- ww_floorIVUSGS(yaak_river_dv) #change floor method yaak_river_iv <- ww_floorIVUSGS(yaak_river_dv, options = wwOptions(floor_iv = '6-hour')) #change number of days yaak_river_iv <- ww_floorIVUSGS(yaak_river_dv, options = wwOptions(floor_iv = '2-hour', period = 365)) # get by date range yaak_river_wy <- ww_floorIVUSGS(yaak_river_dv, options = wwOptions(date_range = 'date_range', dates = c('2022-03-01', '2022-05-11'))) # site status as 'active' yaak_river_wy <- ww_floorIVUSGS(yaak_river_dv, options = wwOptions(site_status = 'active', date_range = 'date_range', dates = c('2022-03-01', '2022-05-11'))) ## End(Not run)