Evaluate random holdouts from timestep grid following imputation
hold_eval.Rd
Evaluates random holdouts from timestep grid produced using hold_grid()
against imputed data using impute_grid()
.
Arguments
- true
timestep grid used to generate holdout data with known "true" values.
- imp
imputed timestep grid.
- hold
holdout grid with where known "true" values were coerced to
NA
(not assigned) values.- PI_upr
timestep grid containing upper prediction intervals. Default is
NULL
.- PI_lwr
timestep grid containing lower prediction intervals. Default is
NULL
.- norm
logical flag to normalize returned root mean square error by standard deviation of observations. Default is
FALSE
.
Value
named list containing:
- comp
data frame containing holdout comparison results. Fields include: #'
Site - observed holdout sites
Timestep - observed holdout timesteps
Observed - observed holdout values
Imputed - imputed holdout values.
- diff
vector of differences (imputed minus observed values) with
NA
s where holdouts were not imputed- rmse
root mean square error for observed vs imputed values. Normalized by standard deviation of observations if norm is
TRUE
.- CR
Coverage rate computed as proportion of holdout values within the modeled prediction interval. Defaults to
NULL
if prediction intervals are not included.
Author
Maintainer: Zeno F. Levy zlevy@usgs.gov
Examples
if (FALSE) { # \dontrun{
# load example Long Island dataset
data(LI_data)
# aggregate data at monthly timestep using median observed values
grid <- timestep_grid(data = LI_data,
timestep = "monthly",
agg_method = "median")
# trim grid to remove sites that are less than 35 percent complete
grid <- trim_grid(grid, data_thresh = 0.35)
# set seed for reproducibility
set.seed(123)
# holdout random 5 percent of observed values
hold <- hold_grid(input_grid = grid, p = 0.05)
# impute holdout grid using top 10 most correlated reference sites
out <- impute_grid(input_grid = hold,
n_refwl = 10,
bootstrap_PI = T)
# evaluate imputation of holdout data
eval <- hold_eval(true = grid,
imp = out$imputed_grid,
hold = hold,
PI_upr = out$PI_upr,
PI_lwr = out$PI_lwr)
# view root mean squared error of imputed holdouts
eval$rmse
# view coverage rate
eval$CR
# plot observed vs imputed values with 1:1 line
plot(eval$comp$Observed, eval$comp$Imputed,
xlab = "Observed", ylab = "Imputed")
abline(0,1,lty=2, col="red")
} # }