R/uncertainty_quantification.R
bootstrap_trendfilter.RdGenerate a bootstrap ensemble of trend filtering estimates in order to quantify the uncertainty in the optimized trend filtering estimate.
bootstrap_trendfilter( obj, algorithm = c("nonparametric", "parametric", "wild"), B = 100L, x_eval = NULL, edf = NULL, mc_cores = parallel::detectCores() - 4, ... )
| obj | An object of class ' |
|---|---|
| algorithm | A string specifying which variation of the bootstrap to use. One of
|
| B | The number of bootstrap samples to be drawn to generate the trend filtering
ensemble. Defaults to |
| edf | The desired number of effective degrees of freedom in each bootstrap
estimate. When |
| mc_cores | Number of cores to utilize for parallel computing. Defaults to the number of cores detected, minus 4. |
| ... | Additional named arguments. Currently only a few experimental arguments may be passed by experts. |
An object of class 'bootstrap_trendfilter' and subclass
'trendfilter'. Generic functions such as predict(),
fitted, and residuals() may also be called on bootstrap_trendfilter
objects, with the same effect as if they were called on the obj argument
originally passed to bootstrap_trendfilter(). A bootstrap_trendfilter
object is a list containing the follow elements:
x_evalInput grid that each bootstrap trend filtering estimate was evaluated on.
ensembleThe full trend filtering bootstrap ensemble as a matrix
with length(x_eval) rows and B columns.
algorithmString specifying which variation of the bootstrap was used to generate the ensemble.
edf_optNumber of effective degrees of freedom that each bootstrap
trend filtering fit should approximately possess in our fixed-edf bootstrap
procedure. Identical to the value passed to edf, or its default.
edf_bootsVector of the estimated number of effective degrees of
freedom of each trend filtering bootstrap estimate. These are unlikely to
all be exactly equal to edf_opt, but should be relatively close.
n_iter_bootsVector of the number of iterations taken by the ADMM algorithm before reaching a stopping criterion, for each bootstrap estimate.
lambda_bootsVector of the hyperparameter values used for each bootstrap fit. In general, these are not all equal because our bootstrap implementation instead seeks to hold the number of effective degrees of freedom constant across all bootstrap estimates.
lambdaVector of the original grid of candidate hyperparameter
values, inherited from obj.
edfNumber of effective degrees of freedom in the trend filtering
estimator, for every hyperparameter value in lambda.
fitted_valuesFitted values of all trend filtering point
estimates with hyperparameter values in lambda, inherited from obj.
xVector of observed values for the input variable, inherited from
obj.
yVector of observed values for the output variable, inherited from
obj.
weightsVector of weights for the observed outputs, inherited from
obj.
kDegree of the trend filtering point estimate (and bootstrap
estimates), inherited from obj.
callThe function call.
scale_xyFor internal use.
One of three possible bootstrap algorithms should be chosen
according to the criteria below. Pointwise variability bands are then
obtained by passing the 'bootstrap_trendfilter' object to vbands(),
along with the desired level (e.g. level = 0.95). Bootstrapping trend
filtering estimators tends to yield more accurate uncertainties when, for
each bootstrap estimate, we fix the number of effective degrees of freedom,
edf (a reparametrization of the hyperparameter lambda), instead of fixing
lambda itself. Thus, bootstrap_trendfilter() has an edf argument
instead of lambda. See the edf argument description and Examples
section for guidance on how edf can be chosen.
Our recommendations for when to use each of the possible settings
for the algorithm argument are shown in the table below. See
Politsch et al. (2020a) for more details.
| Scenario | Uncertainty quantification |
x is unevenly sampled | algorithm = "nonparametric" |
x is evenly sampled and measurement variances for y are available | algorithm = "parametric" |
x is evenly sampled and measurement variances for y are not available | algorithm = "wild" |
For our purposes, an evenly sampled data set with some discarded pixels
(either sporadically or in large consecutive chunks) is still considered to
be evenly sampled. When the inputs are evenly sampled on a transformed scale,
we recommend transforming to that scale and carrying out the full trend
filtering analysis on that scale. See Example 2 below for a case when the
inputs are evenly sampled on the log10(x) scale.
Politsch et al. (2020a). Trend filtering – I. A modern statistical tool for time-domain astronomy and astronomical spectroscopy. MNRAS, 492(3), p. 4005-4018. [Publisher] [arXiv].
Politsch et al. (2020b). Trend Filtering – II. Denoising astronomical signals with varying degrees of smoothness. MNRAS, 492(3), p. 4019-4032. [Publisher] [arXiv].
# Example 1: Phase-folded light curve of an eclipsing binary star system # # The apparent brightness over time of a star system that has two suns # that regularly eclipse one another from our vantage point on Earth. Here, # the time series is stacked according to the orbital period of the binary # system, with the primary eclipse occuring at `phase = 0` and the input # domain ranging from -0.5 to 0.5. data("eclipsing_binary") head(eclipsing_binary)#> # A tibble: 6 × 3 #> phase flux std_err #> <dbl> <dbl> <dbl> #> 1 -0.499 0.938 0.0102 #> 2 -0.498 0.930 0.0102 #> 3 -0.496 0.944 0.0102 #> 4 -0.494 0.963 0.0102 #> 5 -0.494 0.934 0.0102 #> 6 -0.494 0.949 0.0102x <- eclipsing_binary$phase y <- eclipsing_binary$flux weights <- 1 / eclipsing_binary$std_err^2 cv_tf <- cv_trendfilter( x = x, y = y, weights = weights, max_iter = 1e4, obj_tol = 1e-6 ) boot_tf <- bootstrap_trendfilter( obj = cv_tf, algorithm = "nonparametric", edf = cv_tf$edf_min["MAE"] ) # Example 2: The "Lyman-alpha forest" in the spectrum of a distant quasar data("quasar_spectrum") head(quasar_spectrum)#> # A tibble: 6 × 3 #> log10_wavelength flux weights #> <dbl> <dbl> <dbl> #> 1 3.55 0.424 0.0417 #> 2 3.55 -2.11 0.125 #> 3 3.55 -3.78 0.128 #> 4 3.55 3.60 0.121 #> 5 3.55 -4.89 0.134 #> 6 3.55 1.76 0.132x <- quasar_spectrum$log10_wavelength y <- quasar_spectrum$flux weights <- quasar_spectrum$weights sure_tf <- sure_trendfilter(x, y, weights) boot_tf <- bootstrap_trendfilter( obj = sure_tf, algorithm = "parametric", edf = sure_tf$edf_min )