| Title: | SelectBoost-Style Variable Selection for Functional Data Analysis |
|---|---|
| Description: | Implements 'SelectBoost'-style variable selection workflows for functional data analysis. The package provides FDA-native design and preprocessing objects for raw curves, spline-basis expansions, Functional principal component analysis scores, and scalar covariates; grouped stability-selection routines based on repeated subject-level subsampling; multiple selector backends including lasso, group lasso, and sparse-group lasso; FDA-aware grouping functions and calibration helpers for 'SelectBoost'; method-comparison utilities; a formula interface; simulation, benchmarking, and validation helpers with mapped ground truth; targeted sensitivity-study utilities and shipped benchmark summaries for mean 'F1' comparisons between FDA-aware and plain 'SelectBoost' workflows; small example datasets; and an optional adapter to the native stability-selection interface from the 'FDboost' package. |
| Authors: | Frederic Bertrand [cre, aut]
|
| Maintainer: | Frederic Bertrand <[email protected]> |
| License: | GPL-3 |
| Version: | 0.5.1 |
| Built: | 2026-05-11 07:16:08 UTC |
| Source: | https://github.com/fbertran/selectboost.fda |
Applies a fitted preprocessor to new functional predictors and optional
scalar covariates, returning an fda_matrix object compatible with the
selection routines.
apply_fda_preprocessor(object, predictors, scalar_covariates = NULL, ...)apply_fda_preprocessor(object, predictors, scalar_covariates = NULL, ...)
object |
A fitted |
predictors |
New functional predictors. |
scalar_covariates |
Optional scalar covariates. |
... |
Not used. |
An object of class fda_matrix.
Accepts a standard numeric matrix/data frame or a named list of functional blocks. List inputs are column-bound while preserving the original block membership of each coefficient, which is later reused for grouped stability selection and FDA-aware SelectBoost grouping.
as_functional_matrix(x, center = FALSE, scale = FALSE)as_functional_matrix(x, center = FALSE, scale = FALSE)
x |
A numeric matrix/data frame, an |
center, scale
|
Passed to |
An object of class fda_matrix with elements x, blocks, and
positions.
Runs compare_selection_methods() on a simulated dataset and evaluates the
fitted objects against the mapped truth.
benchmark_selection_methods( data, methods = c("stability", "interval", "selectboost", "plain_selectboost"), levels = c("feature", "group"), stability_args = list(), interval_args = list(), selectboost_args = list(), plain_selectboost_args = list(), fdboost_model = NULL, fdboost_args = list(), keep_comparison = TRUE )benchmark_selection_methods( data, methods = c("stability", "interval", "selectboost", "plain_selectboost"), levels = c("feature", "group"), stability_args = list(), interval_args = list(), selectboost_args = list(), plain_selectboost_args = list(), fdboost_model = NULL, fdboost_args = list(), keep_comparison = TRUE )
data |
An object returned by |
methods |
Methods passed to |
levels |
Evaluation levels. |
stability_args, interval_args, selectboost_args, plain_selectboost_args
|
Additional arguments passed to |
fdboost_model, fdboost_args
|
Optional |
keep_comparison |
Should the fitted comparison object be stored? |
An object of class fda_benchmark.
sim <- simulate_fda_scenario(n = 24, grid_length = 16, seed = 1) bench <- benchmark_selection_methods( sim, methods = c("selectboost", "plain_selectboost"), selectboost_args = list(B = 3, steps.seq = 0.5, c0lim = FALSE), plain_selectboost_args = list(B = 3, steps.seq = 0.5, c0lim = FALSE) ) head(bench$metrics)sim <- simulate_fda_scenario(n = 24, grid_length = 16, seed = 1) bench <- benchmark_selection_methods( sim, methods = c("selectboost", "plain_selectboost"), selectboost_args = list(B = 3, steps.seq = 0.5, c0lim = FALSE), plain_selectboost_args = list(B = 3, steps.seq = 0.5, c0lim = FALSE) ) head(bench$metrics)
Runs interval stability selection over candidate interval widths.
calibrate_interval_width( design, widths, step = NULL, overlap = FALSE, selector = "lasso", keep_fits = FALSE, seed = NULL, ... )calibrate_interval_width( design, widths, step = NULL, overlap = FALSE, selector = "lasso", keep_fits = FALSE, seed = NULL, ... )
design |
An |
widths |
Candidate interval widths. |
step |
Optional step size. Defaults to |
overlap |
Should the interval groups overlap? |
selector |
Base selector passed to |
keep_fits |
Should the fitted objects be stored in the result? |
seed |
Optional seed used to create deterministic per-grid seeds. |
... |
Additional arguments passed to |
An object of class fda_calibration_grid.
Runs FDA-SelectBoost on a user-provided or suggested c0 grid.
calibrate_selectboost( design, selector = "msgps", c0_grid = NULL, grid_method = c("quantile", "linear"), association_method = c("correlation", "neighborhood", "hybrid", "interval"), keep_fit = TRUE, ... )calibrate_selectboost( design, selector = "msgps", c0_grid = NULL, grid_method = c("quantile", "linear"), association_method = c("correlation", "neighborhood", "hybrid", "interval"), keep_fit = TRUE, ... )
design |
An |
selector |
Base selector passed to |
c0_grid |
Optional explicit |
grid_method |
Rule used by |
association_method |
Passed to |
keep_fit |
Should the fitted object be stored in the result? |
... |
Additional arguments passed to |
An object of class fda_calibration_grid.
Runs grouped stability selection over a grid of subsampling fractions and cutoff values.
calibrate_stability_selection( design, selector = "group_lasso", sample_fraction_grid = c(0.5, 0.632, 0.75), cutoff_grid = c(0.6, 0.75, 0.9), keep_fits = FALSE, seed = NULL, ... )calibrate_stability_selection( design, selector = "group_lasso", sample_fraction_grid = c(0.5, 0.632, 0.75), cutoff_grid = c(0.6, 0.75, 0.9), keep_fits = FALSE, seed = NULL, ... )
design |
An |
selector |
Base selector passed to |
sample_fraction_grid |
Candidate subsampling fractions. |
cutoff_grid |
Candidate cutoff values. |
keep_fits |
Should the fitted objects be stored in the result? |
seed |
Optional seed used to create deterministic per-grid seeds. |
... |
Additional arguments passed to |
An object of class fda_calibration_grid.
Runs multiple selection workflows on the same fda_design object and
returns both the fitted objects and a comparison table.
compare_selection_methods( design, methods = c("stability", "interval", "selectboost"), stability_args = list(), interval_args = list(), selectboost_args = list(), plain_selectboost_args = list(), fdboost_model = NULL, fdboost_args = list() )compare_selection_methods( design, methods = c("stability", "interval", "selectboost"), stability_args = list(), interval_args = list(), selectboost_args = list(), plain_selectboost_args = list(), fdboost_model = NULL, fdboost_args = list() )
design |
An |
methods |
Methods to run. Supported values are |
stability_args, interval_args, selectboost_args, plain_selectboost_args
|
Named lists of arguments passed to the corresponding fitting functions. |
fdboost_model |
Optional fitted |
fdboost_args |
Additional arguments passed to
|
An object of class fda_method_comparison.
sim <- simulate_fda_scenario(n = 24, grid_length = 16, seed = 1) comparison <- compare_selection_methods( sim$design, methods = c("selectboost", "plain_selectboost"), selectboost_args = list(B = 3, steps.seq = 0.5, c0lim = FALSE), plain_selectboost_args = list(B = 3, steps.seq = 0.5, c0lim = FALSE) ) summary(comparison)sim <- simulate_fda_scenario(n = 24, grid_length = 16, seed = 1) comparison <- compare_selection_methods( sim$design, methods = c("selectboost", "plain_selectboost"), selectboost_args = list(B = 3, steps.seq = 0.5, c0lim = FALSE), plain_selectboost_args = list(B = 3, steps.seq = 0.5, c0lim = FALSE) ) summary(comparison)
Computes support-recovery metrics for fitted selection objects against the
truth generated by simulate_fda_scenario().
evaluate_selection(x, truth, level = c("feature", "group", "basis"), ...)evaluate_selection(x, truth, level = c("feature", "group", "basis"), ...)
x |
A fitted selection object or an |
truth |
Ground-truth object, typically the value returned by
|
level |
Evaluation level: |
... |
Additional arguments passed to the relevant method. |
A data frame with recovery metrics.
Constructor for a functional predictor represented by basis coefficients or FPCA scores.
fda_basis( coefficients, basis_type = c("generic", "spline", "wavelet", "fpca"), argvals = NULL, component_names = NULL, name = NULL, unit = NULL )fda_basis( coefficients, basis_type = c("generic", "spline", "wavelet", "fpca"), argvals = NULL, component_names = NULL, name = NULL, unit = NULL )
coefficients |
Numeric matrix with one row per observation. |
basis_type |
Label describing the representation. |
argvals |
Optional labels or positions for basis functions/components. |
component_names |
Optional names for coefficient columns. |
name |
Optional predictor name. |
unit |
Optional unit for the basis domain. |
An object of class fda_basis.
Spline-Basis Preprocessing Spec
fda_bspline( df = 6L, degree = 3L, intercept = TRUE, center = FALSE, scale = FALSE )fda_bspline( df = 6L, degree = 3L, intercept = TRUE, center = FALSE, scale = FALSE )
df |
Degrees of freedom used by |
degree |
Spline degree. |
intercept |
Should the spline basis include an intercept column? |
center, scale
|
Logical flags controlling column-wise centering and scaling of the resulting coefficients. |
An object of class fda_preprocess_spec.
Bundles the response, functional predictors, family, and a reversible feature map. This is the FDA-native entry point for the higher-level fitting functions.
fda_design( response = NULL, predictors, scalar_covariates = NULL, family = c("gaussian", "binomial"), id = NULL, center = FALSE, scale = FALSE, transforms = NULL, scalar_transform = NULL, preprocessor = NULL )fda_design( response = NULL, predictors, scalar_covariates = NULL, family = c("gaussian", "binomial"), id = NULL, center = FALSE, scale = FALSE, transforms = NULL, scalar_transform = NULL, preprocessor = NULL )
response |
Response vector. |
predictors |
A single predictor or a named list of predictors. Elements
can be |
scalar_covariates |
Optional scalar covariates supplied separately from the functional predictors. |
family |
Model family. |
id |
Optional observation identifiers. |
center, scale
|
Backward-compatible shortcuts for applying an identity transform with centering and scaling to the functional predictors. |
transforms |
Optional preprocessing specs for the functional predictors. |
scalar_transform |
Optional preprocessing specs for scalar covariates. |
preprocessor |
Optional fitted |
An object of class fda_design.
data("spectra_example", package = "SelectBoost.FDA") idx <- 1:20 design <- fda_design( response = spectra_example$response[idx], predictors = list( signal = fda_grid( spectra_example$predictors$signal[idx, ], argvals = spectra_example$grid, name = "signal" ), nuisance = fda_grid( spectra_example$predictors$nuisance[idx, ], argvals = spectra_example$grid, name = "nuisance" ) ), scalar_covariates = spectra_example$scalar_covariates[idx, ], scalar_transform = fda_standardize(), family = "gaussian" ) summary(design)data("spectra_example", package = "SelectBoost.FDA") idx <- 1:20 design <- fda_design( response = spectra_example$response[idx], predictors = list( signal = fda_grid( spectra_example$predictors$signal[idx, ], argvals = spectra_example$grid, name = "signal" ), nuisance = fda_grid( spectra_example$predictors$nuisance[idx, ], argvals = spectra_example$grid, name = "nuisance" ) ), scalar_covariates = spectra_example$scalar_covariates[idx, ], scalar_transform = fda_standardize(), family = "gaussian" ) summary(design)
Supports additive formulas of the form y ~ signal + noise + age + batch,
where functional terms are supplied as matrices, fda_grid, or fda_basis
objects in data, and scalar terms are expanded through
stats::model.matrix().
fda_design_formula( formula, data, family = c("gaussian", "binomial"), transforms = NULL, scalar_transform = NULL, preprocessor = NULL, center = FALSE, scale = FALSE )fda_design_formula( formula, data, family = c("gaussian", "binomial"), transforms = NULL, scalar_transform = NULL, preprocessor = NULL, center = FALSE, scale = FALSE )
formula |
An additive formula with a single response. |
data |
A list or data frame containing the variables referenced in
|
family, transforms, scalar_transform, preprocessor, center, scale
|
Passed to
|
An object of class fda_design.
FPCA Preprocessing Spec
fda_fpca( n_components = 3L, variance_explained = NULL, center = TRUE, scale = FALSE )fda_fpca( n_components = 3L, variance_explained = NULL, center = TRUE, scale = FALSE )
n_components |
Number of principal components to retain. |
variance_explained |
Optional cumulative explained variance target in
|
center, scale
|
Passed to |
An object of class fda_preprocess_spec.
Constructor for one discretized functional predictor sampled on a common grid.
fda_grid(values, argvals = NULL, name = NULL, unit = NULL)fda_grid(values, argvals = NULL, name = NULL, unit = NULL)
values |
Numeric matrix with one row per observation. |
argvals |
Optional vector of grid values. Defaults to
|
name |
Optional predictor name. |
unit |
Optional unit for the grid axis. |
An object of class fda_grid.
Identity Preprocessing Spec
fda_identity(center = FALSE, scale = FALSE)fda_identity(center = FALSE, scale = FALSE)
center, scale
|
Logical flags controlling column-wise centering and scaling of the transformed features. |
An object of class fda_preprocess_spec.
Wraps scalar covariates so they can participate in the same feature-mapping and preprocessing machinery as functional predictors.
fda_scalar(values, name = NULL, unit = NULL)fda_scalar(values, name = NULL, unit = NULL)
values |
Numeric vector or matrix with one row per observation. |
name |
Optional predictor name. |
unit |
Optional unit label. |
An object of class fda_scalar.
Standardization Preprocessing Spec
fda_standardize(center = TRUE, scale = TRUE)fda_standardize(center = TRUE, scale = TRUE)
center, scale
|
Logical flags controlling column-wise centering and
scaling. Both default to |
An object of class fda_preprocess_spec.
Thin adapter to the stabsel.FDboost() method. This is the native route when
the model itself is already fitted with FDboost.
fdboost_stability_selection(model, ...)fdboost_stability_selection(model, ...)
model |
A fitted |
... |
Additional arguments forwarded to |
A stabsel object.
Learns train/test-safe preprocessing transforms for functional predictors and
optional scalar covariates. The fitted object can be reused to create
compatible fda_design objects on new data.
fit_fda_preprocessor( predictors, scalar_covariates = NULL, transforms = NULL, scalar_transform = NULL )fit_fda_preprocessor( predictors, scalar_covariates = NULL, transforms = NULL, scalar_transform = NULL )
predictors |
One predictor or a named list of predictors. |
scalar_covariates |
Optional scalar covariates supplied as a vector,
matrix/data frame, |
transforms |
Optional preprocessing specs for functional predictors. |
scalar_transform |
Optional preprocessing specs for scalar covariates. |
An object of class fda_preprocessor.
Fit SelectBoost from an FDA Design
fit_selectboost(design, ...)fit_selectboost(design, ...)
design |
An |
... |
Additional arguments forwarded to |
An object inheriting from selectboost_fda_result.
sim <- simulate_fda_scenario(n = 24, grid_length = 16, seed = 1) fit <- fit_selectboost( sim$design, mode = "fast", steps.seq = 0.5, c0lim = FALSE, B = 3 ) head(selection_map(fit, c0 = colnames(fit$feature_selection)[1]))sim <- simulate_fda_scenario(n = 24, grid_length = 16, seed = 1) fit <- fit_selectboost( sim$design, mode = "fast", steps.seq = 0.5, c0lim = FALSE, B = 3 ) head(selection_map(fit, c0 = colnames(fit$feature_selection)[1]))
Fit FDA-SelectBoost from a Formula
fit_selectboost_formula( formula, data, family = c("gaussian", "binomial"), transforms = NULL, scalar_transform = NULL, preprocessor = NULL, center = FALSE, scale = FALSE, ... )fit_selectboost_formula( formula, data, family = c("gaussian", "binomial"), transforms = NULL, scalar_transform = NULL, preprocessor = NULL, center = FALSE, scale = FALSE, ... )
formula, data, family, transforms, scalar_transform, preprocessor, center, scale
|
Passed to |
... |
Additional arguments passed to |
An object inheriting from selectboost_fda_result.
Fit Grouped Stability Selection from an FDA Design
fit_stability(design, ...)fit_stability(design, ...)
design |
An |
... |
Additional arguments forwarded to |
An object inheriting from fda_stability_selection.
sim <- simulate_fda_scenario(n = 24, grid_length = 16, seed = 1) if (requireNamespace("glmnet", quietly = TRUE)) { fit <- fit_stability( sim$design, selector = "lasso", B = 4, cutoff = 0.4, seed = 1 ) head(selection_map(fit)) }sim <- simulate_fda_scenario(n = 24, grid_length = 16, seed = 1) if (requireNamespace("glmnet", quietly = TRUE)) { fit <- fit_stability( sim$design, selector = "lasso", B = 4, cutoff = 0.4, seed = 1 ) head(selection_map(fit)) }
Fit Stability Selection from a Formula
fit_stability_formula( formula, data, family = c("gaussian", "binomial"), transforms = NULL, scalar_transform = NULL, preprocessor = NULL, center = FALSE, scale = FALSE, ... )fit_stability_formula( formula, data, family = c("gaussian", "binomial"), transforms = NULL, scalar_transform = NULL, preprocessor = NULL, center = FALSE, scale = FALSE, ... )
formula, data, family, transforms, scalar_transform, preprocessor, center, scale
|
Passed to |
... |
Additional arguments passed to |
An object inheriting from fda_stability_selection.
Computes or post-processes an absolute association matrix for discretized or basis-expanded functional predictors.
functional_association( x, association = NULL, method = c("correlation", "neighborhood", "hybrid", "interval"), within_blocks = TRUE, bandwidth = NULL, interval_groups = NULL, width = NULL, step = width, decay = 1 )functional_association( x, association = NULL, method = c("correlation", "neighborhood", "hybrid", "interval"), within_blocks = TRUE, bandwidth = NULL, interval_groups = NULL, width = NULL, step = width, decay = 1 )
x |
Any input accepted by |
association |
Optional square association matrix supplied by the user.
When omitted, |
method |
Association structure. |
within_blocks |
Should cross-block associations be zeroed out? |
bandwidth |
Optional maximum within-block lag retained in the association matrix. |
interval_groups |
Optional interval grouping used when
|
width, step
|
Interval parameters used when |
decay |
Positive exponent controlling the neighborhood kernel. |
A square absolute association matrix with unit diagonal.
Returns one group label per column, with each functional block defining a group.
functional_block_groups(x)functional_block_groups(x)
x |
Any input accepted by |
An integer vector of group memberships.
Creates non-overlapping interval groups within each functional block. This is useful when one wants region-level stability summaries instead of pointwise selection frequencies.
functional_interval_groups(x, width, step = width, overlap = FALSE)functional_interval_groups(x, width, step = width, overlap = FALSE)
x |
Any input accepted by |
width |
Positive integer interval width within each block. |
step |
Step size between interval starts. Only non-overlapping intervals are supported by default. |
overlap |
Logical; should intervals be allowed to overlap? When |
Either an integer group vector with an interval_table attribute or
an overlapping group structure of class fda_group_list.
Convenience wrapper around stability_selection_fda() that first creates
non-overlapping interval groups within each functional block.
interval_stability_selection( x, y = NULL, width, step = width, overlap = FALSE, ... )interval_stability_selection( x, y = NULL, width, step = width, overlap = FALSE, ... )
x |
Any input accepted by |
y |
Response vector. Leave as |
width |
Positive interval width. |
step |
Step size between interval starts. |
overlap |
Logical; should the interval groups overlap? |
... |
Additional arguments forwarded to |
An object of class fda_interval_stability_selection.
Builds a closure that can be passed directly to group= in
SelectBoost::fastboost() or SelectBoost::autoboost(). The returned
grouping function respects functional block boundaries and can optionally
restrict groups to local neighborhoods along the observation grid.
make_functional_grouping_function( x, association = NULL, method = c("threshold", "community"), association_method = c("correlation", "neighborhood", "hybrid", "interval"), within_blocks = TRUE, bandwidth = NULL, interval_groups = NULL, width = NULL, step = width, decay = 1 )make_functional_grouping_function( x, association = NULL, method = c("threshold", "community"), association_method = c("correlation", "neighborhood", "hybrid", "interval"), within_blocks = TRUE, bandwidth = NULL, interval_groups = NULL, width = NULL, step = width, decay = 1 )
x |
Any input accepted by |
association |
Optional square association matrix. When omitted, the
correlation matrix supplied by |
method |
Grouping strategy. |
association_method |
Association structure passed to
|
within_blocks |
Should groups be restricted to features coming from the same functional block? |
bandwidth |
Optional maximum within-block lag retained in groups. |
interval_groups, width, step, decay
|
Additional arguments passed to
|
A function with signature (absXcor, c0) compatible with
SelectBoost.
Simulated smooth trajectories used to demonstrate spline-basis and FPCA preprocessing from raw curves.
motion_examplemotion_example
A list with four components:
Numeric vector of observation times.
Numeric response vector.
Named list of functional predictor matrices.
Data frame with scalar covariates.
Simulated for package examples.
Runs SelectBoost directly on the flattened predictor matrix without the
FDA-specific grouping heuristics used by selectboost_fda(). This is useful
as a benchmark against the FDA-aware variant.
plain_selectboost( x, y = NULL, mode = c("fast", "auto"), selector = "msgps", selector_fun = NULL, selector_args = list(), groups = NULL, family = c("gaussian", "binomial"), association = NULL, group_method = c("threshold", "community"), ... )plain_selectboost( x, y = NULL, mode = c("fast", "auto"), selector = "msgps", selector_fun = NULL, selector_args = list(), groups = NULL, family = c("gaussian", "binomial"), association = NULL, group_method = c("threshold", "community"), ... )
x |
Any input accepted by |
y |
Response vector. Leave as |
mode |
|
selector |
Base selector used inside SelectBoost. Choose from
|
selector_fun |
Optional custom base selector. It must return a
coefficient vector of length |
selector_args |
Optional named list of arguments forwarded to the base selector. |
groups |
Optional feature groups used by grouped base selectors such as
|
family |
Model family passed to built-in selectors. |
association |
Optional absolute association matrix used directly by the raw SelectBoost grouping function. |
group_method |
Functional grouping backend: threshold-based or community-based. |
... |
Additional arguments passed to |
An object of class plain_selectboost_result.
Plots feature-, group-, interval-, and basis-level summaries derived from
selection_map(). The available views depend on the fitted object:
## S3 method for class 'fda_stability_selection' plot( x, type = c("feature", "group", "interval", "basis"), value = c("group", "mean", "max"), facet = c("none", "predictor"), palette = selection_palette(), show_legend = TRUE, legend_title = NULL, legend_n_ticks = 3L, legend_digits = 2L, legend_cex = 0.75, cutoff = x$cutoff, ... ) ## S3 method for class 'selectboost_fda_result' plot( x, type = c("feature", "group", "basis"), value = c("max", "mean"), palette = selection_palette(), show_legend = TRUE, legend_title = NULL, legend_n_ticks = 3L, legend_digits = 2L, legend_cex = 0.75, c0 = NULL, ... )## S3 method for class 'fda_stability_selection' plot( x, type = c("feature", "group", "interval", "basis"), value = c("group", "mean", "max"), facet = c("none", "predictor"), palette = selection_palette(), show_legend = TRUE, legend_title = NULL, legend_n_ticks = 3L, legend_digits = 2L, legend_cex = 0.75, cutoff = x$cutoff, ... ) ## S3 method for class 'selectboost_fda_result' plot( x, type = c("feature", "group", "basis"), value = c("max", "mean"), palette = selection_palette(), show_legend = TRUE, legend_title = NULL, legend_n_ticks = 3L, legend_digits = 2L, legend_cex = 0.75, c0 = NULL, ... )
x |
An object returned by |
type |
Summary view to plot. Stability-selection fits support
|
value |
Quantity summarized in group, interval, and basis views.
Stability-selection fits accept |
facet |
Faceting mode for interval heatmaps. Currently only
|
palette |
Vector of colors used for heatmaps. |
show_legend |
Logical; should heatmap views draw a legend? |
legend_title |
Optional legend title for heatmap views. By default an
informative title is chosen from |
legend_n_ticks |
Approximate number of tick marks used in the heatmap legend. |
legend_digits |
Number of significant digits used for heatmap legend labels. |
legend_cex |
Character expansion used for heatmap legend text. |
cutoff |
Stability threshold. Only used for |
... |
Additional graphical parameters passed to bar-plot-based views. |
c0 |
Optional SelectBoost correlation threshold. When omitted,
SelectBoost heatmaps are drawn across all available |
fda_stability_selection supports type = "feature", "group",
"interval", and "basis".
selectboost_fda_result supports type = "feature", "group", and
"basis".
Heatmap-based views are used for interval summaries and for SelectBoost
summaries over multiple c0 values. Bar-plot views are used otherwise.
Invisibly returns the helper output used to build the plot.
Repeats the FDA benchmark over a grid of simulation settings and a grid of
FDA-aware SelectBoost settings. This is intended to answer the specific
benchmark question of when selectboost_fda() improves on plain
SelectBoost.
run_selectboost_sensitivity_study( n_rep = 10L, simulate_grid = expand.grid(scenario = c("localized_dense", "confounded_blocks"), confounding_strength = c(0.4, 0.9), active_region_scale = c(1, 0.7), local_correlation = c(0, 2), stringsAsFactors = FALSE), selectboost_grid = expand.grid(association_method = c("correlation", "neighborhood", "hybrid", "interval"), bandwidth = c(NA, 4, 8), stringsAsFactors = FALSE), simulate_args = list(), benchmark_args = list(), seed = NULL, keep_results = FALSE )run_selectboost_sensitivity_study( n_rep = 10L, simulate_grid = expand.grid(scenario = c("localized_dense", "confounded_blocks"), confounding_strength = c(0.4, 0.9), active_region_scale = c(1, 0.7), local_correlation = c(0, 2), stringsAsFactors = FALSE), selectboost_grid = expand.grid(association_method = c("correlation", "neighborhood", "hybrid", "interval"), bandwidth = c(NA, 4, 8), stringsAsFactors = FALSE), simulate_args = list(), benchmark_args = list(), seed = NULL, keep_results = FALSE )
n_rep |
Number of replications per setting combination. |
simulate_grid |
Data frame of simulation-setting combinations. Columns
are merged into |
selectboost_grid |
Data frame of |
simulate_args |
Named list forwarded to |
benchmark_args |
Named list forwarded to |
seed |
Optional seed used to derive deterministic per-replication and per-setting seeds. |
keep_results |
Should the individual benchmark objects be returned? |
An object inheriting from fda_selectboost_sensitivity_study and
fda_simulation_study.
grid <- data.frame( scenario = "confounded_blocks", confounding_strength = 0.9, active_region_scale = 0.7, local_correlation = 2, stringsAsFactors = FALSE ) methods <- data.frame( association_method = c("correlation", "hybrid"), bandwidth = c(NA, 4), stringsAsFactors = FALSE ) study <- run_selectboost_sensitivity_study( n_rep = 1, simulate_grid = grid, selectboost_grid = methods, simulate_args = list(n = 24, grid_length = 16), benchmark_args = list( methods = c("selectboost", "plain_selectboost"), levels = "feature", selectboost_args = list(B = 3, steps.seq = 0.5, c0lim = FALSE), plain_selectboost_args = list(B = 3, steps.seq = 0.5, c0lim = FALSE) ), seed = 1 ) summarise_benchmark_advantage( study, target = "selectboost", reference = "plain_selectboost", level = "feature" )grid <- data.frame( scenario = "confounded_blocks", confounding_strength = 0.9, active_region_scale = 0.7, local_correlation = 2, stringsAsFactors = FALSE ) methods <- data.frame( association_method = c("correlation", "hybrid"), bandwidth = c(NA, 4), stringsAsFactors = FALSE ) study <- run_selectboost_sensitivity_study( n_rep = 1, simulate_grid = grid, selectboost_grid = methods, simulate_args = list(n = 24, grid_length = 16), benchmark_args = list( methods = c("selectboost", "plain_selectboost"), levels = "feature", selectboost_args = list(B = 3, steps.seq = 0.5, c0lim = FALSE), plain_selectboost_args = list(B = 3, steps.seq = 0.5, c0lim = FALSE) ), seed = 1 ) summarise_benchmark_advantage( study, target = "selectboost", reference = "plain_selectboost", level = "feature" )
Repeats simulate_fda_scenario() and benchmark_selection_methods() over
multiple replications and aggregates the resulting recovery metrics.
run_simulation_study( n_rep = 10L, simulate_args = list(), benchmark_args = list(), seed = NULL, keep_results = FALSE )run_simulation_study( n_rep = 10L, simulate_args = list(), benchmark_args = list(), seed = NULL, keep_results = FALSE )
n_rep |
Number of simulation replications. |
simulate_args |
Named list forwarded to |
benchmark_args |
Named list forwarded to |
seed |
Optional seed used to derive deterministic per-replication seeds. |
keep_results |
Should the individual benchmark objects be returned? |
An object of class fda_simulation_study.
Wraps SelectBoost::fastboost() or SelectBoost::autoboost() while adding
FDA-specific structure through block-aware and region-aware grouping.
selectboost_fda( x, y = NULL, mode = c("fast", "auto"), selector = "msgps", selector_fun = NULL, selector_args = list(), groups = NULL, family = c("gaussian", "binomial"), association = NULL, group_method = c("threshold", "community"), association_method = c("correlation", "neighborhood", "hybrid", "interval"), within_blocks = TRUE, bandwidth = NULL, interval_groups = NULL, width = NULL, step = width, decay = 1, ... )selectboost_fda( x, y = NULL, mode = c("fast", "auto"), selector = "msgps", selector_fun = NULL, selector_args = list(), groups = NULL, family = c("gaussian", "binomial"), association = NULL, group_method = c("threshold", "community"), association_method = c("correlation", "neighborhood", "hybrid", "interval"), within_blocks = TRUE, bandwidth = NULL, interval_groups = NULL, width = NULL, step = width, decay = 1, ... )
x |
Any input accepted by |
y |
Response vector. Leave as |
mode |
|
selector |
Base selector used inside SelectBoost. Choose from
|
selector_fun |
Optional custom base selector. It must return a
coefficient vector of length |
selector_args |
Optional named list of arguments forwarded to the base selector. |
groups |
Optional feature groups used by grouped base selectors such as
|
family |
Model family passed to built-in selectors. |
association |
Optional custom association matrix used to define FDA-aware groups. |
group_method |
Functional grouping backend: threshold-based or community-based. |
association_method |
Association structure used to build FDA-aware groups. |
within_blocks |
Should SelectBoost groups stay within functional blocks? |
bandwidth |
Optional maximum within-block lag retained in groups. |
interval_groups, width, step, decay
|
Additional arguments passed to
|
... |
Additional arguments passed to |
An object of class selectboost_fda_result.
Returns the selected rows from selection_map() for stability-selection or
SelectBoost fits.
selected(x, ...)selected(x, ...)
x |
A fitted selection object. |
... |
Additional arguments passed to the relevant method. |
A data frame.
Returns a feature map augmented with selection summaries from a fit object.
selection_map(x, level = c("feature", "group", "basis"), ...)selection_map(x, level = c("feature", "group", "basis"), ...)
x |
An |
level |
Summary level. |
... |
Additional arguments passed to the relevant method. |
A data frame.
Generates raw functional predictors, scalar covariates, a response, and the mapped ground truth for the transformed design matrix.
simulate_fda_scenario( n = 80L, grid_length = 60L, family = c("gaussian", "binomial"), representation = c("grid", "basis", "fpca"), transforms = NULL, basis_df = 7L, n_components = 5L, scenario = c("localized_dense", "distributed_smooth", "confounded_blocks"), confounding_strength = NULL, active_region_scale = 1, local_correlation = 0, include_scalar = TRUE, noise_sd = 0.4, seed = NULL )simulate_fda_scenario( n = 80L, grid_length = 60L, family = c("gaussian", "binomial"), representation = c("grid", "basis", "fpca"), transforms = NULL, basis_df = 7L, n_components = 5L, scenario = c("localized_dense", "distributed_smooth", "confounded_blocks"), confounding_strength = NULL, active_region_scale = 1, local_correlation = 0, include_scalar = TRUE, noise_sd = 0.4, seed = NULL )
n |
Number of observations. |
grid_length |
Number of grid points per functional predictor. |
family |
Model family used to generate the response. |
representation |
Representation used when building the returned
|
transforms |
Optional transform list passed to |
basis_df |
Degrees of freedom used when |
n_components |
Number of FPCA components used when
|
scenario |
Benchmark scenario. |
confounding_strength |
Strength of cross-block confounding injected into
the nuisance curve. Higher values make plain |
active_region_scale |
Positive multiplier applied to the width of the
active regions. Values below |
local_correlation |
Non-negative smoothing parameter applied to the simulated curves. Larger values increase local correlation along the grid. |
include_scalar |
Should scalar covariates be included in the design and truth object? |
noise_sd |
Observation noise level. |
seed |
Optional random seed. |
An object of class fda_simulation_data.
sim <- simulate_fda_scenario(n = 24, grid_length = 16, seed = 1) sim head(sim$truth$active_features)sim <- simulate_fda_scenario(n = 24, grid_length = 16, seed = 1) sim head(sim$truth$active_features)
Simulated dense spectra with one signal block, one nuisance block, and two scalar covariates. The response is continuous and depends on localized regions of the signal spectrum plus the scalar covariates.
spectra_examplespectra_example
A list with four components:
Numeric vector of wavelength locations.
Numeric response vector.
Named list of functional predictor matrices.
Data frame with scalar covariates.
Simulated for package examples.
Repeatedly subsamples observations, refits a sparse base selector, and computes exact feature- and group-level selection frequencies. This is the generic FDA recipe for basis expansions, discretized curves, or FPCA scores.
stability_selection_fda( x, y = NULL, selector = "group_lasso", selector_fun = NULL, groups = NULL, family = c("gaussian", "binomial"), B = 100L, sample_fraction = 0.5, cutoff = 0.75, seed = NULL, keep_subsamples = FALSE, ... )stability_selection_fda( x, y = NULL, selector = "group_lasso", selector_fun = NULL, groups = NULL, family = c("gaussian", "binomial"), B = 100L, sample_fraction = 0.5, cutoff = 0.75, seed = NULL, keep_subsamples = FALSE, ... )
x |
Any input accepted by |
y |
Response vector. Leave as |
selector |
Either |
selector_fun |
Optional custom selector. It must accept |
groups |
Optional grouping structure. Defaults to block-level groups when
|
family |
Model family passed to the built-in selectors. |
B |
Number of subsampling replicates. |
sample_fraction |
Fraction of observations drawn without replacement in each subsample. |
cutoff |
Stability threshold used to define |
seed |
Optional random seed. |
keep_subsamples |
Should the sampled row indices be returned? |
... |
Additional arguments forwarded to the built-in or custom selector. |
An object of class fda_stability_selection.
Builds a data-driven c0 grid from an FDA-aware association matrix.
suggest_c0_grid( x, n = 5L, method = c("quantile", "linear"), association_method = c("correlation", "neighborhood", "hybrid", "interval"), within_blocks = TRUE, bandwidth = NULL, interval_groups = NULL, width = NULL, step = width, decay = 1 )suggest_c0_grid( x, n = 5L, method = c("quantile", "linear"), association_method = c("correlation", "neighborhood", "hybrid", "interval"), within_blocks = TRUE, bandwidth = NULL, interval_groups = NULL, width = NULL, step = width, decay = 1 )
x |
Any input accepted by |
n |
Number of grid values to return. |
method |
Grid construction rule: |
association_method |
Association structure passed to
|
within_blocks, bandwidth, interval_groups, width, step, decay
|
Passed to
|
A decreasing numeric vector of c0 values.
Computes the per-scenario and per-level gain of a target method over one or
more reference methods. This is intended to make the benchmark story explicit
when comparing FDA-aware SelectBoost to existing baselines.
summarise_benchmark_advantage( x, target = "selectboost", reference = c("plain_selectboost", "stability"), level = c("feature", "group", "basis"), metric = "f1", optimize = c("max", "min"), select_c0 = c("best", "all") )summarise_benchmark_advantage( x, target = "selectboost", reference = c("plain_selectboost", "stability"), level = c("feature", "group", "basis"), metric = "f1", optimize = c("max", "min"), select_c0 = c("best", "all") )
x |
An |
target |
Method whose gain should be assessed. |
reference |
One or more baseline methods. |
level |
Evaluation level. |
metric |
Metric used both for best- |
optimize |
Should larger or smaller values of |
select_c0 |
Keep all |
A data frame.
Collapses raw benchmark rows into method-level performance summaries, with an
option to retain only the best c0 per method and replication.
summarise_benchmark_performance( x, level = c("feature", "group", "basis"), metric = "f1", optimize = c("max", "min"), select_c0 = c("best", "all") )summarise_benchmark_performance( x, level = c("feature", "group", "basis"), metric = "f1", optimize = c("max", "min"), select_c0 = c("best", "all") )
x |
An |
level |
Evaluation level. |
metric |
Metric used to pick the best |
optimize |
Should larger or smaller values of |
select_c0 |
Keep all |
A data frame.