Title: | Bootstrap Hyperparameter Selection for PLS Models and Extensions |
---|---|
Description: | Several implementations of non-parametric stable bootstrap-based techniques to determine the numbers of components for Partial Least Squares linear or generalized linear regression models as well as and sparse Partial Least Squares linear or generalized linear regression models. The package collects techniques that were published in a book chapter (Magnanensi et al. 2016, 'The Multiple Facets of Partial Least Squares and Related Methods', <doi:10.1007/978-3-319-40643-5_18>) and two articles (Magnanensi et al. 2017, 'Statistics and Computing', <doi:10.1007/s11222-016-9651-4>) and (Magnanensi et al. 2021, 'Frontiers in Applied Mathematics and Statistics', <doi:10.3389/fams.2021.693126>). |
Authors: | Frederic Bertrand [cre, aut] , Jeremy Magnanensi [aut], Myriam Maumy-Bertrand [aut] |
Maintainer: | Frederic Bertrand <[email protected]> |
License: | GPL-3 |
Version: | 1.0.1 |
Built: | 2025-01-22 05:28:30 UTC |
Source: | https://github.com/fbertran/bootpls |
Bootstrap (Y,X) for the coefficients with number of components updated for each resampling.
coefs.plsR.adapt.ncomp( dataset, i, R = 1000, ncpus = 1, parallel = "no", verbose = FALSE )
coefs.plsR.adapt.ncomp( dataset, i, R = 1000, ncpus = 1, parallel = "no", verbose = FALSE )
dataset |
Dataset to use. |
i |
Vector of resampling. |
R |
Number of resamplings to find the number of components. |
ncpus |
integer: number of processes to be used in parallel operation: typically one would chose this to the number of available CPUs. |
parallel |
The type of parallel operation to be used (if any). If missing, the default is taken from the option "boot.parallel" (and if that is not set, "no"). |
verbose |
Suppress information messages. |
Numeric vector: first value is the number of components, the remaining values are the coefficients the variables computed for that number of components.
Jérémy Magnanensi, Frédéric Bertrand
[email protected]
https://fbertran.github.io/homepage/
A new bootstrap-based stopping criterion in PLS component construction,
J. Magnanensi, M. Maumy-Bertrand, N. Meyer and F. Bertrand (2016), in The Multiple Facets of Partial Least Squares and Related Methods,
doi:10.1007/978-3-319-40643-5_18
A new universal resample-stable bootstrap-based stopping criterion for PLS component construction,
J. Magnanensi, F. Bertrand, M. Maumy-Bertrand and N. Meyer, (2017), Statistics and Computing, 27, 757–774.
doi:10.1007/s11222-016-9651-4
New developments in Sparse PLS regression, J. Magnanensi, M. Maumy-Bertrand,
N. Meyer and F. Bertrand, (2021), Frontiers in Applied Mathematics and Statistics,
doi:10.3389/fams.2021.693126
.
set.seed(314) ncol=5 xran=matrix(rnorm(30*ncol),30,ncol) coefs.plsR.adapt.ncomp(xran,sample(1:30)) coefs.plsR.adapt.ncomp(xran,sample(1:30),ncpus=2,parallel="multicore")
set.seed(314) ncol=5 xran=matrix(rnorm(30*ncol),30,ncol) coefs.plsR.adapt.ncomp(xran,sample(1:30)) coefs.plsR.adapt.ncomp(xran,sample(1:30),ncpus=2,parallel="multicore")
Bootstrap (Y,T) functions for PLSR
coefs.plsR.CSim(dataset, i)
coefs.plsR.CSim(dataset, i)
dataset |
Dataset with tt |
i |
Index for resampling |
Coefficient of the last variable in the linear regression
lm(dataset[i,1] ~ dataset[,-1] - 1)
computed using bootstrap
resampling.
Jérémy Magnanensi, Frédéric Bertrand
[email protected]
https://fbertran.github.io/homepage/
A new bootstrap-based stopping criterion in PLS component construction,
J. Magnanensi, M. Maumy-Bertrand, N. Meyer and F. Bertrand (2016), in The Multiple Facets of Partial Least Squares and Related Methods,
doi:10.1007/978-3-319-40643-5_18
A new universal resample-stable bootstrap-based stopping criterion for PLS component construction,
J. Magnanensi, F. Bertrand, M. Maumy-Bertrand and N. Meyer, (2017), Statistics and Computing, 27, 757–774.
doi:10.1007/s11222-016-9651-4
New developments in Sparse PLS regression, J. Magnanensi, M. Maumy-Bertrand,
N. Meyer and F. Bertrand, (2021), Frontiers in Applied Mathematics and Statistics,
doi:10.3389/fams.2021.693126
.
set.seed(314) xran=matrix(rnorm(150),30,5) coefs.plsR.CSim(xran,sample(1:30))
set.seed(314) xran=matrix(rnorm(150),30,5) coefs.plsR.CSim(xran,sample(1:30))
A function passed to boot
to perform bootstrap.
coefs.plsRglm.CSim( dataRepYtt, ind, nt, modele, family = NULL, maxcoefvalues, ifbootfail )
coefs.plsRglm.CSim( dataRepYtt, ind, nt, modele, family = NULL, maxcoefvalues, ifbootfail )
dataRepYtt |
Dataset with tt components to resample |
ind |
indices for resampling |
nt |
number of components to use |
modele |
type of modele to use, see plsRglm. Not used, please specify the family instead. |
family |
glm family to use, see plsRglm |
maxcoefvalues |
maximum values allowed for the estimates of the coefficients to discard those coming from singular bootstrap samples |
ifbootfail |
value to return if the estimation fails on a bootstrap sample |
estimates on a bootstrap sample or ifbootfail
value if the
bootstrap computation fails.
Numeric vector of the components computed using a bootstrap resampling.
Jérémy Magnanensi, Frédéric Bertrand
[email protected]
https://fbertran.github.io/homepage/
A new bootstrap-based stopping criterion in PLS component construction,
J. Magnanensi, M. Maumy-Bertrand, N. Meyer and F. Bertrand (2016), in The Multiple Facets of Partial Least Squares and Related Methods,
doi:10.1007/978-3-319-40643-5_18
A new universal resample-stable bootstrap-based stopping criterion for PLS component construction,
J. Magnanensi, F. Bertrand, M. Maumy-Bertrand and N. Meyer, (2017), Statistics and Computing, 27, 757–774.
doi:10.1007/s11222-016-9651-4
New developments in Sparse PLS regression, J. Magnanensi, M. Maumy-Bertrand,
N. Meyer and F. Bertrand, (2021), Frontiers in Applied Mathematics and Statistics,
doi:10.3389/fams.2021.693126
.
set.seed(314) library(plsRglm) data(aze_compl, package="plsRglm") Xaze_compl<-aze_compl[,2:34] yaze_compl<-aze_compl$y dataset <- cbind(y=yaze_compl,Xaze_compl) modplsglm <- plsRglm::plsRglm(y~.,data=dataset,4,modele="pls-glm-family",family=binomial) dataRepYtt <- cbind(y = modplsglm$RepY, modplsglm$tt) coefs.plsRglm.CSim(dataRepYtt, sample(1:nrow(dataRepYtt)), 4, family = binomial, maxcoefvalues=10, ifbootfail=0)
set.seed(314) library(plsRglm) data(aze_compl, package="plsRglm") Xaze_compl<-aze_compl[,2:34] yaze_compl<-aze_compl$y dataset <- cbind(y=yaze_compl,Xaze_compl) modplsglm <- plsRglm::plsRglm(y~.,data=dataset,4,modele="pls-glm-family",family=binomial) dataRepYtt <- cbind(y = modplsglm$RepY, modplsglm$tt) coefs.plsRglm.CSim(dataRepYtt, sample(1:nrow(dataRepYtt)), 4, family = binomial, maxcoefvalues=10, ifbootfail=0)
A function passed to boot
to perform bootstrap.
coefs.sgpls.CSim( dataRepYtt, ind, nt, modele, family = binomial, maxcoefvalues, ifbootfail )
coefs.sgpls.CSim( dataRepYtt, ind, nt, modele, family = binomial, maxcoefvalues, ifbootfail )
dataRepYtt |
Dataset with tt components to resample |
ind |
indices for resampling |
nt |
number of components to use |
modele |
type of modele to use, see plsRglm. Not used, please specify the family instead. |
family |
glm family to use, see plsRglm |
maxcoefvalues |
maximum values allowed for the estimates of the coefficients to discard those coming from singular bootstrap samples |
ifbootfail |
value to return if the estimation fails on a bootstrap sample |
Numeric vector of the components computed using a bootstrap
resampling or ifbootfail
value if the
bootstrap computation fails.
Jérémy Magnanensi, Frédéric Bertrand
[email protected]
https://fbertran.github.io/homepage/
A new bootstrap-based stopping criterion in PLS component construction,
J. Magnanensi, M. Maumy-Bertrand, N. Meyer and F. Bertrand (2016), in The Multiple Facets of Partial Least Squares and Related Methods,
doi:10.1007/978-3-319-40643-5_18
A new universal resample-stable bootstrap-based stopping criterion for PLS component construction,
J. Magnanensi, F. Bertrand, M. Maumy-Bertrand and N. Meyer, (2017), Statistics and Computing, 27, 757–774.
doi:10.1007/s11222-016-9651-4
New developments in Sparse PLS regression, J. Magnanensi, M. Maumy-Bertrand,
N. Meyer and F. Bertrand, (2021), Frontiers in Applied Mathematics and Statistics,
doi:10.3389/fams.2021.693126
.
set.seed(4619) xran=cbind(rbinom(30,1,.2),matrix(rnorm(150),30,5)) coefs.sgpls.CSim(xran, ind=sample(1:nrow(xran)), maxcoefvalues=1e5, ifbootfail=rep(NA,3))
set.seed(4619) xran=cbind(rbinom(30,1,.2),matrix(rnorm(150),30,5)) coefs.sgpls.CSim(xran, ind=sample(1:nrow(xran)), maxcoefvalues=1e5, ifbootfail=rep(NA,3))
This dataset provides a simulated dataset for gamma family based PLSR that was created with the simul_data_UniYX_gamma
function.
A data frame with 200 observations on the following 8 variables.
a numeric vector
a numeric vector
a numeric vector
a numeric vector
a numeric vector
a numeric vector
a numeric vector
a numeric vector
a numeric vector
data(datasim) X_datasim_train <- datasim[1:140,2:8] y_datasim_train <- datasim[1:140,1] X_datasim_test <- datasim[141:200,2:8] y_datasim_test <- datasim[141:200,1] rm(X_datasim_train,y_datasim_train,X_datasim_test,y_datasim_test)
data(datasim) X_datasim_train <- datasim[1:140,2:8] y_datasim_train <- datasim[1:140,1] X_datasim_test <- datasim[141:200,2:8] y_datasim_test <- datasim[141:200,1] rm(X_datasim_train,y_datasim_train,X_datasim_test,y_datasim_test)
Provides a wrapper for the bootstrap function boot
from the
boot
R package.
Implements non-parametric bootstraps for PLS
Regression models by (Y,T) resampling to select the number of components.
nbcomp.bootplsR( Y, X, R = 500, sim = "ordinary", ncpus = 1, parallel = "no", typeBCa = TRUE, verbose = TRUE )
nbcomp.bootplsR( Y, X, R = 500, sim = "ordinary", ncpus = 1, parallel = "no", typeBCa = TRUE, verbose = TRUE )
Y |
Vector of response. |
X |
Matrix of predictors. |
R |
The number of bootstrap replicates. Usually this will be a single
positive integer. For importance resampling, some resamples may use one set
of weights and others use a different set of weights. In this case |
sim |
A character string indicating the type of simulation required.
Possible values are |
ncpus |
integer: number of processes to be used in parallel operation: typically one would chose this to the number of available CPUs. |
parallel |
The type of parallel operation to be used (if any). If missing, the default is taken from the option "boot.parallel" (and if that is not set, "no"). |
typeBCa |
Compute BCa type intervals ? |
verbose |
Display info during the run of algorithm? |
More details on bootstrap techniques are available in the help of the
boot
function.
A numeric, the number of components selected by the bootstrap.
Jérémy Magnanensi, Frédéric Bertrand
[email protected]
https://fbertran.github.io/homepage/
A new bootstrap-based stopping criterion in PLS component construction,
J. Magnanensi, M. Maumy-Bertrand, N. Meyer and F. Bertrand (2016), in The Multiple Facets of Partial Least Squares and Related Methods,
doi:10.1007/978-3-319-40643-5_18
A new universal resample-stable bootstrap-based stopping criterion for PLS component construction,
J. Magnanensi, F. Bertrand, M. Maumy-Bertrand and N. Meyer, (2017), Statistics and Computing, 27, 757–774.
doi:10.1007/s11222-016-9651-4
New developments in Sparse PLS regression, J. Magnanensi, M. Maumy-Bertrand,
N. Meyer and F. Bertrand, (2021), Frontiers in Applied Mathematics and Statistics,
doi:10.3389/fams.2021.693126
.
data(pine, package="plsRglm") Xpine<-pine[,1:10] ypine<-log(pine[,11]) res <- nbcomp.bootplsR(ypine, Xpine) nbcomp.bootplsR(ypine, Xpine, typeBCa=FALSE) nbcomp.bootplsR(ypine, Xpine, typeBCa=FALSE, verbose=FALSE) try(nbcomp.bootplsR(ypine, Xpine, sim="permutation")) nbcomp.bootplsR(ypine, Xpine, sim="permutation", typeBCa=FALSE)
data(pine, package="plsRglm") Xpine<-pine[,1:10] ypine<-log(pine[,11]) res <- nbcomp.bootplsR(ypine, Xpine) nbcomp.bootplsR(ypine, Xpine, typeBCa=FALSE) nbcomp.bootplsR(ypine, Xpine, typeBCa=FALSE, verbose=FALSE) try(nbcomp.bootplsR(ypine, Xpine, sim="permutation")) nbcomp.bootplsR(ypine, Xpine, sim="permutation", typeBCa=FALSE)
Provides a wrapper for the bootstrap function boot
from the
boot
R package.
Implements non-parametric bootstraps for PLS
Generalized Linear Regression models by (Y,T) resampling to select the
number of components.
nbcomp.bootplsRglm( object, typeboot = "boot_comp", R = 250, statistic = coefs.plsRglm.CSim, sim = "ordinary", stype = "i", stabvalue = 1e+06, ... )
nbcomp.bootplsRglm( object, typeboot = "boot_comp", R = 250, statistic = coefs.plsRglm.CSim, sim = "ordinary", stype = "i", stabvalue = 1e+06, ... )
object |
An object of class |
typeboot |
The type of bootstrap. ( |
R |
The number of bootstrap replicates. Usually this will be a single
positive integer. For importance resampling, some resamples may use one set
of weights and others use a different set of weights. In this case |
statistic |
A function which when applied to data returns a vector
containing the statistic(s) of interest. |
sim |
A character string indicating the type of simulation required.
Possible values are |
stype |
A character string indicating what the second argument of
|
stabvalue |
A value to hard threshold bootstrap estimates computed from atypical resamplings. Especially useful for Generalized Linear Models. |
... |
Other named arguments for |
More details on bootstrap techniques are available in the help of the
boot
function.
An object of class "boot"
. See the Value part of the help of
the function boot
.
Jérémy Magnanensi, Frédéric Bertrand
[email protected]
https://fbertran.github.io/homepage/
A new bootstrap-based stopping criterion in PLS component construction,
J. Magnanensi, M. Maumy-Bertrand, N. Meyer and F. Bertrand (2016), in The Multiple Facets of Partial Least Squares and Related Methods,
doi:10.1007/978-3-319-40643-5_18
A new universal resample-stable bootstrap-based stopping criterion for PLS component construction,
J. Magnanensi, F. Bertrand, M. Maumy-Bertrand and N. Meyer, (2017), Statistics and Computing, 27, 757–774.
doi:10.1007/s11222-016-9651-4
New developments in Sparse PLS regression, J. Magnanensi, M. Maumy-Bertrand,
N. Meyer and F. Bertrand, (2021), Frontiers in Applied Mathematics and Statistics,
doi:10.3389/fams.2021.693126
.
set.seed(314) library(plsRglm) data(aze_compl, package="plsRglm") Xaze_compl<-aze_compl[,2:34] yaze_compl<-aze_compl$y dataset <- cbind(y=yaze_compl,Xaze_compl) modplsglm <- plsRglm::plsRglm(y~.,data=dataset,10,modele="pls-glm-family", family = binomial) comp_aze_compl.bootYT <- nbcomp.bootplsRglm(modplsglm, R=250) boxplots.bootpls(comp_aze_compl.bootYT) confints.bootpls(comp_aze_compl.bootYT) plots.confints.bootpls(confints.bootpls(comp_aze_compl.bootYT),typeIC = "BCa") comp_aze_compl.permYT <- nbcomp.bootplsRglm(modplsglm, R=250, sim="permutation") boxplots.bootpls(comp_aze_compl.permYT) confints.bootpls(comp_aze_compl.permYT, typeBCa=FALSE) plots.confints.bootpls(confints.bootpls(comp_aze_compl.permYT, typeBCa=FALSE))
set.seed(314) library(plsRglm) data(aze_compl, package="plsRglm") Xaze_compl<-aze_compl[,2:34] yaze_compl<-aze_compl$y dataset <- cbind(y=yaze_compl,Xaze_compl) modplsglm <- plsRglm::plsRglm(y~.,data=dataset,10,modele="pls-glm-family", family = binomial) comp_aze_compl.bootYT <- nbcomp.bootplsRglm(modplsglm, R=250) boxplots.bootpls(comp_aze_compl.bootYT) confints.bootpls(comp_aze_compl.bootYT) plots.confints.bootpls(confints.bootpls(comp_aze_compl.bootYT),typeIC = "BCa") comp_aze_compl.permYT <- nbcomp.bootplsRglm(modplsglm, R=250, sim="permutation") boxplots.bootpls(comp_aze_compl.permYT) confints.bootpls(comp_aze_compl.permYT, typeBCa=FALSE) plots.confints.bootpls(confints.bootpls(comp_aze_compl.permYT, typeBCa=FALSE))
Number of components for SGPLS using (Y,T) bootstrap
nbcomp.bootsgpls( x, y, fold = 10, eta, R, scale.x = TRUE, maxnt = 10, plot.it = TRUE, br = TRUE, ftype = "iden", typeBCa = TRUE, stabvalue = 1e+06, verbose = TRUE )
nbcomp.bootsgpls( x, y, fold = 10, eta, R, scale.x = TRUE, maxnt = 10, plot.it = TRUE, br = TRUE, ftype = "iden", typeBCa = TRUE, stabvalue = 1e+06, verbose = TRUE )
x |
Matrix of predictors. |
y |
Vector or matrix of responses. |
fold |
Number of fold for cross-validation |
eta |
Thresholding parameter. eta should be between 0 and 1. |
R |
Number of resamplings. |
scale.x |
Scale predictors by dividing each predictor variable by its sample standard deviation? |
maxnt |
Maximum number of components allowed in a spls model. |
plot.it |
Plot the results. |
br |
Apply Firth's bias reduction procedure? |
ftype |
Type of Firth's bias reduction procedure. Alternatives are "iden" (the approximated version) or "hat" (the original version). Default is "iden". |
typeBCa |
Include computation for BCa type interval. |
stabvalue |
A value to hard threshold bootstrap estimates computed from atypical resamplings. |
verbose |
Additionnal information on the algorithm. |
List of four: error matrix, eta optimal, K optimal and the matrix of results.
Jérémy Magnanensi, Frédéric Bertrand
[email protected]
https://fbertran.github.io/homepage/
A new bootstrap-based stopping criterion in PLS component construction,
J. Magnanensi, M. Maumy-Bertrand, N. Meyer and F. Bertrand (2016), in The Multiple Facets of Partial Least Squares and Related Methods,
doi:10.1007/978-3-319-40643-5_18
A new universal resample-stable bootstrap-based stopping criterion for PLS component construction,
J. Magnanensi, F. Bertrand, M. Maumy-Bertrand and N. Meyer, (2017), Statistics and Computing, 27, 757–774.
doi:10.1007/s11222-016-9651-4
New developments in Sparse PLS regression, J. Magnanensi, M. Maumy-Bertrand,
N. Meyer and F. Bertrand, (2021), Frontiers in Applied Mathematics and Statistics,
doi:10.3389/fams.2021.693126
.
set.seed(4619) data(prostate, package="spls") nbcomp.bootsgpls((prostate$x)[,1:30], prostate$y, R=250, eta=0.2, maxnt=1, typeBCa = FALSE) set.seed(4619) data(prostate, package="spls") nbcomp.bootsgpls(prostate$x, prostate$y, R=250, eta=c(0.2,0.6), typeBCa = FALSE)
set.seed(4619) data(prostate, package="spls") nbcomp.bootsgpls((prostate$x)[,1:30], prostate$y, R=250, eta=0.2, maxnt=1, typeBCa = FALSE) set.seed(4619) data(prostate, package="spls") nbcomp.bootsgpls(prostate$x, prostate$y, R=250, eta=c(0.2,0.6), typeBCa = FALSE)
Number of components for SGPLS using (Y,T) bootstrap (parallel version)
nbcomp.bootsgpls.para( x, y, fold = 10, eta, R, scale.x = TRUE, maxnt = 10, br = TRUE, ftype = "iden", ncpus = 1, plot.it = TRUE, typeBCa = TRUE, stabvalue = 1e+06, verbose = TRUE )
nbcomp.bootsgpls.para( x, y, fold = 10, eta, R, scale.x = TRUE, maxnt = 10, br = TRUE, ftype = "iden", ncpus = 1, plot.it = TRUE, typeBCa = TRUE, stabvalue = 1e+06, verbose = TRUE )
x |
Matrix of predictors. |
y |
Vector or matrix of responses. |
fold |
Number of fold for cross-validation. |
eta |
Thresholding parameter. eta should be between 0 and 1. |
R |
Number of resamplings. |
scale.x |
Scale predictors by dividing each predictor variable by its sample standard deviation? |
maxnt |
Maximum number of components allowed in a spls model. |
br |
Apply Firth's bias reduction procedure? |
ftype |
Type of Firth's bias reduction procedure. Alternatives are "iden" (the approximated version) or "hat" (the original version). Default is "iden". |
ncpus |
Number of cpus for parallel computing. |
plot.it |
Plot the results. |
typeBCa |
Include computation for BCa type interval. |
stabvalue |
A value to hard threshold bootstrap estimates computed from atypical resamplings. |
verbose |
Additionnal information on the algorithm. |
List of four: error matrix, eta optimal, K optimal and the matrix of results.
Jérémy Magnanensi, Frédéric Bertrand
[email protected]
https://fbertran.github.io/homepage/
A new bootstrap-based stopping criterion in PLS component construction,
J. Magnanensi, M. Maumy-Bertrand, N. Meyer and F. Bertrand (2016), in The Multiple Facets of Partial Least Squares and Related Methods,
doi:10.1007/978-3-319-40643-5_18
A new universal resample-stable bootstrap-based stopping criterion for PLS component construction,
J. Magnanensi, F. Bertrand, M. Maumy-Bertrand and N. Meyer, (2017), Statistics and Computing, 27, 757–774.
doi:10.1007/s11222-016-9651-4
New developments in Sparse PLS regression, J. Magnanensi, M. Maumy-Bertrand,
N. Meyer and F. Bertrand, (2021), Frontiers in Applied Mathematics and Statistics,
doi:10.3389/fams.2021.693126
.
set.seed(4619) data(prostate, package="spls") nbcomp.bootsgpls.para((prostate$x)[,1:30], prostate$y, R=250, eta=0.2, maxnt=1, typeBCa = FALSE) set.seed(4619) data(prostate, package="spls") nbcomp.bootsgpls.para(prostate$x, prostate$y, R=250, eta=c(0.2,0.6), typeBCa = FALSE)
set.seed(4619) data(prostate, package="spls") nbcomp.bootsgpls.para((prostate$x)[,1:30], prostate$y, R=250, eta=0.2, maxnt=1, typeBCa = FALSE) set.seed(4619) data(prostate, package="spls") nbcomp.bootsgpls.para(prostate$x, prostate$y, R=250, eta=c(0.2,0.6), typeBCa = FALSE)
Title
nbcomp.bootspls( x, y, fold = 10, eta, R = 500, maxnt = 10, kappa = 0.5, select = "pls2", fit = "simpls", scale.x = TRUE, scale.y = FALSE, plot.it = TRUE, typeBCa = TRUE, verbose = TRUE )
nbcomp.bootspls( x, y, fold = 10, eta, R = 500, maxnt = 10, kappa = 0.5, select = "pls2", fit = "simpls", scale.x = TRUE, scale.y = FALSE, plot.it = TRUE, typeBCa = TRUE, verbose = TRUE )
x |
Matrix of predictors. |
y |
Vector or matrix of responses. |
fold |
Number of fold for cross-validation |
eta |
Thresholding parameter. eta should be between 0 and 1. |
R |
Number of resamplings. |
maxnt |
Maximum number of components allowed in a spls model. |
kappa |
Parameter to control the effect of the concavity of the objective function and the closeness of original and surrogate direction vectors. kappa is relevant only when responses are multivariate. kappa should be between 0 and 0.5. Default is 0.5. |
select |
PLS algorithm for variable selection. Alternatives are "pls2" or "simpls". Default is "pls2". |
fit |
PLS algorithm for model fitting. Alternatives are "kernelpls", "widekernelpls", "simpls", or "oscorespls". Default is "simpls". |
scale.x |
Scale predictors by dividing each predictor variable by its sample standard deviation? |
scale.y |
Scale responses by dividing each response variable by its sample standard deviation? |
plot.it |
Plot the results. |
typeBCa |
Include computation for BCa type interval. |
verbose |
Displays information on the algorithm. |
list of 3: mspemat matrix of results, eta.opt numeric value, K.opt numeric value)
Jérémy Magnanensi, Frédéric Bertrand
[email protected]
https://fbertran.github.io/homepage/
A new bootstrap-based stopping criterion in PLS component construction,
J. Magnanensi, M. Maumy-Bertrand, N. Meyer and F. Bertrand (2016), in The Multiple Facets of Partial Least Squares and Related Methods,
doi:10.1007/978-3-319-40643-5_18
A new universal resample-stable bootstrap-based stopping criterion for PLS component construction,
J. Magnanensi, F. Bertrand, M. Maumy-Bertrand and N. Meyer, (2017), Statistics and Computing, 27, 757–774.
doi:10.1007/s11222-016-9651-4
New developments in Sparse PLS regression, J. Magnanensi, M. Maumy-Bertrand,
N. Meyer and F. Bertrand, (2021), Frontiers in Applied Mathematics and Statistics,
doi:10.3389/fams.2021.693126
.
set.seed(314) data(pine, package = "plsRglm") Xpine<-pine[,1:10] ypine<-log(pine[,11]) nbcomp.bootspls(x=Xpine,y=ypine,eta=.2, maxnt=1) set.seed(314) data(pine, package = "plsRglm") Xpine<-pine[,1:10] ypine<-log(pine[,11]) nbcomp.bootspls.para(x=Xpine,y=ypine,eta=c(.2,.6))
set.seed(314) data(pine, package = "plsRglm") Xpine<-pine[,1:10] ypine<-log(pine[,11]) nbcomp.bootspls(x=Xpine,y=ypine,eta=.2, maxnt=1) set.seed(314) data(pine, package = "plsRglm") Xpine<-pine[,1:10] ypine<-log(pine[,11]) nbcomp.bootspls.para(x=Xpine,y=ypine,eta=c(.2,.6))
Title
nbcomp.bootspls.para( x, y, fold = 10, eta, R = 500, maxnt = 10, kappa = 0.5, select = "pls2", fit = "simpls", scale.x = TRUE, scale.y = FALSE, plot.it = TRUE, typeBCa = TRUE, ncpus = 1, verbose = TRUE )
nbcomp.bootspls.para( x, y, fold = 10, eta, R = 500, maxnt = 10, kappa = 0.5, select = "pls2", fit = "simpls", scale.x = TRUE, scale.y = FALSE, plot.it = TRUE, typeBCa = TRUE, ncpus = 1, verbose = TRUE )
x |
Matrix of predictors. |
y |
Vector or matrix of responses. |
fold |
Number of fold for cross-validation |
eta |
Thresholding parameter. eta should be between 0 and 1. |
R |
Number of resamplings. |
maxnt |
Maximum number of components allowed in a spls model. |
kappa |
Parameter to control the effect of the concavity of the objective function and the closeness of original and surrogate direction vectors. kappa is relevant only when responses are multivariate. kappa should be between 0 and 0.5. Default is 0.5. |
select |
PLS algorithm for variable selection. Alternatives are "pls2" or "simpls". Default is "pls2". |
fit |
PLS algorithm for model fitting. Alternatives are "kernelpls", "widekernelpls", "simpls", or "oscorespls". Default is "simpls". |
scale.x |
Scale predictors by dividing each predictor variable by its sample standard deviation? |
scale.y |
Scale responses by dividing each response variable by its sample standard deviation? |
plot.it |
Plot the results. |
typeBCa |
Include computation for BCa type interval. |
ncpus |
Number of cpus for parallel computing. |
verbose |
Displays information on the algorithm. |
list of 3: mspemat matrix of results, eta.opt numeric value, K.opt numeric value)
Jérémy Magnanensi, Frédéric Bertrand
[email protected]
https://fbertran.github.io/homepage/
A new bootstrap-based stopping criterion in PLS component construction,
J. Magnanensi, M. Maumy-Bertrand, N. Meyer and F. Bertrand (2016), in The Multiple Facets of Partial Least Squares and Related Methods,
doi:10.1007/978-3-319-40643-5_18
A new universal resample-stable bootstrap-based stopping criterion for PLS component construction,
J. Magnanensi, F. Bertrand, M. Maumy-Bertrand and N. Meyer, (2017), Statistics and Computing, 27, 757–774.
doi:10.1007/s11222-016-9651-4
New developments in Sparse PLS regression, J. Magnanensi, M. Maumy-Bertrand,
N. Meyer and F. Bertrand, (2021), Frontiers in Applied Mathematics and Statistics,
doi:10.3389/fams.2021.693126
.
set.seed(314) data(pine, package = "plsRglm") Xpine<-pine[,1:10] ypine<-log(pine[,11]) nbcomp.bootspls.para(x=Xpine,y=ypine,eta=.2, maxnt=1) set.seed(314) data(pine, package = "plsRglm") Xpine<-pine[,1:10] ypine<-log(pine[,11]) nbcomp.bootspls.para(x=Xpine,y=ypine,eta=c(.2,.6))
set.seed(314) data(pine, package = "plsRglm") Xpine<-pine[,1:10] ypine<-log(pine[,11]) nbcomp.bootspls.para(x=Xpine,y=ypine,eta=.2, maxnt=1) set.seed(314) data(pine, package = "plsRglm") Xpine<-pine[,1:10] ypine<-log(pine[,11]) nbcomp.bootspls.para(x=Xpine,y=ypine,eta=c(.2,.6))
Permutation bootstrap (Y,T) function for PLSR
permcoefs.plsR.CSim(dataset, i)
permcoefs.plsR.CSim(dataset, i)
dataset |
Dataset with tt |
i |
Index for resampling |
Coefficient of the last variable in the linear regression
lm(dataset[i,1] ~ dataset[,-1] - 1)
computed using permutation
resampling.
Jérémy Magnanensi, Frédéric Bertrand
[email protected]
https://fbertran.github.io/homepage/
A new bootstrap-based stopping criterion in PLS component construction,
J. Magnanensi, M. Maumy-Bertrand, N. Meyer and F. Bertrand (2016), in The Multiple Facets of Partial Least Squares and Related Methods,
doi:10.1007/978-3-319-40643-5_18
A new universal resample-stable bootstrap-based stopping criterion for PLS component construction,
J. Magnanensi, F. Bertrand, M. Maumy-Bertrand and N. Meyer, (2017), Statistics and Computing, 27, 757–774.
doi:10.1007/s11222-016-9651-4
New developments in Sparse PLS regression, J. Magnanensi, M. Maumy-Bertrand,
N. Meyer and F. Bertrand, (2021), Frontiers in Applied Mathematics and Statistics,
doi:10.3389/fams.2021.693126
.
set.seed(314) xran=matrix(rnorm(150),30,5) permcoefs.plsR.CSim(xran,sample(1:30))
set.seed(314) xran=matrix(rnorm(150),30,5) permcoefs.plsR.CSim(xran,sample(1:30))
A function passed to boot
to perform bootstrap.
permcoefs.plsRglm.CSim( dataRepYtt, ind, nt, modele, family = NULL, maxcoefvalues, ifbootfail )
permcoefs.plsRglm.CSim( dataRepYtt, ind, nt, modele, family = NULL, maxcoefvalues, ifbootfail )
dataRepYtt |
Dataset with tt components to resample |
ind |
indices for resampling |
nt |
number of components to use |
modele |
type of modele to use, see plsRglm. Not used, please specify the family instead. |
family |
glm family to use, see plsRglm |
maxcoefvalues |
maximum values allowed for the estimates of the coefficients to discard those coming from singular bootstrap samples |
ifbootfail |
value to return if the estimation fails on a bootstrap sample |
estimates on a bootstrap sample or ifbootfail
value if the
bootstrap computation fails.
Numeric vector of the components computed using a permutation resampling.
Jérémy Magnanensi, Frédéric Bertrand
[email protected]
https://fbertran.github.io/homepage/
A new bootstrap-based stopping criterion in PLS component construction,
J. Magnanensi, M. Maumy-Bertrand, N. Meyer and F. Bertrand (2016), in The Multiple Facets of Partial Least Squares and Related Methods,
doi:10.1007/978-3-319-40643-5_18
A new universal resample-stable bootstrap-based stopping criterion for PLS component construction,
J. Magnanensi, F. Bertrand, M. Maumy-Bertrand and N. Meyer, (2017), Statistics and Computing, 27, 757–774.
doi:10.1007/s11222-016-9651-4
New developments in Sparse PLS regression, J. Magnanensi, M. Maumy-Bertrand,
N. Meyer and F. Bertrand, (2021), Frontiers in Applied Mathematics and Statistics,
doi:10.3389/fams.2021.693126
.
set.seed(314) library(plsRglm) data(aze_compl, package="plsRglm") Xaze_compl<-aze_compl[,2:34] yaze_compl<-aze_compl$y dataset <- cbind(y=yaze_compl,Xaze_compl) modplsglm <- plsRglm::plsRglm(y~.,data=dataset,4,modele="pls-glm-logistic") dataRepYtt <- cbind(y = modplsglm$RepY, modplsglm$tt) permcoefs.plsRglm.CSim(dataRepYtt, sample(1:nrow(dataRepYtt)), 4, family = binomial, maxcoefvalues=10, ifbootfail=0)
set.seed(314) library(plsRglm) data(aze_compl, package="plsRglm") Xaze_compl<-aze_compl[,2:34] yaze_compl<-aze_compl$y dataset <- cbind(y=yaze_compl,Xaze_compl) modplsglm <- plsRglm::plsRglm(y~.,data=dataset,4,modele="pls-glm-logistic") dataRepYtt <- cbind(y = modplsglm$RepY, modplsglm$tt) permcoefs.plsRglm.CSim(dataRepYtt, sample(1:nrow(dataRepYtt)), 4, family = binomial, maxcoefvalues=10, ifbootfail=0)
Permutation Bootstrap (Y,T) function for plsRglm
permcoefs.sgpls.CSim( dataRepYtt, ind, nt, modele, family = binomial, maxcoefvalues, ifbootfail )
permcoefs.sgpls.CSim( dataRepYtt, ind, nt, modele, family = binomial, maxcoefvalues, ifbootfail )
dataRepYtt |
Dataset with tt components to resample |
ind |
indices for resampling |
nt |
number of components to use |
modele |
type of modele to use, see plsRglm. Not used, please specify the family instead. |
family |
glm family to use, see plsRglm |
maxcoefvalues |
maximum values allowed for the estimates of the coefficients to discard those coming from singular bootstrap samples |
ifbootfail |
value to return if the estimation fails on a bootstrap sample |
Numeric vector of the components computed using a bootstrap
resampling or ifbootfail
value if the
bootstrap computation fails.
Jérémy Magnanensi, Frédéric Bertrand
[email protected]
https://fbertran.github.io/homepage/
A new bootstrap-based stopping criterion in PLS component construction,
J. Magnanensi, M. Maumy-Bertrand, N. Meyer and F. Bertrand (2016), in The Multiple Facets of Partial Least Squares and Related Methods,
doi:10.1007/978-3-319-40643-5_18
A new universal resample-stable bootstrap-based stopping criterion for PLS component construction,
J. Magnanensi, F. Bertrand, M. Maumy-Bertrand and N. Meyer, (2017), Statistics and Computing, 27, 757–774.
doi:10.1007/s11222-016-9651-4
New developments in Sparse PLS regression, J. Magnanensi, M. Maumy-Bertrand,
N. Meyer and F. Bertrand, (2021), Frontiers in Applied Mathematics and Statistics,
doi:10.3389/fams.2021.693126
.
set.seed(4619) xran=cbind(rbinom(30,1,.2),matrix(rnorm(150),30,5)) permcoefs.sgpls.CSim(xran, ind=sample(1:nrow(xran)), maxcoefvalues=1e5, ifbootfail=rep(NA,3))
set.seed(4619) xran=cbind(rbinom(30,1,.2),matrix(rnorm(150),30,5)) permcoefs.sgpls.CSim(xran, ind=sample(1:nrow(xran)), maxcoefvalues=1e5, ifbootfail=rep(NA,3))
This function is based on the visweb
function from
the bipartite package.
signpred2( matbin, pred.lablength = max(sapply(rownames(matbin), nchar)), labsize = 1, plotsize = 12 )
signpred2( matbin, pred.lablength = max(sapply(rownames(matbin), nchar)), labsize = 1, plotsize = 12 )
matbin |
Matrix with 0 or 1 entries. Each row per predictor and a column for every model. 0 means the predictor is not significant in the model and 1 that, on the contrary, it is significant. |
pred.lablength |
Maximum length of the predictors labels. Defaults to full label length. |
labsize |
Size of the predictors labels. |
plotsize |
Global size of the graph. |
A plot window.
Bernd Gruber with minor modifications from
Frédéric Bertrand
[email protected]
https://fbertran.github.io/homepage/
Vazquez, P.D., Chacoff, N.,P. and Cagnolo, L. (2009) Evaluating multiple determinants of the structure of plant-animal mutualistic networks. Ecology, 90:2039-2046.
See Also visweb
set.seed(314) simbin <- matrix(rbinom(200,3,.2),nrow=20,ncol=10) signpred2(simbin)
set.seed(314) simbin <- matrix(rbinom(200,3,.2),nrow=20,ncol=10) signpred2(simbin)
This function generates a single univariate gamma response value
and a vector of explanatory variables
drawn
from a model with a given number of latent components.
simul_data_UniYX_gamma(totdim, ncomp, jvar, lvar, link = "inverse", offset = 0)
simul_data_UniYX_gamma(totdim, ncomp, jvar, lvar, link = "inverse", offset = 0)
totdim |
Number of columns of the X vector (from |
ncomp |
Number of latent components in the model (to use noise, select ncomp=3) |
jvar |
First variance parameter |
lvar |
Second variance parameter |
link |
Character specification of the link function in the mean model
(mu). Currently, " |
offset |
Offset on the linear scale |
This function should be combined with the replicate function to give rise to a larger dataset. The algorithm used is a modification of a port of the one described in the article of Li which is a multivariate generalization of the algorithm of Naes and Martens.
vector |
|
Jeremy Magnanensi, Frédéric Bertrand
[email protected]
https://fbertran.github.io/homepage/
Jérémy Magnanensi, Frédéric Bertrand
[email protected]
https://fbertran.github.io/homepage/
T. Naes, H. Martens, Comparison of prediction methods for
multicollinear data, Commun. Stat., Simul. 14 (1985) 545-576.
Morris, Elaine B. Martin, Model selection for partial least squares
regression, Chemometrics and Intelligent Laboratory Systems 64 (2002),
79-89, doi:10.1016/S0169-7439(02)00051-5.
A new bootstrap-based stopping criterion in PLS component construction,
J. Magnanensi, M. Maumy-Bertrand, N. Meyer and F. Bertrand (2016), in The Multiple Facets of Partial Least Squares and Related Methods,
doi:10.1007/978-3-319-40643-5_18
A new universal resample-stable bootstrap-based stopping criterion for PLS component construction,
J. Magnanensi, F. Bertrand, M. Maumy-Bertrand and N. Meyer, (2017), Statistics and Computing, 27, 757–774.
doi:10.1007/s11222-016-9651-4
New developments in Sparse PLS regression, J. Magnanensi, M. Maumy-Bertrand,
N. Meyer and F. Bertrand, (2021), Frontiers in Applied Mathematics and Statistics,
doi:10.3389/fams.2021.693126
.
set.seed(314) ncomp=rep(3,100) totdimpos=7:50 totdim=sample(totdimpos,100,replace=TRUE) l=3.01 #for (l in seq(3.01,15.51,by=0.5)) { j=3.01 #for (j in seq(3.01,9.51,by=0.5)) { i=44 #for ( i in 1:100){ set.seed(i) totdimi<-totdim[i] ncompi<-ncomp[i] datasim <- t(replicate(200,simul_data_UniYX_gamma(totdimi,ncompi,j,l))) #} #} #} pairs(datasim) rm(i,j,l,totdimi,ncompi,datasim)
set.seed(314) ncomp=rep(3,100) totdimpos=7:50 totdim=sample(totdimpos,100,replace=TRUE) l=3.01 #for (l in seq(3.01,15.51,by=0.5)) { j=3.01 #for (j in seq(3.01,9.51,by=0.5)) { i=44 #for ( i in 1:100){ set.seed(i) totdimi<-totdim[i] ncompi<-ncomp[i] datasim <- t(replicate(200,simul_data_UniYX_gamma(totdimi,ncompi,j,l))) #} #} #} pairs(datasim) rm(i,j,l,totdimi,ncompi,datasim)