R-universe - fbertran (Frederic Bertrand)

extrafont - Tools for Using Fonts

Tools to using fonts other than the standard PostScript fonts. This package makes it easy to use system TrueType fonts and with PDF or PostScript output files, and with bitmap output files in Windows. extrafont can also be used with fonts packaged specifically to be used with, such as the fontcm package, which has Computer Modern PostScript fonts with math symbols.

Last updated

10.73 score 9 stars 11 dependents 19k scripts 19k downloads

plsRglm - Partial Least Squares Regression for Generalized Linear Models

Provides (weighted) Partial least squares Regression for generalized linear models and repeated k-fold cross-validation of such models using various criteria <doi:10.48550/arXiv.1810.01005>. It allows for missing data in the explanatory variables. Bootstrap confidence intervals constructions are also available.

Last updated

openblascpp

9.06 score 19 stars 7 dependents 145 scripts 2.0k downloads

ggWebGL - Browser-Native 'WebGL' Rendering for R Graphics

Provides browser-native 'WebGL' rendering for R graphics through 'htmlwidgets'. The package supports grammar-style graphics workflows and renderer-ready specifications for dense analytical and scientific scenes, including point, line, trajectory, raster, vector, mesh, and surface layers, shader-driven display modes, timeline controls, structured views, selection metadata, and publication-oriented static export helpers. Rendering stays in the browser, and the core package remains cross-platform without requiring 'CUDA', 'Metal', or 'OpenCL' toolchains.

Last updated

7.74 score 6 stars 2 dependents 32 scripts 475 downloads

Cascade - Selection, Reverse-Engineering and Prediction in Cascade Networks

A modeling tool allowing gene selection, reverse engineering, and prediction in cascade networks. Jung, N., Bertrand, F., Bahram, S., Vallat, L., and Maumy-Bertrand, M. (2014) <doi:10.1093/bioinformatics/btt705>.

Last updated

7.56 score 1 stars 4 dependents 61 scripts 587 downloads

bigalgebra - 'BLAS' and 'LAPACK' Routines for Native R Matrices and 'big.matrix' Objects

Provides arithmetic functions for R matrix and 'big.matrix' objects as well as functions for QR factorization, Cholesky factorization, General eigenvalue, and Singular value decomposition (SVD). A method matrix multiplication and an arithmetic method -for matrix addition, matrix difference- allows for mixed type operation -a matrix class object and a big.matrix class object- and pure type operation for two big.matrix class objects.

Last updated

openblas

7.51 score 4 stars 3 dependents 75 scripts 539 downloads

survAUC - Estimators of Prediction Accuracy for Time-to-Event Data

Provides a variety of functions to estimate time-dependent true/false positive rates and AUC curves from a set of censored survival data.

Last updated

openblas

7.49 score 2 stars 13 dependents 245 scripts 3.2k downloads

tester - Tests and Checks Characteristics of R Objects

Allows users to test characteristics of common R objects.

Last updated

7.33 score 2 stars 6 dependents 120 scripts 3.3k downloads

extrafontdb - Holding the Database for the 'extrafont' Package

It is meant to be used with the 'extrafont' package. The 'extrafont' package contains the code to install and use fonts, while the 'extrafontdb' package contains the font database.

Last updated

7.31 score 1 stars 12 dependents 292 scripts 19k downloads

plsRcox - Partial Least Squares Regression for Cox Models and Related Techniques

Provides Partial least squares Regression and various regular, sparse or kernel, techniques for fitting Cox models in high dimensional settings <doi:10.1093/bioinformatics/btu660>, Bastien, P., Bertrand, F., Meyer N., Maumy-Bertrand, M. (2015), Deviance residuals-based sparse PLS and sparse kernel PLS regression for censored data, Bioinformatics, 31(3):397-404. Cross validation criteria were studied in <doi:10.48550/arXiv.1810.02962>, Bertrand, F., Bastien, Ph. and Maumy-Bertrand, M. (2018), Cross validating extensions of kernel, sparse or regular partial least squares regression models to censored data.

Last updated

6.69 score 7 stars 3 dependents 124 scripts 1.6k downloads

Patterns - Deciphering Biological Networks with Patterned Heterogeneous Measurements

A modeling tool dedicated to biological network modeling (Bertrand and others 2020, <doi:10.1093/bioinformatics/btaa855>). It allows for single or joint modeling of, for instance, genes and proteins. It starts with the selection of the actors that will be the used in the reverse engineering upcoming step. An actor can be included in that selection based on its differential measurement (for instance gene expression or protein abundance) or on its time course profile. Wrappers for actors clustering functions and cluster analysis are provided. It also allows reverse engineering of biological networks taking into account the observed time course patterns of the actors. Many inference functions are provided and dedicated to get specific features for the inferred network such as sparsity, robust links, high confidence links or stable through resampling links. Some simulation and prediction tools are also available for cascade networks (Jung and others 2014, <doi:10.1093/bioinformatics/btt705>). Example of use with microarray or RNA-Seq data are provided.

Last updated

6.66 score 4 stars 57 scripts 473 downloads

Rttf2pt1 - 'ttf2pt1' Program

Contains the program 'ttf2pt1', for use with the 'extrafont' package. This product includes software developed by the 'TTF2PT1' Project and its contributors.

Last updated

6.35 score 1 stars 12 dependents 71 scripts 18k downloads

bigPLSR - Partial Least Squares Regression Models with Big Matrices

Fast partial least squares (PLS) for dense and out-of-core data. Provides SIMPLS (straightforward implementation of a statistically inspired modification of the PLS method) and NIPALS (non-linear iterative partial least-squares) solvers, plus kernel-style PLS variants ('kernelpls' and 'widekernelpls') with parity to 'pls'. Optimized for 'bigmemory'-backed matrices with streamed cross-products and chunked BLAS (Basic Linear Algebra Subprograms) (XtX/XtY and XXt/YX), optional file-backed score sinks, and deterministic testing helpers. Includes an auto-selection strategy that chooses between XtX SIMPLS, XXt (wide) SIMPLS, and NIPALS based on (n, p) and a configurable memory budget. About the package, Bertrand and Maumy (2023) <https://hal.science/hal-05352069>, and <https://hal.science/hal-05352061> highlighted fitting and cross-validating PLS regression models to big data. For more details about some of the techniques featured in the package, Dayal and MacGregor (1997) <doi:10.1002/(SICI)1099-128X(199701)11:1%3C73::AID-CEM435%3E3.0.CO;2-%23>, Rosipal & Trejo (2001) <https://www.jmlr.org/papers/v2/rosipal01a.html>, Tenenhaus, Viennet, and Saporta (2007) <doi:10.1016/j.csda.2007.01.004>, Rosipal (2004) <doi:10.1007/978-3-540-45167-9_17>, Song, Wang, and Bai (2024) <doi:10.1016/j.chemolab.2024.105238>. Includes kernel logistic PLS with 'C++'-accelerated alternating iteratively reweighted least squares (IRLS) updates, streamed reproducing kernel Hilbert space (RKHS) solvers with reusable centering statistics, and bootstrap diagnostics with graphical summaries for coefficients, scores, and cross-validation workflows, alongside dedicated plotting utilities for individuals, variables, ellipses, and biplots. The streaming backend uses far less memory and keeps memory bounded across data sizes. For PLS1, streaming is often fast enough while preserving a small memory footprint; for PLS2 it remains competitive with a bounded footprint. On small problems that fit comfortably in RAM (random-access memory), dense in-memory solvers are slightly faster; the crossover occurs as n or p grow and the Gram/cross-product cost dominates.

Last updated

openblascpp

6.11 score 1 stars 31 scripts 280 downloads

SelectBoost.gamlss - Stability-Selection via Correlated Resampling for 'GAMLSS' Models

Extends the 'SelectBoost' approach to Generalized Additive Models for Location, Scale and Shape (GAMLSS). Implements bootstrap stability-selection across parameter-specific formulas (mu, sigma, nu, tau) via gamlss::stepGAIC(). Includes optional standardization of predictors and helper functions for corrected AIC calculation. More details can be found in Bertrand and Maumy (2024) <https://hal.science/hal-05352041> that highlights correlation-aware resampling to improve variable selection for GAMLSS and quantile regression when predictors are numerous and highly correlated.

Last updated

openblascppopenmp

6.10 score 2 stars 20 scripts 309 downloads

bigPCAcpp - Principal Component Analysis for 'bigmemory' Matrices

High performance principal component analysis routines that operate directly on bigmemory::big.matrix() objects. The package avoids materialising large matrices in memory by streaming data through 'BLAS' and 'LAPACK' kernels and provides helpers to derive scores, loadings, correlations, and contribution diagnostics, including utilities that stream results into 'bigmemory'-backed matrices for file-based workflows. Additional interfaces expose 'scalable' singular value decomposition, robust PCA, and robust SVD algorithms so that users can explore large matrices while tempering the influence of outliers. 'Scalable' principal component analysis is also implemented, Elgamal, Yabandeh, Aboulnaga, Mustafa, and Hefeeda (2015) <doi:10.1145/2723372.2751520>.

Last updated

openblascpp

6.07 score 9 stars 1 dependents 11 scripts 475 downloads

SelectBoost.FDA - SelectBoost-Style Variable Selection for Functional Data Analysis

Implements 'SelectBoost'-style variable selection workflows for functional data analysis. The package provides FDA-native design and preprocessing objects for raw curves, spline-basis expansions, Functional principal component analysis scores, and scalar covariates; grouped stability-selection routines based on repeated subject-level subsampling; multiple selector backends including lasso, group lasso, and sparse-group lasso; FDA-aware grouping functions and calibration helpers for 'SelectBoost'; method-comparison utilities; a formula interface; simulation, benchmarking, and validation helpers with mapped ground truth; targeted sensitivity-study utilities, two-parameter perturbation-grid workflows, renderer-neutral selection-surface and diagnostic extractors, monotonicity and precision-recall utilities, association diagnostics, report table helpers, and shipped benchmark summaries for mean 'F1' comparisons between FDA-aware and plain 'SelectBoost' workflows; small example datasets; and an optional adapter to the native stability-selection interface from the 'FDboost' package.

Last updated

5.85 score 1 stars 35 scripts 291 downloads

turner - Turn Vectors and Lists of Vectors into Indexed Structures

Package designed for working with vectors and lists of vectors, mainly for turning them into other indexed data structures.

Last updated

5.28 score 1 stars 2 dependents 39 scripts 1.6k downloads

peperr - Parallelised Estimation of Prediction Error

Designed for prediction error estimation through resampling techniques, possibly accelerated by parallel execution on a compute cluster. Newly developed model fitting routines can be easily incorporated. Methods used in the package are detailed in Porzelius Ch., Binder H. and Schumacher M. (2009) <doi:10.1093/bioinformatics/btp062> and were used, for instance, in Porzelius Ch., Schumacher M. and Binder H. (2011) <doi:10.1007/s00180-011-0236-6>.

Last updated

5.19 score 3 stars 1 dependents 29 scripts 493 downloads

boids4R - Reynolds-Style Boids and Swarm Simulation

Provides deterministic two- and three-dimensional boids and swarm simulations for R. The package implements Reynolds-style separation, alignment, and cohesion rules with optional obstacles, attractors, predators, species parameters, and reproducible frame export. Simulation state is renderer-neutral; optional adapters can hand frame data to visualization packages such as 'ggWebGL'. The model follows Reynolds (1987) <doi:10.1145/37402.37406>.

Last updated

cpp

5.18 score 1 stars 20 scripts 306 downloads

SelectBoost - A General Algorithm to Enhance the Performance of Variable Selection Methods in Correlated Datasets

An implementation of the selectboost algorithm (Bertrand et al. 2020, 'Bioinformatics', <doi:10.1093/bioinformatics/btaa855>), which is a general algorithm that improves the precision of any existing variable selection method. This algorithm is based on highly intensive simulations and takes into account the correlation structure of the data. It can either produce a confidence index for variable selection or it can be used in an experimental design planning perspective.

Last updated

confidencecorrelationcorrelation-structuremodellingprecisionrecallselection-algorithm

5.15 score 7 stars 3 dependents 15 scripts 404 downloads

mi4p - Multiple Imputation for Proteomics

A framework for multiple imputation for proteomics is proposed by Marie Chion, Christine Carapito and Frederic Bertrand (2021) <doi:10.1371/journal.pcbi.1010420>. It is dedicated to dealing with multiple imputation for proteomics.

Last updated

5.10 score 7 stars 36 scripts 308 downloads

SelectBoost.beta - Stability-Selection via Correlated Resampling for Beta-Regression Models

Adds variable-selection functions for Beta regression models (both mean and phi submodels) so they can be used within the 'SelectBoost' algorithm. Includes stepwise AIC, BIC, and corrected AIC on betareg() fits, 'gamlss'-based LASSO/Elastic-Net, a pure 'glmnet' iterative re-weighted least squares-based selector with an optional standardization speedup, and 'C++' helpers for iterative re-weighted least squares working steps and precision updates. Also provides a fastboost_interval() variant for interval responses, comparison helpers, and a flexible simulator simulation_DATA.beta() for interval-valued data. For more details see Bertrand and Maumy (2023) <doi:10.7490/f1000research.1119552.1>, <https://hal.science/hal-05352047>, and <https://hal.science/hal-05352056>.

Last updated

cpp

5.00 score 1 stars 6 scripts 247 downloads

bigANNOY - Approximate k-Nearest Neighbour Search for 'bigmemory' Matrices with Annoy

Approximate Euclidean k-nearest neighbour search routines that operate on 'bigmemory::big.matrix' data through Annoy indexes created with 'RcppAnnoy'. The package builds persistent on-disk indexes plus sidecar metadata from streamed 'big.matrix' rows, supports euclidean, angular, Manhattan, and dot-product Annoy metrics, and can either return in-memory results or stream neighbour indices and distances into destination 'bigmemory' matrices. Explicit index lifecycle helpers, stronger metadata validation, descriptor-aware file-backed workflows, and benchmark helpers are also included.

Last updated

cpp

4.98 score 1 stars 27 scripts 467 downloads

plsRbeta - Partial Least Squares Regression for Beta Regression Models

Provides Partial least squares Regression for (weighted) beta regression models (Bertrand 2013, <https://ojs-test.apps.ocp.math.cnrs.fr/index.php/J-SFdS/article/view/215>) and k-fold cross-validation of such models using various criteria. It allows for missing data in the explanatory variables. Bootstrap confidence intervals constructions are also available.

Last updated

4.97 score 2 stars 31 scripts 511 downloads

c060 - Extended Inference for Lasso and Elastic-Net Regularized Cox and Generalized Linear Models

The c060 package provides additional functions to perform stability selection, model validation and parameter tuning for glmnet models.

Last updated

4.94 score 3 stars 48 scripts 291 downloads

bigPLScox - Partial Least Squares for Cox Models with Big Matrices

Provides Partial least squares Regression and various regular, sparse or kernel, techniques for fitting Cox models for big data. Provides a Partial Least Squares (PLS) algorithm adapted to Cox proportional hazards models that works with 'bigmemory' matrices without loading the entire dataset in memory. Also implements a gradient-descent based solver for Cox proportional hazards models that works directly on 'bigmemory' matrices. Bertrand and Maumy (2023) <https://hal.science/hal-05352069>, and <https://hal.science/hal-05352061> highlighted fitting and cross-validating PLS-based Cox models to censored big data.

Last updated

openblascpp

4.92 score 1 stars 11 scripts 243 downloads

robustfa - Object Oriented Solution for Robust Factor Analysis

Outliers virtually exist in any datasets of any application field. To avoid the impact of outliers, we need to use robust estimators. Classical estimators of multivariate mean and covariance matrix are the sample mean and the sample covariance matrix. Outliers will affect the sample mean and the sample covariance matrix, and thus they will affect the classical factor analysis which depends on the classical estimators (Pison, G., Rousseeuw, P.J., Filzmoser, P. and Croux, C. (2003) <doi:10.1016/S0047-259X(02)00007-6>). So it is necessary to use the robust estimators of the sample mean and the sample covariance matrix. There are several robust estimators in the literature: Minimum Covariance Determinant estimator, Orthogonalized Gnanadesikan-Kettenring, Minimum Volume Ellipsoid, M, S, and Stahel-Donoho. The most direct way to make multivariate analysis more robust is to replace the sample mean and the sample covariance matrix of the classical estimators to robust estimators (Maronna, R.A., Martin, D. and Yohai, V. (2006) <doi:10.1002/0470010940>) (Todorov, V. and Filzmoser, P. (2009) <doi:10.18637/jss.v032.i03>), which is our choice of robust factor analysis. We created an object oriented solution for robust factor analysis based on new S4 classes.

Last updated

4.91 score 1 stars 1 dependents 27 scripts 414 downloads

Sobol4R - Sobol Indices for Models with Fixed and Stochastic Parameters

Tools to design experiments, compute Sobol sensitivity indices, and summarise stochastic responses inspired by the strategy described by Zhu and Sudret (2021) <doi:10.1016/j.ress.2021.107815>. Includes helpers to optimise toy models implemented in C++, visualise indices with uncertainty quantification, and derive reliability-oriented sensitivity measures based on failure probabilities. It is further detailed in Logosha, Maumy and Bertrand (2022) <doi:10.1063/5.0246026> and (2023) <doi:10.1063/5.0246024> or in Bertrand, Logosha and Maumy (2024) <https://hal.science/hal-05371803>, <https://hal.science/hal-05371795> and <https://hal.science/hal-05371798>.

Last updated

cpp

4.88 score 1 stars 8 scripts 254 downloads

networkABC - Network Reverse Engineering with Approximate Bayesian Computation

We developed an inference tool based on approximate Bayesian computation to decipher network data and assess the strength of the inferred links between network's actors. It is a new multi-level approximate Bayesian computation (ABC) approach. At the first level, the method captures the global properties of the network, such as a scale-free structure and clustering coefficients, whereas the second level is targeted to capture local properties, including the probability of each couple of genes being linked. Up to now, Approximate Bayesian Computation (ABC) algorithms have been scarcely used in that setting and, due to the computational overhead, their application was limited to a small number of genes. On the contrary, our algorithm was made to cope with that issue and has low computational cost. It can be used, for instance, for elucidating gene regulatory network, which is an important step towards understanding the normal cell physiology and complex pathological phenotype. Reverse-engineering consists in using gene expressions over time or over different experimental conditions to discover the structure of the gene network in a targeted cellular process. The fact that gene expression data are usually noisy, highly correlated, and have high dimensionality explains the need for specific statistical methods to reverse engineer the underlying network.

Last updated

4.86 score 4 stars 18 scripts 300 downloads

OneTwoSamples - Deal with One and Two (Normal) Samples

We introduce an R function one_two_sample() which can deal with one and two (normal) samples, Ying-Ying Zhang, Yi Wei (2012) <doi:10.2991/asshm-13.2013.29>. For one normal sample x, the function reports descriptive statistics, plot, interval estimation and test of hypothesis of x. For two normal samples x and y, the function reports descriptive statistics, plot, interval estimation and test of hypothesis of x and y, respectively. It also reports interval estimation and test of hypothesis of mu1-mu2 (the difference of the means of x and y) and sigma1^2 / sigma2^2 (the ratio of the variances of x and y), tests whether x and y are from the same population, finds the correlation coefficient of x and y if x and y have the same length.

Last updated

4.79 score 1 stars 31 scripts 582 downloads

bigKNN - Exact Search and Graph Construction for 'bigmemory' Matrices

Exact nearest-neighbour and radius-search routines that operate directly on 'bigmemory::big.matrix' objects. The package streams row blocks through 'BLAS' kernels, supports self-search and external-query search, exposes prepared references for repeated queries, and can build exact k-nearest-neighbour, radius, mutual k-nearest-neighbour, and shared-nearest-neighbour graphs. Version 0.3.0 adds execution plans, serializable prepared caches, resumable streamed graph jobs, coercion helpers, exact candidate reranking, and recall summaries for evaluating approximate neighbours.

Last updated

openblascpp

4.76 score 1 stars 19 scripts 460 downloads

elcf4R - Electricity Load Curves Forecasting at Individual Level

Implements forecasting methods for individual electricity load curves, including Kernel Wavelet Functional (KWF), clustered KWF, Generalized Additive Models (GAM), Multivariate Adaptive Regression Splines (MARS), and Long Short-Term Memory (LSTM) models. Provides normalized dataset adapters for iFlex, StoreNet, Low Carbon London, and REFIT; download and read support for IDEAL and GX; explicit Python backend selection for TensorFlow-based LSTM fits; helpers for daily segmentation and rolling-origin benchmarking; and compact shipped example panels and benchmark-result datasets.

Last updated

cpp

4.70 score 1 stars 7 scripts 291 downloads

BioStatR - Initiation à La Statistique Avec R

Datasets and functions for the book "Initiation à la Statistique avec R", F. Bertrand and M. Maumy-Bertrand (2022, ISBN:978-2100782826 Dunod, fourth edition).

Last updated

4.65 score 5 stars 90 scripts 764 downloads

SelectBoost.quantile - 'SelectBoost'-Style Variable Selection for Quantile Regression

A 'SelectBoost'-inspired workflow for sparse quantile regression. The package builds correlation neighborhoods, perturbs correlated predictors with a directional sampler inspired by the original 'SelectBoost' internals, refits penalized quantile regression models on the perturbed designs, and aggregates variable-selection frequencies across a path of correlation thresholds.

Last updated

4.60 score 1 stars 7 scripts 342 downloads

XGeoRTR - Backend-Neutral Explainable Geometry State and Operators

Provides the platform layer for explanation geometry in R. The package standardizes generic explanation tables into a normalized backend state object, computes embeddings, diagnostics, and multiscale level-of-detail summaries, and serializes backend-neutral state for reproducible workflows. It also exposes selected long-table and regular-grid views for downstream use-case packages. Rendering and viewport orchestration are delegated to downstream frontends such as 'ggWebGL'.

Last updated

4.59 score 1 stars 13 scripts 426 downloads

bootPLS - Bootstrap Hyperparameter Selection for PLS Models and Extensions

Several implementations of non-parametric stable bootstrap-based techniques to determine the numbers of components for Partial Least Squares linear or generalized linear regression models as well as and sparse Partial Least Squares linear or generalized linear regression models. The package collects techniques that were published in a book chapter (Magnanensi et al. 2016, 'The Multiple Facets of Partial Least Squares and Related Methods', <doi:10.1007/978-3-319-40643-5_18>) and two articles (Magnanensi et al. 2017, 'Statistics and Computing', <doi:10.1007/s11222-016-9651-4>) and (Magnanensi et al. 2021, 'Frontiers in Applied Mathematics and Statistics', <doi:10.3389/fams.2021.693126>).

Last updated

4.48 score 3 stars 4 scripts 497 downloads

penalizedSVM - Feature Selection SVM using Penalty Functions

Support Vector Machine (SVM) classification with simultaneous feature selection using penalty functions is implemented. The smoothly clipped absolute deviation (SCAD), 'L1-norm', 'Elastic Net' ('L1-norm' and 'L2-norm') and 'Elastic SCAD' (SCAD and 'L2-norm') penalties are available. The tuning parameters can be found using either a fixed grid or a interval search.

Last updated

4.45 score 1 stars 2 dependents 94 scripts 427 downloads

sageR - Applied Statistics for Economics and Management with R

Datasets and functions for the book "Statistiques pour l’économie et la gestion", "Théorie et applications en entreprise", F. Bertrand, Ch. Derquenne, G. Dufrénot, F. Jawadi and M. Maumy, C. Borsenberger editor, (2021, ISBN:9782807319448, De Boeck Supérieur, Louvain-la-Neuve). The first chapter of the book is dedicated to an introduction to statistics and their world. The second chapter deals with univariate exploratory statistics and graphics. The third chapter deals with bivariate and multivariate exploratory statistics and graphics. The fourth chapter is dedicated to data exploration with Principal Component Analysis. The fifth chapter is dedicated to data exploration with Correspondance Analysis. The sixth chapter is dedicated to data exploration with Multiple Correspondance Analysis. The seventh chapter is dedicated to data exploration with automatic clustering. The eighth chapter is dedicated to an introduction to probability theory and classical probability distributions. The ninth chapter is dedicated to an estimation theory, one-sample and two-sample tests. The tenth chapter is dedicated to an Gaussian linear model. The eleventh chapter is dedicated to an introduction to time series. The twelfth chapter is dedicated to an introduction to probit and logit models. Various example datasets are shipped with the package as well as some new functions.

Last updated

4.28 score 2 stars 19 scripts 275 downloads

fontcm - Computer Modern Font for Use with Extrafont Package

Computer Modern font with Paul Murrell's symbol extensions. Is is to be used with the **extrafont** package. When this font package is installed, the CM fonts will be available for PDF or Postscript output files; however, this will (probably) not make the font available for screen or bitmap output files.

Last updated

4.01 score 1 stars 60 scripts 3.4k downloads

plsdof - Degrees of Freedom and Statistical Inference for Partial Least Squares Regression

The plsdof package provides Degrees of Freedom estimates for Partial Least Squares (PLS) Regression. Model selection for PLS is based on various information criteria (aic, bic, gmdl) or on cross-validation. Estimates for the mean and covariance of the PLS regression coefficients are available. They allow the construction of approximate confidence intervals and the application of test procedures (Kramer and Sugiyama 2012 <doi:10.1198/jasa.2011.tm10107>). Further, cross-validation procedures for Ridge Regression and Principal Components Regression are available.

Last updated

4.01 score 3 stars 34 scripts 523 downloads

mergeGridR - Grid-Based Number Merge Puzzle Simulation

Provides tools to simulate, analyse, visualise, and benchmark grid-based number merge puzzles. The package implements generic grid mechanics, tile-spawning rules, merge rules, scoring functions, reproducible simulation utilities, and local 'Shiny' and 'WebGL' interfaces for interactive use. It is intended for teaching, algorithmic experimentation, and game-theoretic examples. The autoplay helpers use standard heuristic search and Monte Carlo simulation ideas described in Russell and Norvig (2021, ISBN:9780134610993) and Robert and Casella (2004, ISBN:9780387212395).

Last updated

cpp

4.00 score 10 scripts

missPLS - Methods and Reproducible Workflows for Partial Least Squares with Missing Data

Methods-first tooling for reproducing and extending the partial least squares regression studies on incomplete data described in Nengsih et al. (2019) <doi:10.1515/sagmb-2018-0059>. The package provides simulation helpers, missingness generators, imputation wrappers, component-selection utilities, real-data diagnostics, and reproducible study orchestration for Nonlinear Iterative Partial Least Squares (NIPALS)-Partial Least Squares (PLS) workflows.

Last updated

4.00 score 1 stars 8 scripts 406 downloads

ModStatR - Statistical Modelling in Action with R

Datasets and functions for the book "Modélisation statistique par la pratique avec R", F. Bertrand, E. Claeys and M. Maumy-Bertrand (2019, ISBN:9782100793525, Dunod, Paris). The first chapter of the book is dedicated to an introduction to the R statistical software. The second chapter deals with correlation analysis: Pearson, Spearman and Kendall simple, multiple and partial correlation coefficients. New wrapper functions for permutation tests or bootstrap of matrices of correlation are provided with the package. The third chapter is dedicated to data exploration with factorial analyses (PCA, CA, MCA, MDA) and clustering. The fourth chapter is dedicated to regression analysis: fitting and model diagnostics are detailed. The exercises focus on covariance analysis, logistic regression, Poisson regression, two-way analysis of variance for fixed or random factors. Various example datasets are shipped with the package: for instance on pokemon, world of warcraft, house tasks or food nutrition analyses.

Last updated

3.88 score 5 stars 4 scripts 565 downloads

granova - Graphical Analysis of Variance

This small collection of functions provides what we call elemental graphics for display of analysis of variance results, David C. Hoaglin, Frederick Mosteller and John W. Tukey (1991, ISBN:978-0-471-52735-0), Paul R. Rosenbaum (1989) <doi:10.2307/2684513>, Robert M. Pruzek and James E. Helmreich <https://jse.amstat.org/v17n1/helmreich.html>. The term elemental derives from the fact that each function is aimed at construction of graphical displays that afford direct visualizations of data with respect to the fundamental questions that drive the particular analysis of variance methods. These functions can be particularly helpful for students and non-statistician analysts. But these methods should be quite generally helpful for work-a-day applications of all kinds, as they can help to identify outliers, clusters or patterns, as well as highlight the role of non-linear transformations of data.

Last updated

3.68 score 1 stars 48 scripts 616 downloads

CascadeData - Experimental Data of Cascade Experiments in Genomics

These experimental expression data (5 leukemic 'CLL' B-lymphocyte of aggressive form from 'GSE39411', <doi:10.1073/pnas.1211130110>), after B-cell receptor stimulation, are used as examples by packages such as the 'Cascade' one, a modeling tool allowing gene selection, reverse engineering, and prediction in cascade networks. Jung, N., Bertrand, F., Bahram, S., Vallat, L., and Maumy-Bertrand, M. (2014) <doi:10.1093/bioinformatics/btt705>.

Last updated

2.70 score 1 stars 8 scripts 335 downloads