New Developments in Sparse PLS Regression

Jérémy Magnanensi,Myriam Maumy-Bertrand,Nicolas Meyer,Frédéric Bertrand

doi:10.3389/fams.2021.693126

Abstract

Methods based on partial least squares (PLS) regression, which has recently gained much attention in the analysis of high-dimensional genomic datasets, have been developed since the early 2000s for performing variable selection. Most of these techniques rely on tuning parameters that are often determined by cross-validation (CV) based methods, which raises essential stability issues. To overcome this, we have developed a new dynamic bootstrap-based method for significant predictor selection, suitable for both PLS regression and its incorporation into generalized linear models (GPLS). It relies on establishing bootstrap confidence intervals, which allows testing of the significance of predictors at preset type I risk α, and avoids CV. We have also developed adapted versions of sparse PLS (SPLS) and sparse GPLS regression (SGPLS), using a recently introduced non-parametric bootstrap-based technique to determine the numbers of components. We compare their variable selection reliability and stability concerning tuning parameters determination and their predictive ability, using simulated data for PLS and real microarray gene expression data for PLS-logistic classification. We observe that our new dynamic bootstrap-based method has the property of best separating random noise in y from the relevant information with respect to other methods, leading to better accuracy and predictive abilities, especially for non-negligible noise levels.

Highlights

Partial least squares (PLS) regression, introduced by [1], is a well-known dimension-reduction method, notably in chemometrics and spectrometric modeling [2]
We focus on the second type of adapted PLS regression, referred to on as GPLS
In order to take into account these theoretical results, we have developed a new dynamic bootstrap-based approach for variable selection relevant for both the PLS and GPLS frameworks

Summary

Introduction

Partial least squares (PLS) regression, introduced by [1], is a well-known dimension-reduction method, notably in chemometrics and spectrometric modeling [2]. We focus on the PLS univariate response framework, better known as PLS1. Let n be the number of observations and p the number of covariates. Yn)T ∈ Rn represents the response vector, with (.)T denoting the transpose. The original underlying algorithm, developed to deal with continuous responses, consists of building latent variables tk, 1 # k # K, called components, as linear combinations of the original predictors X Xp) ∈ Mn,p(R), where Mn,p(R) represents the set of matrices of n rows and p columns.

Objectives

Methods

Findings

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Frontiers in Applied Mathematics and Statistics	Publication Date: Jul 16, 2021
Citations: 2	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

New Developments in Sparse PLS Regression

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Frontiers in Applied Mathematics and Statistics

Lead the way for us

Similar Papers

HDDA] sparse subspace constrained partial least squares
Matthew Sutton ... Benoit Liquet
Journal of Statistical Computation and Simulation | VOL. 89
Matthew Sutton, et. al.Matthew Sutton ... Benoit Liquet
09 Dec 2018
Journal of Statistical Computation and Simulation | VOL. 89

Deviance residuals-based sparse PLS and sparse kernel PLS regression for censored data.
Philippe Bastien ... Myriam Maumy-Bertrand
Bioinformatics | VOL. 31
Philippe Bastien, et. al.Philippe Bastien ... Myriam Maumy-Bertrand
06 Oct 2014
Bioinformatics | VOL. 31

Using elastic net regression to perform spectrally relevant variable selection
Cannon Giglio ... Steven D Brown
Journal of Chemometrics | VOL. 32
Cannon Giglio, et. al.Cannon Giglio ... Steven D Brown
25 Apr 2018
Journal of Chemometrics | VOL. 32

Comparison of Regularized Regression Methods for ~Omics Data
Animesh Acharjee
Journal of Postgenomics Drug & Biomarker Development | VOL. 03
Animesh AcharjeeAnimesh Acharjee
01 Jan 2012
Journal of Postgenomics Drug & Biomarker Development | VOL. 03

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

New Developments in Sparse PLS Regression

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Frontiers in Applied Mathematics and Statistics