KPLS Optimization With Nature-Inspired Metaheuristic Algorithms

Jorge Daniel Mello-Roman,Adolfo Hernandez

doi:10.1109/access.2020.3019771

Abstract

Kernel partial least squares regression (KPLS) is a technique used in several scientific areas because of its high predictive ability. This article proposes a methodology to simultaneously estimate both the parameters of the kernel function and the number of components of the KPLS regression to maximize its predictive ability. A metaheuristic optimization problem was proposed taking the cumulative cross-validation coefficient as an objective function to be maximized. It was solved using nature-inspired metaheuristic algorithms: the genetic algorithm, particle swarm optimization, grey wolf optimization and the firefly algorithm. To validate the results and have a reference measure of the efficiency of the nature-inspired metaheuristic algorithms, derivative-free optimization algorithms were also applied: Hooke-Jeeves and Nelder-Mead. The metaheuristic algorithms estimated optimal values of both of the kernel function parameters and the number of components in the KPLS regression.

Highlights

Partial least squares (PLS) regression is a linear method that seeks to predict a set of dependent variables (Y) from a set of predictors (X) by extracting orthogonal factors that maximize predictive ability, called components [1], [2]
Are the results of thirty runs of the parameter estimation algorithm in kernel partial least squares regression method (KPLS) regression proposed in Algorithm 2 for the metaheuristic algorithms genetic algorithm (GA), firefly algorithm (FFA), particle swarm optimization (PSO) and grey wolf optimization (GWO) and the derivative-free algorithms NM and HJ
This article proposes a methodology to simultaneously estimate both the kernel function parameter θ and the number of components h, which maximize the predictive capacity in the KPLS regression

Summary

Introduction

Partial least squares (PLS) regression is a linear method that seeks to predict a set of dependent variables (Y) from a set of predictors (X) by extracting orthogonal factors that maximize predictive ability, called components [1], [2]. PLS is not appropriate for describing data structures when these exhibit nonlinear variations [3], so Rosipal and Trejo [4] proposed the kernel partial least squares regression method (KPLS), which transforms the original datasets into a space of arbitrary dimensionality characteristics, where the generation of a linear model is possible [5]. A recurring difficulty when implementing KPLS regression is determining both the number of components and kernel function parameters that maximize its predictive capacity [6], [7], generating an optimization problem.

Objectives

Methods

Results

Conclusion