Abstract

Viticulturists traditionally have a keen interest in studying the relationship between the biochemistry of grapevines’ leaves/petioles and their associated spectral reflectance in order to understand the fruit ripening rate, water status, nutrient levels, and disease risk. In this paper, we implement imaging spectroscopy (hyperspectral) reflectance data, for the reflective 330 - 2510 nm wavelength region (986 total spectral bands), to assess vineyard nutrient status; this constitutes a high dimensional dataset with a covariance matrix that is ill-conditioned. The identification of the variables (wavelength bands) that contribute useful information for nutrient assessment and prediction, plays a pivotal role in multivariate statistical modeling. In recent years, researchers have successfully developed many continuous, nearly unbiased, sparse and accurate variable selection methods to overcome this problem. This paper compares four regularized and one functional regression methods: Elastic Net, Multi-Step Adaptive Elastic Net, Minimax Concave Penalty, iterative Sure Independence Screening, and Functional Data Analysis for wavelength variable selection. Thereafter, the predictive performance of these regularized sparse models is enhanced using the stepwise regression. This comparative study of regression methods using a high-dimensional and highly correlated grapevine hyperspectral dataset revealed that the performance of Elastic Net for variable selection yields the best predictive ability.

Highlights

  • The goal of that project was to develop vineyard nutrient models, with wavelengths in the reflective regime as predictor variables, toward rapid assessment vine nutrient status using remote sensors, such as cameras mounted on Unmanned Aerial Systems (UAS)

  • We compare four regularized and one functional regression methods: elastic net, multi-step adaptive elastic net, minimax concave penalty, sure independence screening, and functional data analysis based on their predictive performance

  • The presence of multicollinearity fails to reduce the number of basis functions significantly, based on minimum mean squared error (MSE), which negatively affects the predictive ability of grapevine dataset based on functional data analysis

Read more

Summary

Introduction

Variable selection in multivariate analysis is a critical step in regression, especially. To explore the various variable selection techniques, this paper is organized as follows: In Section 2, we explain the Elastic Net regularization approach with an emphasis on its clever combination of the traditional Ridge and Lasso methods; in Section 3 we introduce the Multi-Step Adaptive Elastic Net, which provides (at least in principle) an improvement over the basic elastic Net, via various modifications of the penalty function; Section 4 deals with the method known as Minimax Concave Penalty, which is yet another technique designed to address the estimation, inference and prediction inherent in complex datasets, like the grapevine data studied in this paper; in Section 5 we discuss various aspects of iterative Sure Independence Screening; Section 6 adopts a conceptually different approach, by using a functional approach to variable selection, including smoothing by basis representation and validation; Section 7 is dedicated to the comparative analysis of the performance of all of the above techniques with the goal of wavelength selection for the Riesling grapevine dataset toward nutrient estimation; and in Section 8, we discuss the advantages and limitations of the various models mentioned above before concluding

Elastic Net
Multi-Step Adaptive Elastic Net
Minimax Concave Penalty
Functional Data Analysis
Functional Regression Model
Smoothing by Basis Representation
Comparative Study for Wavelength Selection
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call