Wheat physiology predictor: predicting physiological traits in wheat from hyperspectral reflectance measurements using deep learning

Robert T Furbank,Viridiana Silva-Perez,John R Evans,Anthony G Condon,Gonzalo M Estavillo,Wennan He,Saul Newman,Richard Poiré,Ashley Hall,Zhen He

doi:10.1186/s13007-021-00806-6

Abstract

BackgroundThe need for rapid in-field measurement of key traits contributing to yield over many thousands of genotypes is a major roadblock in crop breeding. Recently, leaf hyperspectral reflectance data has been used to train machine learning models using partial least squares regression (PLSR) to rapidly predict genetic variation in photosynthetic and leaf traits across wheat populations, among other species. However, the application of published PLSR spectral models is limited by a fixed spectral wavelength range as input and the requirement of separate custom-built models for each trait and wavelength range. In addition, the use of reflectance spectra from the short-wave infrared region requires expensive multiple detector spectrometers. The ability to train a model that can accommodate input from different spectral ranges would potentially make such models extensible to more affordable sensors. Here we compare the accuracy of prediction of PLSR with various deep learning approaches and an ensemble model, each trained and tested using previously published data sets.ResultsWe demonstrate that the accuracy of PLSR to predict photosynthetic and related leaf traits in wheat can be improved with deep learning-based and ensemble models without overfitting. Additionally, these models can be flexibly applied across spectral ranges without significantly compromising accuracy.ConclusionThe method reported provides an improved prediction of wheat leaf and photosynthetic traits from leaf hyperspectral reflectance and do not require a full range, high cost leaf spectrometer. We provide a web service for deploying these algorithms to predict physiological traits in wheat from a variety of spectral data sets, with important implications for wheat yield prediction and crop breeding.

Highlights

The global population is estimated to reach 9.7 billion by 2050 [1]
We developed a machine learning framework based on Partial Least Squares Regression (PLSR) and hyperspectral reflectance, which enables prediction of several physiological traits related to photosynthetic performance in wheat leaves with high accuracy and speed (30 s to 1 min per leaf; [4, 10])
Dataset description We used the large multi-site, multi-environment wheat data set including two treatment regimes, collected by Silva-Perez et al [4, 12] for the construction of the models. This dataset consisted of the entire hyperspectral reflectance spectra (400–2400 nm) from wheat leaves and the corresponding physiological traits concurrently measured on the same leaf section

Summary

Introduction

The global population is estimated to reach 9.7 billion by 2050 [1]. As a result, the projected demand for cereal grain exceeds the agricultural forecast output [2]. We developed a machine learning framework based on Partial Least Squares Regression (PLSR) and hyperspectral reflectance, which enables prediction of several physiological traits related to photosynthetic performance in wheat leaves with high accuracy and speed (30 s to 1 min per leaf; [4, 10]). Measuring photosynthesis-related traits, such as nitrogen per unit leaf area (Narea) and leaf dry mass per area (LMA), require laborious, destructive, and expensive laboratory-based methods which may take several days. Leaf hyperspectral reflectance data has been used to train machine learning models using partial least squares regression (PLSR) to rapidly predict genetic variation in photo‐ synthetic and leaf traits across wheat populations, among other species. We compare the accuracy of prediction of PLSR with various deep learning approaches and an ensemble model, each trained and tested using previously published data sets

Methods

Results

Conclusion