Comparing Calibration Algorithms for the Rapid Characterization of Pretreated Corn Stover Using Near-Infrared Spectroscopy

Zofia Tillman,Edward J Wolfrum

doi:10.3389/fenrg.2022.878973

Zofia Tillman, Edward J Wolfrum

Open Access

https://doi.org/10.3389/fenrg.2022.878973

Copy DOI

Abstract

Rapid characterization of biomass composition is a key enabling technology for biorefineries—the ability to measure the chemical composition of biomass materials entering the biorefinery as well as the composition of key process intermediate streams would allow real-time process control and the development of robust models to predict process performance. The utility of near-infrared (NIR) spectroscopy for rapid characterization requires multivariate algorithms for building calibration models. The most prevalent algorithm used for building calibration models using NIR spectra is the linear modeling algorithm Partial Least Squares Regression (PLS). Nonlinear regression algorithms (which are typically more computationally intensive than linear modeling approaches) have gained popularity in recent years due to their ability to solve a wide variety of classification and regression problems and the dramatic increase in available computational resources. In this work, we demonstrate that a calibration model can predict the composition of corn stover process intermediate samples pretreated with three different treatments—hot water (HW), dilute acid (DA), and deacetylation followed by dilute acid (DDA). We quantitatively compare three different algorithms for building prediction models based on near-infrared spectroscopy—partial least squares (PLS), support vector machines (SVM), and random forests (RF). We demonstrate the utility of improving model performance by accounting for instrument performance variability using repeated measurements of standard materials (e.g., the “repeatability file” strategy) and investigate its performance with nonlinear regression techniques, and we discuss methods for quantifying the uncertainties of specific predictions among the three methods.

Full Text