Abstract

In qualitative and quantitative terahertz(THz)spectroscopic analyses, reduction and feature extraction of original spectral data are important steps. Due to the parameters, sample preparation, and experimental conditions used in THz time-domain spectroscopy (THz-TDS), the sample absorption lines present different degrees of oscillation and contain certain background noise; therefore spectral data dimension reduction is necessary. Since the existing traditional algorithms, such as principal component analysis, cannot extract useful information from signals, an improved spectral feature extraction method is proposed based on geodesic distance nonlinear reduction and partial least squares regression. Three kinds of transmission spectra of transgenic soybeans are obtained in this experiment. To extract the useful information from the spectral data, principal component analysis (PCA), a locally linear embedding (LLE) algorithm and Floyd's improved LLE algorithm (FLLE) are applied. Multiple linear regression analysis (MLR) and partial least squares regression analysis (PLSR) are performed on the reduced dimensional spectral data. The root mean square error (RMSE) of the FLLE-PLSR algorithm is 0.0079, and the determination coefficient(R) is 0.9966, which are obviously better than those of the PCA-MLR, LLE-MLR, FLLE-MLR, PCA-PLSR and LLE-PLSR algorithms. The proposed method can effectively extract characteristic quantities from the THz spectrum data of transgenic soybeans that have broad application value in agricultural security and food supervision.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call