Redundant information is inevitable when acquisition of high-dimension spectra, which contribute to the overfitting.In order to effectively reduce the dimensions of multi-dimensional spectral data and improve the accuracy of spectral detection of complex solution components, a method based on the gradient distribution of the multi-dimensional spectral image is proposed in this paper, aiming to extract the feature region, which is highly correlated with the solution concentration. In this research, firstly, photon propagation simulations in the wedge sample were performed using Monte Carlo, and the image of multi-dimensional spectra under different optical parameters was obtained. Furthermore, a method of extracting the position of the feature region based on the gradient distribution is proposed, according to the distribution characteristics of the image. Subsequently, a series of experiments were carried out of the visible-near-infrared spectra of 39 groups of mixed solutions, which composed of India Ink and Intralipid of sequential changed concentrations, to reveal the effectiveness of the feature region. The regression models for prediction of Intralipid concentration were developed by partial least square regression (PLSR). In contrast to the traditional spectral analysis method, the dimensionality of data was reduced to 1/9, by extracting the feature region of the multi-dimensional spectral data, while the average correlation coefficient (Rp) of prediction set reached 98.87%, and the average mean square error (RMSEP) of prediction set was 0.1502, showing an increase of 0.12% in Rp and a decrease of 6.12% in RMSEP, respectively. The results showed that models incorporating the feature region, improved the accuracy of component analysis of complex solutions with high-concentration and also the analysis speed and robustness of the model. A rapid, non-invasive, low-cost method is proposed in this paper, using visible-near-infrared spectroscopy with the wedge sample to analyze the components of complex solutions.
Read full abstract