Abstract

Confusing low-molecular-weight hyaluronic acid (LMWHA) from acid degradation and enzymatic hydrolysis (named LMWHA-A and LMWHA-E, respectively) will lead to health hazards and commercial risks. The purpose of this work is to analyze the structural differences between LMWHA-A and LMWHA-E, and then achieve a fast and accurate classification based on near-infrared (NIR) spectroscopy and machine learning. First, we combined nuclear magnetic resonance (NMR), Fourier transform infrared (FTIR) spectroscopy, two-dimensional correlated NIR spectroscopy (2DCOS), and aquaphotomics to analyze the structural differences between LMWHA-A and LMWHA-E. Second, we compared the dimensionality reduction methods including principal component analysis (PCA), kernel PCA (KPCA), and t-distributed stochastic neighbor embedding (t-SNE). Finally, the differences in classification effect of traditional machine learning methods including partial least squares-discriminant analysis (PLS-DA), support vector classification (SVC), and random forest (RF) as well as deep learning methods including one-dimensional convolutional neural network (1D-CNN) and long short-term memory (LSTM) were compared. The results showed that genetic algorithm (GA)-SVC and RF were the best performers in traditional machine learning, but their highest accuracy in the test dataset was 90%, while the accuracy of 1D-CNN and LSTM models in the training dataset and test dataset classification was 100%. The results of this study show that compared with traditional machine learning, the deep learning models were better for the classification of LMWHA-A and LMWHA-E. Our research provides a new methodological reference for the rapid and accurate classification of biological macromolecules.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call