Abstract

This study aimed to establish a method based on machine learning technology for accurately predicting the commodity specifications of Fritillariae Cirrhosae Bulbus and explore the application of data augmentation technology in the field of drug analysis. The correlation optimized warping(COW) algorithm was used to perform peak calibration on the UPLC-QDA multi-channel superimposed data of 30 batches of samples, and the data were normalized. Through unsupervised learning methods such as clustering analysis, principal component analysis(PCA), and correlation analysis, the general characteristics of the data were understood. Then, the logistic regression algorithm was used for supervised learning on the data, and the condition tabular generative adversarial networks(CTGAN) was used to generate a large amount of data. Logistic regression classification models were trained separately using the real data and the data generated by CTGAN, and these models were evaluated. The logistic regression model trained with real data achieved cross-validation and test set accuracies of 0.95 and 1.00, respectively, while the logistic regression model trained with both real and CTGAN-generated data achieved cross-validation and test set accuracies of 0.99 and 1.00, respectively. The results indicate that machine learning can accurately predict the classification of Songbei, Qingbei, and Lubeibased on UPLC-QDA detection data. CTGAN-generated data can partially compensate for the lack of data in drug analysis, improving the accuracy and predictive ability of machine learning models.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call