Abstract

Ginseng is a well-known traditional herbal medicine and the ginseng available on the market may not actually be produced in a certain place as claimed. Traditional methods of identifying the geographical origin of Ginseng are subjective, time-consuming or destructive. A more efficient approach is desirable. The feasibility of combining near-infrared (NIR) spectroscopy with ensemble learning for discriminating ginseng producing area was explored. A total of 270 samples were collected and evenly partitioned into the training and test sets. Random subspace ensemble (RSE) that uses linear discriminant classifier (LDA) as weak learner (abbreviated RSE-LDA) was used to construct predictive models. Two parameters including the size of subspace and the number of learners in ensemble were optimized. Classic partial least algorithm (PLS) was applied to build the reference model. The sensitivity, specificity, and total accuracy of final RSE-LDA and PLS models were 97.8 %, 100 %, 99.3 %, and 93.3 %, 96.7 %, 95.6 %, respectively. In order to study the impact of training set composition on the results, the samples were randomly divided 200 times and the algorithm was run repeatedly to statistically analyze the sensitivity and specificity on the test set. Similar results were obtained. The effect of training set size was also investigated. It indicates that the combination of NIR spectroscopy with the RSE algorithm is a potential tool of discriminating the origin of Ginseng.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call