Aquilaria Lam. is a remarkable genus that produces agarwood, which is widely used as an herbal medicine, perfume, and incense sticks, as well as in interior decoration. However, illegal harvesting pose threats to endangered Aquilaria species, and the adulteration in agarwood products causes potential medical safety. In this study, 306 authentic sequences of five candidate barcodes (ITS, matK, psbA-trnH, rbcL, and trnL-trnF) downloaded from GenBank and 330 sequences taken from our samples were collected to establish a DNA barcode reference dataset for 11 species of Aquilaria listed in the IUCN Red List of Threatened Species. Machine learning approaches, including BLOG, Naïve Bayes, SMO, Jrip, and J48, were compared with distance-based (TaxonDNA) and tree-based (neighbor-joining tree [NJ tree]) methods based on the accuracy of discrimination across the five single barcodes and their combinations. The results indicated that BLOG and SMO could successfully identify the six species of Aquilaria with ITS + matK, ITS + rbcL, and ITS + trnL-trnF. However, when TaxonDNA and the NJ tree were used, only ITS + matK could identify these six species. Additionally, greater species resolution was found in machine learning approaches (16.67%−66.67%) compared with classical methods (0–50%) in managing low variation barcodes, such as psbA-trnH, and chloroplast-locus combinations. Compared with traditional analytical methods, machine learning approaches are reliable and efficient tools for DNA barcoding analysis of Aquilaria, which can be utilized to combat the adulteration of agarwood products and facilitate the conservation of Aquilaria species.
Read full abstract