Abstract

BackgroundDNA barcoding technology, which uses a short piece of DNA sequence to identify species, has wide ranges of applications. Until today, a universal DNA barcode marker for plants remains elusive. The rbcL and matK regions have been proposed as the “core barcode” for plants and the ITS2 and psbA-trnH intergenic spacer (PTIGS) regions were later added as supplemental barcodes. The use of PTIGS region as a supplemental barcode has been limited by the lack of computational tools that can handle significant insertions and deletions in the PTIGS sequences. Here, we compared the most commonly used alignment-based and alignment-free methods and developed a web server to allow the biologists to carry out PTIGS-based DNA barcoding analyses.ResultsFirst, we compared several alignment-based methods such as BLAST and those calculating P distance and Edit distance, alignment-free methods Di-Nucleotide Frequency Profile (DNFP) and their combinations. We found that the DNFP and Edit-distance methods increased the identification success rate to ~80%, 20% higher than the most commonly used BLAST method. Second, the combined methods showed overall better success rate and performance. Last, we have developed a web server that allows (1) retrieving various sub-regions and the consensus sequences of PTIGS, (2) annotating novel PTIGS sequences, (3) determining species identity by PTIGS sequences using eight methods, and (4) examining identification efficiency and performance of the eight methods for various taxonomy groups.ConclusionsThe Edit distance and the DNFP methods have the highest discrimination powers. Hybrid methods can be used to achieve significant improvement in performance. These methods can be extended to applications using the core barcodes and the other supplemental DNA barcode ITS2. To our knowledge, the web server developed here is the only one that allows species determination based on PTIGS sequences. The web server can be accessed at http://psba-trnh-plantidit.dnsalias.org.

Highlights

  • DNA barcoding technology, which uses a short piece of DNA sequence to identify species, has wide ranges of applications

  • Hybrid methods can be used to achieve significant improvement in performance. These methods can be extended to applications using the core barcodes and the other supplemental DNA barcode ITS2

  • The web server developed here is the only one that allows species determination based on psbA-trnH intergenic spacer (PTIGS) sequences

Read more

Summary

Introduction

DNA barcoding technology, which uses a short piece of DNA sequence to identify species, has wide ranges of applications. The rbcL and matK regions have been proposed as the “core barcode” for plants and the ITS2 and psbA-trnH intergenic spacer (PTIGS) regions were later added as supplemental barcodes. The use of PTIGS region as a supplemental barcode has been limited by the lack of computational tools that can handle significant insertions and deletions in the PTIGS sequences. DNA barcoding technology uses a short piece of DNA sequence to identify species. As one of the supplemental barcodes, PTIGS has several favorable characteristics It can be amplified across a broad range of land plants. PTIGS has the highest percentage of nucleotide difference and micro-inversions and it has become the most variable plastid region in some group of plants [3,9,11]

Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call