Abstract

DNA barcodes are widely used in taxonomy, systematics, species identification, food safety, and forensic science. Most of the conventional DNA barcode sequences contain the whole information of a given barcoding gene. Most of the sequence information does not vary and is uninformative for a given group of taxa within a monophylum. We suggest here a method that reduces the amount of noninformative nucleotides in a given barcoding sequence of a major taxon, like the prokaryotes, or eukaryotic animals, plants, or fungi. The actual differences in genetic sequences, called single nucleotide polymorphism (SNP) genotyping, provide a tool for developing a rapid, reliable, and high‐throughput assay for the discrimination between known species. Here, we investigated SNPs as robust markers of genetic variation for identifying different pigeon species based on available cytochrome c oxidase I (COI) data. We propose here a decision tree‐based SNP barcoding (DTSB) algorithm where SNP patterns are selected from the DNA barcoding sequence of several evolutionarily related species in order to identify a single species with pigeons as an example. This approach can make use of any established barcoding system. We here firstly used as an example the mitochondrial gene COI information of 17 pigeon species (Columbidae, Aves) using DTSB after sequence trimming and alignment. SNPs were chosen which followed the rule of decision tree and species‐specific SNP barcodes. The shortest barcode of about 11 bp was then generated for discriminating 17 pigeon species using the DTSB method. This method provides a sequence alignment and tree decision approach to parsimoniously assign a unique and shortest SNP barcode for any known species of a chosen monophyletic taxon where a barcoding sequence is available.

Highlights

  • The original idea of DNA barcoding was to use a short DNA sequence as a species-­specific marker for species identification and authentication (Hebert, Cywinska, Ball, & deWaard, 2003)

  • The decision tree-­based SNP barcoding (DTSB) method applied here to a group of 17 pigeon species generates the shortest possible DNA barcode for species identification. Such single nucleotide polymorphism (SNP) barcode sequences are obtained after sequence alignment; for example, the number of M bp for SNP barcode sequence may be identified from N species with c oxidase I (COI) sequences

  • Each SNP barcode sequence can reliably identify each species of a given taxon, and the SNP barcodes are generated from decision tree algorithm that searches for P nodes, where P = [pmin, pmax], pmin = N/4, pmax = N − 1, and P nodes can be repeatedly selected if needed

Read more

Summary

| INTRODUCTION

The original idea of DNA barcoding was to use a short DNA sequence as a species-­specific marker for species identification and authentication (Hebert, Cywinska, Ball, & deWaard, 2003). The COI sequence is conventionally used as an unarbitrary barcode for the discrimination between eukaryotic and animal species, its major shortcoming is that it takes substantial memory and processing time for computational comparisons, when dealing with large data Such large data are increasingly available with metagenomic approaches to species diversity, even if only using a single promising barcoding gene, like COI (Gao, Jia, & Kong, 2016). We propose a decision tree-­based SNP barcoding (DTSB) algorithm that automatically generates barcodes for species identification through a decision tree approach This will facilitate to discriminate biota at species level based on a machine learning technique to analyze given COI sequences from 17 pigeon species. We hypothesize that SNPs from aligned COI sequences of different know species can be used as a new of straightforward way to strip barcoding sequence information from nonvariable and noninformative information to gain shortest variable bp information allowing speedy computational comparisons for the purpose of species discrimination

| MATERIAL AND METHODS
| DISCUSSION
| CONCLUSION
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call