Abstract

For some groups of organisms, DNA barcoding can provide a useful tool in taxonomy, evolutionary biology, and biodiversity assessment. However, the efficacy of DNA barcoding depends on the degree of sampling per species, because a large enough sample size is needed to provide a reliable estimate of genetic polymorphism and for delimiting species. We used a simulation approach to examine the effects of sample size on four estimators of genetic polymorphism related to DNA barcoding: mismatch distribution, nucleotide diversity, the number of haplotypes, and maximum pairwise distance. Our results showed that mismatch distributions derived from subsamples of ≥20 individuals usually bore a close resemblance to that of the full dataset. Estimates of nucleotide diversity from subsamples of ≥20 individuals tended to be bell-shaped around that of the full dataset, whereas estimates from smaller subsamples were not. As expected, greater sampling generally led to an increase in the number of haplotypes. We also found that subsamples of ≥20 individuals allowed a good estimate of the maximum pairwise distance of the full dataset, while smaller ones were associated with a high probability of underestimation. Overall, our study confirms the expectation that larger samples are beneficial for the efficacy of DNA barcoding and suggests that a minimum sample size of 20 individuals is needed in practice for each population.

Highlights

  • Over the past decade, DNA barcoding has proven to be a useful tool in studies of taxonomy, ecology, biodiversity assessment, and various other fields (Waugh 2007; Valentini et al 2009; Scheffers et al 2012)

  • Ecology and Evolution published by John Wiley & Sons Ltd

  • Using DNA sequence data generated via simulation under a coalescent model, we examined the behaviour of four estimators of genetic polymorphism: mismatch distribution, nucleotide diversity, the number of haplotypes, and maximum pairwise distance

Read more

Summary

Introduction

DNA barcoding has proven to be a useful tool in studies of taxonomy, ecology, biodiversity assessment, and various other fields (Waugh 2007; Valentini et al 2009; Scheffers et al 2012). Its concept has become the basis of DNA mini-barcoding (Meusnier et al 2008) and DNA metabarcoding which uses high-thoughput sequences from environmental samples (Yu et al 2012). A Simulation Study of Sample Size noff et al 2006a,b), variability in the success of the method The impact of sample size has long been an important issue in DNA barcoding (Austerlitz et al 2009; Zhang et al 2010; Bergsten et al 2012; Jin et al 2012)

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call