Abstract
Genetic variation is the driving force of evolution and as such is of central interest for biologists. However, inadequate discrimination of errors from true genetic variation could lead to incorrect estimates of gene copy number, population genetic parameters, phylogenetic relationships and the deposition of gene and protein sequences in databases that are not actually present in any organism. Misincorporation errors in multi-template PCR cloning methods, still commonly used for obtaining novel gene sequences in non-model species, are difficult to detect, as no previous information may be available about the number of expected copies of genes belonging to multi-gene families. However, studies employing these techniques rarely describe in any great detail how errors arising in the amplification process were detected and accounted for. Here, we estimated the rate of base misincorporation of a widely-used PCR-cloning method, using a single copy mitochondrial gene from a single individual to minimise variation in the template DNA, as 1.62×10−3 errors per site, or 9.26×10−5 per site per duplication. The distribution of errors among sequences closely matched that predicted by a binomial distribution function. The empirically estimated error rate was applied to data, obtained using the same methods, from the Phospholipase A2 toxin family from the pitviper Ovophis monticola. The distribution of differences detected closely matched the expected distribution of errors and we conclude that, when undertaking gene discovery or assessment of genetic diversity using this error-prone method, it will be informative to empirically determine the rate of base misincorporation.
Highlights
The study of naturally occurring genetic variation, whether between species [1,2], populations [3] or individuals [4], is of vital importance in biology
We obtained partial mitochondrial cytochrome b (MT-CYB) sequence (253 to 761 base pairs, averaging 715 bp, total 58592 bases) from 82 clones of PCR product amplified from a single individual
The pre-cloning cycles are relevant to calculating error rates as the second PCR is based on a large number of copies of the target sequence from the bacterial colony, and errors occurring after this point are unlikely to be seen in the final sequence data
Summary
The study of naturally occurring genetic variation, whether between species [1,2], populations [3] or individuals [4], is of vital importance in biology. Multigene families frequently evolve rapidly through birth and death processes [14,15,3] and in non-model species without sequenced genomes, the exact number of different gene copies carried by a particular individual may not be known in advance. Inadequate discrimination could lead to incorrect estimates of gene copy number, population genetic parameters, phylogenetic relationships and the occurrence of gene and protein sequences in databases that are not present in any organism. Studies employing these techniques rarely describe explicitly how errors arising in the amplification process were detected and accounted for
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.