Abstract

To the Editor: The good news is that a very large number of human mtDNA sequences from diverse populations and ethnic groups are becoming available for analysis. The bad news is that many of these sequences contain errors (Dennis 2003; Forster 2003). In at least one instance, that of the Icelandic population, it appears that mtDNA sequence errors were a contributing factor (although not the only one) to an erroneous conclusion about the genetic diversity of these people (Arnason 2003). Forster (2003) cites other examples where mtDNA sequence errors have compromised analyses of population genetics and human evolution. In a reanalysis of mtDNA sequences in the Ladin population of the Alps, the original conclusions on population diversity were not overturned after the use of more accurate sequences (Vernesi et al. 2002). At this point, we do not know the extent of the damage, so to speak, caused by mtDNA sequence errors. Nevertheless, it is clear that correcting such errors must be undertaken as quickly as possible. As a result of our reduced median network analyses (Herrnstadt et al. 2002), we released a database of 560 human mtDNA coding region sequences. A small number of errors in these sequences were detected by Dr. Hans-Jurgen Bandelt, and we were able to correct these, as noted in an erratum that was published soon after our original report (Herrnstadt et al. 2002). Subsequently, a systematic approach to the detection of phantom sequence errors was published in this Journal (Bandelt et al. 2002). As defined by these investigators, phantom errors are those that arise during the sequencing process itself. Dr. Bandelt contacted us again and suggested that there were phantom mutations in our mtDNA database. Specifically, the likely errors involved G→C transversions at nt 7927 and nt 7985. Such a result was surprising to us, because we believed that our sequencing approach and quality control measures had avoided such errors. Therefore, we used Dr. Bandelt’s information as a starting point for a comprehensive reanalysis of our database. After reanalysis, which included inspection of the electropherograms for all G→C and C→G transversions, we found that 41 of these mtDNA sequences contained at least one such phantom error. In fact, there were more such phantom errors than those suggested by Dr. Bandelt. In addition to the phantom transversions at positions 7927 and 7985, we detected instances of other such errors that included ones at nucleotide positions 500, 14160, 14460, 14974, and 16239. However, these errors did not occur randomly throughout the database. Instead, we could “isolate” the errors to a short time period that was relatively early during our large-scale mtDNA sequencing program. With the benefit of hindsight, it appears that the frequency of these errors was caused by two technical factors (see also Bandelt et al. 2002). The first was that one particular capillary array of the ABI 3700 DNA Analyzer produced suboptimal base separations, whereas the second was that the sequencing chemistry at that time utilized an early version of reagents that was optimized subsequently. In addition to these 41 sequences, we also found that an additional 26 mtDNA sequences contained errors that arose during data entry or editing. As a result of this reanalysis, we have corrected the database of 560 sequences, which is available through the MitoKor Web site (the URL address is given below). Have these errors invalidated our network analyses? Not to a substantial degree. Many of the sequence errors generated private polymorphisms, which were not included in our analyses. Furthermore, a substantial proportion of the branches in these networks were established by multiple substitutions (see figs. 1–4 in Herrnstadt et al. 2002), and, so far, we have no evidence from additional network analysis that the original results need major revision. Can we now guarantee that our mtDNA database is error free? No. Although such is our goal, it is not practical, and it is probably not technically feasible. It is now clear that many mtDNA databases or sequence sets contain errors (Forster 2003). The solution to this problem is further effort, both at the front end (the sequencing process itself) and at the back end (increased quality control) of mtDNA database construction.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.