Abstract
Recent reports have shown than many identically named genetic lines used in research around the world actually contain large amounts of uncharacterized genetic variation as a result of cross contamination of stocks, unintentional crossing, residual heterozygosity within original stocks, or de novo mutation. 27 public, large scale, RNA-seq datasets from 20 independent research groups around the world were used to assess variation within the maize (Zea mays ssp. mays) inbred B73, a four decade old variety which served as the reference genotype for the original maize genome sequencing project and is widely used in genetic, genomic, and phenotypic research. Several clearly distinct clades were identified among putatively B73 samples. A number of these clades were defined by the presence of clearly defined genomic blocks containing a haplotype which did not match the published B73 reference genome. The overall proportion of the maize genotype where multiple distinct haplotypes were observed across different research groups was approximately 2.3%. In some cases the relationship among B73 samples generated by different research groups recapitulated mentor/mentee relationships within the maize genetics community.
Highlights
A great deal of biological research depends on reference genotypes that allow researchers around the world on work with material that is genetically identical or nearly identical
Recent reports have shown than many identically named genetic lines used in research around the world contain large amounts of uncharacterized genetic variation as a result of cross contamination of stocks, unintentional crossing, residual heterozygosity within original stocks, or de novo mutation. 27 public, large scale, RNA-seq datasets from 20 independent research groups around the world were used to assess variation within the maize (Zea mays ssp. mays) inbred B73, a four decade old variety which served as the reference genotype for the original maize genome sequencing project and is widely used in genetic, genomic, and phenotypic research
SNP calling, and filtering, a total of 13,360 high confidence segregating SNPs were identified among the 27 RNA-seq samples labeled as B73 employed in this study, substantially lower than the *64,000 high quality SNPs identified by RNA-seq in a population segregating for a single non-B73 haplotype [46]
Summary
A great deal of biological research depends on reference genotypes that allow researchers around the world on work with material that is genetically identical or nearly identical. For many decades, assessing whether samples labeled as coming from genetically identical sources truly were identical was a costly, time consuming, and often inconclusive process [1] [2]. One study of human cell cultures found that 18% of cell lines were either contaminated or something entirely different from what they were labeled as [3] with the widely used HeLa cell line being one of the most frequent offenders [4]. A recent resequencing study of arabidopsis demonstrated that a line believed to carry a mutation for the ABP1 gene in an otherwise Col-0 background contained a wide range of other nonsense and missense mutations as well as a large region on chromosome 3 which came from a different arabidopsis accession [5]. In soybean (Glycine max), segregating variation covering *3.1% of the soybean genome
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.