Abstract
Despite decreasing sequencing costs, whole-genome sequencing for population-based genome scans for selection is still prohibitively expensive for organisms with large genomes. Moreover, the repetitive nature of large genomes often represents a challenge in bioinformatic and downstream analyses. Here, we use in-depth transcriptome sequencing to design probes for exome capture in Swiss stone pine (Pinus cembra), a conifer with an estimated genome size of 29.3 Gbp and no reference genome available. We successfully applied around 55,000 self-designed probes, targeting 25,000 contigs, to DNA pools of seven populations from the Swiss Alps and identified >160,000 SNPs in around 15,000 contigs. The probes performed equally well in pools of the closely related species Pinus sibirica; in both species, more than 70% of the targeted contigs were sequenced at a depth ≥40× (number of haplotypes in the pool). However, a thorough analysis of individually sequenced P.cembra samples indicated that a majority of the contigs (63%) represented multi-copy genes. We therefore removed paralogous contigs based on heterozygote excess and deviation from allele balance. Without putatively paralogous contigs, allele frequencies of population pools represented accurate estimates of individually determined allele frequencies. We show that inferences of neutral and adaptive genetic variation may be biased when not accounting for such multi-copy genes. Without individual genotype data, it would have been nearly impossible to recognize and deal with the problem of multi-copy contigs. We advocate to put more emphasis on identifying paralogous loci, which will be facilitated by the establishment of additional high-quality reference genomes.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.