Abstract
BackgroundCopy number variation (CNV) is a common feature of eukaryotic genomes, and a growing body of evidence suggests that genes affected by CNV are enriched in processes that are associated with environmental responses. Here we use next generation sequence (NGS) data to detect copy-number variable regions (CNVRs) within the Malus x domestica genome, as well as to examine their distribution and impact.MethodsCNVRs were detected using NGS data derived from 30 accessions of M. x domestica analyzed using the read-depth method, as implemented in the CNVrd2 software. To improve the reliability of our results, we developed a quality control and analysis procedure that involved checking for organelle DNA, not repeat masking, and the determination of CNVR identity using a permutation testing procedure.ResultsOverall, we identified 876 CNVRs, which spanned 3.5 % of the apple genome. To verify that detected CNVRs were not artifacts, we analyzed the B- allele-frequencies (BAF) within a single nucleotide polymorphism (SNP) array dataset derived from a screening of 185 individual apple accessions and found the CNVRs were enriched for SNPs having aberrant BAFs (P < 1e-13, Fisher’s Exact test). Putative CNVRs overlapped 845 gene models and were enriched for resistance (R) gene models (P < 1e-22, Fisher’s exact test). Of note was a cluster of resistance gene models on chromosome 2 near a region containing multiple major gene loci conferring resistance to apple scab.ConclusionWe present the first analysis and catalogue of CNVRs in the M. x domestica genome. The enrichment of the CNVRs with R gene models and their overlap with gene loci of agricultural significance draw attention to a form of unexplored genetic variation in apple. This research will underpin further investigation of the role that CNV plays within the apple genome.Electronic supplementary materialThe online version of this article (doi:10.1186/s12864-015-2096-x) contains supplementary material, which is available to authorized users.
Highlights
Copy number variation (CNV) is a common feature of eukaryotic genomes, and a growing body of evidence suggests that genes affected by CNV are enriched in processes that are associated with environmental responses
The read-depth CNV detection method is based on an assumption that the number of reads originating from a region of a genome after removing technical bias is indicative of the copy number for that region
Integer copy number assignment for an individual sample can be performed using the read-depth method, given the low-coverage sequencing data used in our study and the incomplete apple genome assembly, we chose instead to focus on copy-number variable regions (CNVRs) that displayed significant variation in segmentation scores and not to attempt integer copy number assignment
Summary
Copy number variation (CNV) is a common feature of eukaryotic genomes, and a growing body of evidence suggests that genes affected by CNV are enriched in processes that are associated with environmental responses. We use generation sequence (NGS) data to detect copy-number variable regions (CNVRs) within the Malus x domestica genome, as well as to examine their distribution and impact. Other forms of genomic variation have begun to receive attention. One such form is copy number variation (CNV), defined as a deletion, duplication or insertion of DNA sequence fragments longer than 50 base pairs in length [1]. Studies of CNV in eukaryotic organisms, such as dog [2], barley [3], and human [4], have revealed that 4 to 15 % of a eukaryotic genome is comprised of regions which exhibit variation in copy number between individuals. Segmental duplications (SDs), Boocock et al BMC Genomics (2015) 16:848 which are sections of DNA with near-identical sequence, are considered hotspots for CNV formation [9]
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.