Abstract

Copy number variants are duplications and deletions of the genome that play an important role in phenotypic changes and human disease. Many software applications have been developed to detect copy number variants using either whole-genome sequencing or whole-exome sequencing data. However, there is poor agreement in the results from these applications. Simulated datasets containing copy number variants allow comprehensive comparisons of the operating characteristics of existing and novel copy number variant detection methods. Several software applications have been developed to simulate copy number variants and other structural variants in whole-genome sequencing data. However, none of the applications reliably simulate copy number variants in whole-exome sequencing data. We have developed and tested Simulator of Exome Copy Number Variants (SECNVs), a fast, robust and customizable software application for simulating copy number variants and whole-exome sequences from a reference genome. SECNVs is easy to install, implements a wide range of commands to customize simulations, can output multiple samples at once, and incorporates a pipeline to output rearranged genomes, short reads and BAM files in a single command. Variants generated by SECNVs are detected with high sensitivity and precision by tools commonly used to detect copy number variants. SECNVs is publicly available at https://github.com/YJulyXing/SECNVs.

Highlights

  • Copy number variants (CNVs) represent DNA duplications and deletions ranging from a few dozen base pairs to several million bases that have been associated with phenotypic changes and human disease (Feuk et al, 2006)

  • We presented a fast, reliable and highlycustomizable software application, Simulator of Exome Copy Number Variants (SECNVs), which takes in a reference genome and target regions to simulate SNPs, indels and CNVs in one or multiple test genomes, as well as the control, and outputs fasta formatted genome files with target regions, short read files, BAM files and indexes in a single command

  • CNVs represent an important source of genetic variation and have been associated with disease and other important phenotypic traits in humans, domesticated animals and crops (Zhang et al, 2009; FIGURE 3 | Exemplar simulated output BAM files visualized in IGV. (A) A 10 copy duplication at mouse chr1:65272798-65339955, which partially overlap with exons of the Pikfyve gene

Read more

Summary

Introduction

Copy number variants (CNVs) represent DNA duplications and deletions ranging from a few dozen base pairs to several million bases that have been associated with phenotypic changes and human disease (Feuk et al, 2006). Many software applications have been developed to detect CNVs using either whole-genome sequencing (WGS) (Bartenhagen and Dugas, 2013; Pattnaik et al, 2014; Qin et al, 2015; Faust, 2017; Xia et al, 2017) or whole-exome sequencing (WES) (Sathirapongsasuti et al, 2011; Fromer et al, 2012; Klambauer et al, 2012; Koboldt et al, 2012a; Koboldt et al, 2012b; Krumm et al, 2012; Plagnol et al, 2012; Magi et al, 2013) data. WES is based on the capture and sequencing of transcribed regions (exons) of protein coding sequences, which combined represent approximately 1% of the human genome. In species with very large genomes and limited opportunities for WGS experiments, WES data are expected to represent a critical source of information to detect CNVs (Hirsch et al, 2014)

Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.