Abstract

The development of high-throughput sequencing technologies has enabled novel methods for detecting structural variants (SVs). Current methods are typically based on depth of coverage or pair-end mapping clusters. However, most of these only report an approximate location for each SV, rather than exact breakpoints. We have developed pair-read informed split mapping (PRISM), a method that identifies SVs and their precise breakpoints from whole-genome resequencing data. PRISM uses a split-alignment approach informed by the mapping of paired-end reads, hence enabling breakpoint identification of multiple SV types, including arbitrary-sized inversions, deletions and tandem duplications. Comparisons to previous datasets and simulation experiments illustrate PRISM's high sensitivity, while PCR validations of PRISM results, including previously uncharacterized variants, indicate an overall precision of ~90%. PRISM is freely available at http://compbio.cs.toronto.edu/prism.

Highlights

  • The development of high-throughput sequencing (HTS) technologies has enabled novel methods for detecting structural variants (SVs)

  • We ran pair-read informed split mapping (PRISM), Pindel, SVseq, Splitread, BreakDancer (Chen et al, 2009), CNVnator (Abyzov et al, 2011) and CREST on our simulation dataset, with the results summarized in Supplementary Table S1

  • CREST is designed for somatic rather than germline variants, so we decided to exclude it from the comparisons on real genomes

Read more

Summary

Introduction

The development of high-throughput sequencing (HTS) technologies has enabled novel methods for detecting structural variants (SVs). Current methods, which are usually based on depth of coverage (Abyzov et al, 2011), pair-end mapping clusters (Chen et al, 2009) or a combination of these (Medvedev et al, 2010), have been successful in quantifying structural variation in individual genomes and populations (Mills et al, 2011) Most of these only report an approximate location for each SV, rather than exact breakpoints. Split-read based methods, such as Pindel (Ye et al, 2009), Splitread (Karakoc et al, 2011) and SVseq (Zhang et al, 2011), while able to identify these breakpoints, have been limited in their ability to identify large-scale ‘structural’ variants These tools take the approach of aligning the split read only in the immediate vicinity of the read’s pair and limit the maximum discoverable variant size. The recent CREST method (Wang et al, 2011) takes an alternative approach for assembling the unaligned clipped ends of reads and mapping these to the genome

Methods
Results
Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.