Abstract

BackgroundNext-generation sequencing produces high-throughput data, albeit with greater error and shorter reads than traditional Sanger sequencing methods. This complicates the detection of genomic variations, especially, small insertions and deletions.FindingsHere we describe ParMap, a statistical algorithm for the identification of complex genetic variants, such as small insertion and deletions, using partially mapped reads in nextgen sequencing data.ConclusionsWe report ParMap's successful application to the mutation analysis of chromosome X exome-captured leukemia DNA samples.

Highlights

  • Next-generation sequencing produces high-throughput data, albeit with greater error and shorter reads than traditional Sanger sequencing methods

  • We aimed to develop a procedure for identifying small genomic insertions and deletions with high confidence and built an algorithm (ParMap) capable of producing a list of candidates, through statistical analysis of partially mapped reads (Figure 1)

  • The Chromosome X exome captured DNA samples were sequenced with the Applied Biosystems SOLiD 3 platform using 1/8th of sequencing slide per sample to produced a total of 105,302,787 fifty-base long fragment reads

Read more

Summary

Introduction

Next-generation sequencing produces high-throughput data, albeit with greater error and shorter reads than traditional Sanger sequencing methods. One of the major technological advances in biology in the last few years has been the development of high throughput nextgen sequencing systems that produce gigabases of data in a single run, and allow an unbiased view of the whole genome without relying on prior knowledge about the disease-causing alterations These ultradeep sequencing technologies produce large amounts of sequence data, which increase the sequencing depth and allow for better statistics in calling various genomic variations. The development of efficient statistical and computational methods for the high confidence call of genomic variants is needed for the analysis of these high throughput datasets At this point, the detection of single mutations and large copy number variations using deep sequencing data is fairly straight forward [1,2], whereas the identification of small (less than 10 nucleotides) insertions and deletions is more challenging. ParMap calculates a measure based on the number of reads that only cover the positions adjacent to a gap without covering their neighboring positions in the direction of the gap, to identify the possible locations of genomic insertions or deletions (Figure 2 and Methods)

Objectives
Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.