Abstract

A primary component of next-generation sequencing analysis is to align short reads to a reference genome, with each read aligned independently. However, reads that observe the same non-reference DNA sequence are highly correlated and can be used to better model the true variation in the target genome. A novel short-read micro re-aligner, SRMA, that leverages this correlation to better resolve a consensus of the underlying DNA sequence of the targeted genome is described here.

Highlights

  • Whole-genome human re-sequencing is feasible using generation sequencing technology

  • Local re-alignment of simulated data To assess the performance of local re-alignment on a dataset with a known diploid sequence, two whole genome human re-sequencing experiments were simulated to generate 1 billion 50 base-paired end reads for a total of 100 Gb of genomic sequence representing a mean haploid coverage of 15 × for either Illumina or Applied Biosystems Inc (ABI) SOLiD data

  • The data were initially aligned with Burrows Wheeler Alignment tool (BWA) [9] and locally re-aligned with Short-Read Micro re-Aligner (SRMA)

Read more

Summary

Introduction

Whole-genome human re-sequencing is feasible using generation sequencing technology Technologies such as those produced by Illumina, Life, and Roche 454 produce millions to billions of short DNA sequences that can be used to reconstruct the diploid sequence of a human genome. Such data alone could be used to de novo assemble the genome in question [1,2,3,4,5,6]. Observing multiple reads that differ from the reference sequence in their respective alignments identifies variants These alignment algorithms have made it possible to accurately and efficiently catalogue many types of variation between human individuals and those causative for specific diseases

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call