Abstract

Background Discordant read pairs [1,2] – those deviating either from expected insert size range or correct relative orientation – have served as vital clues to identifying structural variants (SV) in genomes. Collecting discordant read pairs is the first step in SV detection and is often done by sequence alignment. When there are repetitive elements, such as insertion sequence (IS), a class of transposable elements in bacterial genomes, discordant read pairs can have multiple mapping loci – making them more challenging to be placed and interpreted. Instead of resolving such tangled mapping results, many tools simply ignore these mapped read pairs, potentially missing SVs involving repetitive elements.

Highlights

  • Discordant read pairs [1,2] – those deviating either from expected insert size range or correct relative orientation – have served as vital clues to identifying structural variants (SV) in genomes

  • We present an idea of using approximate de Bruijn graphs (A-Bruijn graphs) [3] to identify discordant read pairs, in order to discover SVs

  • We applied this approach to whole genome sequencing data [4] (~100x per sample using 90bp x 2 paired end Illumina sequencing) obtained from 38 lines of Escherichia coli PFM2, a derivative strain of E. coli K-12 MG1655, and 34 lines of a mismatch repair deficient derivative that were propagated for ~3,080 and ~375

Read more

Summary

Background

Discordant read pairs [1,2] – those deviating either from expected insert size range or correct relative orientation – have served as vital clues to identifying structural variants (SV) in genomes. Collecting discordant read pairs is the first step in SV detection and is often done by sequence alignment. When there are repetitive elements, such as insertion sequence (IS), a class of transposable elements in bacterial genomes, discordant read pairs can have multiple mapping loci – making them more challenging to be placed and interpreted. Instead of resolving such tangled mapping results, many tools ignore these mapped read pairs, potentially missing SVs involving repetitive elements

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call