Abstract

BackgroundMobile elements (MEs) constitute greater than 50% of the human genome as a result of repeated insertion events during human genome evolution. Although most of these elements are now fixed in the population, some MEs, including ALU, L1, SVA and HERV-K elements, are still actively duplicating. Mobile element insertions (MEIs) have been associated with human genetic disorders, including Crohn’s disease, hemophilia, and various types of cancer, motivating the need for accurate MEI detection methods. To comprehensively identify and accurately characterize these variants in whole genome next-generation sequencing (NGS) data, a computationally efficient detection and genotyping method is required. Current computational tools are unable to call MEI polymorphisms with sufficiently high sensitivity and specificity, or call individual genotypes with sufficiently high accuracy.ResultsHere we report Tangram, a computationally efficient MEI detection program that integrates read-pair (RP) and split-read (SR) mapping signals to detect MEI events. By utilizing SR mapping in its primary detection module, a feature unique to this software, Tangram is able to pinpoint MEI breakpoints with single-nucleotide precision. To understand the role of MEI events in disease, it is essential to produce accurate individual genotypes in clinical samples. Tangram is able to determine sample genotypes with very high accuracy. Using simulations and experimental datasets, we demonstrate that Tangram has superior sensitivity, specificity, breakpoint resolution and genotyping accuracy, when compared to other, recently developed MEI detection methods.ConclusionsTangram serves as the primary MEI detection tool in the 1000 Genomes Project, and is implemented as a highly portable, memory-efficient, easy-to-use C++ computer program, built under an open-source development model.

Highlights

  • Mobile elements (MEs) constitute greater than 50% of the human genome as a result of repeated insertion events during human genome evolution

  • Results are shown for the Tangram and RetroSeq programs applied to simulated data (1,000 AluY insertions introduced at random positions on human chromosome 20)

  • The Stewart et al call set consisted of 1,208 Alu and 180 Long interspersed element 1 (L1) calls, including 486 Alu and 48 L1 insertions that were experimentally confirmed with a polymerase-chain-reaction-based (PCRbased) validation techniques

Read more

Summary

Introduction

Mobile elements (MEs) constitute greater than 50% of the human genome as a result of repeated insertion events during human genome evolution. With the advent of nextgeneration sequencing (NGS) technologies providing vast throughput for individual resequencing, a number of new algorithms have been developed for various SV types, including copy number variations (CNVs) [4,5,6,7,8], and large deletion events [9]. These algorithms take advantage of various signals provided by NGS mapping algorithms, primarily read-depth (RD), and read-pair (RP). While DNA transposons are thought to have largely ceased activity in primates about 37 million years ago [16], retrotransposons have continued to propagate throughout primate evolution including the lineage leading to humans [13,17]

Methods
Results
Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.