Abstract

Transposable elements (TEs) help shape the structure and function of the human genome. When inserted into some locations, TEs may disrupt gene regulation and cause diseases. Here, we present xTea (x-Transposable element analyzer), a tool for identifying TE insertions in whole-genome sequencing data. Whereas existing methods are mostly designed for short-read data, xTea can be applied to both short-read and long-read data. Our analysis shows that xTea outperforms other short read-based methods for both germline and somatic TE insertion discovery. With long-read data, we created a catalogue of polymorphic insertions with full assembly and annotation of insertional sequences for various types of retroelements, including pseudogenes and endogenous retroviruses. Notably, we find that individual genomes have an average of nine groups of full-length L1s in centromeres, suggesting that centromeres and other highly repetitive regions such as telomeres are a significant yet unexplored source of active L1s. xTea is available at https://github.com/parklab/xTea.

Highlights

  • Transposable elements (TEs) help shape the structure and function of the human genome

  • This improves the detection of events that occur in regions close to other structural variations (SVs), especially for those located within the insert size (Fig. S2)

  • With a high-quality benchmark dataset and a large pedigree dataset, we demonstrated that xTea outperforms MELT in identifying and genotyping germline insertions. xTea has much higher sensitivity with comparable precision in identifying somatic L1 insertions than TraFiC-mem

Read more

Summary

Introduction

Transposable elements (TEs) help shape the structure and function of the human genome. We present xTea (x-Transposable element analyzer), a tool for identifying TE insertions in whole-genome sequencing data. With the availability of whole-genome sequencing (WGS) data, we have reported frequent somatic L1 insertions in some cancer types, especially in epithelial cancers, suggesting a role of TEs in tumorigenesis[7]. An SVA insertion causing exontrapping was identified in a child with Batten disease and it led to the development of a personalized antisense-oligonucleotide drug to fix the splicing defect[13]. These studies highlight the importance of accurate TE detection for genomic medicine. PALMER21 is the only tool designed for TEinsertion detection from long reads

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call