Abstract

Transposable elements (TEs) are non-static genomic units capable of moving indistinctly from one chromosomal location to another. Their insertion polymorphisms may cause beneficial mutations, such as the creation of new gene function, or deleterious in eukaryotes, e.g., different types of cancer in humans. A particular type of TE called LTR-retrotransposons comprises almost 8% of the human genome. Among LTR retrotransposons, human endogenous retroviruses (HERVs) bear structural and functional similarities to retroviruses. Several tools allow the detection of transposon insertion polymorphisms (TIPs) but fail to efficiently analyze large genomes or large datasets. Here, we developed a computational tool, named TIP_finder, able to detect mobile element insertions in very large genomes, through high-performance computing (HPC) and parallel programming, using the inference of discordant read pair analysis. TIP_finder inputs are (i) short pair reads such as those obtained by Illumina, (ii) a chromosome-level reference genome sequence, and (iii) a database of consensus TE sequences. The HPC strategy we propose adds scalability and provides a useful tool to analyze huge genomic datasets in a decent running time. TIP_finder accelerates the detection of transposon insertion polymorphisms (TIPs) by up to 55 times in breast cancer datasets and 46 times in cancer-free datasets compared to the fastest available algorithms. TIP_finder applies a validated strategy to find TIPs, accelerates the process through HPC, and addresses the issues of runtime for large-scale analyses in the post-genomic era.

Highlights

  • Transposable elements (TEs) are non-static genomic units capable of moving indistinctly from one chromosomal location to another [1,2,3]

  • TIP_finder follows the strategy of the analysis of discordant read pair as proposed by several algorithms such as by [30] to detect transposon insertion polymorphisms (TIPs) using (i) short pair reads such as those obtained by Illumina, (ii) a reference genome sequence assembled at the chromosome level, and a database of consensus

  • Problems Encountered with Large Genomes and Testing TIP_finder

Read more

Summary

Introduction

Transposable elements (TEs) are non-static genomic units capable of moving indistinctly from one chromosomal location to another [1,2,3]. Several studies have indicated that TEs play crucial genomic roles involved in chromosome structuring, structural variation, the alteration of gene expression [5,7], evolution, the variation of genomic size, and environmental adaptation [9,10,11,12,13]. These elements can be associated with human diseases, such as different types of cancer [14,15,16]. Class I or retrotransposons use an RNA molecule as an intermediate, while Class II or DNA transposons utilize a DNA intermediate

Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call