Abstract

Transposable elements (TEs) are genomic units able to move within the genome of virtually all organisms. Due to their natural repetitive numbers and their high structural diversity, the identification and classification of TEs remain a challenge in sequenced genomes. Although TEs were initially regarded as “junk DNA”, it has been demonstrated that they play key roles in chromosome structures, gene expression, and regulation, as well as adaptation and evolution. A highly reliable annotation of these elements is, therefore, crucial to better understand genome functions and their evolution. To date, much bioinformatics software has been developed to address TE detection and classification processes, but many problematic aspects remain, such as the reliability, precision, and speed of the analyses. Machine learning and deep learning are algorithms that can make automatic predictions and decisions in a wide variety of scientific applications. They have been tested in bioinformatics and, more specifically for TEs, classification with encouraging results. In this review, we will discuss important aspects of TEs, such as their structure, importance in the evolution and architecture of the host, and their current classifications and nomenclatures. We will also address current methods and their limitations in identifying and classifying TEs.

Highlights

  • Transposable elements (TEs) are genomic units able to move within and among the genomes of virtually all organisms [1]

  • Machine learning (ML) and deep learning (DL) may represent the new generation of bioinformatics approaches, especially for TEs [214]

  • Both techniques have been tested in many genomic areas, demonstrating very high levels of success, yet their application in TEs is limited

Read more

Summary

Introduction

Transposable elements (TEs) are genomic units able to move within and among the genomes of virtually all organisms [1]. TEs represent the most repetitive sequences [5] They are able to move in the genomes, generate mutations, and obviously amplify the number of their copies [6]. TEs moving via an RNA molecule called retrotransposons fall into Class I, while elements moving via a DNA molecule, called transposons, are classified into Class II [8] They represent the vast majority of TEs found in plant genomes due to their mobility mechanisms. Several methods were developed to identify and annotate transposable elements in sequenced genomes These are classified into four categories: de novo, structure-based, comparative genomics, and homology-based [17]. For the reasons mentioned above, we focused on them in this review

Retrotransposons Structure
LTR Retrotransposons
How are Retrotransposons Activated
How Are Retrotransposons Silenced
Horizontal Transfer of TEs
Function of Retrotransposons in a Chromosome’s Structure
Chromosomal Distribution of Retrotransposons
Sex-Specific Chromosomes
Interaction of Retrotransposons with Genes
Current Classifications
How to Identify and Classify Retrotransposons
Current Problems for Retrotransposon Identification and Classification
Current Strategies and Methodologies
Structure-Based Methods
Homology-Based Methods
De Novo
Comparative Genomics
Most Popular Bioinformatics Resources
Current Machine Learning Techniques for Genomics and Transposable Elements
Findings
Conclusions
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call