Abstract

Phenotypes are driven by regulated gene expression, which in turn are mediated by complex interactions between diverse biological molecules. Protein–DNA interactions such as histone and transcription factor binding are well studied, along with RNA–RNA interactions in short RNA silencing of genes. In contrast, lncRNA-protein interaction (LPI) mechanisms are comparatively unknown, likely directed by the difficulties in studying LPI. However, LPI are emerging as key interactions in epigenetic mechanisms, playing a role in development and disease. Their importance is further highlighted by their conservation across kingdoms. Hence, interest in LPI research is increasing. We therefore review the current state of the art in lncRNA-protein interactions. We specifically surveyed recent computational methods and databases which researchers can exploit for LPI investigation. We discovered that algorithm development is heavily reliant on a few generic databases containing curated LPI information. Additionally, these databases house information at gene-level as opposed to transcript-level annotations. We show that early methods predict LPI using molecular docking, have limited scope and are slow, creating a data processing bottleneck. Recently, machine learning has become the strategy of choice in LPI prediction, likely due to the rapid growth in machine learning infrastructure and expertise. While many of these methods have notable limitations, machine learning is expected to be the basis of modern LPI prediction algorithms.

Highlights

  • Transcriptomics is the study of a complete set of RNA transcripts in a cell, measuring variable expression levels of the genome under different conditions

  • We discovered that Starbase is an exclusive database which provides MALAT1–protein interactions with the CLIP-seq evidence, whereas POSTAR2 provides RNA- and RBP-centric interactome information for the well-examined long non-coding RNA (lncRNA)

  • Most modern lncRNA-protein interaction (LPI) prediction algorithms use machine learning, where large datasets with attributes of interest are passed to an algorithm (Table 2)

Read more

Summary

Introduction

Transcriptomics is the study of a complete set of RNA transcripts in a cell, measuring variable expression levels of the genome under different conditions. LncRNA can act as gene regulators, and like other epigenetic mechanisms are involved in numerous biological processes They achieve their regulatory function with their ability to interact with a wide range of biological molecules, such as other nucleic acids and proteins [5], as well as with small molecules [4]. Among their more direct modes of action are sequestering and releasing transcripts to modulate gene expression, stabilising transcripts and binding to DNA to sterically hinder transcription initiation [6]. LPI, with an emphasis on software and databases, together with their advantages as well as limitations

LPI Laboratory Assays
LPI Prediction Algorithms
Molecular Docking Approaches
Machine Learning Approaches
Future Directions
Conclusions
Methods

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.