ATTfold: RNA Secondary Structure Prediction With Pseudoknots Based on Attention Mechanism.

Yili Wang,Yubing Gao,Liyan Dong,Zhen Liu,Yuanning Liu,Hao Zhang,Shuo Wang

doi:10.3389/fgene.2020.612086

Abstract

Accurate RNA secondary structure information is the cornerstone of gene function research and RNA tertiary structure prediction. However, most traditional RNA secondary structure prediction algorithms are based on the dynamic programming (DP) algorithm, according to the minimum free energy theory, with both hard and soft constraints. The accuracy is particularly dependent on the accuracy of soft constraints (from experimental data like chemical and enzyme detection). With the elongation of the RNA sequence, the time complexity of DP-based algorithms will increase geometrically, as a result, they are not good at coping with relatively long sequences. Furthermore, due to the complexity of the pseudoknots structure, the secondary structure prediction method, based on traditional algorithms, has great defects which cannot predict the secondary structure with pseudoknots well. Therefore, few algorithms have been available for pseudoknots prediction in the past. The ATTfold algorithm proposed in this article is a deep learning algorithm based on an attention mechanism. It analyzes the global information of the RNA sequence via the characteristics of the attention mechanism, focuses on the correlation between paired bases, and solves the problem of long sequence prediction. Moreover, this algorithm also extracts the effective multi-dimensional features from a great number of RNA sequences and structure information, by combining the exclusive hard constraints of RNA secondary structure. Hence, it accurately determines the pairing position of each base, and obtains the real and effective RNA secondary structure, including pseudoknots. Finally, after training the ATTfold algorithm model through tens of thousands of RNA sequences and their real secondary structures, this algorithm was compared with four classic RNA secondary structure prediction algorithms. The results show that our algorithm significantly outperforms others and more accurately showed the secondary structure of RNA. As the data in RNA sequence databases increase, our deep learning-based algorithm will have superior performance. In the future, this kind of algorithm will be more indispensable.

Highlights

RNA is an indispensable biopolymer that plays diverse biological roles in regulating translation (Kapranov et al, 2007), gene expression (Storz and Gottesman, 2006), and RNA splicing (Sharp, 2009)
The prediction of the RNA secondary structure has gradually fallen into a bottleneck in traditional algorithm research over the past 40 years
With the rapid development of deep learning and machine learning, the Method tRNA F1-score positive predictive value (PPV)

Summary

Introduction

RNA is an indispensable biopolymer that plays diverse biological roles in regulating translation (Kapranov et al, 2007), gene expression (Storz and Gottesman, 2006), and RNA splicing (Sharp, 2009). To accurately obtain the RNA secondary structure, different prediction algorithms have been developed over the past 40 years. The most mainstream calculation method is the Nearest Neighbor Thermodynamic Model (NNTM) based on a single RNA sequence (Turner and Mathews, 2010). This method calculates the RNA secondary structure with minimum free energy (MFE) through the dynamic programming algorithm. As for classic algorithms, they only focus on the number of pairing bases in the sequence, while ignoring the exact base pairs Such algorithms perform well in terms of the prediction accuracy, they deliver poor reports to describe the true RNA secondary structures. The thermodynamic matcher is still a very general framework used to solve the hard constraints of RNA secondary structure (Reeder and Giegerich, 2004)

Objectives

Methods

Results

Conclusion

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Frontiers in Genetics	Publication Date: Dec 15, 2020
Citations: 12	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

ATTfold: RNA Secondary Structure Prediction With Pseudoknots Based on Attention Mechanism.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Frontiers in Genetics

Lead the way for us

Similar Papers

A New Method of RNA Secondary Structure Prediction Based on Convolutional Neural Network and Dynamic Programming.
Hao Zhang ... Yuanning Liu
Frontiers in Genetics | VOL. 10
Hao Zhang, et. al.Hao Zhang ... Yuanning Liu
22 May 2019
Frontiers in Genetics | VOL. 10

RNA independent fragment partition method based on deep learning for RNA secondary structure prediction
Qi Zhao ... Yudong Yao
Scientific Reports | VOL. 13
Qi Zhao, et. al.Qi Zhao ... Yudong Yao
17 Feb 2023
Scientific Reports | VOL. 13

Efficient Generation of RNA Secondary Structure Prediction Algorithm Under PAR Framework.
Haihe Shi ... Xiaoqian Jing
Frontiers in plant science | VOL. 12
Haihe Shi, et. al.Haihe Shi ... Xiaoqian Jing
21 Jan 2022
Frontiers in plant science | VOL. 12

Predicting RNA secondary structure based on machine learning and genetic algorithm
Duy Binh Doan ... Duc Long Dang
-
Duy Binh Doan, et. al.Duy Binh Doan ... Duc Long Dang
26 Nov 2020
26 Nov 2020

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

ATTfold: RNA Secondary Structure Prediction With Pseudoknots Based on Attention Mechanism.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Frontiers in Genetics