Protein Structure Prediction by Recurrent and Convolutional Deep Neural Network Architectures

Jack Hanson

doi:10.25904/1912/3830

Abstract

In this thesis, the application of convolutional and recurrent machine learning techniques to several key structural properties of proteins is explored. Chapter 2 presents the rst application of an LSTM-BRNN in structural bioinformat- ics. The method, called SPOT-Disorder, predicts the per-residue probability of a protein being intrinsically disordered (ie. unstructured, or exible). Using this methodology, SPOT-Disorder achieved the highest accuracy in the literature without separating short and long disordered regions during training as was required in previous models, and was additionally proven capable of indirectly discerning functional sites located in disordered regions. Chapter 3 extends the application of an LSTM-BRNN to a two-dimensional problem in the prediction of protein contact maps. Protein contact maps describe the intra-sequence distance between each residue pairing at a distance cuto , providing key restraints towards the possible conformations of a protein. This work, entitled SPOT-Contact, introduced the coupling of two-dimensional LSTM-BRNNs with ResNets to maximise dependency propagation in order to achieve the highest reported accuracies for contact map preci- sion. Several models of varying architectures were trained and combined as an ensemble predictor in order to minimise incorrect generalisations. Chapter 4 discusses the utilisation of an ensemble of LSTM-BRNNs and ResNets to predict local protein one-dimensional structural properties. The method, called SPOT-1D, predicts for a wide range of local structural descriptors, including several solvent exposure metrics, secondary structure, and real-valued backbone angles. SPOT-1D was signi cantly improved by the inclusion of the outputs of SPOT-Contact in the input features. Using this topology led to the best reported accuracy metrics for all predicted properties. The protein structures constructed by the backbone angles predicted by SPOT-1D achieved the lowest average error from their native structures in the literature. Chapter 5 presents an update on SPOT-Disorder, as it employs the inputs from SPOT- 1D in conjunction with an ensemble of LSTM-BRNN's and Inception Residual Squeeze and Excitation networks to predict for protein intrinsic disorder. This model con rmed the enhancement provided by utilising the coupled architectures over the LSTM-BRNN solely, whilst also introducing a new convolutional format to the bioinformatics eld. The work in Chapter 6 utilises the same topology from SPOT-1D for single-sequence prediction of protein intrinsic disorder in SPOT-Disorder-Single. Single-sequence predic- tion describes the prediction of a protein's properties without the use of evolutionary information. While evolutionary information generally improves the performance of a computational model, it comes at the expense of a greatly increased computational and time load. Removing this from the model allows for genome-scale protein analysis at a minor drop in accuracy. However, models trained without evolutionary profi les can be more accurate for proteins with limited and therefore unreliable evolutionary information.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Protein Structure Prediction by Recurrent and Convolutional Deep Neural Network Architectures

Abstract

Talk to us

Similar Papers

Lead the way for us

Similar Papers

Predicting protein distance maps according to physicochemical properties.
Gualberto Asencio Cortés ...
Journal of Integrative Bioinformatics | VOL. 8
Gualberto Asencio Cortés, et. al.Gualberto Asencio Cortés ...
16 Sep 2011
Journal of Integrative Bioinformatics | VOL. 8

Hybrid computational models for protein sequence analysis and secondary structure prediction

-

09 Jan 2017
09 Jan 2017

Addressing One-Dimensional Protein Structure Prediction Problems with Machine Learning Techniques

-

29 Nov 2018
29 Nov 2018

Protein Structure Prediction
Mohammed J Zaki
-
Mohammed J ZakiMohammed J Zaki
01 Jan 2008
01 Jan 2008

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Protein Structure Prediction by Recurrent and Convolutional Deep Neural Network Architectures

Abstract

Talk to us

Similar Papers