Prediction of protein function using a deep convolutional neural network ensemble

Evangelia I Zacharaki

doi:10.7717/peerj-cs.124

Abstract

BackgroundThe availability of large databases containing high resolution three-dimensional (3D) models of proteins in conjunction with functional annotation allows the exploitation of advanced supervised machine learning techniques for automatic protein function prediction.MethodsIn this work, novel shape features are extracted representing protein structure in the form of local (per amino acid) distribution of angles and amino acid distances, respectively. Each of the multi-channel feature maps is introduced into a deep convolutional neural network (CNN) for function prediction and the outputs are fused through support vector machines or a correlation-based k-nearest neighbor classifier. Two different architectures are investigated employing either one CNN per multi-channel feature set, or one CNN per image channel.ResultsCross validation experiments on single-functional enzymes (n = 44,661) from the PDB database achieved 90.1% correct classification, demonstrating an improvement over previous results on the same dataset when sequence similarity was not considered.DiscussionThe automatic prediction of protein function can provide quick annotations on extensive datasets opening the path for relevant applications, such as pharmacological target identification. The proposed method shows promise for structure-based protein function prediction, but sufficient data may not yet be available to properly assess the method’s performance on non-homologous proteins and thus reduce the confounding factor of evolutionary relationships.

Highlights

Metagenomics has led to a huge increase of protein databases and the discovery of new protein families (Godzik, 2011)
Each of the multi-channel feature maps is introduced into a deep convolutional neural network (CNN) for function prediction and the outputs are fused through support vector machines or a correlation-based k-nearest neighbor classifier
The training samples were used to learn the parameters of the network, as well as the parameters of the subsequent classifiers used during fusion (SVM or k-nearest neighbor (kNN) model)

Summary

Introduction

Metagenomics has led to a huge increase of protein databases and the discovery of new protein families (Godzik, 2011). In this work enzymatic structures from the Protein Data Bank (PDB) are considered and the enzyme commission (EC) number is used as a fairly complete framework for annotation. The availability of large databases containing high resolution threedimensional (3D) models of proteins in conjunction with functional annotation allows the exploitation of advanced supervised machine learning techniques for automatic protein function prediction. The proposed method shows promise for structure-based protein function prediction, but sufficient data may not yet be available to properly assess the method’s performance on non-homologous proteins and reduce the confounding factor of evolutionary relationships

Methods

Results

Discussion

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: PeerJ Computer Science	Publication Date: Jul 17, 2017
Citations: 28	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Prediction of protein function using a deep convolutional neural network ensemble

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: PeerJ Computer Science

Lead the way for us

Similar Papers

Year 2 Report: Protein Function Prediction Platform
C Zhou
-
C ZhouC Zhou
27 Apr 2012
27 Apr 2012

New avenues in protein function prediction
Iddo Friedberg ... Martin Jambon
Protein Science | VOL. 15
Iddo Friedberg, et. al.Iddo Friedberg ... Martin Jambon
01 Jun 2006
Protein Science | VOL. 15

Peer Review #1 of "Prediction of protein function using a deep convolutional neural network ensemble (v0.1)"
A Doig
-
A DoigA Doig
17 Jul 2017
17 Jul 2017

Peer Review #2 of "Prediction of protein function using a deep convolutional neural network ensemble (v0.1)"
-
-
--
17 Jul 2017
17 Jul 2017

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Prediction of protein function using a deep convolutional neural network ensemble

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: PeerJ Computer Science