Combining classifiers for improved classification of proteins from sequence or structure.

Iain Melvin,William S Noble,Christina S Leslie,Jason Weston

doi:10.1186/1471-2105-9-389

Abstract

BackgroundPredicting a protein's structural or functional class from its amino acid sequence or structure is a fundamental problem in computational biology. Recently, there has been considerable interest in using discriminative learning algorithms, in particular support vector machines (SVMs), for classification of proteins. However, because sufficiently many positive examples are required to train such classifiers, all SVM-based methods are hampered by limited coverage.ResultsIn this study, we develop a hybrid machine learning approach for classifying proteins, and we apply the method to the problem of assigning proteins to structural categories based on their sequences or their 3D structures. The method combines a full-coverage but lower accuracy nearest neighbor method with higher accuracy but reduced coverage multiclass SVMs to produce a full coverage classifier with overall improved accuracy. The hybrid approach is based on the simple idea of "punting" from one method to another using a learned threshold.ConclusionIn cross-validated experiments on the SCOP hierarchy, the hybrid methods consistently outperform the individual component methods at all levels of coverage.Code and data sets are available at

Highlights

Predicting a protein's structural or functional class from its amino acid sequence or structure is a fundamental problem in computational biology
In 1999, Jaakkola et al [8] first applied the support vector machine (SVM) classifier [9] to the problem of predicting a protein's structural class from its amino acid sequence. They focused on a particular protein structural hierarchy called the Structural Classification of Proteins (SCOP) [10], and they trained support vector machines (SVMs) to recognize novel families
We aim to address a fundamental limitation of any SVM-based method, namely, that an SVM can only be trained when a sufficient number of training examples are available

Summary

Introduction

Predicting a protein's structural or functional class from its amino acid sequence or structure is a fundamental problem in computational biology. In 1999, Jaakkola et al [8] first applied the support vector machine (SVM) classifier [9] to the problem of predicting a protein's structural class from its amino acid sequence. They focused on a particular protein structural hierarchy called the Structural Classification of Proteins (SCOP) [10], and they trained SVMs to recognize novel families (page number not for citation purposes)

Objectives

Results

Conclusion

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: BMC bioinformatics	Publication Date: Sep 22, 2008
Citations: 42	License type: cc-by

R Discovery Prime

R Discovery Prime

Combining classifiers for improved classification of proteins from sequence or structure.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: BMC bioinformatics

Lead the way for us

Similar Papers

SVM-Fold: a tool for discriminative multi-class protein fold and superfamily recognition
Iain Melvin ... William Stafford Noble
BMC Bioinformatics | VOL. 8
Iain Melvin, et. al.Iain Melvin ... William Stafford Noble
01 May 2007
BMC Bioinformatics | VOL. 8

Semi-Supervised Protein Classification Using Cluster Kernels
Weston Jason ... Leslie Christina
-
Weston Jason, et. al.Weston Jason ... Leslie Christina
22 Sep 2006
22 Sep 2006

An Observational Study on the Mode of Action of Ferulic Acid on Common Dental Pathogens – An in Silico Approach
A S Smiline Girija ... R Nivethitha
Journal of Pharmaceutical Research International | VOL. -
A S Smiline Girija, et. al.A S Smiline Girija ... R Nivethitha
25 Aug 2020
Journal of Pharmaceutical Research International | VOL. -

Efficient and Interpretable Prediction of Protein Functional Classes by Correspondence Analysis and Compact Set Relations
Jia-Ming Chang ... Jean-Francois Taly
PLoS ONE | VOL. 8
Jia-Ming Chang, et. al.Jia-Ming Chang ... Jean-Francois Taly
11 Oct 2013
PLoS ONE | VOL. 8

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Combining classifiers for improved classification of proteins from sequence or structure.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: BMC bioinformatics