Abstract

BackgroundBecause a priori knowledge about function of G protein-coupled receptors (GPCRs) can provide useful information to pharmaceutical research, the determination of their function is a quite meaningful topic in protein science. However, with the rapid increase of GPCRs sequences entering into databanks, the gap between the number of known sequence and the number of known function is widening rapidly, and it is both time-consuming and expensive to determine their function based only on experimental techniques. Therefore, it is vitally significant to develop a computational method for quick and accurate classification of GPCRs.ResultsIn this study, a novel three-layer predictor based on support vector machine (SVM) and feature selection is developed for predicting and classifying GPCRs directly from amino acid sequence data. The maximum relevance minimum redundancy (mRMR) is applied to pre-evaluate features with discriminative information while genetic algorithm (GA) is utilized to find the optimized feature subsets. SVM is used for the construction of classification models. The overall accuracy with three-layer predictor at levels of superfamily, family and subfamily are obtained by cross-validation test on two non-redundant dataset. The results are about 0.5% to 16% higher than those of GPCR-CA and GPCRPred.ConclusionThe results with high success rates indicate that the proposed predictor is a useful automated tool in predicting GPCRs. GPCR-SVMFS, a corresponding executable program for GPCRs prediction and classification, can be acquired freely on request from the authors.

Highlights

  • Because a priori knowledge about function of G protein-coupled receptors (GPCRs) can provide useful information to pharmaceutical research, the determination of their function is a quite meaningful topic in protein science

  • The results suggest that the order of these feature groups that contributed to the discrimination GPCRs from nonGPCRs is: Moran autocorrelation (MAuto) > Dipc and D > normalized Moreau-Broto autocorrelation (NMBAuto) > T, Geary autocorrelation (GAuto) and DHP > amino acid composition (AAC) and C

  • The results reveal that the order of these feature groups that contributed to the classification GPCRs into 6 families is: MAuto > dipeptide composition (DipC) > D > NMBAuto > DHP > AAC and GAuto

Read more

Summary

Introduction

Because a priori knowledge about function of G protein-coupled receptors (GPCRs) can provide useful information to pharmaceutical research, the determination of their function is a quite meaningful topic in protein science. In view of the extremely unbalanced state, it is vitally important to develop a computational method that can fast and accurately predict the structure and function of GPCRs from sequence information. The second one is based on similarity searches using primary database search tools (e.g. BLAST, FASTA) and such database searches coupled with searches of pattern databases (PRINTS) [12]. They do not seem to be sufficiently successful for comprehensive functional identification of GPCRs, since GPCRs make up a highly divergent family, and even when they are grouped according to similarity of function, their sequences share strikingly little homology or similarity to each other [13]. The third one is based on statistical and machine learning method, including support vector machines (SVM) [8,14,15,16,17], hidden Markov models (HMMs) [1,3,6,18], covariant discriminant (CD) [7,11,19,20], nearest neighbor (NN) [2,21] and other techniques [13,22,23,24]

Methods
Results
Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.