Abstract

Nuclear receptors are key transcription factors that regulate crucial gene networks responsible for cell growth, differentiation, and homeostasis. Nuclear receptors form a superfamily of phylogenetically related proteins and control functions associated with major diseases (e.g. diabetes, osteoporosis, and cancer). In this study, a novel method has been developed for classifying the subfamilies of nuclear receptors. The classification was achieved on the basis of amino acid and dipeptide composition from a sequence of receptors using support vector machines. The training and testing was done on a non-redundant data set of 282 proteins obtained from the NucleaRDB data base (1). The performance of all classifiers was evaluated using a 5-fold cross validation test. In the 5-fold cross-validation, the data set was randomly partitioned into five equal sets and evaluated five times on each distinct set while keeping the remaining four sets for training. It was found that different subfamilies of nuclear receptors were quite closely correlated in terms of amino acid composition as well as dipeptide composition. The overall accuracy of amino acid composition-based and dipeptide composition-based classifiers were 82.6 and 97.5%, respectively. Therefore, our results prove that different subfamilies of nuclear receptors are predictable with considerable accuracy using amino acid or dipeptide composition. Furthermore, based on above approach, an online web service, NRpred, was developed, which is available at www.imtech.res.in/raghava/nrpred.

Highlights

  • Of nuclear receptors are predictable with considerable In this report, we have made an attempt to develop a method accuracy using amino acid or dipeptide composition

  • Amino acid composition and dipeptide composition were used to transform the variable lengths of proteins to fixed length patterns

  • The classifiers were developed using the support vector machines, because it was shown in the past that SVM is better at classifying the biological data in comparison with the artificial neural network [25,26]

Read more

Summary

Introduction

Of nuclear receptors are predictable with considerable In this report, we have made an attempt to develop a method accuracy using amino acid or dipeptide composition. Amino acid and dipeptide compositions are simplistic approaches for producing patterns of fixed length from the protein sequences of varying length [8]. Amino acid composition has been used to predict the structural class of domains and the subcellular localization of proteins (9 –11). MCC is a better parameter for evaluating the performance of a method, as it accounts for both over- and under-predictions The performance of both classifiers has been estimated through a 5-fold cross-validation test. It was found that various subfamilies of nuclear receptors are correlated with amino acid or dipeptide composition, implying that the subfamilies of nuclear receptors are predictable to a highly accurate extent if good training data can be established. The method is available via the World Wide Web at www.imtech.res.in/raghava/nrpred

Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.