Abstract

Due to the increasing gap between structure-determined and sequenced proteins, prediction of protein structural classes has been an important problem. It is very important to use efficient sequential parameters for developing class predictors because of the close sequence-structure relationship. The multinomial logistic regression model was used for the first time to evaluate the contribution of sequence parameters in determining the protein structural class. An in-house program generated parameters including single amino acid and all dipeptide composition frequencies. Then, the most effective parameters were selected by a multinomial logistic regression. Selected variables in the multinomial logistic model were Valine among single amino acid composition frequencies and Ala–Gly, Cys–Arg, Asp–Cys, Glu–Tyr, Gly–Glu, His–Tyr, Lys–Lys, Leu–Asp, Leu–Arg, Pro–Cys, Gln–Met, Gln–Thr, Ser–Trp, Val–Asn and Trp–Asn among dipeptide composition frequencies. Also a neural network model was constructed and fed by the parameters selected by multinomial logistic regression to build a hybrid predictor. In this study, self-consistency and jackknife tests on a database constructed by Zhou [1998. An intriguing controversy over protein structural class prediction. J. Protein Chem. 17(8), 729–738] containing 498 proteins are used to verify the performance of this hybrid method, and are compared with some of prior works. The results showed that our two-stage hybrid model approach is very promising and may play a complementary role to the existing powerful approaches.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call