Protein secondary structures prediction (PSSP) is considered as a challenging task in bioinformatics. Many approaches have been proposed in last few decades in order to solve this problem. Despite the enhancements achieved, the prediction accuracy still remains limited. Accurate prediction of the secondary structure of proteins is a critical step in deducing tertiary structure of proteins and their functions. Among the proposed approaches to tackle this problem, artificial neural networks (ANN) are considered as one of the most successful methods that widely used in the field of PSSP. Recently, many efforts have been devoted to modify, improve and combine this technology with other machine learning methods in order to get better results. In this paper, we have proposed an ensemble method which combines the outputs of four feedforward neural networks. In each network one of the machine learning approaches has been applied in order to solve the class imbalance problem of protein secondary structure classification. The experimental results on RS126 data set show that our ensemble system has better performance compare to the best individual classifier. The results also reveal that the proposed system yields significant improvement in prediction accuracy of beta-sheet structure and a more balanced classification of three secondary structures.
Read full abstract