The protein secondary structure prediction (PSSP) is pivotal for predicting tertiary structure, which is proliferating in demand for drug design and development. Further, it can be used to learn different protein functions. Although there are many computational methods for protein structure prediction, none of them have succeeded 100% in solving the protein structure prediction problem. This paper proposes a deep learning model, namely Cascaded Feature Learning Model (CFLM) that uses a multi-stage transfer learning approach based on the Residual Dense Network (RDN) for predicting protein secondary structure. The model is trained with different protein datasets at various levels of transfer learning and accepts selected protein features such as solvent accessibility, physicochemical properties, PSSM (position-specific scoring matrix), and PSFM (position-specific frequency matrix) as input. The Q3 and Q8 accuracy obtained on CASP 13 and 14 benchmark datasets with the proposed approach are 91.23% and 91.45%, and 76.83% and 78.04%, respectively. The performance of CFLM is also compared with some of the recent PSSP approaches and it is observed that the proposed approach improves the prediction accuracy by 2.8% in terms of Q3 and 1.81% in terms of Q8 in the CASP 14 dataset.
Read full abstract