Abstract

Protein Secondary Structure (PSS) prediction emerges as a hot topic in the area of bioinformatics.PSS helps to predict the tertiary structure and helps to understand its structures, which in turn helps to design various drugs. The existing PSS prediction techniques are capable of achieving Q3 accuracy of nearly 80% and have no improvement till now. In this paper, we propose a novel technique that uses amino acid sequences alone as an input feature and the respected feature vector matrix is given through the deep learning model (DLM) for PSS prediction. We use OneHotEncoding and LSTM (Long Short Term Memory) technique to forecast PSS that helps to achieve more accuracy. The OneHotEncoder is used to extract the local contexts of amino-acid sequences, and LSTM captures the long-distance interdependencies among aminoacids. The overall implementation is carried in MATLAB 2020a. The performance of this model is evaluated in terms of precision, recall, F1-score, and by the percentage of accuracy of both Q3 and Q8 secondary structure predictions. The Q3 structure of the proposed scheme gained 86.54, 85.2 and 85.7%CullPDB, CASP10, and CASP11 and the accuracy of Q8 is 77.8, 72.5 and 74.9% on the benchmark datasets such as CullPDB, CASP10, and CASP11 respectively. Some of the advantages of the proposed scheme are minimize the computation time and achieves better accuracy when compared to the other baseline models in the prediction of PSS.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call