Abstract

<span>Predicting a protein’s secondary structure is crucial for understanding the working of proteins. Despite advancements over the years, the top predictors have achieved only 80% Q8 accuracy when sequence profile information is the sole input. An ensemble approach is proposed using convolutional neural network (CNN) and a classifier known as support vector machine (SVM) on both the partial and the whole CullPDB datasets. The protein secondary structure (PSS) has a complex hierarchical structure, as well as the ability to take into account the reliance between neighbouring labels. A detailed experiment yielding high levels of Q8 accuracy with scores of 97.91%, 85.13%, and 78.02% using 20%, 80%, and 100% respectively of the protein residues on the new predicted dataset CullPDB6133 which is better than the accuracies predicted by similar models. The proposed methodology highlights the use of CNN as a general framework, for efficiently predicting eight-state (Q8) accuracy of secondary protein structures with a low time and space complexity.</span>

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call