Abstract
Protein secondary structure prediction is an important topic in bioinformatics. This paper proposed a novel model named WS-BiLSTM, which combined the wavelet scattering convolutional network and the long-short-term memory network for the first time to predict protein secondary structure. This model captures nonlocal interactions between amino acid sequences and remembers long-range interactions between amino acids. In our WS-BiLSTM model, the wavelet scattering convolutional network is used to extract protein features from the PSSM sliding window; the extracted features are combined with the original PSSM data as the input features of the long-short-term memory network to predict protein secondary structure. It is worth noting that the wavelet scattering convolutional network is asymmetric as a member of the continuous wavelet family. The Q3 accuracy on the test set CASP9, CASP10, CASP11, CASP12, CB513, and PDB25 reached 85.26%, 85.84%, 84.91%, 85.13%, 86.10%, and 85.52%, which were higher 2.15%, 2.16%, 3.5%, 3.19%, 4.22%, and 2.75%, respectively, than using the long-short-term memory network alone. Comparing our results with the state-of-art methods shows that our proposed model achieved better results on the CB513 and CASP12 data sets. The experimental results show that the features extracted from the wavelet scattering convolutional network can effectively improve the accuracy of protein secondary structure prediction.
Highlights
Protein is an essential component of organisms, complete immunity, cellular signal transmission, and other functions
This paper proposes a protein secondary structure prediction method based on the wavelet scattering convolutional network and long-short-term memory network
In order to evaluate the accuracy of the proposed model and verify the effectiveness of the wavelet scattering convolutional network for feature extraction, two separate experiments were set up to predict the protein secondary structure
Summary
Protein is an essential component of organisms, complete immunity, cellular signal transmission, and other functions. Protein structure can be divided into primary, secondary, tertiary, and quaternary structures. Inspired by the great success in the fields of computer vision [1], speech recognition [2], and emotion classification [3], the method based on deep learning has been widely used in many biological research fields [4,5]. Examples include protein contact map [6], drugtarget binding affinity [7,8], chromatin accessibility [9], protein function [10,11], and using Support Vector Machine (SVM) to solve the problem of protein structure prediction [12]. The main advantage of the deep learning method is that it can automatically represent the original sequence and learn hidden patterns through nonlinear transformation [13]
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.