Abstract
In recent years, deep learning (DL) techniques have been applied in the structural and functional analysis of proteins in bioinformatics, especially in 8-state (Q8) protein secondary structure prediction (PSSP). In this paper, we have explored the performance of various DL architectures for Q8 PSSP, by developing six DL architectures, using convolutional neural networks (CNNs), recurrent neural networks (RNNs), and combinations of them. These architectures are: CNN-SW (CNNs with sliding window); CNN-WP (CNNs with whole protein as input); LSTM+ (Long Short-Term Memory (LSTM) & Bidirectional LSTM (BLSTM)); GRU+ (Gated Recurrent Unit (GRU) & bidirectional GRU (BGRU)); CNN-BGRU (CNNs & BGRUs); and CNN-BLSTM (CNNs & BLSTMs). They include batch normalization, dropout, and fully-connected layers. We have used CB6133 and CB513 datasets for training and testing, respectively. The experiments showed that combining CNN with BLSTM or BGRU overcame overfitting, and achieved better Q8 accuracy, precision, recall and F-score. The experiments on CB513 showed that CNN-SW, CNN-BGRU, and CNN-BLSTM achieved Q8 accuracy comparable with some state-of-the-art models.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.