Protein secondary structure prediction (PSSP) is a pivotal research endeavour that plays a crucial role in the comprehensive elucidation of protein functions and properties. Current prediction methodologies are focused on deep-learning techniques, particularly focusing on multi-factor features. Diverging from existing approaches, in this study, we placed special emphasis on the effects of amino acid properties and protein secondary structure propensity scores (SSPs) on secondary structure during the meticulous selection of multi-factor features. This differential feature-selection strategy results in a distinctive and effective amalgamation of the sequence and property features. To harness these multi-factor features optimally, we introduced a hybrid deep feature extraction model. The model initially employs mechanisms such as dilated convolution (D-Conv) and a channel attention network (SENet) for local feature extraction and targeted channel enhancement. Subsequently, a combination of recurrent neural network variants (BiGRU and BiLSTM), along with a transformer module, was employed to achieve global bidirectional information consideration and feature enhancement. This approach to multi-factor feature input and multi-level feature processing enabled a comprehensive exploration of intricate associations among amino acid residues in protein sequences, yielding a Q3 accuracy of 84.9% and an Sov score of 85.1%. The overall performance surpasses that of the comparable methods. This study introduces a novel and efficient method for determining the PSSP domain, which is poised to deepen our understanding of the practical applications of protein molecular structures.
Read full abstract