This study focused on determining linear B-cell epitopes using a dataset of 50,000 experimentally validated linear epitopes from IEDB. The analysis showed that sequence-derived features were crucial for predicting BCEs. A deep learning framework was developed to improve feature representation and efficiency. Despite the power of deep neural networks, they require regularization technique to perform well. However, applying dropout to crucial recurrent connections in networks like LSTM can disrupt information flow. To address this, we introduce a novel method based on selective gradient dropout for deep fully connected layers. This novel method selectively freezes specific connections during training, improving network sparsity and efficiency while maintaining performance. Experimental results confirm its effectiveness in mitigating overfitting and enhancing computational efficiency. Experimental results show that this sparse network outperforms traditional methods in classification tasks achieving an average 86.3% accuracy. Thus being a significant improvement compared to existing tools.
Read full abstract