Abstract
The solubility of proteins stands as a pivotal factor in the realm of pharmaceutical research and production. Addressing the imperative to enhance production efficiency and curtail experimental costs, the demand arises for computational models adept at accurately predicting solubility based on provided datasets. Prior investigations have leveraged deep learning models and feature engineering techniques to distill features from raw protein sequences for solubility prediction. However, these methodologies have not thoroughly delved into the interdependencies among features or their respective magnitudes of significance. This study introduces HybridGCN, a pioneering Hybrid Graph Convolutional Network that elevates solubility prediction accuracy through the combination of diverse features, encompassing sophisticated deep-learning features and classical biophysical features. An exploration into the intricate interplay between deep-learning features and biophysical features revealed that specific biophysical attributes, notably evolutionary features, complement features extracted by advanced deep-learning models. Augmenting the model’s capability for feature representation, we employed ESM, a substantial protein language model, to derive a zero-shot learning feature capturing comprehensive and pertinent information concerning protein functions and structures. Furthermore, we proposed a novel feature fusion module termed Adaptive Feature Re-weighting (AFR) to integrate multiple features, thereby enabling the fine-tuning of feature importance. Ablation experiments and comparative analyses attest to the efficacy of the HybridGCN approach, culminating in state-of-the-art performances on the public eSOL and S. cerevisiae datasets.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.