Abstract
Machine learning plays a role in accelerating drug discovery, and the design of effective machine learning models is crucial for accurately predicting molecular properties. Characterizing molecules typically involves the use of molecular fingerprints and molecular graphs. These are input into a multilayer perceptron (MLP) and variants of graph neural networks, such as graph attention networks (GATs). Due to the diverse types and large dimension of fingerprints, models may contain many features that are relatively irrelevant or redundant; meanwhile, although the GAT excels in handling heterogeneous graph tasks, it lacks the ability to extract collaborative information from neighboring nodes, which is crucial in scenarios where it cannot capture the joint influence of adjacent groups on atoms. To overcome these challenges, we introduce a hybrid model, combining improved GAT and MLP. In GAT, the recurrent neural network is employed to capture collaborative information. To address the dimensionality issue, we propose a feature selection algorithm, which is based on the principle of maximizing relevance while minimizing redundancy. Through experiments on 13 public data sets and 14 breast cell lines, our model demonstrates superior performance compared to state-of-the-art deep learning and traditional machine learning algorithms. Additionally, a series of ablation experiments were conducted to demonstrate the advantages of our improved version, as well as its antinoise capability and interpretability. These results indicate that our model holds promising prospects for practical applications.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.