Abstract

Recently, molecular representation and property exploration, with the combination of neural network, play a critical role in the field of drug design and discovery for assisting in drug related research. However, previous research in molecular representation relies heavily on artificial extraction of features based on biological experiments which may result in a manually introduced noise of molecular information with high cost in time and money. In this paper, a novel method named Substructural Hierarchical Attention Network (SuHAN) is proposed to discover inherent characteristics of molecules for representation learning. Specifically, SuHAN is composed of the cascaded layer: atom-level layer and substructure-level layer. Molecule in the SMILES format is divided into several substructural fragments by predefined partition rules, and then they are fed into atom-level layer and substructure-level layer successively to obtain feature representation from different perspective: atomic view and substructural view. In this way, the prominent structural features that may be omitted in global extraction are excavated from a fine-grained viewpoint and fused to reconstruct representative pattern in an overall view. Experiments on biophysics and physiology datasets demonstrate that our model is competitive with a significant improvement of both accuracy and stability in performance. We confirmed that the substructural segments and progressive hierarchical networks lead to an effective molecular representation for downstream tasks. These results provide a novel perspective about reconstructing overall pattern through local prominent structure.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.