Molecular Property Prediction (MPP) is a fundamental task in important research fields such as chemistry, materials, biology, and medicine, where traditional computational chemistry methods based on quantum mechanics often consume substantial time and computing power. In recent years, machine learning has been increasingly used in computational chemistry, in which graph neural networks have shown good performance in molecular property prediction tasks, but they have some limitations in terms of generalizability, interpretability, and certainty. In order to address the above challenges, a Multiscale Molecular Structural Neural Network (MMSNet) is proposed in this paper, which obtains rich multiscale molecular representations through the information fusion between bonded and non-bonded "message passing" structures at the atomic scale and spatial feature information "encoder-decoder" structures at the molecular scale; a multi-level attention mechanism is introduced on the basis of theoretical analysis of molecular mechanics in order to enhance the model's interpretability; the prediction results of MMSNet are used as label values and clustered in the molecular library by the K-NN (K-Nearest Neighbors) algorithm to reverse match the spatial structure of the molecules, and the certainty of the model is quantified by comparing virtual screening results across different K-values. Experimental results in the authoritative small molecule dataset QM9 and the macromolecular protein database PDBbind demonstrate that MMSNet has optimal prediction accuracy, model complexity, and generalizability compared with more than ten existing state-of-the-art (SOTA) models in a variety of different types of prediction tasks; it has a great potential for downstream tasks such as chemical research, drug discovery, and material design.
Read full abstract