Out-of-the-box deep learning prediction of pharmaceutical properties by broadly learned knowledge-based molecular representations

Wan Xiang Shen,Yu Zong Chen,Feng Zhu,Chu Qin,Xian Zeng,Ya Li Wang,Ying Tan,Yu Yang Jiang

doi:10.1038/s42256-021-00301-6

Abstract

Successful deep learning critically depends on the representation of the learned objects. Recent state-of-the-art pharmaceutical deep learning models successfully exploit graph-based de novo learning of molecular representations. Nonetheless, the combined potential of human expert knowledge of molecular representations and convolution neural networks has not been adequately explored for enhanced learning of pharmaceutical properties. Here we show that broader exploration of human-knowledge-based molecular representations enables more enhanced deep learning of pharmaceutical properties. By broad learning of 1,456 molecular descriptors and 16,204 fingerprint features of 8,506,205 molecules, a new feature-generation method MolMap was developed for mapping these molecular descriptors and fingerprint features into robust two-dimensional feature maps. Convolution-neural-network-based MolMapNet models were constructed for out-of-the-box deep learning of pharmaceutical properties, which outperformed the graph-based and other established models on most of the 26 pharmaceutically relevant benchmark datasets and a novel dataset. The MolMapNet learned important features that are consistent with the literature-reported molecular features. While deep learning models have allowed the extraction of fingerprints from the structural description of molecules, they can miss information that is present in the molecular descriptors that chemists use. Shen and colleagues present a method to combine both sources of information into two-dimensional fingerprint maps, which can be used in a wide variety of pharmaceutical tasks to predict the properties of drugs.

Full Text