Predicting olfactory perceptions from odorant molecules is challenging due to the complex and potentially discontinuous nature of the perceptual space for smells. In this study, we introduce a deep learning model, Mol-PECO (Molecular Representation by Positional Encoding of Coulomb Matrix), designed to predict olfactory perceptions based on molecular structures and electrostatics. Mol-PECO learns the efficient embedding of molecules by utilizing the Coulomb matrix, which encodes atomic coordinates and charges, as an alternative of the adjacency matrix and its Laplacian eigenfunctions as positional encoding of atoms. With a comprehensive dataset of odor molecules and descriptors, Mol-PECO outperforms traditional machine learning methods using molecular fingerprints and graph neural networks based on adjacency matrices. The learned embeddings by Mol-PECO effectively capture the odor space, enabling global clustering of descriptors and local retrieval of similar odorants. This work contributes to a deeper understanding of the olfactory sense and its mechanisms.
Read full abstract