A knowledge graph (KG) is a technique for modeling entities and their interrelations. Knowledge graph embedding (KGE) translates these entities and relationships into a continuous vector space to facilitate dense and efficient representations. In the domain of chemistry, applying KG and KGE techniques integrates heterogeneous chemical information into a coherent and user-friendly framework, enhances the representation of chemical data features, and is beneficial for downstream tasks, such as chemical property prediction. This paper begins with a comprehensive review of classical and contemporary KGE methodologies, including distance-based models, semantic matching models, and neural network-based approaches. We then catalogue the primary databases employed in chemistry and biochemistry that furnish the KGs with essential chemical data. Subsequently, we explore the latest applications of KG and KGE in chemistry, focusing on risk assessment, property prediction, and drug discovery. Finally, we discuss the current challenges to KG and KGE techniques and provide a perspective on their potential future developments.
Read full abstract