Особенности разработки полносвязных нейросетей для решения задачи оценивания липофильности органических соединений

Boris I Piakillia,Valery I Goncharov

doi:10.21293/1818-0442-2024-27-1-86-94

Abstract

The assessment of lipophilicity of small organic compounds plays a crucial role in the development and optimization of new drugs. Unfortunately, experimental methods require significant time and resources, including the use of laboratory equipment and reagents. Additionally, manual verification and data adjustment often increase the process's labor intensity. In contrast, computational methods like machine learning offer faster and less resource-intensive ways to assess lipophilicity, allowing for efficient processing of large data volumes and adaptation to complex relationships between molecular structure and lipophilicity. Developing neural network models for lipophilicity assessment is challenging due to insufficient experimental data and high computational costs with graph neural network models. This work presents an analysis of popular methods for describing chemical structures for building fully connected neural network models, less demanding in training data volume. Based on this analysis, features best describing organic compounds from an open lipophilicity dataset collected from the ChEMBL database are selected. The search for the optimal neural network model architecture for the chosen features is conducted.

Full Text