Abstract
The molecular structure forms one of the basic ingredients used to model the relationship between chemical compounds and their properties. If neural networks are used as a technique to build this model, the representation of the molecular structure has to fulfil some constraints. This study first draws up an inventory of a number of commonly used representations and discusses their applicability in combination with neural networks. Many representations are not easy to obtain or lack information on the atom types. Therefore an alternative molecular structure representation, i.e. type distance counting (TDC), has been introduced which is based on distances between atom pairs and the respective atom types. The potential of TDC has been demonstrated by virtue of a simple experiment in which neural networks were used to predict the boiling points of a small set of alkanes and alkenes represented by TDC. Next, with a case study, i.e. modelling the relationship between chemical compounds and the high performance liquid chromatography retention index with neural networks, a comparison of a number of molecular structure representations has been made. The TDC outperformed all other representations used. Then, for this particular problem, the neural network performance has been determined for a number of possible TDC representations. Finally, these results have been compared with the outcome of experiments using the same data sets of chemical compounds and retention indices, but other technique/representation combinations: neural networks/fragment coding, and expert systems/fragment coding. Neural networks combined with the TDC representation appeared to give the best overall performance.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.