Atom typing is the first step for simulating molecules using a force field. Automatic atom typing for an arbitrary molecule is often realized by rule-based algorithms, which have to manually encode rules for all types defined in this force field. These are time-consuming and force field-specific. In this study, a method that is independent of a specific force field based on graph representation learning is established for automatic atom typing. The topology adaptive graph convolution network (TAGCN) is found to be an optimal model. The model does not need manual enumeration of rules but can learn the rules just through training using typed molecules prepared during the development of a force field. The test on the CHARMM general force field gives a typing correctness of 91%. A systematic error of typing by TAGCN is its inability of distinguishing types in rings or acyclic chains. It originates from the fundamental structure of graph neural networks and can be fixed in a trivial way. More importantly, analysis of the rationalization processes of these models using layer-wise relation propagation reveals how TAGCN encodes rules learned during training. Our model is found to be able to type using the local chemical environments, in a way highly in accordance with chemists' intuition.

Full Text

Published Version
Open DOI Link

Get access to 250M+ research papers

Discover from 40M+ Open access, 3M+ Pre-prints, 9.5M Topics and 32K+ Journals.

Sign Up Now! It's FREE

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call