Abstract

Machine learning inspired potentials continue to improve the ability for predicting structures of materials. However, many challenges still exist, particularly when calculating structures of disordered systems. These challenges are primarily due to the rapidly increasing dimensionality of the feature-vector space which in most machine-learning algorithms is dependent on the size of the structure. In this article, we present a feature-engineered approach that establishes a set of principles for representing potentials of physical structures (crystals, molecules, and clusters) in a feature space rather than a physically motivated space. Our goal in this work is to define guiding principles that optimize information storage of the physical parameters within the feature representations. In this manner, we focus on keeping the dimensionality of the feature space independent of the number of atoms in the structure. Our Structural Information Filtered Features (SIFF) potential represents structures by utilizing a feature vector of low-correlated descriptors, which correspondingly maximizes information within the descriptor. We present results of our SIFF potential on datasets composed of disordered (carbon and carbon–oxygen) clusters, molecules with C7O2H2 stoichiometry in the GDB9-14B dataset, and crystal structures of the form (AlxGayInz)2O3 as proposed in the NOMAD Kaggle competition. Our potential's performance is at least comparable, sometimes significantly more accurate, and often more efficient than other well-known machine-learning potentials for structure prediction. However, primarily, we offer a different perspective on how researchers should consider opportunities in maximizing information storage for features.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call