Abstract

We have developed an actor-critic-type policy-based reinforcement learning (RL) method to find low-energy nanoparticle structures and compared its effectiveness to classical basin-hopping. We took a molecule building approach where nanoalloy particles can be regarded as metallic molecules, albeit with much higher flexibility in structure. We explore the strengths of our approach by tasking an agent with the construction of stable mono- and bimetallic clusters. Following physics, an appropriate reward function and an equivariant molecular graph representation framework is used to learn the policy. The agent succeeds in finding well-known stable configuration for small clusters in both single and multicluster experiments. However, for certain use cases the agent lacks generalization to avoid overfitting. We relate this to the pitfalls of actor-critic methods for molecular design and discuss what learning properties an agent will require to achieve universality.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call