Equivariant Graph-Representation-Based Actor-Critic Reinforcement Learning for Nanoparticle Design.

Jonas Elsborg,Arghya Bhowmik

doi:10.1021/acs.jcim.3c00394

Abstract

We have developed an actor-critic-type policy-based reinforcement learning (RL) method to find low-energy nanoparticle structures and compared its effectiveness to classical basin-hopping. We took a molecule building approach where nanoalloy particles can be regarded as metallic molecules, albeit with much higher flexibility in structure. We explore the strengths of our approach by tasking an agent with the construction of stable mono- and bimetallic clusters. Following physics, an appropriate reward function and an equivariant molecular graph representation framework is used to learn the policy. The agent succeeds in finding well-known stable configuration for small clusters in both single and multicluster experiments. However, for certain use cases the agent lacks generalization to avoid overfitting. We relate this to the pitfalls of actor-critic methods for molecular design and discuss what learning properties an agent will require to achieve universality.

Full Text