Existing machine learning models attempt to predict the energies of large molecules by training small molecules, but eventually fail to retain high accuracy as the errors increase with system size. Through an orbital pairwise decomposition of the correlation energy, a pretrained neural network model on hundred-scale data containing small molecules is demonstrated to be sufficiently transferable for accurately predicting large systems, including molecules and crystals. Our model introduces a residual connection to explicitly learn the pairwise energy corrections, and employs various low-rank retraining techniques to modestly adjust the learned network parameters. We demonstrate that with as few as only one larger molecule retraining the base model originally trained on only small molecules of (H2O)6, the MP2 correlation energy of the large liquid water (H2O)64 in a periodic supercell can be predicted at chemical accuracy. Similar performance is observed for large protonated clusters and periodic poly glycine chains. A demonstrative application is presented to predict the energy ordering of symmetrically inequivalent sublattices for distinct hydrogen orientations in the ice XV phase. Our work represents an important step forward in the quest for cost-effective, highly accurate and transferable neural network models in quantum chemistry, bridging the electronic structure patterns between small and large systems.
Read full abstract