The development of strong CDCL-based propositional (SAT) solvers has greatly advanced several areas of automated reasoning (AR). One of the directions in AR is therefore to make use of SAT solvers in expressive formalisms such as first-order logic, for which large corpora of general mathematical problems exist today. This is possible due to Herbrand's theorem, which allows reduction of first-order problems to propositional problems by instantiation. The core challenge is synthesizing the appropriate instances from the typically infinite Herbrand universe.In this work, we develop a machine learning system targeting this task, addressing its combinatorial and invariance properties. In particular, we develop a GNN2RNN architecture based on a graph neural network (GNN) that learns from problems and their solutions independently of many symmetries and symbol names (addressing the abundance of Skolems), combined with a recurrent neural network (RNN) that proposes for each clause its instantiations. The architecture is then combined with an efficient ground solver and, starting with zero knowledge, iteratively trained on a large corpus of mathematical problems. We show that the system is capable of solving many problems by such educated guessing, finding proofs for 32.12% of the training set. The final trained system solves 19.74% of the unseen test data on its own. We also observe that the trained system finds solutions that the iProver and CVC5 systems did not find.
Read full abstract