Symbolic Math Reasoning with Language Models

Vedant Gaur,Nikunj Saunshi

doi:10.1109/urtc56832.2022.10002218

Abstract

The emergence of large language models (LLMs) such as OpenAI’s GPT-3, Google’s LaMDA, Meta’s OPT [2, 3, 7, 10] etc. have revolutionized the field of natural language processing (NLP). These models with upwards of hundreds of billions of parameters are trained on large unlabeled text corpora and can subsequently solve downstream tasks with little to no labeled data. While these models are increasingly versatile in their abilities, e.g., solving math word problems, the larger question of their ability to reason remains. Using and modifying the SVAMP dataset, we find that GPT-3’s davinci-002 model, in addition to having good performance on numerical math word problems, also performs well on the potentially harder symbolic version of the same problems. Furthermore, adopting a two-step approach (solve symbolically and then substitute numerical values) leads to better accuracy on the numerical test set in the zero-shot regime. Additionally, we find that the use of specific prompting techniques pushes the model, in many cases, to actively describe its thought process and aid in the final answer output when faced with a complex, multi-step problem, aligning with recent observations.

Full Text