Are NLP Models really able to Solve Simple Math Word Problems?

Arkil Patel,Satwik Bhattamishra,Navin Goyal

doi:10.18653/v1/2021.naacl-main.168

Abstract

The problem of designing NLP solvers for math word problems (MWP) has seen sustained research activity and steady gains in the test accuracy. Since existing solvers achieve high performance on the benchmark datasets for elementary level MWPs containing one-unknown arithmetic word problems, such problems are often considered “solved” with the bulk of research attention moving to more complex MWPs. In this paper, we restrict our attention to English MWPs taught in grades four and lower. We provide strong evidence that the existing MWP solvers rely on shallow heuristics to achieve high performance on the benchmark datasets. To this end, we show that MWP solvers that do not have access to the question asked in the MWP can still solve a large fraction of MWPs. Similarly, models that treat MWPs as bag-of-words can also achieve surprisingly high accuracy. Further, we introduce a challenge dataset, SVAMP, created by applying carefully chosen variations over examples sampled from existing datasets. The best accuracy achieved by state-of-the-art models is substantially lower on SVAMP, thus showing that much remains to be done even for the simplest of the MWPs.

Highlights

We provide strong evidence that the existing Math Word Problem (MWP) solvers rely on shallow heuristics to achieve high performance on the benchmark datasets
A Math Word Problem (MWP) consists of a short lems where the output is a mathematical expression natural language narrative describing a state of involving numbers and one or more arithmetic opthe world and poses a question about some un- erators (+, −, ∗, /)
This indicates that the models can rely cently, ASDiv (Miao et al, 2020) has been proon superficial patterns present in the narrative of posed to provide more diverse problems with anthe MWP and achieve high accuracy without even notations for equation, problem type and grade looking at the question

Summary

Introduction

This indicates that the models can rely cently, ASDiv (Miao et al, 2020) has been proon superficial patterns present in the narrative of posed to provide more diverse problems with anthe MWP and achieve high accuracy without even notations for equation, problem type and grade looking at the question. The presence of these issues in existing bench- Identifying artifacts in datasets has been done marks makes them unreliable for measuring the for the Natural Language Inference (NLI) task by performance of models. Ing SOTA models on SVAMP, we find that they Challenge Sets for NLP tasks have been proare not even able to solve half the problems in the posed most notably for NLI and machine transladataset. We create a challenge set called SVAMP 1 for more robust evaluation of methods developed to solve elementary level math word problems

Related Work

Datasets and Methods

Analyzing the attention weights

SVAMP it appears to be of higher quality and harder than the MAWPS dataset

Protocol for creating variations

B Creation Protocol

Findings

E Ethical Considerations

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Are NLP Models really able to Solve Simple Math Word Problems?

Abstract

Highlights

Summary

Talk to us

Similar Papers

Lead the way for us

Publication Date: Jan 1, 2021
Citations: 41	License type: cc-by

Similar Papers

Automatic Generation of Amharic Math Word Problem and Equation
Andinet Assefa Bekele
Journal of Computer and Communications | VOL. 08
Andinet Assefa BekeleAndinet Assefa Bekele
01 Jan 2020
Journal of Computer and Communications | VOL. 08

Classifying and Solving Arithmetic Math Word Problems—An Intelligent Math Solver
Sourav Mandal ... Sudip Kumar Naskar
IEEE Transactions on Learning Technologies | VOL. 14
Sourav Mandal, et. al.Sourav Mandal ... Sudip Kumar Naskar
01 Feb 2021
IEEE Transactions on Learning Technologies | VOL. 14

Learning Relation-Enhanced Hierarchical Solver for Math Word Problems.
Xin Lin ... Hao Wang
IEEE transactions on neural networks and learning systems | VOL. 35
Xin Lin, et. al.Xin Lin ... Hao Wang
01 Oct 2024
IEEE transactions on neural networks and learning systems | VOL. 35

Attentional cuing in math word problems for girls at-risk for ADHD and their peers in general education settings
Suneeta Kercood ... Kinsey Tom-Wright
Contemporary Educational Psychology | VOL. 37
Suneeta Kercood, et. al.Suneeta Kercood ... Kinsey Tom-Wright
11 Feb 2012
Contemporary Educational Psychology | VOL. 37

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Are NLP Models really able to Solve Simple Math Word Problems?

Abstract

Highlights

Summary

Talk to us

Similar Papers