Abstract

We present an automated approach to verify the correctness of programming assignments, such as the ones that arise in a functional programming course. Our approach takes as input student submissions and reference solutions, and uses equivalence checking to automatically prove or disprove correctness of each submission. To be effective in the context of a real-world programming course, an automated grading system must be both robust, to support programs written in a variety of style, and scalable, to treat hundreds of submissions at once. We achieve robustness by handling recursion using functional induction and by handling auxiliary functions using function call matching. We achieve scalability using a clustering algorithm that leverages the transitivity of equivalence to discover intermediate reference solutions among student submissions. We implement our approach on top of the Stainless verification system, to support equivalence checking of Scala programs. We evaluate our system and its components on over 4000 programs drawn from a functional programming course and from the program equivalence checking literature; this is the largest such evaluation to date. We show that our system is capable of proving program correctness by generating inductive equivalence proofs, and providing counterexamples for incorrect programs, with a high success rate.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call