On testing machine learning programs

Houssem Ben Braiek,Foutse Khomh

doi:10.1016/j.jss.2020.110542

Abstract

Nowadays, we are witnessing a wide adoption of Machine learning (ML) models in many software systems. They are even being tested in safety-critical systems, thanks to recent breakthroughs in deep learning and reinforcement learning. Many people are now interacting with systems based on ML every day, e.g., voice recognition systems used by virtual personal assistants like Amazon Alexa or Google Home. As the field of ML continues to grow, we are likely to witness transformative advances in a wide range of areas, from finance, energy, to health and transportation. Given this growing importance of ML-based systems in our daily life, it is becoming utterly important to ensure their reliability. Recently, software researchers have started adapting concepts from the software testing domain (e.g., code coverage, mutation testing, or property-based testing) to help ML engineers detect and correct faults in ML programs. This paper reviews current existing testing practices for ML programs. First, we identify and explain challenges that should be addressed when testing ML programs. Next, we report existing solutions found in the literature for testing ML programs. Finally, we identify gaps in the literature related to the testing of ML programs and make recommendations of future research directions for the scientific community. We hope that this comprehensive review of software testing practices will help ML engineers identify the right approach to improve the reliability of their ML-based systems. We also hope that the research community will act on our proposed research directions to advance the state of the art of testing for ML programs.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

On testing machine learning programs

Abstract

Talk to us

Similar Papers

More From: Journal of Systems and Software

Lead the way for us

Journal: Journal of Systems and Software	Publication Date: Feb 7, 2020
Citations: 113

Similar Papers

Petuum
Eric P Xing ... Jinliang Wei
-
Eric P Xing, et. al.Eric P Xing ... Jinliang Wei
10 Aug 2015
10 Aug 2015

Use of a Machine Learning Program to Correctly Triage Incoming Text Messaging Replies From a Cardiovascular Text-Based Secondary Prevention Program: Feasibility Study.
Nicole Lowres ... Clara K Chow
JMIR mHealth and uHealth | VOL. 8
Nicole Lowres, et. al.Nicole Lowres ... Clara K Chow
16 Jun 2020
JMIR mHealth and uHealth | VOL. 8

P571Accuracy of a machine learning program to correctly triage incoming SMS text replies from a successful cardiovascular SMS-based secondary prevention program
N Lowres ... C K Chow
European Heart Journal | VOL. 40
N Lowres, et. al.N Lowres ... C K Chow
01 Oct 2019
European Heart Journal | VOL. 40

SynEva: Evaluating ML Programs by Mirror Program Synthesis
Yi Qin ... Chang Xu
-
Yi Qin, et. al.Yi Qin ... Chang Xu
01 Jul 2018
01 Jul 2018

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

On testing machine learning programs

Abstract

Talk to us

Similar Papers

More From: Journal of Systems and Software