An Empirical Study on Real Bugs for Machine Learning Programs

Xiaobing Sun,Hui Yang,Jiajun Hu,Gengjie Li,Tianchi Zhou,Bin Li

doi:10.1109/apsec.2017.41

Abstract

Due to the availability of various open source Machine Learning (ML) tools and libraries, developers nowadays can easily implement their purposes by just invoking machine learning APIs without knowing the details of the algorithm. However, the owners of ML tools and libraries usually pay more attention to the correctness and functionality of their algorithm, while spending much less effort on maintaining their code and keeping their code at a high quality level. Considering the popularity of machine learning in today's world, low quality ML tools and libraries can have a huge impact on the software products that use ML algorithms. So in this paper, we conduct an empirical study on real machine learning bugs to examine their patterns and how they evolve over time. We collect three popular machine learning projects on Github, and manually analyzed 329 closed bugs from the perspectives of their bug category, fix pattern, fix scale, fix duration, and type of software maintenance. The results show that (1) there are seven categories of bugs in machine learning programs; (2) twelve different fix patterns are commonly used to fix the bugs; (3) 63.83% of the patches belong to micro-scale-fix and small-scale-fix, and 68.39% of the bugs are fixed within one month; (4) 47.77% of the bug fixes belong to corrective activity from the view of software maintenance.

Full Text