Real world projects, real faults: evaluating spectrum based fault localization techniques on Python projects

Ratnadira Widyasari,David Lo,Gede Artha Azriadi Prana,Shaowei Wang,Stefanus Agus Haryono

doi:10.1007/s10664-022-10189-4

Ratnadira Widyasari, David Lo + Show 3 more

Open Access

https://doi.org/10.1007/s10664-022-10189-4

Copy DOI

Abstract

Spectrum Based Fault Localization (SBFL) is a statistical approach to identify faulty code within a program given a program spectra (i.e., records of program elements executed by passing and failing test cases). Several SBFL techniques have been proposed over the years, but most evaluations of those techniques were done only on Java and C programs, and frequently involve artificial faults. Considering the current popularity of Python, indicated by the results of the Stack Overflow survey among developers in 2020, it becomes increasingly important to understand how SBFL techniques perform on Python projects. However, this remains an understudied topic. In this work, our objective is to analyze the effectiveness of popular SBFL techniques in real-world Python projects. We also aim to compare our observed performance on Python to previously-reported performance on Java. Using the recently-built bug benchmark BugsInPy as our fault dataset, we apply five popular SBFL techniques (Tarantula, Ochiai, OP, Barinel, and DStar) and analyze their performances. We subsequently compare our results with results from Java and C projects reported in earlier related works. We find that 1) the real faults in BugsInPy are harder to identify using SBFL techniques compared to the real faults in Defects4J, indicated by the lower performance of the evaluated SBFL techniques on BugsInPy; 2) older techniques such as Tarantula, Barinel, and Ochiai consistently outperform newer techniques (i.e., OP and DStar) in a variety of metrics and debugging scenarios; 3) claims in preceding studies done on artificial faults in C and Java (such as “OP outperforms Tarantula”) do not hold on Python real faults; 4) lower-performing techniques can outperform higher-performing techniques in some cases, emphasizing the potential benefit of combining SBFL techniques. Our results yield insight into how popular SBFL techniques perform in real Python faults and emphasize the importance of conducting SBFL evaluations on real faults.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Empirical Software Engineering	Publication Date: Aug 6, 2022
Citations: 6	License type: cc-by-nc-nd

R Discovery Prime

R Discovery Prime

Real world projects, real faults: evaluating spectrum based fault localization techniques on Python projects

Abstract

Talk to us

Similar Papers

More From: Empirical Software Engineering

Lead the way for us

Similar Papers

MCFL: Improving Fault Localization by Differentiating Missing Code and Other Faults
Zijie Li ... Zhenyu Zhang
-
Zijie Li, et. al.Zijie Li ... Zhenyu Zhang
01 Jul 2020
01 Jul 2020

An Empirical Study of Boosting Spectrum-Based Fault Localization via PageRank
Mengshi Zhang ... Sarfraz Khurshid
IEEE Transactions on Software Engineering | VOL. 47
Mengshi Zhang, et. al.Mengshi Zhang ... Sarfraz Khurshid
01 Jun 2021
IEEE Transactions on Software Engineering | VOL. 47

VFL: Variable-based fault localization
Jeongho Kim ... Jindae Kim
Information and Software Technology | VOL. 107
Jeongho Kim, et. al.Jeongho Kim ... Jindae Kim
30 Nov 2018
Information and Software Technology | VOL. 107

Boosting spectrum-based fault localization using PageRank
Mengshi Zhang ... Sarfraz Khurshid
-
Mengshi Zhang, et. al.Mengshi Zhang ... Sarfraz Khurshid
10 Jul 2017
10 Jul 2017

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Real world projects, real faults: evaluating spectrum based fault localization techniques on Python projects

Abstract

Talk to us

Similar Papers

More From: Empirical Software Engineering