Abstract

Software testing plays a critical role in the development and the assurance of the software quality. However, the quality of the code that is responsible for testing may, itself, be affected by poor design choices, known as test smells. In the literature, test smells may be interpreted differently by developers, which in turn can lead to different detection tools and results. In our work, we have selected the mostly commonly used detection tools and have investigated their overall agreement across different projects and different test smells. The found results were evaluated according to the average level of agreement, where we observed a definite disagreement between the tools. To overcome this gap of misinterpretation, we propose in this paper a multi-label classification approach to detect test smells based on a deep representation of the test code. We conducted our experiments using 4 problem-transformation techniques and 4 ensemble techniques. To evaluate our experimental results, we built a benchmark using a tool-based approach for labelling and made it publicly available. Binary Relevance and RAkEL are found to be the best multi-label techniques that achieve high performance results.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.