Adaptive Reward Computation in Reinforcement Learning-Based Continuous Integration Testing

Yang Yang,Zheng Li,Ruilian Zhao,Chaoyue Pan

doi:10.1109/access.2021.3063232

Yang Yang, Zheng Li + Show 2 more

Open Access

https://doi.org/10.1109/access.2021.3063232

Copy DOI

Journal: IEEE Access	Publication Date: Jan 1, 2021
Citations: 43	License type: CC BY 4.0

Affiliation: Beijing University of Chemical Technology

Abstract

Reinforcement learning (RL) has been applied to prioritizing test cases in Continuous Integration (CI) testing, where the reward plays a crucial role. It has been demonstrated that historical information-based reward function can improve the effectiveness of the test case prioritization (TCP). However, the inherent character of frequent iterations in CI can produce a considerable accumulation of historical information, which may decrease TCP efficiency and result in slow feedback. In this paper, the partial historical information is considered in the reward computation, where sliding window techniques are adopted to capture the possible efficient information. Firstly, the fixed-size sliding window is introduced to set a fixed length of recent historical information for each CI test. Then dynamic sliding window techniques are proposed, where the window size is continuously adaptive to each CI testing. Two methods are proposed, the test suite-based dynamic sliding window and the individual test case-based dynamic sliding window. The empirical studies are conducted on fourteen industrial-level programs, and the results reveal that under limited time, the sliding window-based reward function can effectively improve the TCP effect, where the NAPFD (Normalized Average Percentage of Faults Detected) and Recall of the dynamic sliding windows are better than that of the fixed-size sliding window. In particular, the individual test case-based dynamic sliding window approach can rank 74.18% failed test cases in the top 50% of the sorting sequence, with 1.35% improvement of NAPFD and 6.66 positions increased in TTF (Test to Fail).

Highlights

Continuous Integration (CI) is a software development practice where developers periodically merge code changes into a central repository [1]
Reinforcement Learning (RL) is an unsupervised model, and the continuous decision problem can be solved with its continuous exploration [7]
Spieker et al [8] first applied Reinforcement learning (RL) in CI testing, where the test cases are rewarded based on the latest execution results and continuously prioritized through the feedback of the latest cycle to the agent

Summary

INTRODUCTION

Continuous Integration (CI) is a software development practice where developers periodically merge code changes into a central repository [1]. Y. Yang et al.: Adaptive Reward Computation in RL-Based CI Testing information includes the long-term execution results of test cases [13] that will negatively affect the reward values. Because the length of the window can effectively influence the extraction of historical features, Wu et al [16] and Marijan and Liaaen [4] conducted empirical studies in different CI environments and obtained that 5 and 6 historical cycles were the most suitable fixed-size sliding windows respectively. The failure execution information of test cases is usually used to compute the reward function for RL-based CI testing. The scope of historical execution information for the reward computation may be different for each test case in the cycle. The dynamic sliding window is proposed based on the historical failure distribution to determine a real-time scope of historical information for reward computation.

AND RELATED WORK

EXPERIMENTS

RESEARCH QUESTIONS

THREATS TO VALIDITY Internal Validity

Findings

CONCLUSION AND FUTURE WORK