Abstract
Code review is an essential practice in software engineering to spot code defects in the early stages of software development. Modern code reviews (e.g., acceptance or rejection of pull requests with Git) have become less formal than classic Fagan's inspections, lightweight, and more reliant on individuals (i.e., reviewers). However, reviewers may encounter mentally demanding challenges during the code review, such as code comprehension difficulties or distractions that might affect the code review quality. This work proposes a novel approach that evaluates the quality of code reviews in terms of bug-finding effectiveness and provides the reviewers with a clear message of whether the review should be repeated, indicating the code regions that may not have been well-reviewed. The proposed approach utilizes biometric information collected from the reviewer during the review process using non-intrusive biofeedback devices (e.g., smartwatches). Biometric measures such as Heart Rate Variability (HRV) and task-evoked pupillary response are captured as a surrogate of the cognitive state of the reviewer (e.g., mental workload) and inexpensive desktop eye-trackers compatible with the software development settings. This work uses Artificial Intelligence techniques to predict the cognitive load from the extracted biomarkers and classify each code region according to a set of features. The final evaluation considers various factors such as code complexity, time of the code review, the experience level of the reviewer, and other factors. Our experimental results show the approach could predict the review quality with 87.77%±4.65 accuracy and a Spearman correlation coefficient of 0.85 (p-value < 0.001) between the predicted and the actual review performance. This evaluation validates the cognitive load measurement using electroencephalography (EEG) signals as ground truth for the HRV and pupil signals.
Highlights
Software development is an intensive intellectual task
We propose an innovative approach that monitors the reviewer’s performance at code line reviewing level and evaluates the overall quality of the code reviews by providing three relevant outcomes: a) an overall evaluation with a clear indication of whether the review should be repeated or not, b) pointers to code regions that may not have been well-reviewed, and c) an explanation of why the review of the pointed code regions was considered not satisfactory
The proposed approach uses biometric information to assess the cognitive state of the reviewer during the code review process
Summary
Software development is an intensive intellectual task. It consists of knowledge activities related to understanding the problem and designing an adequate solution. Despite the modern tools available for developers and the intensive research on software reliability and quality, the general statistics for software developed (most of them related to software for critical applications) show high bug density figures, ranging from 1 to 5 bugs per 1000 lines of delivered code [2][3][4]. This problem is amplified by the constant pressure to minimize the time-to-market and due to the dramatic increase in code size witnessed by modern software. The reviewers could use this information to promptly improve the code review through a second pass over specific parts of the code under review, or project managers can ask for a second independent review
Published Version (Free)
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.