ACoRA – A Platform for Automating Code Review Tasks
Background: Modern Code Reviews (MCR) are frequently adopted when assuring code and design quality in continuous integration and deployment projects. Although tiresome, they serve a secondary purpose of learning about the software product. Aim: Our objective is to design and evaluate a support tool to help software developers focus on the most important code fragments to review and provide them with suggestions on what should be reviewed in this code. Method: We used design science research to develop and evaluate a tool for automating code reviews by providing recommendations for code reviewers. The tool is based on Transformer-based machine learning models for natural language processing, applied to both programming language code (patch content) and the review comments. We evaluate both the ability of the language model to match similar lines and the ability to correctly indicate the nature of the potential problems encoded in a set of categories. We evaluated the tool on two open-source projects and one industry project. Results: The proposed tool was able to correctly annotate (only true positives) 35%–41% and partially correctly annotate 76%–84% of code fragments to be reviewed with labels corresponding to different aspects of code the reviewer should focus on. Conclusion: By comparing our study to similar solutions, we conclude that indicating lines to be reviewed and suggesting the nature of the potential problems in the code allows us to achieve higher accuracy than suggesting entire changes in the code considered in other studies. Also, we have found that the differences depend more on the consistency of commenting rather than on the ability of the model to find similar lines.
- Conference Article
40
- 10.1109/saner.2019.8667996
- Feb 1, 2019
Modern code review (MCR) is nowadays well-adopted in industrial and open source projects. Recent studies have investigated how developers perceive its ability to foster code quality, developers’ code ownership, and team building. MCR is often being used with automated quality checks through static analysis tools, testing or, ultimately, through automated builds on a Continuous Integration (CI) infrastructure. With the aim of understanding how developers use the outcome of CI builds during code review and, more specifically, during the discussion of pull requests, this paper empirically investigates the interplay between pull request discussion and the use of CI by means of 64,865 pull request discussions belonging to 69 open source projects. After having analyzed to what extent a build outcome in uences the pull request merger, we qualitatively analyze the content of 857 pull request discussions. Also, we complement such an analysis with a survey involving 13 developers. While pull requests with passed build have a higher chance of being merged than failed ones, and while survey participants confirmed this quantitative finding, other process-related factors play a more important role in the pull request merge decision. Also, the survey participants point out cases where a pull request can be merged in presence of a CI failure, e.g., when a new pull request is opened to cope with the failure, when the failure is due to minor static analysis warnings. The study also indicates that CI introduces extra complexity, as in many pull requests developers have to solve non-trivial CI configuration issues.
- Conference Article
5
- 10.1145/3377929.3390057
- Jul 8, 2020
Modern code review is a common practice used by software developers to ensure high software quality in open source and industrial projects. During code review, developers submit their code changes which should be reviewed, via tool-based code review platforms, before being integrated into the codebase. Then, reviewers provide their feedback to developers, and may request further modifications before finally accepting or rejecting the submitted code changes. However, the identification of appropriate reviewers is still a tedious task as the number of code reviews to be performed is inflated with the increasing number of code changes and the increasing size of software development teams in today's large and active software projects. To help developers with the review process, we introduce a multi-objective search-based approach to find the appropriate set of reviewers. We use the Non-dominated Sorting Genetic Algorithm (NSGA-II) to optimize two conflicting objectives (i) maximize reviewers expertise with the changed files, and (ii) minimize reviewers workload in terms of their current open code reviews. We conduct a preliminary evaluation on two open source projects to evaluate our approach. Results indicate that our approach is efficient as compared to state-of-the-art approaches.
- Research Article
47
- 10.1016/j.asoc.2020.106908
- Nov 30, 2020
- Applied Soft Computing
WhoReview: A multi-objective search-based approach for code reviewers recommendation in modern code review
- Research Article
1
- 10.1016/j.infsof.2024.107596
- Oct 5, 2024
- Information and Software Technology
Deciphering refactoring branch dynamics in modern code review: An empirical study on Qt
- Conference Article
177
- 10.1109/saner.2015.7081824
- Mar 1, 2015
Software code review is an inspection of a code change by an independent third-party developer in order to identify and fix defects before an integration. Effectively performing code review can improve the overall software quality. In recent years, Modern Code Review (MCR), a lightweight and tool-based code inspection, has been widely adopted in both proprietary and open-source software systems. Finding appropriate code-reviewers in MCR is a necessary step of reviewing a code change. However, little research is known the difficulty of finding code-reviewers in a distributed software development and its impact on reviewing time. In this paper, we investigate the impact of reviews with code-reviewer assignment problem has on reviewing time. We find that reviews with code-reviewer assignment problem take 12 days longer to approve a code change. To help developers find appropriate code-reviewers, we propose RevFinder, a file location-based code-reviewer recommendation approach. We leverage a similarity of previously reviewed file path to recommend an appropriate code-reviewer. The intuition is that files that are located in similar file paths would be managed and reviewed by similar experienced code-reviewers. Through an empirical evaluation on a case study of 42,045 reviews of Android Open Source Project (AOSP), OpenStack, Qt and LibreOffice projects, we find that RevFinder accurately recommended 79% of reviews with a top 10 recommendation. RevFinder also correctly recommended the code-reviewers with a median rank of 4. The overall ranking of RevFinder is 3 times better than that of a baseline approach. We believe that RevFinder could be applied to MCR in order to help developers find appropriate code-reviewers and speed up the overall code review process.
- Conference Article
22
- 10.5555/2820518.2820540
- May 16, 2015
Software code review is a well-established software quality practice. Recently, Modern Code Review (MCR) has been widely adopted in both open source and proprietary projects. To evaluate the impact that characteristics of MCR practices have on software quality, this paper comparatively studies MCR practices in defective and clean source code files. We investigate defective files along two perspectives: 1) files that will eventually have defects (i.e., future-defective files) and 2) files that have historically been defective (i.e., risky files). Through an empirical study of 11,736 reviews of changes to 24,486 files from the Qt open source project, we find that both future-defective files and risky files tend to be reviewed less rigorously than their clean counterparts. We also find that the concerns addressed during the code reviews of both defective and clean files tend to enhance evolvability, i.e., ease future maintenance (like documentation), rather than focus on functional issues (like incorrect program logic). Our findings suggest that although functionality concerns are rarely addressed during code review, the rigor of the reviewing process that is applied to a source code file throughout a development cycle shares a link with its defect proneness.
- Book Chapter
7
- 10.1007/978-3-031-04115-0_3
- Jan 1, 2022
Modern Code Reviews (MCRs) are a widely-used quality assurance mechanism in continuous integration and deployment. Unfortunately, in medium and large projects, the number of changes that need to be integrated, and consequently the number of comments triggered during MCRs could be overwhelming. Therefore, there is a need for quickly recognizing which comments are concerning issues that need prompt attention to guide the focus of the code authors, reviewers, and quality managers. The goal of this study is to design a method for automated classification of review comments to identify the needed change faster and with higher accuracy. We conduct a Design Science Research study on three open-source systems. We designed a method (CommentBERT) for automated classification of the code-review comments based on the BERT (Bidirectional Encoder Representations from Transformers) language model and a new taxonomy of comments. When applied to 2,672 comments from Wireshark, The Mono Framework, and Open Network Automation Platform (ONAP) projects, the method achieved accuracy, measured using Matthews Correlation Coefficient, of 0.46–0.82 (Wireshark), 0.12–0.8 (ONAP), and 0.48–0.85 (Mono). Based on the results, we conclude that the proposed method seems promising and could be potentially used to build machine-learning-based tools to support MCRs as long as there is a sufficient number of historical code-review comments to train the model.KeywordsModern Code ReviewsMachine learningBERT
- Conference Article
33
- 10.1145/3180155.3180217
- May 27, 2018
Modern code reviews improve the quality of software products. Although modern code reviews rely heavily on human interactions, little is known regarding whether they are performed fairly. Fairness plays a role in any process where decisions that affect others are made. When a system is perceived to be unfair, it affects negatively the productivity and motivation of its participants. In this paper, using fairness theory we create a framework that describes how fairness affects modern code reviews. To demonstrate its applicability, and the importance of fairness in code reviews, we conducted an empirical study that asked developers of a large industrial open source ecosystem (OpenStack) what their perceptions are regarding fairness in their code reviewing process. Our study shows that, in general, the code review process in OpenStack is perceived as fair; however, a significant portion of respondents perceive it as unfair. We also show that the variability in the way they prioritize code reviews signals a lack of consistency and the existence of bias (potentially increasing the perception of unfairness). The contributions of this paper are: (1) we propose a framework---based on fairness theory---for studying and managing social behaviour in modern code reviews, (2) we provide support for the framework through the results of a case study on a large industrial-backed open source project, (3) we present evidence that fairness is an issue in the code review process of a large open source ecosystem, and, (4) we present a set of guidelines for practitioners to address unfairness in modern code reviews.
- Conference Article
70
- 10.1109/icsme.2016.65
- Oct 1, 2016
Code review is of primary importance in modern software development. It is widely recognized that peer review is an efficient and effective practice for improving software quality and reducing defect proneness. For successful review process, peer reviewers should have a deep experience and knowledge with the code being reviewed, and familiar to work and collaborate together. However, one of the main challenging tasks in modern code review is to find the most appropriate reviewers for submitted code changes. So far, reviewers assignment is still a manual, costly and time-consuming task. In this paper, we introduce a search-based approach, namely RevRec, to provide decision-making support for code change submitters and/or reviewers assigners to identify most appropriate peer reviewers for their code changes. RevRec aims at finding reviewers to be assigned for a code change based on their expertise and collaboration in past reviews using genetic algorithm (GA). We evaluated our approach on a benchmark of three open-source software systems, Android, OpenStack, and Qt. Results indicate that RevRec accurately recommends code reviewers with up to 59% of precision and 74% of recall. Our experiments provide evidence that leveraging reviewers expertise from their prior reviews and the socio-technical aspects of the team work and collaboration is relevant in improving the performance of peer reviewers recommendation in modern code review.
- Conference Article
1
- 10.1109/icetas48360.2019.9117541
- Dec 1, 2019
Code review is an essential practice to evaluate the quality of source code. Presently the Modern trend of code review known as Modern Code Review (MCR), an informal modified version of Fagan's inspection is widely used. In MCR process, MCR workforce that is author and reviewer work together to improve the code quality and software quality. It is a peer-reviewed process, the source code written by the author is evaluated by the reviewer. For the effective outcomes of the MCR process, it is required to focus on the sustainability of the MCR workforce so that they can sustain for a longer and produce the effective outcomes. However, the sustainability of MCR workforce is impacted by the unidentified project-related situational factors. The existing literature of MCR is lacking concerning the identification of project-related situational factors that impact the sustainability of the MCR workforce. Therefore, this study aims to perform the Systematic Literature Review (SLR) to identify the project-related situational factors that can impact the sustainability of the MCR workforce. The situational components are collected into two key classifications, for example, project release management and project attributes. The grounded hypothesis approach has been applied to get final unique and categorized list of project-related situational factors. The identified unique and categorized list of project-related situational components is additionally double checked by the expert panel for their categorizations, naming conventions and suggestions of additional project-related situational factors. The investigation results announced 18 project-related situational factors. The study will be advantageous for experts involved in situational software engineering research as well as the MCR workforce to sustain for longer by overcoming the challenge of unidentified situations.
- Conference Article
82
- 10.1109/msr.2015.23
- May 1, 2015
Software code review is a well-established software quality practice. Recently, Modern Code Review (MCR) has been widely adopted in both open source and proprietary projects. To evaluate the impact that characteristics of MCR practices have on software quality, this paper comparatively studies MCR practices in defective and clean source code files. We investigate defective files along two perspectives: 1) files that will eventually have defects (i.e., future-defective files) and 2) files that have historically been defective (i.e., risky files). Through an empirical study of 11,736 reviews of changes to 24,486 files from the Qt open source project, we find that both future-defective files and risky files tend to be reviewed less rigorously than their clean counterparts. We also find that the concerns addressed during the code reviews of both defective and clean files tend to enhance evolvability, i.e., ease future maintenance (like documentation), rather than focus on functional issues (like incorrect program logic). Our findings suggest that although functionality concerns are rarely addressed during code review, the rigor of the reviewing process that is applied to a source code file throughout a development cycle shares a link with its defect proneness.
- Conference Article
82
- 10.1109/icsm.2015.7332472
- Sep 1, 2015
Software code review is a process of developers inspecting new code changes made by others, to evaluate their quality and identify and fix defects, before integrating them to the main branch of a version control system. Modern Code Review (MCR), a lightweight and tool-based variant of conventional code review, is widely adopted in both open source and proprietary software projects. One challenge that impacts MCR is the assignment of appropriate developers to review a code change. Considering that there could be hundreds of potential code reviewers in a software project, picking suitable reviewers is not a straightforward task. A prior study by Thongtanunam et al. showed that the difficulty in selecting suitable reviewers may delay the review process by an average of 12 days. In this paper, to address the challenge of assigning suitable reviewers to changes, we propose a hybrid and incremental approach Tie which utilizes the advantages of both Text mIning and a filE location-based approach. To do this, Tie integrates an incremental text mining model which analyzes the textual contents in a review request, and a similarity model which measures the similarity of changed file paths and reviewed file paths. We perform a large-scale experiment on four open source projects, namely Android, OpenStack, QT, and LibreOffice, containing a total of 42,045 reviews. The experimental results show that on average Tie can achieve top-1, top-5, and top-10 accuracies, and Mean Reciprocal Rank (MRR) of 0.52, 0.79, 0.85, and 0.64 for the four projects, which improves the state-of-the-art approach RevFinder, proposed by Thongtanunam et al., by 61%, 23%, 8%, and 37%, respectively.
- Research Article
8
- 10.1007/s10664-022-10178-7
- Jul 4, 2022
- Empirical Software Engineering
Code review plays an important role in software quality control. A typical review process involves a careful check of a piece of code in an attempt to detect and locate defects and other quality issues/violations. One type of issue that may impact the quality of software is code smells - i.e., bad coding practices that may lead to defects or maintenance issues. Yet, little is known about the extent to which code smells are identified during modern code review. To investigate the concept behind code smells identified in modern code review and what actions reviewers suggest and developers take in response to the identified smells, we conducted an empirical study of code smells in code reviews by analysing reviews from four, large open source projects from the OpenStack (Nova and Neutron) and Qt (Qt Base and Qt Creator) communities. We manually checked a total of 25,415 code review comments obtained by keywords search and random selection; this resulted in the identification of 1,539 smell-related reviews which then allowed the study of the causes of code smells, actions taken against identified smells, time taken to fix identified smells and reasons why developers ignored fixing identified smells. Our analysis found that 1) code smells were not commonly identified in code reviews, 2) smells were usually caused by violation of coding conventions, 3) reviewers usually provided constructive feedback, including fixing (refactoring) recommendations to help developers remove smells, 4) developers generally followed those recommendations and actioned the changes, 5) once identified by reviewers, it usually takes developers less than one week to fix the smells and 6) the main reason why developers chose to ignore the identified smells is that it is not worth fixing the smell. Our results suggest the following: 1) developers should closely follow coding conventions in their projects to avoid introducing code smells, 2) review-based detection of code smells is perceived to be a trustworthy approach by developers, mainly because reviews are context-sensitive (as reviewers are more aware of the context of the code given that they are part of the project’s development team) and 3) program context needs to be fully considered in order to make a decision of whether to fix the identified code smell immediately.
- Conference Article
219
- 10.1145/2597073.2597082
- May 31, 2014
Code review is the manual assessment of source code by humans, mainly intended to identify defects and quality problems. Modern Code Review (MCR), a lightweight variant of the code inspections investigated since the 1970s, prevails today both in industry and open-source software (OSS) systems. The objective of this paper is to increase our understanding of the practical benefits that the MCR process produces on reviewed source code. To that end, we empirically explore the problems fixed through MCR in OSS systems. We manually classified over 1,400 changes taking place in reviewed code from two OSS projects into a validated categorization scheme. Surprisingly, results show that the types of changes due to the MCR process in OSS are strikingly similar to those in the industry and academic systems from literature, featuring the similar 75:25 ratio of maintainability-related to functional problems. We also reveal that 7–35% of review comments are discarded and that 10–22% of the changes are not triggered by an explicit review comment. Patterns emerged in the review data; we investigated them revealing the technical factors that influence the number of changes due to the MCR process. We found that bug-fixing tasks lead to fewer changes and tasks with more altered files and a higher code churn have more changes. Contrary to intuition, the person of the reviewer had no impact on the number of changes.
- Conference Article
31
- 10.1109/scam.2016.30
- Oct 1, 2016
Modern Code Review (MCR) is an established software development process that aims to improve software quality. Although evidence showed that higher levels of review coverage relates to less post-release bugs, it remains unknown the effectiveness of MCR at specifically finding security issues. We present a work we conduct aiming to fill that gap by exploring the MCR process in the Chromium open source project. We manually analyzed large sets of registered (114 cases) and missed (71 cases) security issues by backtracking in the project's issue, review, and code histories. This enabled us to qualify MCR in Chromium from the security perspective from several angles: Are security issues being discussed frequently? What categories of security issues are often missed or found? What characteristics of code reviews appear relevant to the discovery rate? Within the cases we analyzed, MCR in Chromium addresses security issues at a rate of 1% of reviewers' comments. Chromium code reviews mostly tend to miss language-specific issues (eg C++ issues and buffer overflows) and domain-specific ones (eg such as Cross-Site Scripting), when code reviews address issues, mostly they address those that pertain to the latter type. Initial evidence points to reviews conducted by more than 2 reviewers being more successful at finding security issues.
- Ask R Discovery
- Chat PDF
AI summaries and top papers from 250M+ research sources.