Abstract

Large-scale software development today relies heavily on version control systems facilitating distributed development of software projects. For the purpose of merging diverging versions of the code base, version control systems employ line-based merge algorithms, which are applicable to all text files. Structured merge algorithms have been proposed as an alternative to unstructured, line-based merging, with the goal of reducing the number of merge conflicts that have to be manually resolved by the developer. By leveraging the structure inherent in source code (i.e., by representing source code files in terms of abstract syntax trees instead of sequences of text lines), these algorithms are able to merge revisions in various situations (e.g., reordering of methods) that would cause conflicts when merged using an unstructured approach. However, merging abstract syntax trees is inherently more complex than merging sequences of text lines, which makes structured merge algorithms computationally more expensive than an unstructured merge. To reduce the runtime cost of structured merge algorithms, semistructured merge as well as combinations of different merge strategies were proposed. As such, we observe a range of increasingly structured merge algorithms, which feature different characteristics in terms of conflict resolution and runtime. The progressively increasing use of structure to avoid merge conflicts or to automate conflict resolution raises a number of questions: How is the correctness of the code resulting from a merge affected when employing structured merge algorithms? Which algorithm strikes the best balance between runtime, conflict resolution potential, and correctness of the merge result? For the first time, we evaluate a whole range of merge algorithms (from unstructured over semistructured to structured as well as combinations) by replaying merge commits in a controlled setting. We employ the test suite of the projects in question as an oracle for the correctness of the resulting code, triangulated by a thorough manual analysis. Using 7727 merge commits from 10 open-source projects, we find that combined strategies appear to be the best of both worlds: They resolve as many conflicts as structured merge at a significantly lower runtime per merge commit. Notably, structured merge strategies do cause more test failures, however, the increase is small.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.