Abstract

In collaborative software development, program merging is <i>the</i> mechanism to integrate changes from multiple programmers. Merge algorithms in modern version control systems report a conflict when changes interfere textually. Merge conflicts require manual intervention and frequently stall modern continuous integration pipelines. Prior work found that, although costly, a large majority of resolutions involve re-arranging text without writing any new code. Inspired by this observation we propose the <i>first data-driven approach</i> to resolve merge conflicts with a machine learning model. We realize our approach in a tool DEEPMERGE that uses a novel combination of (i) an edit-aware embedding of merge inputs and (ii) a variation of pointer networks, to construct resolutions from input segments. We also propose an algorithm to localize manual resolutions in a resolved file and employ it to curate a ground-truth dataset comprising 8,719 non-trivial resolutions in JavaScript programs. Our evaluation shows that, on a held out test set, DEEPMERGE can predict correct resolutions for 37&#x0025; of non-trivial merges, compared to only 4&#x0025; by a state-of-the-art semistructured merge technique. Furthermore, on the subset of merges with upto 3 lines (comprising 24&#x0025; of the total dataset), DEEPMERGE can predict correct resolutions with 78&#x0025; accuracy.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call