Abstract

Many open-source software projects are self-organized and do not maintain official lists with information on developer roles. So, knowing which developers take core and maintainer roles is, despite being relevant, often tacit knowledge. We propose a method to automatically identify core developers based on role permissions of privileged events triggered in GitHub issues and pull requests. In an empirical study on 25/GitHub projects, (1) we validate the set of automatically identified core developers with a sample of project-reported developer lists, and (2) we use our set of identified core developers to assess the accuracy of state-of-the-art unsupervised developer classification methods. Our results indicate that the set of core developers, which we extracted from privileged issue events, is sound and the accuracy of state-of-the-art unsupervised classification methods depends mainly on the data source (commit data versus issue data) rather than the network-construction method (directed versus undirected, etc.). In perspective, our results shall guide research and practice to choose appropriate unsupervised classification methods, and our method can help create reliable ground-truth data for training supervised classification methods.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call