Abstract

• Extract links on issues, pull requests, and commits using patterns URL, Num and SHA. • Identify the change of project names and delete references between the same project. • In comparison with previous works, our method IREL identifies many more references. In open source software platforms, software projects do not usually develop in isolation, and they depend on each other and develop together. It is important to identify references between projects in software development activities, which may help projects identify cross-project bugs or attract new contributors from related projects. In this paper, we propose a method IREL to I dentify R eferences between projects by E xtracting L inks. We first extract links from descriptions and comments on issues, pull requests, and commits with three matching patterns. Then we identify changes in project names and replace the original project names with their new project names. Finally, we identify references between projects by selecting links with different source projects and target projects. We evaluate the performance based on datasets with 20,347,228 projects. Our method IREL obtains 934,322 references, 26.461 times as many as the method Reference Coupling and 16.483 times as many as the method Issue Units. Project PageRank scores based on references identified by our method IREL are more correlated with the number of stars of projects. Our method supports researchers to identify references better.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call