Automatically Labelled Software Topic Model

Youcef Bouziane,Mustapha Kamel Abdi,Salah Sadou

doi:10.4018/ijossp.2020010104

Abstract

Public software repositories (SR) maintain a massive amount of valuable data offering opportunities to support software engineering (SE) tasks. Researchers have applied information retrieval techniques in mining software repositories. Topic models are one of these techniques. However, this technique does not give an interpretation nor labels to the extracted topics and it requires manual analysis to identify them. Some approaches were proposed to automatically label the topics using tags in SR, but they do not consider the existence of spam-tags and they have difficulties to scale to large tag space. This article introduces a novel approach called automatically labelled software topic model (AL-STM) that labels the topics based on observed tags in SR. It mitigates the shortcomings of manual and automatic labelling of topics in SE. AL-STM is implemented using 22K GitHub projects and evaluated in a SE task (tag recommending) against the currently used techniques. The empirical results suggest that AL-STM is more robust in terms of MAP and nDCG, and more scalable to large tag space.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Automatically Labelled Software Topic Model

Abstract

Talk to us

Similar Papers

More From: International Journal of Open Source Software and Processes

Lead the way for us

Journal: International Journal of Open Source Software and Processes	Publication Date: Jan 1, 2020
Citations: 1

Similar Papers

Introduction to the Special Issue on Mining Software Repositories in 2010
Jim Whitehead ... Thomas Zimmermann
Empirical Software Engineering | VOL. 17
Jim Whitehead, et. al.Jim Whitehead ... Thomas Zimmermann
26 Apr 2012
Introduction to the Special Issue on Mining Software Repositories in 2010
Jim Whitehead ... Thomas Zimmermann

Mining Unstructured Software Repositories
Stephen W Thomas ... Dorothea Blostein
-
Stephen W Thomas, et. al.Stephen W Thomas ... Dorothea Blostein
12 Dec 2013
12 Dec 2013

Predicting Query Quality for Applications of Text Retrieval to Software Engineering Tasks
Chris Mills ... Rocco Oliveto
ACM Transactions on Software Engineering and Methodology | VOL. 26
Chris Mills, et. al.Chris Mills ... Rocco Oliveto
31 Jan 2017
ACM Transactions on Software Engineering and Methodology | VOL. 26

Introduction to the special issue on mining software repositories
Tao Xie ... Thomas Zimmermann
Empirical Software Engineering | VOL. 18
Tao Xie, et. al.Tao Xie ... Thomas Zimmermann
17 Aug 2013
Empirical Software Engineering | VOL. 18

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Automatically Labelled Software Topic Model

Abstract

Talk to us

Similar Papers

More From: International Journal of Open Source Software and Processes