How to effectively use topic models for software engineering tasks? an approach based on genetic algorithms

Annibale Panichella ,Denys Poshynanyk ,Bogdan Dit ,M Di Penta ,Rocco Oliveto ,Andrea De Lucia

doi:10.5555/2486788.2486857

Annibale Panichella , Denys Poshynanyk + Show 4 more

https://doi.org/10.5555/2486788.2486857

Copy DOI

Abstract

Information Retrieval (IR) methods, and in particular topic models, have recently been used to support essential software engineering (SE) tasks, by enabling software textual retrieval and analysis. In all these approaches, topic models have been used on software artifacts in a similar manner as they were used on natural language documents (e.g., using the same settings and parameters) because the underlying assumption was that source code and natural language documents are similar. However, applying topic models on software data using the same settings as for natural language text did not always produce the expected results. Recent research investigated this assumption and showed that source code is much more repetitive and predictable as compared to the natural language text. Our paper builds on this new fundamental finding and proposes a novel solution to adapt, configure and effectively use a topic modeling technique, namely Latent Dirichlet Allocation (LDA), to achieve better (acceptable) performance across various SE tasks. Our paper introduces a novel solution called LDA-GA, which uses Genetic Algorithms (GA) to determine a near-optimal configuration for LDA in the context of three different SE tasks: (1) traceability link recovery, (2) feature location, and (3) software artifact labeling. The results of our empirical studies demonstrate that LDA-GA is ableto identify robust LDA configurations, which lead to a higher accuracy on all the datasets for these SE tasks as compared to previously published results, heuristics, and the results of a combinatorial search.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

How to effectively use topic models for software engineering tasks? an approach based on genetic algorithms

Abstract

Talk to us

Similar Papers

Lead the way for us

Similar Papers

How to effectively use topic models for software engineering tasks? An approach based on Genetic Algorithms
Annibale Panichella ... Denys Poshynanyk
-
Annibale Panichella, et. al.Annibale Panichella ... Denys Poshynanyk
01 May 2013
01 May 2013

Configuring topic models for software engineering tasks in TraceLab
Bogdan Dit ... Andrea De Lucia
-
Bogdan Dit, et. al.Bogdan Dit ... Andrea De Lucia
01 May 2013
01 May 2013

Configuring and Assembling Information Retrieval Based Solutions for Software Engineering Tasks
Bogdan Dit
-
Bogdan DitBogdan Dit
01 Oct 2016
01 Oct 2016

Topic modeling in software engineering research
Camila Costa Silva ... Matthias Galster
Empirical Software Engineering | VOL. 26
Camila Costa Silva, et. al.Camila Costa Silva ... Matthias Galster
06 Sep 2021
Empirical Software Engineering | VOL. 26

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

How to effectively use topic models for software engineering tasks? an approach based on genetic algorithms

Abstract

Talk to us

Similar Papers