Unsupervised Learning Approach for Clustering Source Code based on Functionalities

Mohamed Ifham,B.T.G.S Kumara,Banujan Kuhaneswaran

doi:10.1109/dasa53625.2021.9682355

Mohamed Ifham, B.T.G.S Kumara + Show 1 more

https://doi.org/10.1109/dasa53625.2021.9682355

Copy DOI

Export

Save

Cite

Publication Date: Dec 7, 2021

Citations: 1

Affiliation: Sabaragamuwa University of Sri Lanka

Abstract
Full-Text
Similar Papers

Abstract

Listen

The source codes of a software system are one of the most valuable resources in software development. A developer spends more time and money on development if they start coding from fresh for each similar functionality project. Topic modelling is crucial for software reuse since it can be difficult for developers to remove obsolete source code from large software systems with a lot of code. Topic modelling techniques has been used to evaluate and model source codes in several ways. Several studies on this domain have been used to extract topics from source codes using various statistical approaches and methodologies. These topic extraction methods are interconnected, and if software best practices aren't followed in older systems, it may ensue unreliability of the outcome. In response to these findings, the author conducted a study in which he extracted source code using a java parser library and predicted the source code functionality name using an unsupervised learning approach such as K-mean algorithm and Expectation Maximization (EM) clustering approach. This is the first effort to develop a clustering model to predict the semantic function name of source code using an unsupervised learning approach and compare the algorithms to get the optimal model.

Full Text