Exploiting tree structures for classifying programs by functionalities

Viet Anh Phan Viet Anh Phan,Ngoc Phuong Chau Ngoc Phuong Chau,Minh Le Nguyen Minh Le Nguyen

doi:10.1109/kse.2016.7758034

Viet Anh Phan Viet Anh Phan, Ngoc Phuong Chau Ngoc Phuong Chau + Show 1 more

https://doi.org/10.1109/kse.2016.7758034

Copy DOI

Abstract

Analyzing source code to solve software engineering problems such as fault prediction, cost, and effort estimation always receives attention of researchers as well as companies. The traditional approaches are based on machine learning, and software metrics obtained by computing standard measures of software projects. However, these methods have faced many challenges due to limitations of using software metrics which were not enough to capture the complexity of programs. The aim of this paper is to apply several natural language processing techniques, which deal with software engineering problems by exploring information of programs' abstract syntax trees (ASTs) instead of software metrics. To speed up computational time, we propose a pruning tree technique to eliminate redundant branches of ASTs. In addition, the k-Nearest Neighbor (kNN) algorithm was adopted to compare with other methods whereby the distance between programs is measured by using the tree edit distance (TED) and the Levenshtein distance. These algorithms are evaluated based on the performance of solving 104-label program classification problem. The experiments show that due to the use of appropriate data structures although kNN is a simple machine learning algorithm, the classifiers achieve the promising results.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Exploiting tree structures for classifying programs by functionalities

Abstract

Talk to us

Similar Papers

Lead the way for us

Similar Papers

Automatically classifying source code using tree-based approaches
Anh Viet Phan ... Lam Thu Bui
Data & Knowledge Engineering | VOL. 114
Anh Viet Phan, et. al.Anh Viet Phan ... Lam Thu Bui
27 Jul 2017
Data & Knowledge Engineering | VOL. 114

Seml: A Semantic LSTM Model for Software Defect Prediction
Hongliang Liang ... Zhuosi Xie
IEEE Access | VOL. 7
Hongliang Liang, et. al.Hongliang Liang ... Zhuosi Xie
01 Jan 2019
IEEE Access | VOL. 7

An Approach to Software Defect Prediction Combining Semantic Features and Code Changes
Chuanqi Tao ... Jingxuan Zhang
International Journal of Software Engineering and Knowledge Engineering | VOL. 32
Chuanqi Tao, et. al.Chuanqi Tao ... Jingxuan Zhang
26 Aug 2022
International Journal of Software Engineering and Knowledge Engineering | VOL. 32

Choosing software metrics for defect prediction: an investigation on feature selection techniques
Kehan Gao ... Taghi M Khoshgoftaar
Software: Practice and Experience | VOL. 41
Kehan Gao, et. al.Kehan Gao ... Taghi M Khoshgoftaar
18 Mar 2011
Software: Practice and Experience | VOL. 41

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Exploiting tree structures for classifying programs by functionalities

Abstract

Talk to us

Similar Papers