Heterogeneous tree structure classification to label Java programmers according to their expertise level

Francisco Ortin,Oscar Rodriguez-Prieto,Nicolas Pascual,Miguel Garcia

doi:10.1016/j.future.2019.12.016

Francisco Ortin, Oscar Rodriguez-Prieto + Show 2 more

Open Access

https://doi.org/10.1016/j.future.2019.12.016

Copy DOI

Journal: Future Generation Computer Systems	Publication Date: Dec 16, 2019
Citations: 14	License type: cc-by-nc-nd

Affiliation: University of Oviedo

Abstract

Open-source code repositories are a valuable asset to creating different kinds of tools and services, utilizing machine learning and probabilistic reasoning. Syntactic models process Abstract Syntax Trees (AST) of source code to build systems capable of predicting different software properties. The main difficulty of building such models comes from the heterogeneous and compound structures of ASTs, and that traditional machine learning algorithms require instances to be represented as n-dimensional vectors rather than trees. In this article, we propose a new approach to classify ASTs using traditional supervised-learning algorithms, where a feature learning process selects the most representative syntax patterns for the child subtrees of different syntax constructs. Those syntax patterns are used to enrich the context information of each AST, allowing the classification of compound heterogeneous tree structures. The proposed approach is applied to the problem of labeling the expertise level of Java programmers. The system is able to label expert and novice programs with an average accuracy of 99.6%. Moreover, other code fragments such as types, fields, methods, statements and expressions could also be classified, with average accuracies of 99.5%, 91.4%, 95.2%, 88.3% and 78.1%, respectively.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Heterogeneous tree structure classification to label Java programmers according to their expertise level

Abstract

Talk to us

Similar Papers

More From: Future Generation Computer Systems

Lead the way for us

Similar Papers

Concept–based Analysis of Java Programming Errors among Low, Average and High Achieving Novice Programmers
...
-
, et. al. ...
01 Jan 2019
01 Jan 2019

Concept–based Analysis of Java Programming Errors among Low, Average and High Achieving Novice Programmers
Philip Olu Jegede ... Emmanuel A Olajubu
Journal of Information Technology Education: Innovations in Practice | VOL. 18
Philip Olu Jegede, et. al.Philip Olu Jegede ... Emmanuel A Olajubu
01 Jan 2019
Journal of Information Technology Education: Innovations in Practice | VOL. 18

Self-efficacy of Freshmen Students in Java Programming
...
-
, et. al. ...
14 Apr 2015
14 Apr 2015

Mining Common Syntactic Patterns used by Java Programmers
Alvaro Losada ... Miguel Garcia
IEEE Latin America Transactions | VOL. 20
Alvaro Losada, et. al.Alvaro Losada ... Miguel Garcia
01 May 2022
IEEE Latin America Transactions | VOL. 20

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Heterogeneous tree structure classification to label Java programmers according to their expertise level

Abstract

Talk to us

Similar Papers

More From: Future Generation Computer Systems