Abstract

Programming is a vital skill in computer science and engineering-related disciplines. However, developing source code is an error-prone task. Logical errors in code are particularly hard to identify for both students and professionals, and a single error is unexpected to end-users. At present, conventional compilers have difficulty identifying many of the errors (especially logical errors) that can occur in code. To mitigate this problem, we propose a language model for evaluating source codes using a bidirectional long short-term memory (BiLSTM) neural network. We trained the BiLSTM model with a large number of source codes with tuning various hyperparameters. We then used the model to evaluate incorrect code and assessed the model’s performance in three principal areas: source code error detection, suggestions for incorrect code repair, and erroneous code classification. Experimental results showed that the proposed BiLSTM model achieved 50.88% correctness in identifying errors and providing suggestions. Moreover, the model achieved an F-score of approximately 97%, outperforming other state-of-the-art models (recurrent neural networks (RNNs) and long short-term memory (LSTM)).

Highlights

  • Programming is among the most critical skills in the field of computing and software engineering

  • A series of experiments based on the source codes collected from Aizu Online Judge (AOJ) was conducted using greatest common divider (GCD) and insertion sort (IS) problem codes

  • It is generally recognized that conventional compilers and other code evaluation systems are unable to reliably detect logic errors and provide proper suggestions for code repair

Read more

Summary

A Bidirectional LSTM Language Model for Code Evaluation and Repair

Developing source code is an error-prone task. Conventional compilers have difficulty identifying many of the errors (especially logical errors) that can occur in code. To mitigate this problem, we propose a language model for evaluating source codes using a bidirectional long short-term memory (BiLSTM) neural network. We trained the BiLSTM model with a large number of source codes with tuning various hyperparameters. We used the model to evaluate incorrect code and assessed the model’s performance in three principal areas: source code error detection, suggestions for incorrect code repair, and erroneous code classification. The model achieved an Fscore of approximately 97%, outperforming other state-of-the-art models (recurrent neural networks (RNNs) and long short-term memory (LSTM)).

Introduction
Related Works
Proposed Approach
Data Collection and Preprocessing
Data and Experimental Setup
Evaluation Metrics
Cross-Entropy
Determining the Number of Epochs and Hidden Units
Incorrect Source Code Evaluation
Source
Suggestions
Error Detection Performance
Limitations
Conclusions
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call