A Bidirectional LSTM Language Model for Code Evaluation and Repair

Md Mostafizer Rahman,Yutaka Watanobe,Keita Nakamura

doi:10.3390/sym13020247

Md Mostafizer Rahman, Yutaka Watanobe + Show 1 more

Open Access

https://doi.org/10.3390/sym13020247

Copy DOI

Journal: Symmetry	Publication Date: Feb 1, 2021
Citations: 63	License type: CC BY 4.0

Affiliation: University of Aizu

Abstract

Programming is a vital skill in computer science and engineering-related disciplines. However, developing source code is an error-prone task. Logical errors in code are particularly hard to identify for both students and professionals, and a single error is unexpected to end-users. At present, conventional compilers have difficulty identifying many of the errors (especially logical errors) that can occur in code. To mitigate this problem, we propose a language model for evaluating source codes using a bidirectional long short-term memory (BiLSTM) neural network. We trained the BiLSTM model with a large number of source codes with tuning various hyperparameters. We then used the model to evaluate incorrect code and assessed the model’s performance in three principal areas: source code error detection, suggestions for incorrect code repair, and erroneous code classification. Experimental results showed that the proposed BiLSTM model achieved 50.88% correctness in identifying errors and providing suggestions. Moreover, the model achieved an F-score of approximately 97%, outperforming other state-of-the-art models (recurrent neural networks (RNNs) and long short-term memory (LSTM)).

Highlights

Programming is among the most critical skills in the field of computing and software engineering
A series of experiments based on the source codes collected from Aizu Online Judge (AOJ) was conducted using greatest common divider (GCD) and insertion sort (IS) problem codes
It is generally recognized that conventional compilers and other code evaluation systems are unable to reliably detect logic errors and provide proper suggestions for code repair

Summary

A Bidirectional LSTM Language Model for Code Evaluation and Repair

Developing source code is an error-prone task. Conventional compilers have difficulty identifying many of the errors (especially logical errors) that can occur in code. To mitigate this problem, we propose a language model for evaluating source codes using a bidirectional long short-term memory (BiLSTM) neural network. We trained the BiLSTM model with a large number of source codes with tuning various hyperparameters. We used the model to evaluate incorrect code and assessed the model’s performance in three principal areas: source code error detection, suggestions for incorrect code repair, and erroneous code classification. The model achieved an Fscore of approximately 97%, outperforming other state-of-the-art models (recurrent neural networks (RNNs) and long short-term memory (LSTM)).

Introduction

Related Works

Proposed Approach

Data Collection and Preprocessing

Data and Experimental Setup

Evaluation Metrics

Cross-Entropy

Determining the Number of Epochs and Hidden Units

Incorrect Source Code Evaluation

Source

Suggestions

Error Detection Performance

Limitations

Conclusions

Full Text

Published version (

Free)

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

A Bidirectional LSTM Language Model for Code Evaluation and Repair

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Symmetry

Lead the way for us

Similar Papers

Gujarati Task Oriented Dialogue Slot Tagging Using Deep Neural Network Models
Rachana Parikh ... Hiren Joshi
-
Rachana Parikh, et. al.Rachana Parikh ... Hiren Joshi
01 Jan 2020
01 Jan 2020

Rice Crop Detection Using LSTM, Bi-LSTM, and Machine Learning Models from Sentinel-1 Time Series
Hugo Crisóstomo De Castro Filho ... Pablo Pozzobon De Bem
Remote sensing | VOL. 12
Hugo Crisóstomo De Castro Filho, et. al.Hugo Crisóstomo De Castro Filho ... Pablo Pozzobon De Bem
18 Aug 2020
Remote sensing | VOL. 12

The Performance of LSTM and BiLSTM in Forecasting Time Series
Sima Siami-Namini ... Akbar Siami Namin
-
Sima Siami-Namini, et. al.Sima Siami-Namini ... Akbar Siami Namin
01 Dec 2019
01 Dec 2019

Neural Network-based Approach to Predict Protein Secondary Structure
Arifur Rahman ... Pintu Chandra Shill
-
Arifur Rahman, et. al.Arifur Rahman ... Pintu Chandra Shill
04 May 2023
04 May 2023

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

A Bidirectional LSTM Language Model for Code Evaluation and Repair

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Symmetry