Abstract

Context:Code readability, which correlates strongly with software quality, plays a critical role in software maintenance and evolvement. Although existing deep learning-based code readability models have reached a rather high classification accuracy, only structural features are utilized which inevitably limits their model performance. Objective:To address this problem, we propose to extract readability-related features from visual, semantic, and structural aspects from source code in an attempt to further improve code readability classification. Method:First, we convert a code snippet into a RGB matrix (for visual feature extraction), a token sequence (for semantic feature extraction) and a character matrix (for structural feature extraction). Then, we input them into a hybrid neural network that is composed of BERT, CNN, and BiLSTM for feature extraction. Finally, the extracted features are concatenated and input into a classifier to make a code readability classification. Result:A series of experiments are conducted to evaluate our method. The results show that the average accuracy could reach 85.3%, which outperforms all existing models. Conclusion:As an innovative work of extracting readability-related features automatically from visual, semantic, and structural aspects, our method is proved to be effective for the task of code readability classification.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.