Abstract

With the advent of artificial intelligence, the research paradigm in natural language processing has been transitioned from statistical methods to machine learning-based approaches. One application is to develop a deep learning-based language model that helps software engineers write code faster. Although there have already been many attempts to develop code auto-completion functionality from different research groups, a need to establish an in-house code has been identified for the following reasons: (1) a security-sensitive company (e.g., Samsung Electronics) may not want to utilize commercial tools given that there is a risk of leaked source codes and (2) commercial tools may not be applicable to the specific domain (e.g., SSD firmware development) especially if one needs to predict unique code patterns and style. This research proposes a hybrid approach that harnesses the synergy between machine learning techniques and advanced design methods aiming to develop a code auto-completion framework that helps firmware developers write code in a more efficient manner. The sensitivity analysis results show that the deterministic design results in reducing prediction accuracy as it generates output in some unexpected ways, while the probabilistic design provides a list of reasonable next code elements in which one could select it manually to increase prediction accuracy.

Highlights

  • In this paper, we propose a hybrid approach that harnesses the synergy between machine learning (ML) techniques and advanced design methods [19] to enhance the level of understanding of the relationship between the generative pre-trained transformer (GPT)-2 model diversity parameters and code auto-completion functionality in the SSD firmware development domain

  • Sensitivity analysis with respect to the GPT-2 diversity parameters is performed to enhance the level of understanding of the relationship between prediction accuracy and the diversity parameters

  • The GPT-2 model has three different diversity parameters implemented in the sampling process

Read more

Summary

Introduction

One potential barrier for increasing productivity is to spend considerable time writing code that is due to a repetitive task Another potential problem is that firmware software developers may be generating similar codes simultaneously as they are separately involved in developing different hardware products. This situation could prevent them from working efficiently if they would need to handle a large volume of source codes, resulting in decreasing productivity. OpenAI released the generative pre-trained transformer (GPT) models [3,4] such as the GPT-2 models

Objectives
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.