In recent years, the proliferation of code generation models based on large language models such as GitHub Copilot and ChatGPT allows automated source code generation to meet the needs of developers and helps increase coding efficiency. However, a recent study revealed security concerns in generated code, leading the code to be vulnerable to attacks. My research introduces a framework aimed at mitigating the risk of code generation models generating vulnerable code specific to data leakage issues. The ranker is developed to use VUDENC, a deep learning model for vulnerability detection, along with CodeQL and Bandit, two Python code analyzers, to evaluate and rank generated code based on security metrics. By generating multiple code candidates and utilizing the ranker to select the most secure option, it ensures the generation of more secure code. The framework is evaluated on an aggregated SecurityEval and LLMSecEval dataset on relevant scenarios, which shows the framework has newfound advantages compared to the gpt3.5-turbo model. With its proven effectiveness, the framework could be expanding its applicability beyond data leakage issues, adapting to mitigate a comprehensive range of vulnerabilities.
Read full abstract