Abstract

Exploit code is widely used for detecting vulnerabilities and implementing defensive measures. However, automatic generation of exploit code for security assessment is a challenging task. In this paper, we propose a novel template-augmented exploit code generation approach ExploitGen based on CodeBERT. Specifically, we first propose a rule-based Template Parser to generate template-augmented natural language descriptions (NL). Both the raw and template-augmented NL sequences are encoded to context vectors by the respective encoders. For better learning semantic information, ExploitGen incorporates a semantic attention layer, which uses the attention mechanism to extract and calculate each layer’s representational information. In addition, ExploitGen computes the interaction information between the template information and the semantics of the raw NL and designs a residual connection to append the template information into the semantics of the raw NL. Comprehensive experiments on two datasets show the effectiveness of ExploitGen after comparison with six state-of-the-art baselines. Apart from the automatic evaluation, we conduct a human study to evaluate the quality of generated code in terms of syntactic and semantic correctness. The results also confirm the effectiveness of ExploitGen.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call