Abstract

Finding potential security weaknesses in any complex IT system is an important and often challenging task best started in the early stages of the development process. We present a method that transforms this task for FPGA designs into a reinforcement learning (RL) problem. This paper introduces a method to generate a Markov Decision Process based RL model from a formal, high-level system description (formulated in the domain-specific language) of the system under review and different, quantified assumptions about the system’s security. Probabilistic transitions and the reward function can be used to model the varying resilience of different elements against attacks and the capabilities of an attacker. This information is then used to determine a plausible data exfiltration strategy. An example with multiple scenarios illustrates the workflow. A discussion of supplementary techniques like hierarchical learning and deep neural networks concludes this paper.

Highlights

  • IntroductionSecuring any (non-trivial) computer system against malicious actors is a challenging task

  • Securing any computer system against malicious actors is a challenging task

  • Attack trees [1] are a common method to model an attacker’s options but their creation is often tedious and they provide little information about the most efficient way to breach a system. We address this problem for FPGA [2] based designs through a combination of a text-based Domain Specific Language (DSL) for the design description and a quantified assessment of the security properties

Read more

Summary

Introduction

Securing any (non-trivial) computer system against malicious actors is a challenging task. Attack trees [1] are a common method to model an attacker’s options but their creation is often tedious and they provide little information about the most efficient way to breach a system. We address this problem for FPGA [2] based designs through a combination of a text-based Domain Specific Language (DSL) for the design description and a quantified assessment of the security properties. An agent is trained on this MDP to exfiltrate all data stored within an FPGA design using the most efficient sequence of predefined actions This task is performed by well-established reinforcement learning (RL) [3] [4] algorithms. We illustrate the feasibility of this approach through a series of experiments, assess its constraints and conclude with a discussion about our model’s limits and possible methods to lift them

Related Work
The descriptive model of the FPGA design
The MDP-Representation of the FPGA design
Scope and constraints of the model
Generic components of Markov Decision Processes
The domain-specific Markov Decision Process
Example
Scenario 1
Scenario 2
Scenario 3
Scenario 4
Results
Limits and restrictions of the MDP based approach and possible solutions
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call