Abstract
Finding potential security weaknesses in any complex IT system is an important and often challenging task best started in the early stages of the development process. We present a method that transforms this task for FPGA designs into a reinforcement learning (RL) problem. This paper introduces a method to generate a Markov Decision Process based RL model from a formal, high-level system description (formulated in the domain-specific language) of the system under review and different, quantified assumptions about the system’s security. Probabilistic transitions and the reward function can be used to model the varying resilience of different elements against attacks and the capabilities of an attacker. This information is then used to determine a plausible data exfiltration strategy. An example with multiple scenarios illustrates the workflow. A discussion of supplementary techniques like hierarchical learning and deep neural networks concludes this paper.
Highlights
IntroductionSecuring any (non-trivial) computer system against malicious actors is a challenging task
Securing any computer system against malicious actors is a challenging task
Attack trees [1] are a common method to model an attacker’s options but their creation is often tedious and they provide little information about the most efficient way to breach a system. We address this problem for FPGA [2] based designs through a combination of a text-based Domain Specific Language (DSL) for the design description and a quantified assessment of the security properties
Summary
Securing any (non-trivial) computer system against malicious actors is a challenging task. Attack trees [1] are a common method to model an attacker’s options but their creation is often tedious and they provide little information about the most efficient way to breach a system. We address this problem for FPGA [2] based designs through a combination of a text-based Domain Specific Language (DSL) for the design description and a quantified assessment of the security properties. An agent is trained on this MDP to exfiltrate all data stored within an FPGA design using the most efficient sequence of predefined actions This task is performed by well-established reinforcement learning (RL) [3] [4] algorithms. We illustrate the feasibility of this approach through a series of experiments, assess its constraints and conclude with a discussion about our model’s limits and possible methods to lift them
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have