Symbolic execution for refuting ∀∃ hyperproperties
Abstract Many important hyperliveness properties, such as refinement and generalized non-interference, fall into the class of $$\forall \exists$$ hyperproperties, and require, for each execution trace of a system, the existence of another execution trace relating to the first one in a certain way. The alternation of quantifiers in the specification renders these hyperproperties extremely difficult to verify, or even just to test. Indeed, contrary to trace properties, where it suffices to find a single counterexample trace, refuting a $$\forall \exists$$ hyperproperty requires not only to find a trace, but also a proof that no second trace exists that satisfies the specified relation with the first trace. As a consequence, automated testing of $$\forall \exists$$ hyperproperties falls out of the scope of existing automated testing tools. In this paper, we present a fully automated approach to detect violations of $$\forall \exists$$ hyperproperties in synchronous and asynchronous infinite-state systems. Our approach extends bug-finding techniques based on symbolic execution with support for trace quantification. We provide a prototype implementation of our approach, and demonstrate its effectiveness on a set of challenging examples.
- Research Article
- 10.1145/3689761
- Oct 8, 2024
- Proceedings of the ACM on Programming Languages
Many important hyperproperties, such as refinement and generalized non-interference, fall into the class of ∀∃ hyperproperties and require, for each execution trace of a system, the existence of another trace relating to the first one in a certain way. The alternation of quantifiers renders ∀∃ hyperproperties extremely difficult to verify, or even just to test. Indeed, contrary to trace properties, where it suffices to find a single counterexample trace, refuting a ∀∃ hyperproperty requires not only to find a trace, but also a proof that no second trace satisfies the specified relation with the first trace. As a consequence, automated testing of ∀∃ hyperproperties falls out of the scope of existing automated testing tools. In this paper, we present a fully automated approach to detect violations of ∀∃ hyperproperties in software systems. Our approach extends bug-finding techniques based on symbolic execution with support for trace quantification. We provide a prototype implementation of our approach, and demonstrate its effectiveness on a set of challenging examples.
- Conference Article
25
- 10.1109/ares.2013.59
- Sep 1, 2013
Software vulnerability has long been considered an important threat to the safety of software systems. When source code is accessible, we can get much help from the information of source code to detect vulnerabilities. Static analysis has been used frequently to scan code for errors that cause security problems when source code is available. However, they often generate many false positives. Symbolic execution has also been proposed to detect vulnerabilities and has shown good performance in some researches. However, they are either ineffective in path exploration or could not scale well to large programs. During practical use, since most of paths are actually not related to security problems and software vulnerabilities are usually caused by the improper use of security-sensitive functions, the number of paths could be reduced by tracing sensitive data backwardly from security-sensitive functions so as to consider paths related to vulnerabilities only. What's more, in order to leave ourselves free from generating bug triggering test input, formal reasoning could be used by solving certain program conditions. In this research, we propose backward trace analysis and symbolic execution to detect vulnerabilities from source code. We first find out all the hot spot in source code file. Based on each hot spot, we construct a data flow tree so that we can get the possible execution traces. Afterwards, we do symbolic execution to generate program constraint(PC) and get security constraint(SC) from our predefined security requirements along each execution trace. A program constraint is a constraint imposed by program logic on program variables. A security constraint(SC) is a constraint on program variables that must be satisfied to ensure system security. Finally, this hot spot will be reported as a vulnerability if there is an assignment of values to program inputs which could satisfy PC but violates SC, in other words, satisfy PC Λ SC. We have implemented our approach and conducted experiments on test cases which we randomly choose from Juliet Test Suites provided by US National Security Agency(NSA). The results show that our approach achieves Precision value of 83.33%, Recall value of 90.90% and F1 Value of 86.95% which gains the best performance among competing tools. Moreover, our approach can efficiently mitigate path explosion problem in traditional symbolic execution.
- Conference Article
25
- 10.1109/ase.2004.71
- Sep 20, 2004
Automatic test generators (ATGs) are an important support tool for large-scale software development. Contemporary ATGs include JTest that does white box testing down to the method level only and black box testing if a specification exists, and AETG that tests pairwise interactions among input variables. The first automatic test generation approaches were static, based on symbolic execution (Clarke, 1976). Korel suggested a dynamic approach to automatic test data generation using function minimization and directed search (Korel, 1990). A dynamic approach can handle array, pointer, function and other dynamic constructs more accurately than a static approach but it may also be more expensive since the program under test is executed repeatedly. Subsequent ATGs explored the use of genetic algorithms (Jones et al., 1996; Michael et al., 2001; Pargas et al., 1999) and simulated annealing (Tracey et al., 1998). These ATGs address the problem of producing test data for low level code coverage like statement, branch and condition/decision and depend on branch function (Korel, 1990) style instrumentation (Jones et al., 1996; Michael et al., 2001) and/or the program graph (Jones et al., 1996; Pargas et al., 1999). Unlike previous work, our ATG, called genet, produces test data for branch coverage with simpler instrumentation than branch functions, does not use program graphs, and is programming language independent, genet uses a genetic algorithm (GA) (Holland, 1975) to search for tests and formal concept analysis (FCA) (Ganter and Wille, 1999) to organize the relationships between tests and their execution traces. The combination of GA with FCA is novel. Further, genet extends the opportunistic approach of GADGET (Michael et al., 2001) by targeting several uncovered branches simultaneously. The relationships that genet learns provides useful insights for test selection, test maintenance and debugging
- Research Article
2
- 10.1016/j.cose.2024.104193
- Nov 13, 2024
- Computers & Security
Beyond the sandbox: Leveraging symbolic execution for evasive malware classification
- Conference Article
- 10.1109/compsac51774.2021.00158
- Jul 1, 2021
A real-world, complex software system can contain a number of code snippets. Many snippets are deep, surrounded by complicated triggering conditions and/or hidden in functions less frequently invoked. Fuzzing and symbolic execution are two mainstreams for exploring input spaces and increasing code coverage of complicated software systems. Meanwhile, it remains a challenge to determine whether a deep code snippet is reachable, and if it is reachable, which test(s) can reach it.This paper presents ApproxiFuzzer, an effective, demand-driven approach to fuzzing towards deep code snippets in Java programs. Given a program P, a target deep code snippet tcs, and a set of seeding test inputs, the key idea behind ApproxiFuzzer is to selectively mutate the test inputs and collect their execution traces such that the execution traces gradually approximate tcs; several measures are designed for measuring the distances between execution traces and the code snippet and directing the fuzzing process towards generating test inputs reaching tcs.We have implemented ApproxiFuzzer and evaluated it against Kelinci (an AFL-based fuzzer) and JDart (a concolic execution tool) on a set of real-world benchmarks. The evaluation clearly demonstrates the strengths of ApproxiFuzzer—ApproxiFuzzer outperforms Kelinci by 36× in efficiently generating test inputs, obtaining up to 18.2% higher code coverage; ApproxiFuzzer also outperforms JDart by 46.2∼96.2% in hitting deep code snippets.
- Research Article
3
- 10.1109/tdsc.2021.3123159
- Nov 1, 2022
- IEEE Transactions on Dependable and Secure Computing
Advanced reverse analysis tools have significantly improved the ability of attackers to crack software via dynamic analysis techniques, such as symbol execution and taint analysis. These techniques are widely used in malicious fields such as vulnerability exploitation or theft of intellectual property. In this paper, we present an obfuscation strategy called “execution trace obfuscation,” wherein the program execution trace repeatedly switches between multiple threads. Our technique realizes equivalent code transformation by abstracting the obfuscation problems into pruning, cloning, and coloring problems in graph theory. Based on this, we further propose the cascade encryption of a function that depends on execution trace information with a key derived from the function address calculation process, followed by removing this key from the program. We have implemented a compiler-level system that inputs a source program and automatically generates an obfuscated file. Finally, random test proves the universality of obfuscation algorithm and verify the system’s performance. Results shows that our system can effectively interfere advanced reverse analysis tools.
- Book Chapter
- 10.1007/978-3-642-11145-7_16
- Jan 1, 2009
Dynamic test generation approach consists of executing a program while gathering symbolic constraints on inputs from predicates encountered in branch statements, and of using a constraint solver to infer new program inputs from previous constraints in order to steer next executions towards new program paths. Variants of this technique have recently been adopted in finding security vulnerabilities in binary level software. However, such existing approaches and tools are not retargetable: on the one hand, they can only find vulnerabilities in the binaries for a specific ISA; on the other hand, they can only find vulnerabilities over a specific OS because the execution trace is totally OS-dependently recorded in these tools. This paper presents a new dynamic test generation technique and a tool, ReTBLDTG, short for ReTargetable Binary-Level Dynamic Test Generation, that implements this technique. Unlike other such techniques, ReTBLDTG can deal with binaries for any ISAs over any OSes. ReTBLDTG is based on the whole system virtual machine that provides OS-independent and fast concrete execution of the target program. And which thread the executing instruction belongs to is OS-independently identified by analyzing the registers' value and hardware events over the virtual machine. Thus, the execution trace is recorded, without knowing the internal structure of the guest OS. At the same time, ReTBLDTG defines a Meta Instruction Set Architecture (MetaISA); ReTBLDTG maps the execution information, which is collected during the binary source code execution, to MetaISA; and symbolic execution, constraint collection and constraint solver operates on MetaISA, thus making these tasks ISA-independent. We have implemented our ReTBLDTG, retargeted it to 32-bit x86, PowerPC and Sparc ISAs, and used it to automatically find the six known bugs in the six benchmarks over Linux and Windows. Our results indicate that our ReTBLDTG can be easily retargeted to any ISA with only a few overheads; and ReTBLDTG can effectively find bugs located deep within large applications over any OS.
- Conference Article
6
- 10.1145/3460120.3484748
- Nov 12, 2021
In this cloud computing era, the security of hypervisors is critical to the overall security of the cloud. In particular, the security of CPU virtualization in hypervisors is paramount because it is implemented in the most privileged CPU mode. Blackbox and graybox fuzzing are limited to finding shallow virtual CPU bugs due to its huge search space. Whitebox fuzzing can be used for systematic analysis of CPU virtualization, but existing implementations rely on slow hardware emulators to enable dynamic symbolic execution. In this paper, we present HyperFuzzer, the first efficient hybrid fuzzer for virtual CPUs. Our key observation is that a virtual CPU's execution is determined by the VM state. Based on this observation, we design a new fuzzing setup that uses complete VM states as fuzzing inputs, and a new fuzzing technique we call Nimble Symbolic Execution to enable dynamic symbolic execution for CPU virtualization running on bare metal. Specifically, it uses the hardware to log the control flow efficiently, and then reconstructs an approximate execution trace from only the control flow and the fuzzing input. The reconstructed execution trace is surprisingly sufficient for precise dynamic symbolic execution of virtual CPUs. We have built a prototype of HyperFuzzer based on Intel Processor Trace for Microsoft Hyper-V. Our experimental results show that HyperFuzzer can run thousands of tests per second, which is 3 orders of magnitude faster than using a hardware emulator. When compared with a baseline using full (control+data) execution traces, HyperFuzzer can still generate 96.8% of the test inputs generated by the baseline. HyperFuzzer has found 11 previously unknown virtual CPU bugs in the Hyper-V hypervisor, and all of them were confirmed and fixed.
- Conference Article
2
- 10.1109/glocom.2005.1577914
- Jan 1, 2005
User capacity in synchronous CDMA systems has been well characterized and is fixed by the processing gain and target SIR required by each user. Asynchronous systems have not been as well studied. This paper examines the user capacity of asynchronous CDMA systems. It is shown that the user capacity can be increased over synchronous systems by allowing the users to be chip asynchronous and adapting their spreading codes to decrease interference. The degree to which the system experiences a gain by asynchrony is characterized by the effective dimension of the chip waveform. Simulation results are provided to verify that the user capacity is improved by allowing asynchrony when interference avoidance algorithms are employed.
- Research Article
102
- 10.1109/18.850680
- Jul 1, 2000
- IEEE Transactions on Information Theory
The performance of linear multiuser receivers in terms of the signal-to-interference ratio (SIR) achieved by the users has been analyzed in a synchronous CDMA system under random spreading sequences. In this paper, we extend these results to a symbol-asynchronous but chip-synchronous system and characterize the SIR for linear receivers-the matched-filter receiver the minimum mean-square error (MMSE) receiver and the decorrelator. For each of the receivers, we characterize the limiting SIR achieved when the processing gain is large and also derive lower bounds on the SIR using the notion of effective interference. Applying the results to a power controlled system, we derive effective bandwidths of the users for these linear receivers and characterize the user capacity region: a set of users is supportable by a system if the sum of the effective bandwidths is less than the processing gain of the system. We show that while the effective bandwidth of the decorrelator and the MMSE receiver is higher in an asynchronous system than that in a synchronous system, it progressively decreases with the increase in the length of the observation window and is asymptotic to that of the synchronous system, when the observation window extends infinitely on both sides of the symbol of interest. Moreover, the performance gap between the MMSE receiver and the decorrelator is significantly wider in the asynchronous setting as compared to the synchronous case.
- Research Article
1
- 10.1134/s0361768820080046
- Dec 1, 2020
- Programming and Computer Software
Automatic detection of bugs in programs is an extremely important direction of current research and development in the field of program reliability and security assurance. Earlier studies covered, methods for program analysis that combine the dynamic symbolic execution, randomized testing, and static analysis. In this paper, a formal model for detecting bugs using the symbolic execution of programs and its implementation for detecting the buffer bounds violation is presented. A formal model of the program symbolic execution is described, and a theorem on detecting a bug on the basis of the violation of the operation domain is formulated and proved. An implementation of the buffer bounds violation analyzer in the process of symbolic program execution is described, and the application of the implemented prototype for analyzing a set of programs in Debian Linux is presented. The experiments confirm the actionability of the proposed method.
- Research Article
22
- 10.1007/bf00336923
- Nov 1, 1983
- Biological cybernetics
The role of synchronism in systems of threshold elements (such as neural networks) is examined. Some important differences between synchronous and asynchronous systems are outlined. In particular, important restrictions on limit cycles are found in asynchronous systems along with multi-frequency oscillations which do not appear in synchronous systems. The possible role of deterministic chaos in these systems is discussed.
- Research Article
24
- 10.1088/1741-2552/abf00c
- Apr 6, 2021
- Journal of Neural Engineering
Objective. For patients with disorders of consciousness (DOC), such as vegetative state (VS) and minimally conscious state (MCS), communication is challenging. Currently, the communication methods of DOC patients are limited to behavioral responses. However, patients with DOC cannot provide sufficient behavioral responses due to motor impairments and limited attention. In this study, we proposed a hybrid asynchronous brain–computer interface (BCI) system that provides a new communication channel for patients with DOC. Approach. Seven patients with DOC (3 VS and 4 MCS) and eleven healthy subjects participated in our experiment. Each subject was instructed to focus on the square with the Chinese words ‘Yes’ and ‘No’. Then, the BCI system determined the target square with both P300 and steady-state visual evoked potential (SSVEP) detections. For the healthy group, we tested the performance of the hybrid system and the single-modality BCI system. Main results. All healthy subjects achieved significant accuracy (ranging from 72% to 100%) in both the hybrid system and the single modality system. The hybrid asynchronous BCI system outperformed the P300-only and SSVEP-only systems. Furthermore, we employed the asynchronous approach to dynamically collect the electroencephalography signal. Compared with the synchronous system, there was a 21% reduction in the average required rounds and a reduction of 105 s in the online experiment time. This asynchronous system was applied to detect the ‘yes/no’ communication function of seven patients with DOC, and the results showed that three of the patients (3 MCS) not only showed significant accuracies (67 ± 3%) in the online experiment, and their Coma Recovery Scale-Revised scores were also improved compared with the scores before the experiment. This result demonstrated that 3 of 7 patients were able to communicate using our hybrid asynchronous BCI system. Significance. This hybrid asynchronous BCI system can be used as a useful auxiliary bedside tool for simple communication with DOC patients.
- Conference Article
9
- 10.5555/2818754.2818832
- May 16, 2015
Symbolic execution is a powerful, systematic analysis that has received much visibility in the last decade. Scalability however remains a major challenge for symbolic execution. Compositional analysis is a well-known general purpose methodology for increasing scalability. This paper introduces a new approach for compositional symbolic execution. Our key insight is that we can summarize each analyzed method as a memoization tree that captures the crucial elements of symbolic execution, and leverage these memoization trees to efficiently replay the symbolic execution of the corresponding methods with respect to their calling contexts. Memoization trees offer a natural way to compose in the presence of heap operations, which cannot be dealt with by previous work that uses logical formulas as summaries for compositional symbolic execution. Our approach also enables efficient target oriented symbolic execution for error detection or program coverage. Initial experimental evaluation based on a prototype implementation in Symbolic Path Finder shows that our approach can be up to an order of magnitude faster than traditional non-compositional symbolic execution.
- Conference Article
27
- 10.1109/icse.2015.79
- May 1, 2015
Symbolic execution is a powerful, systematic analysis that has received much visibility in the last decade. Scalability however remains a major challenge for symbolic execution. Compositional analysis is a well-known general purpose methodology for increasing scalability. This paper introduces a new approach for compositional symbolic execution. Our key insight is that we can summarize each analyzed method as a memoization tree that captures the crucial elements of symbolic execution, and leverage these memoization trees to efficiently replay the symbolic execution of the corresponding methods with respect to their calling contexts. Memoization trees offer a natural way to compose in the presence of heap operations, which cannot be dealt with by previous work that uses logical formulas as summaries for compositional symbolic execution. Our approach also enables efficient target oriented symbolic execution for error detection or program coverage. Initial experimental evaluation based on a prototype implementation in Symbolic Path Finder shows that our approach can be up to an order of magnitude faster than traditional non-compositional symbolic execution.
- Ask R Discovery
- Chat PDF
AI summaries and top papers from 250M+ research sources.