A Scala library for testing student assignments on concurrent programming
We present a lightweight library for testing concurrent Scala programs by systematically exploring multiple interleavings between user-specified operations on shared objects. Our library is targeted at beginners of concurrent programming in Scala, runs on a standard JVM, and supports conventional synchronization primitives such as wait, notify, and synchronized. The key component of the library is the trait SchedulableMonitor that accepts a thread schedule, and interleaves as per the schedule all user-specified operations invoked through multiple threads on objects implementing the trait. Using our library, we developed a unit test engine that tests concurrent operations on shared objects on thousands of schedules obtained by bounding the number of context-switches. If a unit test fails on a schedule, the test engine offers as feedback the interleaved traces of execution that resulted in the failure. We used our test engine to automatically test and evaluate two assignments: (a) lock-based producer/consumer problem, and (b) lock-free sorted list implementation, offered to a class of 150 under-graduate students of EPFL. Our evaluations show that the system is effective in detecting bugs in students' solutions.
- Conference Article
31
- 10.1109/wpc.1998.693354
- Jun 24, 1998
The Java language supports the use of monitors, sockets, and remote method invocation for concurrent programming. Also, Java classes can be defined to simulate other types of concurrent constructs. However, concurrent Java programs, like other concurrent programs, are difficult to specify, design, code, test and debug. In this paper, we describe the design of a toolset, called JaDA (Java Dynamic Analyzer), that provides testing and debugging tools for concurrent Java programs. To collect run-time information or control program execution, JaDA requires transformation of a concurrent Java program into a slightly different Java program. We show that by modifying Java classes that support concurrent programming, Java application programs only need minor modifications. We also present a novel approach to managing threads that are needed for testing and debugging of concurrent Java programs.
- Research Article
7
- 10.1145/3607182
- Nov 23, 2023
- ACM Transactions on Software Engineering and Methodology
Concurrent programs are normally composed of multiple concurrent threads sharing memory space. These threads are often interleaved, which may lead to some non-determinism in execution results, even for the same program input. This poses huge challenges to the testing of concurrent programs, especially on the test result verification—that is, the prevalent existence of the oracle problem. In this article, we investigate the application of metamorphic testing (MT), a mainstream technique to address the oracle problem, into the testing of concurrent programs. Based on the unique features of interleaved executions in concurrent programming, we propose an extended notion of metamorphic relations, the core part of MT, which are particularly designed for the testing of concurrent programs. A comprehensive testing approach, namely ConMT , is thus developed and a tool is built to automate its implementation on concurrent programs written in Java. Empirical studies have been conducted to evaluate the performance of ConMT, and the experimental results show that in addition to addressing the oracle problem, ConMT outperforms the baseline traditional testing techniques with respect to a higher degree of automation, better bug detection capability, and shorter testing time. It is clear that ConMT can significantly improve the cost-effectiveness for the testing of concurrent programs and thus advances the state of the art in the field. The study also brings novelty into MT, hence promoting the fundamental research of software testing.
- Conference Article
17
- 10.1109/apsec.1996.566769
- Dec 24, 2002
Testing of concurrent programs is much more difficult than that of sequential programs. A concurrent program behaves nondeterministically, that is, the program may produce different results with the same input data according to execution timings of the program. In testing of concurrent programs, test data must specify not only input data but also sequences of statements. Ordered Sequence Testing Criterion for length k (OSC/sub k/), which was proposed by the authors, requires execution of all sequences of length k of concurrency statements which cause concurrent actions in a concurrent program. A monitoring tool has been developed for applying the testing criterion OSC/sub k/ to the testing of C concurrent programs on UNIX system. The tool measures coverage with regard to k-tuples of concurrency statements (OSC/sub k/) in source codes of a C concurrent program using a probe insertion method. The analysis of the tool's output for a practical C concurrent program shows not only applicability of the tool for testing of concurrent program but also the necessity of a supporting tool for forcing execution of concurrency statements.
- Research Article
7
- 10.1016/j.scico.2012.06.005
- Jun 27, 2012
- Science of Computer Programming
State-cover testing for nondeterministic terminating concurrent programs with an infinite number of synchronization sequences
- Research Article
37
- 10.1145/1353534.1346323
- Mar 1, 2008
- ACM SIGARCH Computer Architecture News
The reality of multi-core hardware has made concurrent programs pervasive. Unfortunately, writing correct concurrent programs is difficult. Addressing this challenge requires advances in multiple directions, including concurrency bug detection, concurrent program testing, concurrent programming model design, etc. Designing effective techniques in all these directions will significantly benefit from a deep understanding of real world concurrency bug characteristics. This paper provides the first (to the best of our knowledge) comprehensive real world concurrency bug characteristic study. Specifically, we have carefully examined concurrency bug patterns, manifestation, and fix strategies of 105 randomly selected real world concurrency bugs from 4 representative server and client open-source applications (MySQL, Apache, Mozilla and OpenOffice). Our study reveals several interesting findings and provides useful guidance for concurrency bug detection, testing, and concurrent programming language design. Some of our findings are as follows: (1) Around one third of the examined non-deadlock concurrency bugs are caused by violation to programmers' order intentions, which may not be easily expressed via synchronization primitives like locks and transactional memories; (2) Around 34% of the examined non-deadlock concurrency bugs involve multiple variables, which are not well addressed by existing bug detection tools; (3) About 92% of the examined concurrency bugs canbe reliably triggered by enforcing certain orders among no more than 4 memory accesses. This indicates that testing concurrent programs can target at exploring possible orders among every small groups of memory accesses, instead of among all memory accesses; (4) About 73% of the examinednon-deadlock concurrency bugs were not fixed by simply adding or changing locks, and many of the fixes were not correct at the first try, indicating the difficulty of reasoning concurrent execution by programmers.
- Research Article
79
- 10.1145/1353536.1346323
- Mar 1, 2008
- ACM SIGPLAN Notices
The reality of multi-core hardware has made concurrent programs pervasive. Unfortunately, writing correct concurrent programs is difficult. Addressing this challenge requires advances in multiple directions, including concurrency bug detection, concurrent program testing, concurrent programming model design, etc. Designing effective techniques in all these directions will significantly benefit from a deep understanding of real world concurrency bug characteristics. This paper provides the first (to the best of our knowledge) comprehensive real world concurrency bug characteristic study. Specifically, we have carefully examined concurrency bug patterns, manifestation, and fix strategies of 105 randomly selected real world concurrency bugs from 4 representative server and client open-source applications (MySQL, Apache, Mozilla and OpenOffice). Our study reveals several interesting findings and provides useful guidance for concurrency bug detection, testing, and concurrent programming language design. Some of our findings are as follows: (1) Around one third of the examined non-deadlock concurrency bugs are caused by violation to programmers' order intentions, which may not be easily expressed via synchronization primitives like locks and transactional memories; (2) Around 34% of the examined non-deadlock concurrency bugs involve multiple variables, which are not well addressed by existing bug detection tools; (3) About 92% of the examined concurrency bugs canbe reliably triggered by enforcing certain orders among no more than 4 memory accesses. This indicates that testing concurrent programs can target at exploring possible orders among every small groups of memory accesses, instead of among all memory accesses; (4) About 73% of the examinednon-deadlock concurrency bugs were not fixed by simply adding or changing locks, and many of the fixes were not correct at the first try, indicating the difficulty of reasoning concurrent execution by programmers.
- Research Article
38
- 10.1145/1353535.1346323
- Mar 1, 2008
- ACM SIGOPS Operating Systems Review
The reality of multi-core hardware has made concurrent programs pervasive. Unfortunately, writing correct concurrent programs is difficult. Addressing this challenge requires advances in multiple directions, including concurrency bug detection, concurrent program testing, concurrent programming model design, etc. Designing effective techniques in all these directions will significantly benefit from a deep understanding of real world concurrency bug characteristics. This paper provides the first (to the best of our knowledge) comprehensive real world concurrency bug characteristic study. Specifically, we have carefully examined concurrency bug patterns, manifestation, and fix strategies of 105 randomly selected real world concurrency bugs from 4 representative server and client open-source applications (MySQL, Apache, Mozilla and OpenOffice). Our study reveals several interesting findings and provides useful guidance for concurrency bug detection, testing, and concurrent programming language design. Some of our findings are as follows: (1) Around one third of the examined non-deadlock concurrency bugs are caused by violation to programmers' order intentions, which may not be easily expressed via synchronization primitives like locks and transactional memories; (2) Around 34% of the examined non-deadlock concurrency bugs involve multiple variables, which are not well addressed by existing bug detection tools; (3) About 92% of the examined concurrency bugs canbe reliably triggered by enforcing certain orders among no more than 4 memory accesses. This indicates that testing concurrent programs can target at exploring possible orders among every small groups of memory accesses, instead of among all memory accesses; (4) About 73% of the examinednon-deadlock concurrency bugs were not fixed by simply adding or changing locks, and many of the fixes were not correct at the first try, indicating the difficulty of reasoning concurrent execution by programmers.
- Conference Article
938
- 10.1145/1346281.1346323
- Mar 1, 2008
The reality of multi-core hardware has made concurrent programs pervasive. Unfortunately, writing correct concurrent programs is difficult. Addressing this challenge requires advances in multiple directions, including concurrency bug detection, concurrent program testing, concurrent programming model design, etc. Designing effective techniques in all these directions will significantly benefit from a deep understanding of real world concurrency bug characteristics.This paper provides the first (to the best of our knowledge) comprehensive real world concurrency bug characteristic study. Specifically, we have carefully examined concurrency bug patterns, manifestation, and fix strategies of 105 randomly selected real world concurrency bugs from 4 representative server and client open-source applications (MySQL, Apache, Mozilla and OpenOffice). Our study reveals several interesting findings and provides useful guidance for concurrency bug detection, testing, and concurrent programming language design.Some of our findings are as follows: (1) Around one third of the examined non-deadlock concurrency bugs are caused by violation to programmers' order intentions, which may not be easily expressed via synchronization primitives like locks and transactional memories; (2) Around 34% of the examined non-deadlock concurrency bugs involve multiple variables, which are not well addressed by existing bug detection tools; (3) About 92% of the examined concurrency bugs canbe reliably triggered by enforcing certain orders among no more than 4 memory accesses. This indicates that testing concurrent programs can target at exploring possible orders among every small groups of memory accesses, instead of among all memory accesses; (4) About 73% of the examinednon-deadlock concurrency bugs were not fixed by simply adding or changing locks, and many of the fixes were not correct at the first try, indicating the difficulty of reasoning concurrent execution by programmers.
- Conference Article
1
- 10.1145/1390841.1390852
- Jul 20, 2008
This lecture captures industrial and academic experience in teaching concurrent programming for several years. While the statement of a concurrent protocol is typically small taking only a few pages to complete, the statement implicitly introduces an exponential or possibly infinite space of possible interleavings. The novice is not aware of this, and as a result sometimes does not understand the lecture. To overcome this pitfall, exploration of the protocol interleaving is practiced from day one. A classical example being how many results does 100 threads executing i++ have?Separation of concerns is used to minimize the interleaving space at design time. Specifically, abstract primitives, such as atomic block and a conditional atomic block are introduced as a design tool. They are typically not to be found in practical programming languages. This reduces the space of interleavings to be considered at design time and facilitate verification and review of the protocol. On the other hand, the approach introduces pitfalls encountered in the translation/implementation stage.Once the abstract design is verified and/or tested, it needs to be translated to a commercial language where the synchronization primitives, for performance reasons, are weaker. For example, an atomic block is implemented using a lock unlock pair. But now atomicity needs to be ensured by the protocol implementer---a typical pitfall being the confusion of a lock/unlock code segment with an atomic block. Locations in the code that access the protected shared resources should be identified and access needs to follow the lock/unlock protocol. Exceptions and signals needs to be handled appropriately and adhere to the lock/unlock protocol. To avoid translation pitfalls, bug patterns are taught and precise understanding of the synchronization primitives is emphasized.Testing of the implemented protocol is taught. Testing should be aggressive, at unit test, and attempt to cover the interleaving space instead of common practice of testing concurrency at system test in a round about way through stress.Tools like ConTest or CHESS can be applied to this end.
- Conference Article
17
- 10.1109/issre.1996.558726
- Oct 30, 1996
Software testing and metrics are two important approaches to assure the reliability and quality of software. The emergence of concurrent programming in recent years introduces new testing problems and difficulties that cannot be solved by testing techniques for traditional sequential programs. One of the difficult tasks is that concurrent programs can have many instances of execution for the same set of input data. Many concurrent program testing methodologies propose to solve controlled execution and determinism. There are few discussions of concurrent software testing from the inter-task viewpoints. Yet, the common characteristics of concurrent programming are explicit identification of the large grain parallel computation units (tasks), and the explicit inter-task communication via rendezvous-style mechanisms. In this paper, we focus on testing concurrent programs through task decomposition. We propose four testing criteria to test a concurrent program. The programmer can choose an appropriate testing strategy depending on the properties of the concurrent program. Associated with the strategies, four equations are provided to measure the complexity of concurrent programs.
- Conference Article
11
- 10.1109/apsec.1996.566770
- Dec 4, 1996
Software testing generally proceeds as follows: generating test-cases, selecting test-data, executing a test target program, inspecting execution result and evaluating whether testing has already been sufficient or not yet. As for methods for structural testing of programs, the way using a coverage, where the coverage means what extent given testing criteria are satisfied, is noted. At the evaluating step, whether or not we finish the testing is determined in view of the coverage. This paper proposes a method for structural testing of concurrent programs written in Ada programming language, especially, test-case generation and execution of the programs. The Event InterActions Graph (EIAG) is used as a model for concurrent programs. The EIAG consists of Event Graphs and Interactions. An Event Graph is a control flow graph of a program unit in a concurrent program. The Interactions represent interactions between the program units. Program units are such as procedures, functions and task-types. After generating test-cases on the EIAG, a method for selecting test-data is described and measures to cope with infeasible test-cases with which are generated in this step is clarified. And a forced execution of a test target concurrent program in order to solve the nondeterministic execution is investigated. The nondeterministic execution is characteristic of concurrent programs.
- Conference Article
18
- 10.1109/cmpsac.1989.65057
- Sep 20, 1989
Although a lot of research has been done in software testing, how to test concurrent programs effectively has not received much attention. Two early papers on testing concurrent programs were written by P. Brinch Hansen (see Software-Practice and Experience, vol.8, p.145-50 and p.721-9 (1989)) K.C. Tai's paper (1985) addressed several issues on testing concurrent programs and started the work on deterministic execution testing and debugging of concurrent programs. These and other research results on testing concurrent programs are briefly examined. The following approaches to testing concurrent programs are discussed: single execution testing, multiple execution testing, and deterministic execution testing. Problems in deterministic execution testing and debugging of concurrent programs are examined. >
- Research Article
11
- 10.1007/s10009-013-0277-y
- Apr 27, 2013
- International Journal on Software Tools for Technology Transfer
Testing concurrent programs is a challenging problem due to interleaving explosion: even for a fixed set of inputs, there is a huge number of concurrent runs that need to be tested to account for scheduler behavior. Testing all possible schedules is not practical. Consequently, most effective testing algorithms only test a select subset of runs. For example, limiting testing to runs that contain data races or atomicity violations has been shown to capture a large proportion of concurrency bugs. In this paper we present a general approach to concurrent program testing that is based on techniques from artificial intelligence (AI) automated planning. We propose a framework for predicting concurrent program runs that violate a collection of generic correctness specifications for concurrent programs, namely runs that contain data races, atomicity violations, or null-pointer dereferences. Our prediction is based on observing an arbitrary run of the program, and using information collected from this run to model the behavior of the program, and to predict new runs that contain bugs with one of the above noted violation patterns. We characterize the problem of predicting such new runs as an AI sequential planning problem with the temporally extended goal of achieving a particular violation pattern. In contrast to many state-of-the-art approaches, in our approach feasibility of the predicted runs is guaranteed and, therefore, all generated runs are fully usable for testing. Moreover, our planning-based approach has the merit that it can easily accommodate a variety of violation patterns which serve as the selection criteria for guiding search in the state space of concurrent runs. This is achieved by simply modifying the planning goal. We have implemented our approach using state-of-the-art AI planning techniques and tested it within the Penelope concurrent program testing framework [35]. Nevertheless, the approach is general and is amenable to a variety of program testing frameworks. Our experiments with a benchmark suite showed that our approach is very fast and highly effective, finding all known bugs.
- Conference Article
7
- 10.1109/icnsc.2008.4525469
- Apr 1, 2008
Internet-based concurrent programs, such as multiplayer online game program that multiplayer in highly interactive domain behave in space-time unpredictably, offer more advantages and are difficult to test because of their non-deterministic behaviors. One approach to test the concurrent programs is reachability testing. Lei and Carve did more research work and proposed an algorithm for reachability testing. Lei's algorithm can not ensure that a race variant is always feasible because all the events that occur after the race receive event r can potentially be affected, which may induce exercising false SYN-sequences after the send partner of r of a synchronization pair is changed. For increasing the feasibility of reachability testing, we can remove all the events that occur after the race receive event r after changing the send partner of r. But the problem that we need to solve is how to remove. We propose a feasible strategy to solve the problem. Our strategy uses vector timestamps to determine the happened-before relation between the race receive events of the synchronization pairs of the concurrent program. According to the happened-before relation, we change the send partners of the more receive events in proper order. After we change the send partner of the race receive event r , we need to remove all the events that occur after r in the original execution. Our strategy can ensure the feasibility of race variants. The case study proves that our feasible strategy for reachability testing of the concurrent programs can ensure the feasibility of the testing.
- Research Article
14
- 10.1145/2528521.1508249
- Mar 1, 2009
- ACM SIGARCH Computer Architecture News
Multicore hardware is making concurrent programs pervasive. Unfortunately, concurrent programs are prone to bugs. Among different types of concurrency bugs, atomicity violation bugs are common and important. Existing techniques to detect atomicity violation bugs suffer from one limitation: requiring bugs to manifest during monitored runs, which is an open problem in concurrent program testing. This paper makes two contributions. First, it studies the interleaving characteristics of the common practice in concurrent program testing (i.e., running a program over and over) to understand why atomicity violation bugs are hard to expose. Second, it proposes CTrigger to effectively and efficiently expose atomicity violation bugs in large programs. CTrigger focuses on a special type of interleavings (i.e., unserializable interleavings) that are inherently correlated to atomicity violation bugs, and uses trace analysis to systematically identify (likely) feasible unserializable interleavings with low occurrence-probability. CTrigger then uses minimum execution perturbation to exercise low-probability interleavings and expose difficult-to-catch atomicity violation. We evaluate CTrigger with real-world atomicity violation bugs from four sever/desktop applications (Apache, MySQL, Mozilla, and PBZIP2) and three SPLASH2 applications on 8-core machines. CTrigger efficiently exposes the tested bugs within 1--235 seconds, two to four orders of magnitude faster than stress testing. Without CTrigger, some of these bugs do not manifest even after 7 full days of stress testing. In addition, without deterministic replay support, once a bug is exposed, CTrigger can help programmers reliably reproduce it for diagnosis. Our tested bugs are reproduced by CTrigger mostly within 5 seconds, 300 to over 60000 times faster than stress testing.