Model checking for distributed reaction systems with temporal-epistemic properties

  • Abstract
  • Literature Map
  • Similar Papers
Abstract
Translate article icon Translate Article Star icon
Take notes icon Take Notes

Abstract Reaction systems are a model of computation inspired by the biochemistry exhibited by living cells. This paper introduces the notion of agency as an extension to the reaction systems formalism, leading to distributed reaction systems. Adding agents in the reaction systems setting, allows for the natural modelling and representation of multi-agent and distributed systems. To support the specification of temporal-epistemic properties of distributed reaction systems, we introduce the logic rs ctlk and present experimental results of its associated model checking procedure run on a biological benchmark of within-cell signal transduction networks. The experimental results are encouraging despite the complexity of the rs ctlk model checking problem that is shown to be pspace -complete.

Similar Papers
  • PDF Download Icon
  • Research Article
  • Cite Count Icon 2
  • 10.1007/s11047-024-09974-5
Variants of distributed reaction systems
  • Feb 25, 2024
  • Natural Computing
  • Erzsébet Csuhaj-Varjú + 1 more

A distributed reaction system consists of a finite set of reaction systems that either interact with a common environment or interact with each other by communicating products or reactions. A reaction system is a well-known qualitative formal model of interactions between biochemical reactions. A reaction is a triplet of nonempty sets representing chemicals, called the set of reactants, the set of inhibitors, and the set of products. A reaction corresponds to a chemical reaction performed on a set of chemicals, and a reaction system is a finite nonempty set of reactions. In this paper, we examine two variants of distributed reaction systems. We introduce the notion of a distributed reaction system with communication by request (a qDRS for short), where sets of products are communicated between the component reaction systems by queries. First, we show that every qDRS can be represented by a reaction system. After that we compare distributed reaction systems with communication by request to extended distributed reaction systems (EDRSs), models that were introduced in a previous paper. We prove that extended distributed reaction systems, where a context automaton provides input for the component reaction systems, simulate distributed reaction systems with communication by request and distributed reaction systems with communication by request simulate special variants of extended distributed reaction systems. Furthermore, we assign languages to these two variants of distributed reaction systems. We prove that the class of agreement languages of extended distributed reaction systems is equal to the class of languages of nondeterministic multihead finite automata and the agreement language of every distributed reaction system with communication by request is an element of a certain subregular language class.

  • Research Article
  • Cite Count Icon 1
  • 10.5075/epfl-thesis-4858
Model Checking of Distributed Algorithm Implementations
  • Jan 1, 2011
  • Maysam Yabandeh

It is notoriously difficult to develop reliable, high-performance distributed systems that run over asynchronous networks. Even if a distributed system is based on a well-understood distributed algorithm, its implementation can contain errors arising from complexities of realistic distributed environments or simply coding errors. Many of these errors can only manifest after the system has been running for a long time, has developed a complex topology, and has experienced a particular sequence of low-probability events such as node resets. Model checking or systematic state space exploration, which has been used for testing of centralized systems, is also not effective for testing of distributed applications. The aim of these techniques is to exhaustively explore all the reachable states and verify some user-specified invariants on them. Although effective for small software systems, for more complex systems such as distributed systems the exponential increase in number of explored states, manifests itself as a problem at the very early stages of search. This phenomenon, which is also known as exponential state space explosion problem, prevents the model checker from reaching the potentially erroneous states at deeper levels, in a realistic time frame. This thesis proposes Dervish, a new approach in testing that makes use of a model checker in parallel with the running distributed system. Before the model checker performance gets hampered by the exponential explosion problem, the model checker restarts form the current live state of the system, instead of the initial state. The continuously running model checker at each node predicts the possible future inconsistencies, before they actually manifest. This approach, not only helps in testing by checking more relevant states that could occur in a real run, but also enables the application to steer the execution away from the predicted inconsistencies. We identified new bugs in mature Mace implementations of RandTree, Bullet', Paxos, and Chord distributed systems. Furthermore, we show that if the bug is not corrected during system development, Dervish is effective in steering the execution away from the inconsistent states at runtime. To be feasible in practice, the state exploration algorithm in Dervish should be efficient enough to explore some useful states in the period between each two restarts. Our default implementation of this approach benefits from a new search heuristic effective for distributed algorithms with short communications, termed consequence prediction, which selectively explores future event chains of the system. For consensus algorithms, however, which are known to be one of the most complex of distributed algorithms, the exploration algorithms built upon principles of model checking centralized systems are not scalable enough to be installed in Dervish. Those approaches reduce the problem of model checking distributed systems to that of centralized systems, by using the global state, which also includes the network state, as the model checking state. This thesis introduces LMC, a novel model checking algorithm designed specifically for distributed algorithms. The key insight in LMC is to treat the local nodes' states separately, instead of keeping track of the global states. We show how Dervish equipped with LMC enables us to find bugs in some complex consensus algorithms, including PaxosInside, the first consensus algorithm proposed and implemented for manycore environments. A modern manycore architecture can be viewed as a distributed system with explicit message passing to communicate between cores. Yet, doing this efficiently is very challenging given the non-uniform latency in inter-core communication and the unpredicted core response time. This thesis explores, for the first time, the feasibility of implementing a (non-blocking) consensus algorithm in a manycore system. We present PaxosInside, a new consensus algorithm that takes up the challenges of manycore environments, such as limited bandwidth of interconnect network as well as the consensus leader. A unique characteristic of PaxosInside is the use of a single acceptor role in steady state, which in our context, significantly reduces the number of exchanged messages between replicas.

  • Research Article
  • Cite Count Icon 14
  • 10.6100/ir716364
Formal modeling and verification of distributed failure detectors
  • Jan 1, 2011
  • Muhammad Atif

Model checking is a systematic way of checking the absence of errors in a distributed system, i.e., assessing the functional requirements in a distributed system. However, there are certain challenges in this field, e.g., developing true abstract models and on their basis generalizing/guranteeing results, limited capacity of model checking tools and computational resources, identification of all requirements and their accurate specifications, etc. To understand and face such challenges, it is necessary to apply the prominent model checking techniques to different distributed systems designed for different communication models. In this thesis this challenge is accepted and resultantly encountered issues are discussed/addressed. The results reported are sufficient for advocating the need for applying model checking techniques as debugging. Therefore, we report bugs and the propose fixes but for ambiguous algorithms, we reconstruct them. We model check both fixed and reconstructed algorithms. We assess the following protocols: • Accelerated heartbeat protocols, • Consensus protocols in asynchronous distributed systems, • Group membership protocols and • Efficient algorithms to implement failure detectors in partially synchronous systems. We found that the accelerated heartbeat protocols proposed in [M.G. Gouda and T.M. McGuire, Accelerated Heartbeat Protocols, Proc. Of ICDCS’98], violated some natural and essential properties. We proved the results by giving counterexamples and developed the techniques to address the time-triggered events in mCRL2 and investigated the correct time bounds for all the protocols. Regarding consensus problem, we proved the correctness of the proposed algorithms where the failure detectors are unreliable (i.e., failure detectors may make mistakes). These algorithms are proposed in [T. Deepak Chandra and S. Toueg, Unreliable Failure Detectors for Reliable Distributed Systems, J. ACM, 1996 ]. For the group membership protocols proposed in [Y. Amir, D. Dolev, S. Kramer and D. Malki, Membership Algorithms for Multicast Communication Groups, Springer-Verlag, 1992], we found that the original specifications and the text explaining the protocols can be interpreted in different ways and even some natural interpretations contradict each other. Our formalization with respect to different interpretations showed the violation of claimed properties. So to resolve the ambiguities, we reconstructed the protocols and model-checked them. For analyzing the algorithms proposed in [M. Larrea, S. Arevalo and A.Fernndez, Efficient Algorithms to Implement Unreliable Failure Detectors in Partially Synchronous Systems, Proc. of DISC’99 ], we applied symmetry reduction techniques. We found that every algorithm encounters a deadlock if there is a bounded (yet arbitrarily large) buffer in the communication channel between a pair of nodes. We propose fixes for deadlock avoidance and model check the proposed algorithm in UPPAAL, FDR2 and MCRL2. We also present a comparison of these three tools for model checking one of the given four protocols.

  • Research Article
  • Cite Count Icon 2
  • 10.1142/s0129054123470044
Relating Various Types of Distributed Reaction Systems
  • Oct 18, 2023
  • International Journal of Foundations of Computer Science
  • Bogdan Aman

A distributed reaction system models a system composed of several reaction systems. Each reaction system has its own set of reactions, while the background set is the same for all reaction systems. At each transition of the distributed reaction system, the environment provides an arbitrary context containing symbols for each reaction system and also it specifies which reaction systems are active. On the other hand, a distributed communicating reaction system with direct communication models a system composed of several reaction systems that are able to communicate products or reactions, while the environment provides a context similar to that for distributed reaction systems. In this paper, it is proved that these distributed variants of reaction systems can be related by establishing translations of distributed reaction systems into distributed communicating reaction systems with direct communication and the other way round.

  • Research Article
  • 10.56415/csjm.v32.17
Generalized Distributed Reaction Systems
  • Nov 1, 2024
  • Computer Science Journal of Moldova
  • Artiom Alhazov + 2 more

In this paper, we introduce the notion of a generalized distributed reaction system with computations following the concept of the original reaction system: the resulting products in the individual components are obtained by applying rules which take into account the objects in the components of the system as reactants and inhibitors and yield results in specified components of the system. As specific variants, we investigate (i) generalized distributed reaction systems which look at all components for the presence or absence of objects, but the resulting products are only produced in the component the rule is assigned to as well as (ii) generalized distributed reaction systems which look for the presence or absence of objects only in the component the rule is assigned to, but the resulting products can be sent to specified components within the whole system. We first show how all these variants of generalized distributed reaction systems can be flattened to a reaction system having only one component. Moreover, we show how each of these two variants, which are restricted variants of the general model, can simulate even the general model. Finally, we prove that all these variants of generalized distributed reaction systems working with the standard, total parallel application of rules can be transferred into a usual reaction system working with the sequential application of rules.

  • Research Article
  • 10.6082/m1gf0rmt
Unearthing Concurrency and Scalability Bugs in Cloud-Scale Distributed Systems
  • Jan 1, 2017
  • Tanakorn Leesatapornwongsa

In the era of cloud computing, users move their data and computation from local machines to cloud, thus the services are expected to be 24/7 dependable. Cloud services must be accessible anytime and anywhere, not lose or corrupt users data, and scale as user base continues to grow. Unfortunately, guaranteeing cloud services’ dependability is challenging because these cloud services are backed by large sophisticated distributed systems such as scalable data stores, data-parallel frame- works, and cluster management systems. Such cloud-scale distributed systems remain difficult to get right because they need to address data races among nodes, complex failures in commodity hardware, tremendous user requests, and much more. Addressing these cloud-specific challenges makes the systems more complex and new intricate bugs continue to create dependability problems.,This dissertation tries to answer a vital question of cloud dependability: “how can we make cloud-scale distributed systems more dependable?” We try to answer this question by focusing on the problems of distributed concurrency bugs and scalability bugs. We focus on these two problems because they are novel issues that occur in cloud-scale environment only and not many works addressing them.,Distributed concurrency bug (DC bug) is one unsolved reliability problem in cloud systems. DC bugs are caused by non-deterministic order of distributed events such as message arrivals, machine crashes, and reboots. Cloud systems execute multiple complicated distributed protocols concurrently. The possible interleavings of the distributed events are beyond developer’s anticipations and some interleavings might not be handled properly that can lead to catastrophic failures. To combat DC bugs, we make two contributions. First, we conduct a formal study on DC bugs to gain foundation knowledge for DC-bug combating research. We study 104 DC bugs from various widely-deployed cloud-scale distributed systems in many characteristics along several axes of analysis such as the triggering timing condition, input preconditions, error and failure symptoms, and fix strategies. We present the first complete taxonomy of DC bugs, TaxDC, along with many findings on DC bugs that can guide future research.,Second, we advance state of the art of distributed system model checking by introducing “semantic-aware model checking” (SAMC). Distributed system model checkers (dmck) are used to test system reliability of real systems. Existing dmcks however rarely exercise multiple faults due to the state-space explosion problem, and thus do not address present reliability challenges of cloud systems in dealing with complex faults. SAMC pushes the boundary of dmcks by introducing a white-box principle that takes simple semantic information of the target system and incorporates that knowledge into state-space reduction policies. We show that SAMC can find deep bugs one to two orders of magnitude faster compared to state-of-the-art techniques.,And for the second aspect of system dependability, we focus on scalability bugs. Scale surpasses the limit of a single machine in meeting users’ increasing demands for computing and storage. On the negative side, scale creates new development and deployment issues. Developers must ensure that their algorithms and protocol designs to be scalable. However, until real deployment takes place, unexpected bugs in the actual implementations are unforeseen. This new era of cloud- scale distributed systems has given birth to “scalability bugs”, latent bugs that are scale-dependent, and only surface in large scale.,To address scalability bugs, we conduct a study on scalability bugs to understand how they manifest and what their root causes are, and introduce SCK, a methodology that enables developers to scale-check distributed systems and find scalability bugs economically on one machine. SCK helps developers identify potential buggy code and allows developers to colocate a large number of nodes to test the potential buggy code without sacrificing accuracy. We remove a problem of hardware contentions (i.e., CPU, memory, and thread) with four novel strategies, and we successfully integrate SCK to Cassandra, Riak, and Voldemort. With SCK, we achieve a high colocation factor (500 nodes), and can reproduce six scalability bugs and identify two new hidden bugs.

  • Book Chapter
  • Cite Count Icon 21
  • 10.1007/11560647_4
Stochastic Analysis of Graph Transformation Systems: A Case Study in P2P Networks
  • Jan 1, 2005
  • Reiko Heckel

In distributed and mobile systems with volatile bandwidth and fragile connectivity, non-functional aspects like performance and reliability become more and more important. To formalise, measure, and predict these properties, stochastic methods are required. At the same time such systems are characterised by a high degree of architectural reconfiguration. Viewing the architecture of a distributed system as a graph, this is naturally modelled by graph transformations. To address these two concerns, stochastic graph transformation systems have been introduced associating with each rule its application rate—the rate of the exponential distribution governing the delay of its application. Deriving continuous-time Markov chains, Continuous Stochastic Logic is used to specify reliability properties and verify them through model checking. In particular, we study a protocol for the reconfiguration of P2P networks intended to improve their reliability by adding redundant connections. The modelling of this protocol as a (stochastic) graph transformation system takes advantage of negative application and conditions path expressions. This ensuing high-level style of specification helps to reduce the number of states and increases the capabilities for automated analysis.

  • Research Article
  • Cite Count Icon 38
  • 10.1016/j.ins.2015.03.048
Model checking temporal properties of reaction systems
  • Mar 28, 2015
  • Information Sciences
  • Artur Męski + 2 more

Model checking temporal properties of reaction systems

  • Research Article
  • 10.7717/peerj-cs.2995
Formal modeling of a causal consistent distributed system and verification of its history via model checking using colored Petri net
  • Jul 7, 2025
  • PeerJ Computer Science
  • Khalid Amjed Mohammed Alsaegg + 2 more

Various consistency models for replicated distributed systems (DSs) have been developed and are usually implemented in the middleware layer. Causal consistency (CC) is a widely used consistency model appropriate for distributed applications like discussion groups and forums. One of the known distributed algorithms for CC is based on logical time synchronization with Fidge vector clocks that use the concepts of the hold-back and delivery queues for each replica. The basics of the algorithm and its assumptions are presented in the article. Then, a novel formal hierarchical colored Petri net model of a DS with CC support and three constituting replicas is presented. The proposed model operates based on the presented distributed algorithm for CC support with potential randomness for delays in message delivery. The article tries to answer the question: is a given distributed history (DH) a valid image of a causal-consistent distributed system (CCDS)? The proposed model validates a DH via model checking. The question is answered by the execution of the proposed model and the generation of its state space graph (SSG). Required model checking functions are developed for automatically analyzing SSG for (1) extracting the existence of the answer and (2) extraction of the shortest proof scenarios that can generate the given input DH. The model was used to analyze four case study examples. The article presents three effective techniques for decreasing the state space explosion problem. Results show that the colored Petri net model of a CCDS can automatically validate a DH using model checking.

  • Research Article
  • Cite Count Icon 10
  • 10.1016/j.ces.2017.05.051
Generalization of the concept of extents to distributed reaction systems
  • May 31, 2017
  • Chemical Engineering Science
  • D Rodrigues + 2 more

Generalization of the concept of extents to distributed reaction systems

  • Research Article
  • Cite Count Icon 1
  • 10.1142/s0129054123460024
Language Classes of Extended Distributed Reaction Systems
  • Jul 4, 2023
  • International Journal of Foundations of Computer Science
  • Lucie Ciencialová + 2 more

Reaction systems are well-known formal models of interactions between biochemical reactions. A reaction system is a finite set of triples (reactants, inhibitors, products) that represent chemical reactions, where the reactants, the inhibitors, and the products are objects corresponding to the chemicals. The reactions may facilitate or inhibit each other. A distributed reaction system consists of a finite set of reaction systems that interact with their environment (function in a given context). The environment is a finite set of reactants provided by a context automaton. In the preceding paper, we studied distributed reaction systems where in each step, the context automaton provided a separate set of reactants to the component reaction systems. We assigned languages to these distributed reaction systems and provided representations of some well-known language classes by these constructs. In this paper, the context is provided for the whole distributed reaction system and the component reaction systems distribute the context among each other in different ways (the same context is valid for each component, or the context is split among the components). As in the preceding paper, we assign languages to these new types of distributed reaction systems and provide representations of well-known language classes (the class of right-linear simple matrix languages, the recursively enumerable language class).

  • PDF Download Icon
  • Research Article
  • 10.3390/electronics13061153
VConMC: Enabling Consistency Verification for Distributed Systems Using Implementation-Level Model Checkers and Consistency Oracles
  • Mar 21, 2024
  • Electronics
  • Beom-Heyn Kim

Many cloud services are relying on distributed key-value stores such as ZooKeeper, Cassandra, HBase, etc. However, distributed key-value stores are notoriously difficult to design and implement without any mistakes. Because data consistency is the contract for clients that defines what the correct values to read are for a given history of operations under a specific consistency model, consistency violations can confuse client applications by showing invalid values. As a result, serious consequences such as data loss, data corruption, and unexpected behavior of client applications can occur. Software bugs are one of main reasons why consistency violations may occur. Formal verification techniques may be used to make designs correct and minimize the risks of having bugs in the implementation. However, formal verification is not a panacea due to limitations such as the cost of verification, inability to verify existing implementations, and human errors involved. Implementation-level model checking has been heavily explored by researchers for the past decades to formally verify whether the underlying implementation of distributed systems have bugs or not. Nevertheless, previous proposals are limited because their invariant checking is not versatile enough to check for the wide spectrum of consistency models, from eventual consistency to strong consistency. In this work, consistency oracles are employed for consistency invariant checking that can be used by implementation-level model checkers to formally verify data consistency model implementations of distributed key-value stores. To integrate consistency oracles with implementation-level distributed system model checkers, the partial-order information obtained via API is leveraged to avoid the exhaustive search during consistency invariant checking. Our evaluation results show that, by using the proposed method for consistency invariant checking, our prototype model checker, VConMC, can detect consistency violations caused by several real-world software bugs in a well-known distributed key-value store, ZooKeeper.

  • Conference Article
  • 10.1109/icinfa.2014.6932622
Improving software model checking on program backbone within distributed system
  • Jul 1, 2014
  • Jiawei Yong + 4 more

Model checking technique currently has been applied to a wide range of problem domains. Among them, verifying the reliability of software systems becomes much more significant. However, as to software with complex structure and large scale, the verification process suffers from the state space explosion, thus leading to the resource exhaustion and low efficiency. In this paper, we propose a method of improving software model checking in both foreground and background of ANSI-C source program to verify the properties. In the foreground stage, we directly dispose of program itself by pruning the program with respect to the assertion property and compressing the circular paths to extract the program backbone. Subsequently, the program backbone is used to generate a simple CTL automaton model which will be applied afterwards. In the background stage, we redesign the CTL state automaton's data structure and improve the model checking algorithm to adapt the MapReduce framework in distributed system. The set of states which are satisfied with CTL property is output and checked for satisfiability based on the CTL automaton model. The example in each part illustrates the validity of the whole method, and the experiments show this method improves the efficiency of program verification substantially.

  • Research Article
  • Cite Count Icon 1
  • 10.31577/cai_2019_5_1009
A More Faithful Formal Definition of the Desired Property for Distributed Snapshot Algorithms to Model Check the Property
  • Jan 1, 2019
  • Computing and Informatics
  • Ha Thi Thu Doan + 1 more

The first distributed snapshot algorithm was invented by Chandy and Lamport: Chandy-Lamport distributed snapshot algorithm (CLDSA). Distributed snapshot algorithms are crucial components to make distributed systems fault tolerant. Such algorithms are extremely important because many modern key software systems are in the form of distributed systems and should be fault tolerant. There are at least two desired properties such algorithms should satisfy: 1) the distributed snapshot reachability property (called the DSR property) and 2) the ability to run concurrently with, but not alter, an underlying distributed system (UDS). This paper identifies subtle errors in a paper on formalization of the DSR property and shows how to correct them. We give a more faithful formal definition of the DSR property; the definition involves two state machines - one state machine M_UDS that formalizes a UDS and the other M_CLDSA that formalizes the UDS on which CLDSA is superimposed (UDS-CLDSA) - and can be used to more precise model checking of the DSR property for CLDSA. We also prove a theorem on equivalence of our new definition and an existing one that only involves M_CLDSA to guarantee the validity of the existing model checking approach. Moreover, we prove the second property, namely that CLDSA does not alter the behaviors of UDS.

  • Book Chapter
  • Cite Count Icon 1
  • 10.1007/978-3-031-13502-6_5
Languages of Distributed Reaction Systems
  • Jan 1, 2022
  • Lucie Ciencialová + 2 more

Reaction systems are a formal model of interactions between biochemical reactions. The motivation for the concept of a reaction system was to model the behavior of biological systems in which a large number of individual reactions interact with each other. A reaction system consists of a finite set of objects that represent chemicals and a finite set of triplets (reactants, inhibitors, products) that represent chemical reactions; the reactions may facilitate or inhibit each other. An extension of the concept of the reaction system is the distributed reaction system which model was inspired by multi-agent systems, agents (represented by reaction systems) interact with their environment (context provided by a context automaton). In this paper, we assign languages to distributed reaction systems and provide representations of some well-known language classes by these systems.Keywordsreaction systemsdistributed reaction systemsright-linear simple matrix languagerecursively enumerable language

Save Icon
Up Arrow
Open/Close
  • Ask R Discovery Star icon
  • Chat PDF Star icon

AI summaries and top papers from 250M+ research sources.