Developing reliable distributed systems poses many challenges such as concurrency, failure handling, and scalability. It is due to the non-deterministic execution of threads within processes, and communication means. Formal verification methods, such as model checking, have been used to ensure the reliability of safety-critical systems. This technique systematically explores the complete behavior of the system under test (SUT), investigating each reachable state with different thread schedules. Recent software model-checking tools, employing cache and centralization, have been applied to distributed systems. The caching technique can only check one process at a time, while the centralization technique verifies all processes simultaneously. In the centralization technique, two "ArrayByteQueue" buffers are utilized to store communication and process data byte-by-byte. However, during the backtracking process, the read and write operations, involving data insertion and removal from the queue, become resource-intensive. As a consequence, existing interprocess communication (IPC) models encounter computational limitations and experience a rapid state space explosion. To address these challenges, our work proposes the remodeling of IPC models by introducing a request and response tree structure to store communication data. Additionally, we employ pointers to navigate through the data during the backtracking process. Through experimental evaluations, the proposed implementation choices have demonstrated significantly improved performance across various metrics. By incorporating the request and response tree, we enhance the efficiency of storing communication data, while the use of pointers optimizes navigation during backtracking. This remodeling of IPC models shows promise in mitigating computational limitations and state space explosion, thereby enhancing the model-checking process in distributed systems. Our research contributes to advancing the field of model checking in distributed systems and offers potential solutions to the challenges associated with resource-intensive read and write operations during the model-checking process.
Read full abstract