Abstract

The centralized system becomes less efficient, secure, and resilient as the network size and heterogeneity increase due to its inherent single point of failure issues. Distributed consensus mechanisms characterized by decentralization, autonomy, parallelism and fault-tolerance can meet the increasing demands of safety and security in critical interconnected systems. This paper establishes a Node and Link probabilistic failure model in the presence of node and communication link failures for a representative crash fault tolerant distributed consensus protocol: RAFT. The analytical results in terms of the probability density function and the mean value of consensus reliability are derived. Two important reliability performance indicators, Reliability Gain and Tolerance Gain are proposed to indicate the linear relationship between the consensus reliability and two basic parameters, i.e. the joint failure rate and the maximum number of tolerant faulty nodes, which provide the theoretical guidance for quickly deploying a RAFT system. The special case of a distributed consensus network with already a certain number of failures and its adverse impact are evaluated. The Markov probabilistic models, definitions of Reliability Gain and Tolerance Gain, and the analysis methods proposed in this paper can be extended to other consensus mechanisms.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call