BFPF-Cloud: Applying SVM for Byzantine Failure Prediction to Increase Availability and Failure Tolerance in Cloud Computing

Mahnaz Koorang Beheshti,Faramarz Safi-Esfahani

doi:10.1007/s42979-020-00299-5

Abstract

One of the most important aspects of distributed systems is automatic failure recovery. In general, systems must be able to confront any type of failure. One issue is commonly overlooked in the subject of confronting the failures in services. Byzantine failures are the worst kind of arbitrary failures, either. The client should be ready for the worst possible conditions, especially if the server gives an answer that should never give. Sometimes, several servers, hand in hand together, make false answers deliberately. On the other hand, systems have no plan to protect themselves against byzantine failures that happen when the whole processes are not committed to a subject. It is possible that a server responds, but should not, and there is no way to detect that this is incorrect. The complexity of such failures is the main reason in cloud computing systems. However, several algorithms have been presented to detect failures by inspecting responses and address the problem. To detect Byzantine failures, the requests should be executed; first, responses should be created and gathered, and then, the responses are compared altogether. The failures occur when all the processes are unable to reach a consensus on an issue. In other words, Byzantine failures must occur first, and then, a solution should be considered to solve the problem. Accordingly, in cloud computing as a distributed infrastructure, the system should not be involved in severe failures. BFT-Cloud (Byzantine fault-tolerance cloud) as the previous research guarantees the robustness of systems when up to f of totally 3f + 1 resource providers are faulty. It uses replication techniques for overcoming failures, since a broad pool of nodes are available in the cloud. The challenge of the model is that a request should be executed several times to create a correct response that increases the number and duration of executions, either. While it is expected that all responses to requests to be correct without repeating the requests and needing to re-execute them. In this study, a framework is presented called BFPF-Cloud that introduces several features to be applied in algorithms that are based on support vector machine (SVM) to predict Byzantine failures. The reactive policy, along with the proactive one, is applied together to handle Byzantine failures. The main goal is to maintain reliability besides system availability. The experiments show that selecting the characteristics such as latency, PesNumber, MIPS, and failure probability of the replicas are the best features that SVM uses to predict Byzantine failures. In comparison to BFT-Cloud, the number of request re-execution and the execution time are decreased 69.91% on average; the number of repeated requests are decreased 69.78% on average; and throughput is improved 69.90% on average.

Full Text