Abstract

Failure recovery is a nontrivial property for current distributed systems. An autonomous failure recovery in a distributed system is the ability of a system to execute self-corrective action when an instance or a subset of the system becomes faulty. However, autonomous failure recovery in current large distributed system is a very complicated procedure and often complicated to implement. In order to achieve a high level of reliability and availability in current distributed environment,This paper presents an autonomous, self-configured fail-stop failure recovery model. This model utilized the advantages of the distributed neighbor replica technique (NRT). In this paper, the algorithm along with theoretical framework for autonomous failure recovery are illustrated. This paper propose a resource manager for optimal resource selection. In the event of a resource failure, the resource manager autonomously decide on a resource among a faulty resource neighbors and auto-reconfigure the system. This selection is based on certain reliability parameters or criteria. This paper also illustrates a prototype model implementation. The model also demonstrate that this model is theoretically sound with the ability to perform autonomous recovery smoothly by quickly reconfiguring its services upon detection of failure

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.