Abstract

Cloud computing has emerged as one of the most promising technologies for meeting rising computing needs. However, high-performance computing systems are more likely to fail due to the proliferation of components and servers. If a sub-system fails, the entire system may not be functional. In this regard, the occurrence of faults is tolerable using an efficient fault-tolerant method. Since cloud computing involves storing data on a remote network, system failures and congestion are the most common causes of faults. The paper presents a new approach to adopting a fault-tolerant mechanism that adaptively monitors health to detect faults, handles faults using a migration technique, and avoids network congestion. With the advantage of the Ant Colony Optimization (ACO) algorithm and active clustering, the load is distributed evenly in data centers. Simulation results indicate that our algorithm outperforms previous algorithms regarding total execution time and imbalance degree up to 10% and 17%, respectively.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call