Abstract

The scale of big data has shown an explosive growth, which makes the processing of big data put forward higher requirements on data centers, and a single data center can no longer meet the needs of big data processing. To deal with this situation, a geographically distributed cloud system needs to be built. However, in the geographically distributed cloud system, each data center is distributed in different geographic locations, which makes the data placement operations in the geographically distributed cloud system lead to greater overhead. To solve this problem, this paper proposes a data placement strategy. This strategy comprehensively considers the data transmission latency, bandwidth cost, cloud server storage capacity, and load capacity during the data placement process, and formulates a data placement problem that minimizes the energy consumption of data transmission. Then the minimum set cover method based on Lagrangian relaxation is used to solve this problem and obtain the optimal data placement scheme. On the other hand, in a geographically distributed cloud data center, the execution progress of the job submitted by the user will be affected by the straggler task. To solve this problem, this paper proposes a speculative execution strategy for the geographically distributed cloud system. This strategy performs different speculative execution operations according to the state of the cluster load, and then calculates the load capacity of the nodes in the cluster. The node with the strongest load capacity in the cluster is used to perform speculative execution operations. Experimental results show that the proposed data placement strategy can effectively improve the performance of the energy consumption, the data storage cost, the network transmission cost and the data transmission time. The proposed speculative execution strategy can effectively improve the performance of the job completion time, cluster throughput and QoS satisfaction rate.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call