Owing to the advancement of the Internet of Things (IoT) and 5G mobile technologies, various IoT devices produce massive data, which is usually transferred to nearby sites, such as edge nodes or datacenters. Many large-scale IoT applications need to analyze the data distributed across multiple sites to obtain final results. A dominant challenge of this type of data analytics is the heterogeneities of resource capacities across geo-distributed sites. In this article, we find that the resource capacity as well as the resource price differ among sites, and the price heterogeneity has a significant impact on geo-distributed IoT data analytics. Thus, each geo-distributed IoT data analytics job prefers to minimize the job execution cost while guaranteeing its deadline requirement under the resource constraints of involved sites. Specifically, we propose to jointly consider the resource heterogeneities of both capacity and price, and minimize the cost of each job before its deadline. We characterize this optimization problem as a quadratically constrained quadratic programming problem. To tackle such an NP-hard problem, we propose the minimize the job completion cost before a given deadline (MCGL) method, which calculates a task placement solution by the gradient adjustment strategy according to the remarkable negative correlation relationship between job completion time and job completion cost of geo-distributed IoT data analytics job. The task placement strategy can optimize resource cost with respect to the deadline requirement of any geo-distributed data analytics job. The trace-driven evaluations indicate that MCGL significantly reduces the total cost compared with existing methods; moreover, they satisfy the deadline constraints simultaneously.
Read full abstract