Network Function Virtualization (NFV), as an emerging solution to virtualize network services traditionally running on proprietary, dedicated devices, can effectively reduce the cost of big data processing service providers and improve Quality of Service (QoS) by running a chain of ordered Virtual Network Functions (VNFs) on commodity hardware. One fundamental and critical problem of big data processing with NFV is how to deploy the chained VNFs and dispatch corresponding network flows to process the big data traffics so that the service cost can be minimized with guaranteed QoS. In this paper, we study the problem of VNF deployment and flow scheduling in distributed data centers with joint consideration of the service requirements and the resource capacity, and prove its NP-hardness through reduction from the k-level uncapacitated facility location problem. A two-phased algorithm is also devised by first balancing VNF resource requirements and then selecting VNF locations with a low complexity of $O(M^2\log _2 M)$ and an approximation ratio of $K + \rho$ . Finally, with extensive simulation experiments, the result shows that our algorithm can efficiently reduce the total cost of VNF deployment and flow communication in various scenarios.
Read full abstract