Abstract

Data center network (DCN) is the backbone of many emerging applications from smart connected homes to smart traffic control and is continuously evolving to meet the diverse and ever-increasing computing requirements of these applications. The data centers often have tens of thousands of components such as servers and switches/routers that work together to achieve a common objective and serve these applications. Managing such large data centers is a tedious process and demands automation, intelligent control and decision making within the data center. Recently both the industry and academia have focused on bringing intelligence to the control, automation, and management of DCNs. Despite the variety of works that surveyed ML for networking, to the best of our knowledge, none has focused on DCN, which makes this survey original. Readers in the academic and industrial communities will all benefit from a comprehensive discussion of the ML solutions applied in DCN to address critical essential problems, including workload forecasting, traffic flow control, traffic classification and scheduling, topology management, network state prediction, root cause analysis, and network security. Furthermore, this article outlines the challenges and concludes with the future research venues in adopting ML for automatic, intelligent and autonomous DCNs.

Highlights

  • D ATA center network (DCN) hosts multi-tenant and multi-objective applications with ever-growing compute and communication requirements

  • Undeniably, future DCN will have to support the explosive growth in the volume of traffic entailed by smart connected devices and multi-tenant applications and services with remarkable capabilities that meet their needs

  • Over the last past decades, Machine learning (ML) has proved its capability in different domains including networking

Read more

Summary

INTRODUCTION

D ATA center network (DCN) hosts multi-tenant and multi-objective applications with ever-growing compute and communication requirements. The traffic matrices inside a data center change rapidly and unpredictably and are highly divergent This is an important problem to address and a key challenge that complicates the optimization of the network performance and capacity planning. Such characteristics complicate the flow scheduling process over a shared link for the following reason It needs to achieve a faster completion time to reduce communication delays and improve application responsiveness while taking into account the different requirements of flows simultaneously. DCN provides links with high bandwidth, low transmission delays and switches with a small buffer size, it supports on the other hand many-to-one communication patterns which can lead to a large number of incoming flows transmitted simultaneously to a single end-point. If not appropriately and proactively controlled, this in turn would overload the switch buffer leading to congestion, packet loss and higher latency with throughput reduction

WHY ML IS NEEDED FOR DCNS?
WORKLOAD FORECASTING Problem Definition
TRAFFIC FLOW CONTROL
TRAFFIC CLASSIFICATION AND SCHEDULING Problem Definition
TOPOLOGY MANAGEMENT Problem Definition
NETWORK STATE PREDICTION
ROOT CAUSE ANALYSIS Problem Definition
NETWORK SECURITY Problem Definition
CHALLENGES FOR ADOPTING ML IN DCN
Key Results
VISION AND OPEN RESEARCH DIRECTIONS
Findings
CONCLUSION
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call