in a data center network, accurately classifying flow is the key to optimal schedule flow. However, the existing classification methods cannot meet the demand of real networks in terms of classification performance, detection latency, and control overhead. Thus, by combining the ability of deep learning to describe the multi-dimensional features, and the advantage of software-defined networking (SDN) in centrally controlling the network from a global viewpoint, this paper proposes a fine-grained flow classification method. This paper uses random forest technology to select eight important features in three dimensions—time distribution feature of flow, real-time feature of flow and packet header feature—for the classification model. First, this paper proposes a 2-classification scheme with a two-level architecture of pre-classification and exact-classification to detect elephant/mice flows. The pre-classification model using deep residual learning + A-Softmax with cost-sensitive is deployed on a SDN switch at the network edge to filter out a large number of mice flows. The exact-classification model using deep residual learning + AM-Softmax is deployed on the SDN controller to accurately identify the elephant flows. Second, this paper proposes a 4-classification scheme based on gated recurrent unit (GRU) to detect elephant/cheetah/tortoise/porcupine flows. Finally, the experiment results show that, when the 5th packet of a flow arrives, the 2-classification scheme can achieve a recall of up to 97.31%, an accuracy of up to 93.6%, a control overhead of 0.1kbps, and a detection latency of 7 ms. At the same time, the 4-classification scheme can achieve a recall, an accuracy, a false positive rate (FPR), a Kappa, a control overhead, and a detection latency of 83.58%, 86.53%, 5.02%, 0.797, 0.61kbps and 7.5 ms on an average, respectively. Compared with the existing methods (FlowSeer, NELLY and ESCA), all performance measures are improved to different degrees. At the same time, we also confirm the generalization of selected features and of the designed flow classification model.
Read full abstract