Abstract
In Internet traffic classification, the class imbalance problem is mainly addressed by adjusting the class distribution. In the meanwhile, feature selection is also a key factor evoking this problem. Therefore a new filter feature selection method called balanced feature selection (BFS) is proposed. Every feature is measured both locally and globally and then an optimal feature subset is selected by our search model. A certainty coefficient is presented to measure the correlation between a feature and a certain class locally. The symmetric uncertainty is utilised to measure a feature and all classes globally. Through experiments on two real traffic traces using three classification algorithms, BFS is compared with five existing feature selection methods. Results show that it outperforms others by more than 15.29% g-mean improvement. Classification results are averaged over all datasets and classifiers here, 59.54% g-mean, 86.35% Mauc and 91.42% overall accuracy are achieved, respectively, when it is used.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.