With the development of economic globalization and modern information and communication technology, the situation of communication fraud is becoming more and more serious. How to identify fraudulent calls accurately and effectively has become an urgent task in current telecommunications operations. Affected by the sample set and the current state of the art, the current machine learning methods used to identify the imbalanced distribution dataset of positive and negative samples have low recognition accuracy. Therefore, in this paper, we propose a new hybrid model solution that uses feature construction, feature selection and imbalanced classes handling. A stacking model fusion algorithm composed of a two-layer stacking framework with several state-of-the-art machine learning classifiers is adopted. The results show that the risk user identification model based on mobile network communication behavior established by our stacking model fusion algorithm can accurately predict the category labels of telecom users and improve the risk of telecom users. The generalization performance of the identification is high, which provides a certain reference for the telecommunications industry to identify risk users based on mobile network communication behaviors.
Read full abstract