Cellular Internet card (IC) as a new business model emerges, which penetrates rapidly and holds the potential to foster a great business market. However, with the explosive growth of IC users, the user churn problem becomes severe, affecting the IC business significantly, while there is lacking appropriate techniques in the literature to deal with the issue. In this paper, we take the lead to study one large-scale data set from a provincial network operator of China, which contains about 4 million IC users and 22 million traditional card (TC) users. We first justify the IC user churn issue with data, and categorize the user churning reasons. Then, we shed light on understanding user portraits, which is the building block to enable efficient model design. Particularly, we conduct a systematical analytics on usage data by studying the difference of two types of users, examining the impact of user properties, and characterizing the user Internet using behaviors. Finally, by using the IC user portraits and usage patterns, we propose an <underline xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">IC</u> user <underline xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">C</u> hurn <underline xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">P</u> rediction model, named <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">ICCP</i> , which consists of a feature extraction component and a learning-based churn prediction architecture design. For feature extraction, both the static portrait features and temporal sequential features are captured. In the learning architecture, we devise the principal component analysis (PCA) block and the embedding/transformer layers to learn the respective information of two types of features, which are collectively fed into the classification multilayer perceptron layer (MPL) for churn prediction. A reference implementation of <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">ICCP</i> is conducted within the telecom system and extensive experiments corroborate the efficiency of <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">ICCP</i> .
Read full abstract