Abstract

The identification of customer churn is a major issue for large telecom businesses. In order to manage the data of current customers as well as acquire and manage new customers, every day, a substantial volume of data gets generated. Therefore, it's crucial to identify the causes of client churn so that the appropriate steps can be taken to lower it. Numerous researchers have already discussed their efforts to combine static and dynamic approaches in order to reduce churn in big data sets, but these systems still have many issues when it comes to actually identifying churn. In this paper, we suggested two methods, the first of which is churn identification and using Natural Language Processing (NLP) methods and machine learning techniques, we make predictions based on a vast telecommunication data set. The NLP process involves data pre-processing, normalization, feature extraction, and feature selection. For feature extraction, we employ unique techniques like TF-IDF, Stanford NLP, and occurrence correlation methods, have been suggested. Throughout the lesson, a machine learning classification algorithm is used for training and testing. Finally, the system employs a variety of cross validation techniques and training and evaluating Machine learning algorithms. The experimental analysis shows the system's efficacy and accuracy.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call