Abstract

BackgroundCustomer churn is the rate at which customers stop doing business with an entity. In the field of digital health care, user churn prediction is important not only in terms of company revenue but also for improving the health of users. Churn prediction has been previously studied, but most studies applied time-invariant model structures and used structured data. However, additional unstructured data have become available; therefore, it has become essential to process daily time-series log data for churn predictions.ObjectiveWe aimed to apply a recurrent neural network structure to accept time-series patterns using lifelog data and text message data to predict the churn of digital health care users.MethodsThis study was based on the use data of a digital health care app that provides interactive messages with human coaches regarding food, exercise, and weight logs. Among the users in Korea who enrolled between January 1, 2017 and January 1, 2019, we defined churn users according to the following criteria: users who received a refund before the paid program ended and users who received a refund 7 days after the trial period. We used long short-term memory with a masking layer to receive sequence data with different lengths. We also performed topic modeling to vectorize text messages. To interpret the contributions of each variable to model predictions, we used integrated gradients, which is an attribution method.ResultsA total of 1868 eligible users were included in this study. The final performance of churn prediction was an F1 score of 0.89; that score decreased by 0.12 when the data of the final week were excluded (F1 score 0.77). Additionally, when text data were included, the mean predicted performance increased by approximately 0.085 at every time point. Steps per day had the largest contribution (0.1085). Among the topic variables, poor habits (eg, drinking alcohol, overeating, and late-night eating) showed the largest contribution (0.0875).ConclusionsThe model with a recurrent neural network architecture that used log data and message data demonstrated high performance for churn classification. Additionally, the analysis of the contribution of the variables is expected to help identify signs of user churn in advance and improve the adherence in digital health care.

Highlights

  • Customer churn prediction is one of the most important concerns for almost every company

  • We aimed to examine the impact of time-series data on model performance and whether the presence of text data affects the performance of churn prediction

  • We can examine the effect of each variable on the final output of the model; to explain the effect of each variable on the model, we investigated the average value of the integrated gradients for each variable

Read more

Summary

Introduction

Customer churn prediction is one of the most important concerns for almost every company. With smartphone use becoming more common, the digital health care industry is growing, and numerous health-related apps have been launched [4]. The prediction of churn and the retention of digital health care service customers have significant implications for companies and for users. Objective: We aimed to apply a recurrent neural network structure to accept time-series patterns using lifelog data and text message data to predict the churn of digital health care users. Methods: This study was based on the use data of a digital health care app that provides interactive messages with human coaches regarding food, exercise, and weight logs. The analysis of the contribution of the variables is expected to help identify signs of user churn in advance and improve the adherence in digital health care

Objectives
Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call