Abstract

Clustering analysis, or clustering, is an activity which can be applied to user event log data to determine the types of users which exist within a service, and can be used to gain insights into the client base by their behaviour. However, when applied to longitudinal user event log data, clustering can potentially misclassify regular users as ’one-off’ if their last interaction within their tenure of the service appears at the beginning of the observable data set. The main objective of this study was to investigate whether any impact of user tenure within longitudinal data on k-means clustering accuracy would occur. The current paper subjected a large telephony call log data set from a helpline to a k-means clustering algorithm to determine the types of callers that contact the helpline based on their usage characteristics (number of calls, mean duration of calls and variability of call duration). A threshold of one-month increments were applied to the data (callers appearing before the threshold but not after were removed each time) and then subsequently subjected to k-means clustering. Results showed that cluster structures remained stable after each threshold condition. Significant differences in cluster centers were found in one cluster across tenure conditions.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.