Abstract

Data mining for client behavior analysis has become increasingly important in business, however further analysis on transactions and sequential behaviors would be of even greater value, especially in the financial service industry, such as banking and insurance, government and so on. In a real-world business application of taxation debt collection, in order to understand the internal relationship between taxpayers' sequential behaviors (payment, lodgment and actions) and compliance to their debt, we need to find the contrast sequential behavior patterns between compliant and non-compliant taxpayers. Contrast Patterns (CP) are defined as the itemsets showing the difference/discrimination between two classes/datasets (Dong and Li, 1999). However, the existing CP mining methods which can only mine itemset patterns, are not suitable for mining sequential patterns, such as time-ordered transactions in taxpayer sequential behaviors. Little work has been conducted on Contrast Sequential Pattern (CSP) mining so far. Therefore, to address this issue, we develop a CSP mining approach, eCSP, by using an effective CSP-tree structure, which improves the PrefixSpan tree (Pei et al., 2001) for mining contrast patterns. We propose some heuristics and interestingness filtering criteria, and integrate them into the CSP-tree seamlessly to reduce the search space and to find business-interesting patterns as well. The performance of the proposed approach is evaluated on three real-world datasets. In addition, we use a case study to show how to implement the approach to analyse taxpayer behaviour. The results show a very promising performance and convincing business value.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.