Abstract

Discriminative sequential patterns are sub-sequences whose occurrences exhibit significant differences across sequential data sets with different class labels. The discovery of such types of patterns has many practical applications in different fields. To date, various algorithms for mining discriminative sequential patterns have been proposed. However, the reported patterns from these methods usually contain many false positives that only hold in the sample data by chance. To alleviate this issue, we put forward the concept of significance-based discriminative sequential pattern mining and a corresponding algorithm DSPM-MTC (Discriminative Sequential Pattern Mining with Multiple Testing Correction). The key idea of DSPM-MTC is to integrate the multiple hypothesis testing correction procedure into the pattern mining process to generate a pattern set with error rate control. To demonstrate the effectiveness of DSPM-MTC, we conduct a series of experiments on real sequential data sets and simulation data sets. The experimental results show that DSPM-MTC can effectively recognize false discoveries to generate a pattern set with statistical quality control.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call