We are given a large database of customer transactions, where each transaction consists of customer- id, transaction time, and the items bought in the transaction. The discovery of frequent sequences in temporal databases is an important data mining problem. Most current work assumes that the database is static, and a database update requires rediscovering all the patterns by scanning the entire old and new database. We consider the problem of the incremental mining of sequential patterns when new transactions or new customers are added to an original database. In this paper, we propose novel techniques for maintaining sequences in the presence of a) database updates, and b) user interaction (e.g. modifying mining parameters). This is a very challenging task, since such updates can invalidate existing sequences or introduce new ones.
Read full abstract