Abstract

The main purpose of data mining is to extract hidden, important and nontrivial information from a database. Sequential Pattern Mining is a data mining technique that aims to obtain and analyze frequent subsequences from sequences of events or items with or without time constraint. The importance of a sequence can be measured based on different factors such as the frequency of their occurrence, their length and also their profit. The pattern mining or the discovery of important and unexpected patterns and information was first introduced in 1990 with the well-known Apriori algorithm. Then, and after many studies on frequent pattern mining, a new approach appeared: Sequential Pattern Mining. In 1995, Agrawal et al. introduced a new Apriori algorithm supporting time constraints. The algorithm studied the transactions through time, in order to extract frequent patterns from the sequences of products related to a customer. Later, this technique became useful in many applications: DNA researches, medical diagnosis and prevention, telecommunications and so on. Other advanced algorithms and their extensions also appeared since then, such as GSP (1996), SPADE (2001), PrefixSPan (2001), SPAM (2002), CM-SPADE (2014) and CM-SPAM (2014) for Sequential Mining Process, ERMiner (2015) and RuleGrowth (2011) for mining Sequential Rule, CPT (2013) and CPT+(2015) for Sequence Prediction. Overviewing the evolution of sequential data mining techniques, this chapter discusses the multiple extensions of the Sequential Pattern Mining algorithms, and classifies them into Sequential Pattern Mining, Sequential Rule Mining and Sequence Prediction. It elaborates the different classes and some of their extensions.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call