Abstract
Sequential pattern analysis aims at finding statistically relevant temporal structures where the values are delivered in a sequence. With the growing complexity of real-world dynamic scenarios, more and more symbols are often needed to encode the sequential values. This is so-called “curse of cardinality”, which can impose significant challenges to the design of sequential analysis methods in terms of computational efficiency and practical use. Indeed, given the overwhelming scale and the heterogeneous nature of the sequential data, new visions and strategies are needed to face the challenges. To this end, in this paper, we propose a “temporal skeletonization” approach to proactively reduce the cardinality of the representation for sequences by uncovering significant, hidden temporal structures. The key idea is to summarize the temporal correlations in an undirected graph, and use the “skeleton” of the graph as a higher granularity on which hidden temporal patterns are more likely to be identified. As a consequence, the embedding topology of the graph allows us to translate the rich temporal content into a metric space. This opens up new possibilities to explore, quantify, and visualize sequential data. Our approach has shown to greatly alleviate the curse of cardinality in challenging tasks of sequential pattern mining and clustering. Evaluation on a business-to-business (B2B) marketing application demonstrates that our approach can effectively discover critical buying paths from noisy customer event data.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have