Mining algorithms for sequential patterns in parallel : Hash based approach

Takahiko Shintani,Masaru Kitsuregawa

doi:10.1007/3-540-64383-4_24

Abstract

In this paper, we study the problem of mining sequential patterns in a large database of customer transactions. Since finding sequential patterns has to handle a large amount of customer transaction data and requires multiple passes over the database, it is expected that parallel algorithms help to improve the performance significantly. We consider the parallel algorithms for mining sequential patterns on a shared-nothing environment. Three parallel algorithms (Non Partitioned Sequential Pattern Mining(NPSPM), Simply Partitioned Sequential Pattern Mining(SPSPM) and Hash Partitioned Sequential Pattern Mining(HPSPM)) are proposed. In NPSPM, the candidate sequences are just copied among all the nodes, which can lead to memory overflow for large databases. The remaining two algorithms partition the candidate sequences over the nodes, which can efficiently exploit the total system's memory as the number of nodes in increased. If it is partitioned simply, customer transaction data has to be broadcasted to all nodes. HPSPM partitions the candidate sequences among the nodes using hash function, which eliminates the customer transaction data broadcasting and reduces the comparison workload. We describe the implementation of these algorithms on a shared-nothing parallel computer IBM SP2 and its performance evaluation results. Among three algorithms HPSPM attains best performance.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Mining algorithms for sequential patterns in parallel : Hash based approach

Abstract

Talk to us

Similar Papers

Lead the way for us

Similar Papers

Mining sequential patterns from probabilistic databases
Muhammad Muzammal ... Rajeev Raman
Knowledge and Information Systems | VOL. 44
Muhammad Muzammal, et. al.Muhammad Muzammal ... Rajeev Raman
24 Jul 2014
Knowledge and Information Systems | VOL. 44

An efficient model for information gain of sequential pattern from web logs based on dynamic weight constraint
Dhirendra Kumar Jha ... Archana Tomar
-
Dhirendra Kumar Jha, et. al.Dhirendra Kumar Jha ... Archana Tomar
01 Oct 2010
01 Oct 2010

Techniques for Understanding User Usage Behavior on the Internet
Abhijit R Joshi ... Aparna Ranade-Halbe
International Journal of Computer Applications | VOL. 92
Abhijit R Joshi, et. al.Abhijit R Joshi ... Aparna Ranade-Halbe
18 Apr 2014
International Journal of Computer Applications | VOL. 92

WIS: Weighted Interesting Sequential Pattern Mining with a Similar Level of Support and/or Weight
Unil Yun
ETRI Journal | VOL. 29
Unil YunUnil Yun
08 Jun 2007
ETRI Journal | VOL. 29

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Mining algorithms for sequential patterns in parallel : Hash based approach

Abstract

Talk to us

Similar Papers