Abstract
The sequential pattern mining was widely used to solve various business problems, including frequent user click pattern, customer analysis of buying product, gene microarray data analysis, etc. Many studies were going on these pattern mining to extract insightful data. All the studies were mostly concentrated on high utility sequential pattern mining (HUSP) with positive values without a distributed approach. All the ex-isting solutions are centralized which incurs greater computation and communication costs. In this paper, we introduce a novel algorithm for mining HUSPs including negative item values in support of a distributed approach. We use the Hadoop map reduce algorithms for processing the data in parallel. Various pruning techniques have been proposed to minimize the search space in a distributed environment, thus reducing the expense of processing. To our understanding, no algorithm was proposed to mine High Utility Sequential Patterns with negative item values in a distributed environment. So, we design a novel algorithm called DHUSP-N (Distributed High Utility Sequential Pattern mining with Negative values). DHUSP-N can mine high utility sequential patterns considering the negative item utilities from Bigdata.
Highlights
These days we can’t imagine the volume of data that is produced every day in the form of sequences [14] [15]
Utility was introduced to mine frequent patterns to resolve this issue by considering the profit and quantity of products. This introduce a novel field of study, namely, high utility itemset mining and high utility sequential pattern mining (HUSP), these are able to mine insightful knowledge, given a minimum utility defined by the user instead of minimum support
As this is the first of this kind there is no suitable algorithm to compare with DHUSP-N. The generic algorithm such as USPAN [23] is not appropriate to compare with DHUSP-N because it does not use negative values and it is a centralized approach
Summary
These days we can’t imagine the volume of data that is produced every day in the form of sequences [14] [15]. Utility was introduced to mine frequent patterns to resolve this issue by considering the profit (quality) and quantity of products This introduce a novel field of study, namely, high utility itemset mining and high utility sequential pattern mining (HUSP), these are able to mine insightful knowledge, given a minimum utility defined by the user instead of minimum support. High utility sequential pattern (HUSP) mining [2] [23] is used to extract profitable and more beneficial sequential patterns from databases It considers a business intention such as profit, user interests, value, etc. We came up with a new method for mining sequential patterns with high utility that includes negative item values using a distributed approach. We suggest few pruning strategies to eliminate unpromising items that leads to minimize the search space in distributed circumstances
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
More From: International Journal of Advanced Computer Science and Applications
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.