Abstract

Real-time data stream mining algorithms are largely based on binary datasets and do not handle continuous quantitative data streams, especially in medical data mining field. Therefore, this paper proposes a new weighted sliding window fuzzy frequent pattern mining algorithm based on interval type-2 fuzzy set theory over data stream (WSWFFP-T2) with a single scan based on the artificial datasets of medical data stream. The weighted fuzzy frequent pattern tree based on type-2 fuzzy set theory (WFFPT2-tree) and fuzzy-list sorted structure (FLSS) is designed to mine the fuzzy frequent patterns (FFPs) over the medical data stream. The experiments show that the proposed WSWFFP-T2 algorithm is optimal for mining the quantitative data stream and not limited to the fragile databases; the performance is reliable and stable under the condition of the weighted sliding window. Moreover, the proposed algorithm has high performance in mining the FFPs compared with the existing algorithms under the condition of recall and precision rates.

Highlights

  • Frequent itemset mining (FIM) is the primary processing for the association rule mining (ARM, Apriori 1993) algorithm, which mines associations in the form of rules, such as “IF eat more than regular have obesity,” for a dataset

  • For the designed WSWFFP-T2 algorithm to mine the fuzzy frequent patterns (FFPs) on a data stream, the quantitative attribute should be first fuzzified by the defined membership functions based on the interval type-2 fuzzy theory

  • 1, X > 5: After the prefuzzification of interval type-2 fuzzy set theory, the fuzzified data structure is presented as ðf v:upper, f v:lowerÞ/Iname:ftp, where the Iname represents the name of the item and ðf v:upper, f v:lowerÞ is the membership value calculated from Equations (3)–(5)

Read more

Summary

Introduction

Frequent itemset mining (FIM) is the primary processing for the association rule mining (ARM, Apriori 1993) algorithm, which mines associations in the form of rules, such as “IF eat more than regular have obesity,” for a dataset. This paper improves the model and proposes a new weighted sliding window quantitative data stream frequent pattern mining algorithm based on type-2 fuzzy set theory. Recall and precision rates are applied to control the influence of the decay factor on the frequent patterns and the critical frequent patterns of the sliding window (2) A novel WSWFFP-T2 algorithm over medical data streams is proposed to efficiently mine the FFPs with only one scan using a fuzzy-list sorted structure (FLSS) and WFFPT2-tree. Various examples and figures are provided to better understand the interval type-2 fuzzy sets theory mining process over the quantitative data stream (3) The proposed WSWFFP-T2 algorithm is compared with the previous related algorithms to evaluate its performance.

Related Works and Research Problems
Preliminaries
Proposed WSWFFP-T2 Algorithm
Preprocessing Fuzzification
A: ð6Þ
Link node 1 Link node 3 Link node
Link node 1 Link node 1 Link node
Experimental Study
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call