Abstract

Techniques of performance analysis, comprising of various metrics such as accuracy, efficiency and consuming time, have been conducted to evaluate the measures of properties and interestingness for the association rule mining method. Therefore, these metrics combined with different parameters (partitioning points, fuzzy sets) should be analysed thoroughly and balanced simultaneously to enhance the entire performance (effectiveness, accuracy and efficiency) for an algorithm. As a result, Most of the current algorithms face the pressure from the tradeoff of these metrics and parameters, which becomes even rougher when we employ it in different resources of data (discrete data, categorical data and continuous data). Specifically, serial data (i.e., sequences or transactions of floating point numbers), such as analysis of sensor streaming data, financial streaming data, medical streaming data and sentimental streaming data, are different from discrete variables, such as boolean data (e.g., sentiment: negative and positive represented as ‘0’ and ‘1’ separately) and categorical data (e.g., ‘young age’, ‘middle age’, ‘old age’). The main difference is that serial data face sharp boundary’s problem. That is, it is hard to decide the boundary values (i.e., the single points to partition data into different value groups), which is few to be solved in association rule mining methods. This paper aims to resolve the problem of sharp boundaries and balance multiple performances of our algorithm simultaneously by developing a novel dynamic optimisation (parameters and metrics) based fuzzy association rule mining (DOFARM) method. The proposed method can be applied in a wide range of classifying problems, such as the classification of sentiment strength (negative and positive). In our DOFARM method, instead of single partitioning points, we use a range of values to smoothly separate two consecutive partitions and develop a corresponding membership function to generate fuzzy sets for original data sets of physical and emotional diseases. Mainly, we design a dual compromise scheme: the first tradeoff balances better performance of out-putting association rules and more widely applicable fuzzy membership function while the second tradeoff reduces the time parameter as well as enhances the entire performance of our DOFARM method. The feasibility and accuracy of DOFARM we proposed have been certified theoretically and experimentally. Besides, we demonstrate the accuracy, effectiveness and efficiency for our DOFARM method by experiments according to both synthesis and real datasets.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call