Abstract

Class association rules (CARs) are basically used to build a classification model for prediction; they can also be used to describe correlations between itemsets and class labels. The latter is very popular in mining medical data. For example, epidemiologists often consider rules which indicate the relations between risk factors (itemsets) and HIV test results (class labels). However, in the real world, end users are often interested in a subset of class association rules. Particularly, they may consider only rules which contain at least one itemset from a user-defined set of itemsets in the rule antecedent. For example, when classifying which populations are at high risk for HIV infection, epidemiologists often concentrate on rules that include demographic information such as sex, age, and marital status in rule antecedents. Two naive strategies are to solve this problem by applying the itemset constraints into the pre-processing or post-processing step. However, such approaches are time-intensive. This paper thus proposes an efficient method for integrating the constraints into the class association rule mining process. The experimental results show that the proposed algorithm outperforms two basic approaches in the mining time and the memory consumption. The practical benefits of our method are demonstrated by a real-life application in the HIV/AIDS domain.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call