Abstract
In this paper, the Filtered-Association Rules Network (Filtered-ARN) is presented to structure, prune, and analyze a set of association rules in order to construct candidate hypotheses. The Filtered-ARN algorithm selects association rules with the use of asymmetric objective measures, Added Value and Gain then builds a network allowing more exploration information. The Filtered-ARN was validated using three datasets: Lenses, Hayes-roth, and Soybean Large, available online. We carried out a concept proof experiment using a real dataset with data on organic fertilization (Green Manure) for text the proposed method. The results were validated by comparing the Filtered-ARN with the conventional ARN and also comparing the results with the decision tree. The approach presented promising results, showing its ability to explain a set of objective items and the aid to build more consolidated hypotheses by guaranteeing statistical dependence with the use of objective measures.
Highlights
Data mining is often described as the process of discovering “interesting” patterns in large databases [1]
The results demonstrated that the Filtered -Association Rules Network (ARN) could describe the elements that influence the target item more concisely compared to ARN, allowing the user to observe cases where an object statistically interferes with a target item
When we calculate the Added Value value of the rule “[prescription] = myope ⇒ [lenses] = hard” we find AV = 0, which affects a total independence between the constituent elements of this rule, being a mistaken hypothesis regarding the behavior of patients who need rigid lenses
Summary
Data mining is often described as the process of discovering “interesting” patterns in large databases [1]. An association rules mining method is presented that uses objective measures allied to a network structure to optimize hypothesis formation. Filtered -ARN uses objective measures asymmetric with the Association Rules Network (ARN) proposed by Pandey [8] to structure and assist in the analysis of extracted rules in a dataset. To validate the Filtered -ARN, we performed 3 case studies, and the results were compared with the conventional ARN and with a decision tree algorithm since they can be used to visualize degrees of dependence between elements of a dataset. The results demonstrated that the Filtered -ARN could describe the elements that influence the target item more concisely compared to ARN, allowing the user to observe cases where an object statistically interferes with a target item.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have