Mining range associations for classification and characterization

Jianhua Shao,Achilleas Tziatzios

doi:10.1016/j.datak.2018.10.001

Abstract

In this paper, we propose a method that is able to derive rules involving range associations from numerical attributes, and to use such rules to build comprehensible classification and characterization (data summary) models. Our approach follows the classification association rule mining paradigm, where rules are generated in a way similar to association rule mining, but search is guided by rule consequents. This allows many credible rules, not just some dominant rules, to be mined from the data to build models. In so doing, we propose several sub-range analysis and rule formation heuristics to deal with numerical attributes. Our experiments show that our method is able to derive range-based rules that offer both accurate classification and comprehensible characterization for numerical data.

Highlights

In many practical applications, it is desirable that we are able to extract the following type of rule from numerical data: age ∈ [25, 30] ∧ loan ∈ [2000, 3000] ⇒ repay = yesThat is, we derive rules that contain ranges in their antecedents and a categorical value as a consequent
A number of datasets selected from the UCI repository [9] are used in the experiments. These datasets are among the most popular datasets used in the research community for studying classification and they vary in tuple and attribute size, the nature of their numerical attributes and the number of different class labels
Our work adopts the classification association rule mining (CARM) methodology [6]. This allows effectively multiple models to be discovered from data, and to be used as a type of ensemble model for classification and characterization

Summary

Introduction

In the process industry, performance data is often analyzed to help determine how engineering processes may be optimized Such data typically contains a large number of numerical attributes and it is useful that we are able to extract range-based rules to describe the relationships among various variables, so that causality can be understood naturally and processes tuned . This is repeated on the remaining data until all the data is covered this way This strategy works well with categorical data, but is not effective when dealing with numerical attributes, because there is a potentially very large number of ways to form ranges and to cover the data. These methods resort to discretization or point-based split. These mechanisms may not capture some relevant ranges and do not help understand discovered rules [2]

Objectives

Results

Conclusion

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Data & Knowledge Engineering	Publication Date: Oct 24, 2018
Citations: 4	License type: cc-by

R Discovery Prime

R Discovery Prime

Mining range associations for classification and characterization

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Data & Knowledge Engineering

Lead the way for us

Similar Papers

An Adaptive Method of Numerical Attribute Merging for Quantitative Association Rule Mining
Jiuyong Li ... Rodney Topor
-
Jiuyong Li, et. al.Jiuyong Li ... Rodney Topor
01 Jan 1998
01 Jan 1998

Multi-objective PSO algorithm for mining numerical association rules without a priori discretization
Vahid Beiranvand ... Azuraliza Abu Bakar
Expert Systems with Applications | VOL. 41
Vahid Beiranvand, et. al.Vahid Beiranvand ... Azuraliza Abu Bakar
14 Jan 2014
Expert Systems with Applications | VOL. 41

Mining optimized support rules for numeric attributes
Rajeev Rastogi ... Kyuseok Shim
Information Systems | VOL. 26
Rajeev Rastogi, et. al.Rajeev Rastogi ... Kyuseok Shim
07 Aug 2001
Information Systems | VOL. 26

An Efficient Approach for Frequent Pattern Mining Method Using Fuzzy Set Theory
Manmay Badheka ... Sagar Gajera
-
Manmay Badheka, et. al. Manmay Badheka ... Sagar Gajera
01 Jan 2015
01 Jan 2015

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Mining range associations for classification and characterization

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Data &amp; Knowledge Engineering

More From: Data & Knowledge Engineering