How to Mine Information from Each Instance to Extract an Abbreviated and Credible Logical Rule

Limin Wang,Chunhong Cao,Minghui Sun

doi:10.3390/e16105242

Abstract

Decision trees are particularly promising in symbolic representation and reasoning due to their comprehensible nature, which resembles the hierarchical process of human decision making. However, their drawbacks, caused by the single-tree structure,cannot be ignored. A rigid decision path may cause the majority class to overwhelm otherclass when dealing with imbalanced data sets, and pruning removes not only superfluousnodes, but also subtrees. The proposed learning algorithm, flexible hybrid decision forest(FHDF), mines information implicated in each instance to form logical rules on the basis of a chain rule of local mutual information, then forms different decision tree structures and decision forests later. The most credible decision path from the decision forest can be selected to make a prediction. Furthermore, functional dependencies (FDs), which are extracted from the whole data set based on association rule analysis, perform embedded attribute selection to remove nodes rather than subtrees, thus helping to achieve different levels of knowledge representation and improve model comprehension in the framework of semi-supervised learning. Naive Bayes replaces the leaf nodes at the bottom of the tree hierarchy, where the conditional independence assumption may hold. This technique reduces the potential for overfitting and overtraining and improves the prediction quality and generalization. Experimental results on UCI data sets demonstrate the efficacy of the proposed approach.

Highlights

The rapid development of information and web technology has made a significant amount of data readily available for knowledge discovery
Decision trees are promising in this regard, due to their comprehensible nature that resembles the hierarchical process of human decision making
We demonstrated functional dependency rules of probability in [8,9] to build a linkage between functional dependencies (FDs) and probability theory, and the following rules are mainly included:

Summary

Introduction

The rapid development of information and web technology has made a significant amount of data readily available for knowledge discovery. Naive Bayes (NB) [6,7], which is an important mining classifier for data mining and applied in many real-world classification problems, because of its high classification performance, replaces the leaf nodes at the bottom of the tree hierarchy, where the conditional independence assumption may hold. This technique ensures that all attributes will be utilized for prediction, improving the prediction quality and generalization.

Information Theory

Functional Dependency Rules of Probability

Naive Bayes

Bias and Variance

Statistical Results on UCI Data Sets

Contraceptive Method Choice

Conclusions

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

How to Mine Information from Each Instance to Extract an Abbreviated and Credible Logical Rule

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Entropy

Lead the way for us

Journal: Entropy	Publication Date: Oct 9, 2014
License type: CC BY 4.0

Similar Papers

Decision Forests: A Unified Framework for Classification, Regression, Density Estimation, Manifold Learning and Semi-Supervised Learning
Antonio Criminisi
-
Antonio CriminisiAntonio Criminisi
01 Jan 2010
01 Jan 2010

An Algorithm for Reducing the Number of Distinct Branching Conditions in a Decision Forest
Atsuyoshi Nakamura ... Kento Sakurada
-
Atsuyoshi Nakamura, et. al.Atsuyoshi Nakamura ... Kento Sakurada
01 Jan 2020
01 Jan 2020

ForEx++: A New Framework for Knowledge Discovery from Decision Forests
Md Nasim Adnan ... Md Zahidul Islam
Australasian Journal of Information Systems | VOL. 21
Md Nasim Adnan, et. al.Md Nasim Adnan ... Md Zahidul Islam
08 Nov 2017
Australasian Journal of Information Systems | VOL. 21

A Deep Fuzzy Semi-supervised Approach to Clustering and Fault Diagnosis of Partially Labeled Semiconductor Manufacturing Data
Joseph Cohen ... Jun Ni
-
Joseph Cohen, et. al.Joseph Cohen ... Jun Ni
28 Jul 2021
28 Jul 2021

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

How to Mine Information from Each Instance to Extract an Abbreviated and Credible Logical Rule

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Entropy