Using Rough Set Theory to Find Minimal Log with Rule Generation

Tahani Nawaf Alawneh,Mehmet Ali Tut

doi:10.3390/sym13101906

Abstract

Data pre-processing is a major difficulty in the knowledge discovery process, especially feature selection on a large amount of data. In literature, various approaches have been suggested to overcome this difficulty. Unlike most approaches, Rough Set Theory (RST) can discover data de-pendency and reduce the attributes without the need for further information. In RST, the discernibility matrix is the mathematical foundation for computing such reducts. Although it proved its efficiency in feature selection, unfortunately it is computationally expensive on high dimensional data. Algorithm complexity is related to the search of the minimal subset of attributes, which requires computing an exponential number of possible subsets. To overcome this limitation, many RST enhancements have been proposed. Contrary to recent methods, this paper implements RST concepts in an iterated manner using R language. First, the dataset was partitioned into a smaller number of subsets and each subset processed independently to generate its own minimal attribute set. Within the iterations, only minimal elements in the discernibility matrix were considered. Finally, the iterated outputs were compared, and those common among all reducts formed the minimal one (Core attributes). A comparison with another novel proposed algorithm using three benchmark datasets was performed. The proposed approach showed its efficiency in calculating the same minimal attribute sets with less execution time.

Highlights

Information system security has been achieved using several security solutions such as Intrusion Detection System (IDS), IPS, anti-viruses and firewalls, etc
The RoughSets package in R implements the theory of rough set (RST) and fuzzy rough set (FRST) to model and analyze data
We will first explain the motivation for proposing IRS by discussing the computational complexity of the traditional rough set theory when working with high dimensional datasets

Summary

Introduction

Information system security has been achieved using several security solutions such as IDS, IPS, anti-viruses and firewalls, etc. Based on the work in [16,17,18,19], our study proposes more relevant research, providing a novel algorithm by using rough set package in R language to find the optimal minimal subset of attributes, rather than a smaller one without sacrificing performance. The motivation for proposing this methodology is to overcome the prohibitive complexity of RST concepts when searching for an optimal attribute subset, especially with big data Offering such solutions will enhance the efficiency of real-time analysis of security algorithms, i.e., real-time IPS. Developing a new algorithm using RST basic concepts to create minimal re-ducts; Offering a feasible feature selection methodology scalable to huge datasets, without sacrificing performance; Creating a minimal rule decision database that retains information content; Using three benchmark UCI datasets to evaluate the performance of the methodology; Comparing the result of the proposed model to recent works.

Related Works

Rough Set

R Language

Research Methodology

Problem Statement and Motivation

Datasets

12: End For M

Generating Minimal Decision Rules

Execution Time Comparison with Existing Methods

Findings

Conclusions and Future Works

Full Text

Published version (

Free)

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Symmetry	Publication Date: Oct 10, 2021
Citations: 2	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Using Rough Set Theory to Find Minimal Log with Rule Generation

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Symmetry

Lead the way for us

Similar Papers

A novel condensing tree structure for rough set feature selection
Ming Yang ... Ping Yang
Neurocomputing | VOL. 71
Ming Yang, et. al.Ming Yang ... Ping Yang
09 Oct 2007
Neurocomputing | VOL. 71

Fast algorithms of attribute reduction for covering decision systems with minimal elements in discernibility matrix
Ze Dong ... Ming Sun
International Journal of Machine Learning and Cybernetics | VOL. 7
Ze Dong, et. al.Ze Dong ... Ming Sun
16 Oct 2015
International Journal of Machine Learning and Cybernetics | VOL. 7

A Rough Based Hybrid Binary PSO Algorithm for Flat Feature Selection and Classification in Gene Expression Data
Suresh Dara ... Haider Banka
Annals of data science | VOL. 4
Suresh Dara, et. al.Suresh Dara ... Haider Banka
11 Mar 2017
Annals of data science | VOL. 4

A scalable and effective rough set theory-based approach for big data pre-processing
Zaineb Chelly Dagdia ... Mustapha Lebbah
Knowledge and information systems | VOL. 62
Zaineb Chelly Dagdia, et. al.Zaineb Chelly Dagdia ... Mustapha Lebbah
02 May 2020
Knowledge and information systems | VOL. 62

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Using Rough Set Theory to Find Minimal Log with Rule Generation

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Symmetry