The Optimal Inference Rules Selection for Unstructured Data Multi-Classification

Mariem Bounabi,Karim El Moutaouakil Karim El Moutaouakil,Khalid Satori Khalid Satori

doi:10.19139/soic-2310-5070-1131

Mariem Bounabi, Karim El Moutaouakil Karim El Moutaouakil + Show 1 more

Open Access

https://doi.org/10.19139/soic-2310-5070-1131

Copy DOI

Abstract

The Fuzzy Inference System (FIS) is frequently utilized in a variety of Text Mining applications. In the text processing domains, where the amount of the processed data is vast, inserting manual rules for FIS remains a real issue, especially in the text processing domains, where the size of the processed databases is enormous. Therefore, an automated and optimal inference rules (IR) selection strengthens the FIS process. In this work, we propose to apply the FP-Growth as an association model algorithm and an automatic way to identify IR for fuzzy text vectorization. Once the fuzzy vectors are generated, we call the selection variables algorithms, e.g., Info Gain and Relief, to reduce the given descriptor dimensionality. To test the new descriptor performance, we propose multi-classes text classifification systems using several machine learning algorithms. Applying benchmarked databases, the new technique to produce Fuzzy descriptors achieves a signifificant gain in time, precision rules, and weighting quality. Moreover, comparing the classifification systems, the accuracy is improved by 10% comparing with other approaches.

Highlights

One of the Fuzzy Inference System (FIS) use in the Text Mining field is the technique of weighting features (FTF-IDF) [4], where we use fuzzy reasoning to extract the term frequency-inverse term frequency (TF-IDF) scores [5]
To compare different recognition systems, based on the new fuzzy representation FTF-IDF approach and classifiers, several experimentations have been conducting for all algorithms with different configurations under a compatible Dell, Intel (R) Core i5- CPU 2.50 GHz, and 4 GB of RAM
The given accuracy = 98%, in table 4, and the presented results for the paper [6] prove that the automatization of rules to produce the FTF-IDF weight has an excellent impact on the text representation and the supervised decision

Summary

Introduction

The automatization of expert systems is a challenge in several areas [1] [2]. The main aim is to show, monitor, and provide relevant information utilizing fast and intelligent technologies, in the artificial intelligence context applied to textual data. Unlike Apriori, which produces candidate itemsets and tests them to keep only frequent itemsets, FP- Growth constructs frequent itemsets without generating candidates [9] In this contribution, the FP-Growth allows producing more explicit rules, which minimize the need for post-processing as a complicated step.the experiments prove that the new technique permits an optimal, rapid, and interoperable selection of inference rules to produce relevant Fuzzy Descriptors. As the second part of our presented contribution, we cote to generate fuzzy descriptors using the mentioned approach for several textual corpora for automatic multi-classes text classification. This level permits to test the performance of the given descriptor, where we compare several unstructured data categorization systems using:. To show the compared multi-classes text classification systems performances, we present the experimentation and results in the fourth section before the conclusion

Related Work

Main observations and motivations

The adopted Fuzzy TF-IDF approach

Machine Learning Models

Association Models

FP-Growth

Select attributes Methods

Classification Tools

Experiments & Results

Datasets We use as corpus the BBC News and BBC

Pre-processing

The Classification parameters

Performance Measures

BBC Sport Dataset

BBC News Dataset Using other databases, the BBC

Conclusion

Full Text

Published version (

Free)

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

The Optimal Inference Rules Selection for Unstructured Data Multi-Classification

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Statistics, Optimization & Information Computing

Lead the way for us

Journal: Statistics, Optimization & Information Computing	Publication Date: Feb 8, 2022
License type: cc-by

Similar Papers

SU-FF-T-377: Self-Learning AI Technique for Parameter Optimization of IMRT Treatment Planning
F Stieler ... H Yan
Medical physics | VOL. 34
F Stieler, et. al.F Stieler ... H Yan
01 Jun 2007
Medical physics | VOL. 34

Hierarchical Fuzzy Method of Comparing Bank Products with Complex Tariff Packages
...
Journal of Information Technology Management | VOL. 13
, et. al. ...
01 Jul 2021
Journal of Information Technology Management | VOL. 13

Decision Support System for Major Determination in Madrasah Aliyah
...
-
, et. al. ...
26 Dec 2013
26 Dec 2013

The Automatic option of inference rules for the fuzzy TF-IDF
Mariem Bounabi ... Khalid Satori
-
Mariem Bounabi, et. al.Mariem Bounabi ... Khalid Satori
02 Dec 2020
02 Dec 2020

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

The Optimal Inference Rules Selection for Unstructured Data Multi-Classification

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Statistics, Optimization &amp; Information Computing

More From: Statistics, Optimization & Information Computing