Abstract

The article presents both methods of clustering and outlier detection in complex data, such as rule-based knowledge bases. What distinguishes this work from others is, first, the application of clustering algorithms to rules in domain knowledge bases, and secondly, the use of outlier detection algorithms to detect unusual rules in knowledge bases. The aim of the paper is the analysis of using four algorithms for outlier detection in rule-based knowledge bases: Local Outlier Factor (), Connectivity-based Outlier Factor (), K-, and . The subject of outlier mining is very important nowadays. Outliers in rules If-Then mean unusual rules, which are rare in comparing to others and should be explored by the domain expert as soon as possible. In the research, the authors use the outlier detection methods to find a given number of outliers in rules (, , ), while in small groups, the number of outliers covers no more than of the rule cluster. Subsequently, the authors analyze which of seven various quality indices, which they use for all rules and after removing selected outliers, improve the quality of rule clusters. In the experimental stage, the authors use six different knowledge bases. The best results (the most often the clusters quality was improved) are achieved for two outlier detection algorithms and .

Highlights

  • The data analyzed in this work are rule-based knowledge bases (KBs) on which the DecisionSupport Systems (DSSs) are based, supporting organizational and business decision-making

  • When there are a lot of rules in the KB, it is necessary to use machine learning techniques that will allow for effective management of large amounts of data

  • The purpose of using clustering quality indicators is to seek an answer to the question: to what extent does the obtained group structure resulting from a given clustering method represent a good summary of the information contained in the data? The tables and charts cover selected aspects of the authors’ research

Read more

Summary

Introduction

The data analyzed in this work are rule-based knowledge bases (KBs) on which the DecisionSupport Systems (DSSs) are based, supporting organizational and business decision-making. DSSs need artificial intelligence and machine learning for an effective decision support process. These techniques can accelerate the processes of inference and increase the quality of decisions made through the use of effective learning methods. Many authors attempt to use machine learning methods, e.g., clustering, in the decision support process. The article [1] presents a DSS for diabetes prediction while using machine learning and deep learning techniques. The authors compared conventional machine learning (Support Vector Machine (SV M) and the Random Forest (RF)) with deep learning approaches to predict and detect the diabetes patients. In [2], the authors present a clinical DSS for brest cancer screening using clustering and classification in which the Partition Around Medoid (PAM)

Objectives
Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call