Abstract

Multi-dimensional data, such as data cube, are constructed based on aggregating data in data warehouses and it requires to analyze with high flexibility. Also, clustering, which is an unsupervised pattern recognition analysis, has significant challenges to perform on data cube. In this paper, two new drafts of density-based clustering methods are designed to recognize unsupervised patterns of the data cube. In the first draft, DBSCAN clustering is hybridized by genetic algorithm and called the Improved DBSCAN (IDBSCAN). The motivation of designing the IDBSCAN optimizes the DBSCAN’s parameters by a meta-heuristic algorithm such as GA. The second draft, which is called the Soft Improved DBSCAN (SIDBSCAN), is designed based on fuzzy tuning parameters of the GA in the IDBSCAN. The fuzzy tuning parameters are performed with two fuzzy groups rules of Mamdani (SIDBSCAN-Mamdani) and Sugeno (SIDBSCAN-Sugeno), separately. These ideas are proposed to present efficient and flexible unsupervised analysis for a data cube by utilizing a meta-heuristic algorithm to optimize DBSCAN’s parameters and increasing the efficiency of the idea by applying dynamic tuning parameters of the algorithm. To evaluate the efficiency, the SIDBSCAN-Mamdani and the SIDBSCAN-Sugeno are compared with the IDBSCAN and the DBSCAN. The experimental results, consisted of 20 times running, indicate that the proposed ideas achieved to their targets.

Highlights

  • With regard to the increase an expansion of data on different storage media, there is a natural need for the effective methods for accessing data and extracting useful knowledge

  • Comparing the results shows that the Improved DBSCAN (IDBSCAN) success to improve the quality of data cube clustering between 4% for “User Identification from Walking Activity” until 28% for “Dow Jones Index”

  • This paper focuses on the data cube density-based clustering

Read more

Summary

Introduction

With regard to the increase an expansion of data on different storage media, there is a natural need for the effective methods for accessing data and extracting useful knowledge. The main aims of data mining include description and prediction. The second category is based on data deduction, looking for unknown variables and values of the data [26]. Each of these categories includes different patterns such as exploring frequent patterns, classification and regression, clustering and exploring outline patterns, which each of them has its own application and features. The aim of this study is to investigate the clustering analysis which is part of the descriptive patterns with regard to the type of data used for data mining [29]

Objectives
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.