Analyses of Indexing Techniques on Uncertain Data With High Dimensionality

Ma'Aruf Mohammed Lawal,Razali Yaakob,Nor Fazlida Mohd Sani,Hamidah Ibrahim

doi:10.1109/access.2020.2988487

Ma'Aruf Mohammed Lawal, Razali Yaakob + Show 2 more

Open Access

https://doi.org/10.1109/access.2020.2988487

Copy DOI

Journal: IEEE Access	Publication Date: Jan 1, 2020
Citations: 7	License type: CC BY 4.0

Affiliation: Universiti Putra Malaysia, Ahmadu Bello University

Abstract

Deploying a solution for handling critical decision-based problem efficiently requires the processing of high-dimensional data. Over the years, due to modern technological advancement, unprecedented volume of uncertain data is been captured and this has necessitated the need to organize such data for better data access performance. To this effect, the use of indexing technique for supporting, organizing, and storing of uncertain data with high dimensionality has become pertinent. However, the choice of an indexing technique to improve search performance is highly influenced by the properties of the underlying data set, data construction methods employed by the indexing structure, and the query types it supports. This paper is motivated to conduct an extensive performance analysis among existing indexing techniques, namely: R-tree, R*-tree and X-tree, in order to realize the most efficient indexing structure for organizing, storing and ultimately improving search performance over uncertain data with high dimensionality. The results of the analyses with regard to CPU processing time and number of nodes visited clearly show the superiority of X-tree over R-tree and R*-tree, as its superiority holds for different data set sizes, data distributions, number of dimensions and even with varying selectivity ratio.

Highlights

Modern technological advancements seen in computing sphere today, have been responsible for generating or capturing high-dimensional data sets with huge number of objects
The quest for processing high-dimensional data has resulted in a number of innovative indexing techniques which are found useful in many applications, like Geographical Information Systems (GIS), robotics, environmental protection, metric spaces, medical imaging, and geosciences, as they are geometrically suited for both point and spatial data [2], [3], [5]–[7], [9], [11], [12], [14], [16], [18], [24], [28], [30], [41], [43], [46]
The number of nodes visited by the R-tree structure remains very high for the synthetic data set while in real data set it remains slightly higher than that of R*-tree and X -tree structures. This is due to the optimization heuristics of minimizing margins which was employed by both X -tree and R*-tree indexing structures

Summary

Introduction

Modern technological advancements seen in computing sphere today, have been responsible for generating or capturing high-dimensional data sets with huge number of objects. The choice of indexing structures for organizing large data set is accentuated by its ability to generally support both point and spatial data, where the spatial data structures require no transformation to improve data storage via better spatial clustering. This is further justified by the work of [12], where the authors were able to realize the performance of the index-based structures over the sorted-based methods when data objects are clustered together in an MBR for effective pruning.

Objectives

Methods

Results

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Analyses of Indexing Techniques on Uncertain Data With High Dimensionality

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: IEEE Access

Lead the way for us

Similar Papers

On High Dimensional Indexing of Uncertain Data
Charu C Aggarwal ... Philip S Yu
-
Charu C Aggarwal, et. al.Charu C Aggarwal ... Philip S Yu
01 Apr 2008
01 Apr 2008

On Indexing High Dimensional Data with Uncertainty
Charu C Aggarwal ... Philip S Yu
-
Charu C Aggarwal, et. al.Charu C Aggarwal ... Philip S Yu
24 Apr 2008
24 Apr 2008

Query Processing over Uncertain and Probabilistic Databases
Lei Chen ... Xiang Lian
-
Lei Chen, et. al.Lei Chen ... Xiang Lian
01 Jan 2012
01 Jan 2012

A Framework on Data Mining on Uncertain Data with Related Research Issues in Service Industry
Edward Hung
-
Edward HungEdward Hung
01 Jan 2013
01 Jan 2013

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Analyses of Indexing Techniques on Uncertain Data With High Dimensionality

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: IEEE Access