Parallel and distributed clustering framework for big spatial data mining

Malika Bendechache,A-Kamel Tari,M-Tahar Kechadi

doi:10.1080/17445760.2018.1446210

Malika Bendechache, A-Kamel Tari + Show 1 more

Open Access

https://doi.org/10.1080/17445760.2018.1446210

Copy DOI

Abstract

ABSTRACTClustering techniques are very attractive for identifying and extracting patterns of interests from datasets. However, their application to very large spatial datasets presents numerous challenges such as high-dimensionality, heterogeneity, and high complexity of some algorithms. Distributed clustering techniques constitute a very good alternative to the Big Data challenges (e.g., Volume, Variety, Veracity, and Velocity). In this paper, we developed and implemented a Dynamic Parallel and Distributed clustering (DPDC) approach that can analyse Big Data within a reasonable response time and produce accurate results, by using existing and current computing and storage infrastructure, such as cloud computing. The DPDC approach consists of two phases. The first phase is fully parallel and it generates local clusters and the second phase aggregates the local results to obtain global clusters. The aggregation phase is designed in such a way that the final clusters are compact and accurate while the overall process is efficient in time and memory allocation. DPDC was thoroughly tested and compared to well-known clustering algorithms BIRCH and CURE. The results show that the approach not only produces high-quality results but also scales up very well by taking advantage of the Hadoop MapReduce paradigm or any distributed system.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: International Journal of Parallel, Emergent and Distributed Systems	Publication Date: Mar 16, 2018
Citations: 26	License type: cc-by-nc-sa

R Discovery Prime

R Discovery Prime

Parallel and distributed clustering framework for big spatial data mining

Abstract

Talk to us

Similar Papers

More From: International Journal of Parallel, Emergent and Distributed Systems

Lead the way for us

Similar Papers

Cloud computing and big data: Technologies and applications
Mostapha Zbakh ... Mohamed Bakhouya
Concurrency and Computation: Practice and Experience | VOL. 30
Mostapha Zbakh, et. al.Mostapha Zbakh ... Mohamed Bakhouya
20 May 2018
Concurrency and Computation: Practice and Experience | VOL. 30

Legal Governance of Brain Data Derived from Artificial Intelligence
Mahika Ahluwalia
Voices in Bioethics | VOL. 7
Mahika AhluwaliaMahika Ahluwalia
02 Jun 2021
Voices in Bioethics | VOL. 7

Utilizing Cloud Computing to address big geospatial data challenges
Chaowei Yang ... Yun Li
Computers, Environment and Urban Systems | VOL. 61
Chaowei Yang, et. al.Chaowei Yang ... Yun Li
02 Nov 2016
Computers, Environment and Urban Systems | VOL. 61

Big Data with Cloud Computing: an insight on the computing environment, MapReduce, and programming frameworks
Alberto Fernández ... José M Benítez
WIREs Data Mining and Knowledge Discovery | VOL. 4
Alberto Fernández, et. al.Alberto Fernández ... José M Benítez
01 Sep 2014
WIREs Data Mining and Knowledge Discovery | VOL. 4

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Parallel and distributed clustering framework for big spatial data mining

Abstract

Talk to us

Similar Papers

More From: International Journal of Parallel, Emergent and Distributed Systems