Sampling-based consensus fuzzy clustering on Big Data

Mohamed Ali Zoghlami,Minyar Sassi Hidri,Rahma Ben Ayed

doi:10.1109/fuzz-ieee.2016.7737868

Mohamed Ali Zoghlami, Minyar Sassi Hidri + Show 1 more

https://doi.org/10.1109/fuzz-ieee.2016.7737868

Copy DOI

Abstract

Many companies spend vast amounts of resources to collect, transform and store the massive amounts of data that flows through their business processes. When it comes to doing analysis and machine learning such as clustering on this data, time and compute speed gate determine how much data can be analyzed. Moreover, most Big Data clustering algorithms do not look at a complete, large dataset. Instead, they look at a subsample and work on approximations. However, work on samples can spread useful data that can be sources of value. In this paper, we use sampling combined with consensus strategy to dissemble the whole Big Data into small subsets, then basic partitions are locally generated from them using parallel processing. For the sampling part, we propose a partial data clustering (PDC) according to different nodes to classify the current sub-samples of partial data access (PDA) merged together with optimal prototypes generated from the last PDC and condensed into weighted points. For the consensus part, we apply a split-and-merge fuzzy clustering to equivalently transfer the consensus clustering problem into an optimization clustering one. Extensive experiments on several datasets demonstrate the ability to handle massive data and the consensus computing make the proposed classifier promising candidate for Big Data clustering.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Sampling-based consensus fuzzy clustering on Big Data

Abstract

Talk to us

Similar Papers

Lead the way for us

Similar Papers

Consensus Clustering on big data
Hongfu Liu ... Gong Cheng
-
Hongfu Liu, et. al.Hongfu Liu ... Gong Cheng
01 Jun 2015
01 Jun 2015

Clustering Techniques for Big Data Mining
Youssef Fakir ... Jihane El Iklil
-
Youssef Fakir, et. al.Youssef Fakir ... Jihane El Iklil
01 Jan 2020
01 Jan 2020

Fuzzy Based Clustering of Consumers' Big Data in Industrial Applications
Akash Sharma ... Varsha Arya
-
Akash Sharma, et. al.Akash Sharma ... Varsha Arya
06 Jan 2023
06 Jan 2023

Big Data Clustering: Applying Conventional Data Mining Techniques in Big Data Environment
P Praveen ... Ch Jayanth Babu
-
P Praveen, et. al.P Praveen ... Ch Jayanth Babu
01 Jan 2019
01 Jan 2019

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Sampling-based consensus fuzzy clustering on Big Data

Abstract

Talk to us

Similar Papers