Performance analysis of efficient data distribution in P2P environment using hybrid clustering techniques

S Raju,M Chandrasekaran

doi:10.1007/s00500-019-03796-9

S Raju, M Chandrasekaran

Open Access

https://doi.org/10.1007/s00500-019-03796-9

Copy DOI

Abstract

In this paper, K-means algorithm has been applied for distributed large data using hybrid clustering techniques. K-means is a simple and scalable algorithm which can be applied on large datasets. It is one of the well-known unsupervised clustering algorithms that fail in providing structured to unstructured data to enable extraction of valuable information. Peer-to-peer (P2P) technologies divide the data or resources between the peers for managing the network bandwidth, network participants and processing powers. During the data distribution process in the P2P environments, accuracy, computation complexity and distributed clustering accuracy are the important issues as they reduce the entire system performance. So, the author in this paper considered the system for the distribution of data in P2P environment using mining techniques. The data have been distributed using the hybrid map reducing method which analyzes the large volume of data by performing filtering and sorting. The cluster approach analyzes and manages the neighboring relationship about the peer nodes that helps in the management of the cluster distribution in the dynamic environment. Determination of the efficiency of the cluster formed is done with the help of the hybrid clustering algorithm, and the related system architecture is proposed. The clustering efficiency has been enhanced in the P2P environment using the distributed data network. The efficiency of the formed cluster was evaluated in terms of Jaccard index, F-measures, mutual information and rand measure. The performance of the system was analyzed using the experimental results and discussions, namely, error rate, accuracy and time. The multi-objective system helps in easing the difficulties in the implementation of P2P environment sensitive to initial solutions.

Highlights

In recent days, peer-to-peer (P2P) is one of the most common technologies for processing the different types of data in the distributed environment
The peer-to-peer-based clustering process includes the characteristic ability to be scalable in the peer-to-peer technology, ability to perform the routerless network and willingness to perform the functions despite any changes in the node or peer
The K-means clustering algorithm (Chen and Ho 2006) shares the data by exchanging the message between the peers, thereby reducing the problems seen in the normal clustering process

Summary

Motivation

The data analysis process is performed with the help of the data mining which analyzes the data and clusters similar data for making the efficient distribution (Nghiem et al 2014). The peer-to-peer-based clustering process includes the characteristic ability to be scalable in the peer-to-peer technology, ability to perform the routerless network and willingness to perform the functions despite any changes in the node or peer. By using these characteristics, the similar data present in the network have been estimated using the neighborhood relationship clustered together. The data mining process computes the data in the dataset in terms of using exact local algorithm and approximate local algorithm. The K-means clustering algorithm (Chen and Ho 2006) shares the data by exchanging the message between the peers, thereby reducing the problems seen in the normal clustering process

Methodology

Problem statement

Related works

Objectives

Proposed system

Clustering the selected data

Functions of K-means

Estimating the accuracy of cluster using harmonic search

Performance analysis

Skin segmentation dataset

Adult dataset

Jaccard index

F-measure

Rand measure

Methods

Conclusion

Compliance with ethical standards

Full Text

Published version (

Free)

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Soft Computing - A Fusion of Foundations, Methodologies and Applications	Publication Date: Feb 5, 2019
Citations: 8	License type: open-access

R Discovery Prime

R Discovery Prime

Performance analysis of efficient data distribution in P2P environment using hybrid clustering techniques

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Soft Computing - A Fusion of Foundations, Methodologies and Applications

Lead the way for us

Similar Papers

An energy efficient based hybrid clustering (EEBHC) approaches in wireless sensor network
V Perumal ... K Meenakshi Sundaram
International Journal of Engineering & Technology | VOL. 7
V Perumal, et. al.V Perumal ... K Meenakshi Sundaram
29 May 2018
International Journal of Engineering & Technology | VOL. 7

Effects of some design factors on the distribution of similarity indices in cluster analysis
Ahmed N Albatineh ... Golam B M Kibria
Communications in Statistics - Simulation and Computation | VOL. 46
Ahmed N Albatineh, et. al.Ahmed N Albatineh ... Golam B M Kibria
23 Oct 2015
Communications in Statistics - Simulation and Computation | VOL. 46

A minimum spanning tree based partitioning and merging technique for clustering heterogeneous data sets
Gaurav Mishra ... Sraban Kumar Mohanty
Journal of Intelligent Information Systems | VOL. 55
Gaurav Mishra, et. al.Gaurav Mishra ... Sraban Kumar Mohanty
22 Apr 2020
Journal of Intelligent Information Systems | VOL. 55

Combined hybrid clustering techniques and neural fuzzy networks to predict diesel engine emissions
Jiamei Deng ...
-
Jiamei Deng, et. al.Jiamei Deng ...
01 Oct 2007
01 Oct 2007

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Performance analysis of efficient data distribution in P2P environment using hybrid clustering techniques

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Soft Computing - A Fusion of Foundations, Methodologies and Applications