Abstract

We are developing an idea through this paper which would give any question the perfect and best answer. Existing system is not capable of classifying according to different patterns. This compromises with efficiency of the system and quality of final result. In this project Parallel Clustering optimization method is formed by amalgamation of Map-Reduce with Ant Bee Colony Optimization Technique for improving the efficiency and success of the data science method. In addition, related running services on a Hadoop network are predicted with the help of Map Reduce algorithm.

Highlights

  • The scope of data science and analytics is getting stronger day by day

  • A novel hybrid optimization algorithm called BBO-PSO is proposed, which combines biogeography-based optimization with particle swarm optimization In this optimization algorithm, BBO will be used for local search while PSO will be used for mapping global data, which will allow the algorithm to have powerful search capabilities in solution space [3]

  • PROBLEM DEFINITION The data should be used at maximum efficiency and less processing time so the map reduce technique is used with simplified datasets that is obtain by applying pattern on datasets Artificial Bee Colony optimization is applied

Read more

Summary

INTRODUCTION

The scope of data science and analytics is getting stronger day by day. Knowledge created by analyzing and processing data obtained fromdifferent sources can become useful in different ways. Instead of developing new and different algorithms to reach the efficient and satisfactory results, optimization of available techniques can be an option. The K-Means range algorithm is suited to e-commerce application development. The clustering algorithm works efficiently for numerical and categorical data as well as on large data sets clustering [2]. The Particle Swarm optimization techniques instead of Ant Bee Colony Optimization with map reduce for maintaining clustering quality. Some algorithms that are capable of handling large and semi-structured data, are K-Means and ISODATA. Clustering is an unsupervised learning technique splitting into consequential groups of different similar data items. The experimental effects of various clustering approaches to perform reprovision of required information from massive repositories of data to make successful decisions on multiple applications. Our program is aimed at improving the current system while making the clustering techniques and the existing system more efficient

LITERATURE SURVEY
Parallel Particle
A parallel
Performance
PROBLEM DEFINITION
PROPOSED METHODOLOGY
Reduce
IMPLEMENTATION AND RESULTS
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call