Distributed Anomaly Detection Over Big Data

Mohamed Sakr,Arabi Keshk,Walid Atwa

doi:10.19026/rjaset.16.6003

Abstract

This study aims to solve the problem of detecting anomalies in big data. A border-based Gird Partition (BGP) algorithm was proposed. The BGP algorithm focuses on calculating the Local Outlier Factor (LOF) for big data in a distributed environment. It splits the data into intersected subsets, then allocates these subsets to the slave nodes in a distributed environment. Some parts of these subsets are replicated between slave nodes. The slave nodes calculate the LOF for each subset that it owns. The splitting of the data between the slave nodes is done in grid-based without considering the size of the data that will be assigned to every slave node. The BGP algorithm results in un-balanced distribution of the subsets between slave nodes. To overcome this problem a modification on the BGP algorithm is proposed to take in consideration the size of the data that will be assigned to every slave node. The modified algorithm called Balanced boarder-based Gird Partition algorithm (BBGP). BBGP splits the data between the slave node equally. So that all the slave nodes will do balanced processing for calculating the LOF for the data. In the end, we evaluate the performance of the two algorithms through a series of simulation experiments over real data sets.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Distributed Anomaly Detection Over Big Data

Abstract

Talk to us

Similar Papers

More From: Research Journal of Applied Sciences, Engineering and Technology

Lead the way for us

Similar Papers

Fuzzy Neighbors and Deep Learning-Assisted Spark Model for Imbalanced Classification of Big Data
G Nalinipriya ... M Geetha
International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems | VOL. 31
G Nalinipriya, et. al.G Nalinipriya ... M Geetha
01 Feb 2023
International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems | VOL. 31

Sub-Grid Partitioning Algorithm for Distributed Outlier Detection on Big Data
Mohamed Sakr ... Walid Atwa
-
Mohamed Sakr, et. al.Mohamed Sakr ... Walid Atwa
01 Dec 2018
01 Dec 2018

Distributed Local Outlier Detection in Big Data
Yizhou Yan ... Caitlin Kulhman
-
Yizhou Yan, et. al.Yizhou Yan ... Caitlin Kulhman
13 Aug 2017
13 Aug 2017

Local outlier factor as part of a workflow for detecting and attenuating blending noise in simultaneously acquired data
Woodon Jeong ... Mohammed S Almubarak
Geophysical Prospecting | VOL. 68
Woodon Jeong, et. al.Woodon Jeong ... Mohammed S Almubarak
21 Apr 2020
Geophysical Prospecting | VOL. 68

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Distributed Anomaly Detection Over Big Data

Abstract

Talk to us

Similar Papers

More From: Research Journal of Applied Sciences, Engineering and Technology