Abstract

Distributed System, plays a vital role in Frequent Subgraph Mining (FSM) to extract frequent subgraph from Large Graph database. It help to reduce in memory requirements, computational costs as well as increase in data security by distributing resources across distributed sites, which may be homogeneous or heterogeneous. In this paper, we focus on the problem related complexity of data arises in centralized system by using MapReduce framework. We proposed a MapReduced based Optimized Frequent Subgrph Mining (MOFSM) algorithm in MapReduced framework for large graph database. We also compare our algorithm with existing methods using four real-world standard datasets to verify that better solution with respect to performance and scalability of algorithm. These algorithms are used to extract subgraphs in distributed system which is important in real-world applications, such as computer vision, social network analysis, bio-informatics, financial and transportation network.

Highlights

  • The algorithm used to enhance the performance of graph data mining are classified into two groups

  • Map: The map or mapper task to organize input data which is in form of file dictionaries stored in Distributed File System (DFS) or Hadoop Distributed File System (HDFS) The input file is processed in mapper function line by line and generate several small chunks of data

  • We proposed a model MapReduced based Optimized Frequent Subgrph Mining (MOFSM) shown in Fig.2, that used repetitive MapReduce framework with Optimized Frequent Subgraph Mining dynamically

Read more

Summary

INTRODUCTION

The algorithm used to enhance the performance of graph data mining are classified into two groups. As data size increasing very fast, the main challenges are to deal with graphs of big sizes that grow in terabytes or petabytes scale To overcome these problem, we use graph division that reduce the complexity of graph mining algorithm, which helps to secure the most sensitive data, less cost used in memory, computation as well as in transmission during distributed system. A system for distributed graph mining, follows “think like a vertex” (TLV) programming paradigm [16], which provide a high-level filter-process computational framework consist of frequent subgraph mining, counting motifs, and finding cliques. The framework provide higher level of data abstraction and keep hides system level details from programmer, so that they can able more concentrate on problem oriented computation logic .Recently scientist are more emphasis on analyse and design of large network graph database to overcome major challenges arise in Big data like capturing data, storage, searching, sharing, transfer, analysis, presentation, etc.

RELATED WORK
PRILIMINARIES
MapReduce Framework
OVERVIEW OF PROPOSED MODEL
Result
SPLITTING
MAPPING
SHUFFLING
VIII. REDUCING
EXPERIMENTAL RESULT AND DISCUSSION
CONCLUSION
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call