Abstract
Distributed System, plays a vital role in Frequent Subgraph Mining (FSM) to extract frequent subgraph from Large Graph database. It help to reduce in memory requirements, computational costs as well as increase in data security by distributing resources across distributed sites, which may be homogeneous or heterogeneous. In this paper, we focus on the problem related complexity of data arises in centralized system by using MapReduce framework. We proposed a MapReduced based Optimized Frequent Subgrph Mining (MOFSM) algorithm in MapReduced framework for large graph database. We also compare our algorithm with existing methods using four real-world standard datasets to verify that better solution with respect to performance and scalability of algorithm. These algorithms are used to extract subgraphs in distributed system which is important in real-world applications, such as computer vision, social network analysis, bio-informatics, financial and transportation network.
Highlights
The algorithm used to enhance the performance of graph data mining are classified into two groups
Map: The map or mapper task to organize input data which is in form of file dictionaries stored in Distributed File System (DFS) or Hadoop Distributed File System (HDFS) The input file is processed in mapper function line by line and generate several small chunks of data
We proposed a model MapReduced based Optimized Frequent Subgrph Mining (MOFSM) shown in Fig.2, that used repetitive MapReduce framework with Optimized Frequent Subgraph Mining dynamically
Summary
The algorithm used to enhance the performance of graph data mining are classified into two groups. As data size increasing very fast, the main challenges are to deal with graphs of big sizes that grow in terabytes or petabytes scale To overcome these problem, we use graph division that reduce the complexity of graph mining algorithm, which helps to secure the most sensitive data, less cost used in memory, computation as well as in transmission during distributed system. A system for distributed graph mining, follows “think like a vertex” (TLV) programming paradigm [16], which provide a high-level filter-process computational framework consist of frequent subgraph mining, counting motifs, and finding cliques. The framework provide higher level of data abstraction and keep hides system level details from programmer, so that they can able more concentrate on problem oriented computation logic .Recently scientist are more emphasis on analyse and design of large network graph database to overcome major challenges arise in Big data like capturing data, storage, searching, sharing, transfer, analysis, presentation, etc.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
More From: International Journal of Engineering and Advanced Technology
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.