Abstract

AbstractThe process of integration through classification provides a unified representation of diverse data sources in Big data. The main challenges of big data analysis are due to the various granularities, irreconcilable data models, and multipart interdependencies between data content. Previously designed models were facing problems in integrating and analyzing big data due to highly complex and dynamic multi-source and heterogeneous information variation and also in processing and classifying the association among the attributes in a schema. In this paper, we propose an integration and classification approach through designing a Probabilistic Semantic Association (PSA) method to generate the feature pattern for the sources of big data. The PSA approach is trained to understand the data association and dependency pattern between the data class and incoming data to map the data objects accurately. It initially builds a data integration mechanism by transforming data into structured and learn to utilize the trained knowledge to classify the probabilistic association among the data and knowledge patterns. Later it builds a data analysis mechanism to analyze the mapped data through PSA to evaluate the integration efficiency. An experimental evaluation is performed over a real-time crime dataset generated from multiple locations having various events classes. The analysis of results confined that the utilization of knowledge patterns of accurate classification to enhance the integration of multiple source data is appropriate. The measure of precision, recall, fall-out rate, and F-measure approve the efficiency of the proposed PSA method. Even in comparison with the state-of-art classification method and with SC-LDA algorithm shows an improvisation in the prediction accuracy and enhance the data integration.

Highlights

  • Big data have started to extend everywhere and has been utilized in many fields such as computer vision, machine learning, financial, and social analytics

  • The objective of this paper is to propose a new approach for data integration and classification for uncertain and unstructured data through a Probabilistic Semantic Association (PSA)

  • Reddy et al [38] present an analysis of ML algorithms-based classifier’s performance on Big data utilizing dimensionality reduction techniques. It suggests a great extent of improvisation in classification using Principal Component Analysis (PCA) and Linear Discriminate Analysis (LDA) techniques

Read more

Summary

Introduction

Big data have started to extend everywhere and has been utilized in many fields such as computer vision, machine learning, financial, and social analytics. According to Sun et al [15], introduce a semantic-based structural similarity for the first time and propose an approach to measure the semantic-based structural similarity between networks with the computing theory for semantic relations as the foundation. Reddy et al [38] present an analysis of ML algorithms-based classifier’s performance on Big data utilizing dimensionality reduction techniques. It suggests a great extent of improvisation in classification using Principal Component Analysis (PCA) and Linear Discriminate Analysis (LDA) techniques

Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.