Abstract

<abstract> With the abundance of raw data generated from various sources including social networks, big data has become essential in acquiring, processing, and analyzing heterogeneous data from multiple sources for real-time applications. In this paper, we propose a big data framework suitable for pre‑processing and classification of image as well as text analytics by employing two key workflows, called big data (BD) pipeline and machine learning (ML) pipeline. Our unique end-to-end workflow integrates data cleansing, data integration, data transformation and data reduction processes, followed by various analytics using suitable machine learning techniques. Further, our model is the first of its kind to augment facial recognition with sentiment analysis in a distributed big data framework. The implementation of our model uses state-of-the-art distributed technologies to ingest, prepare, process and analyze big data for generating actionable data insights by employing relevant ML algorithms such as k-NN, logistic regression and decision tree. In addition, we demonstrate the application of our big data framework to facial recognition system using open sources by developing a prototype as a use case. We also employ sentiment analysis on non-repetitive semi structured public data (text) such as user comments, image tagging, and other information associated with the facial images. We believe our work provides a novel approach to intersect Big Data, ML and Face Recognition and would create new research to alleviate some of the challenges associated with big data processing in real world applications. </abstract>

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call