Abstract
The integration of different big data platforms and input data from corresponding sources ensure the expected analysis and prediction of massive data with its nature. This paper incorporates the mechanism of combining and configuring data processing frameworks to generate input data, transform and analyze respective data based on its nature and requirement. In this paper, we look at multiple tasks in three different categories. Initially, it involves the classification of a local dataset with a large volume using a machine learning algorithm. Secondly, integration and operation of SQL data generated from Relational Database management systems. And lastly, the analysis of incoming data streams generated as a source from the Twitter app manager into Apache Spark. Based on this experiment the expected result is properly categorized, classified from each local dataset, integrated relational database, and results of most populated hashTags of real-life Twitter post data. Eventually, the experiment of the system depicts and evaluates analysis and combination of technologies for handling big data streams along with corresponding platforms.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have