In past decades, the structured and consistent data analysis has seen huge success. It is a challenging task to analyse the multimedia data which is in unstructured format. Here the big data defines the huge volume of data that can be processed in distributed format. The big data can be analysed by using the hadoop tool which contains the Hadoop Distributed File System (HDFS) storage space and inbuilt several components are there. Hadoop manages the distributed data which is placed in the form of cluster analysis of data itself. In this, it shows the working of Sqoop and Hive in hadoop. Sqoop (SQL-to-Hadoop) is one of the Hadoop component that is designed to efficiently imports the huge data from traditional database to HDFS and vice versa. Hive is an open source software for managing large data files that is stored in HDFS. To show the working, here we are taking the application Instagram which is a most popular social media. In this analyze the data that is generated from Instagram that can be mined and utilized by using Sqoop and Hive. By this, prove that sqoop and hive can give results efficiently. This paper gives the details of sqoop and hive working in hadoop.
Read full abstract