Abstract
Data continuously goes on increasing day by day due to rapidly increasing population, use of sensors, use of social media, use of Internet of things etc. Data generated from these various sources may be small or it may be large, also it can be structured or unstructured. We can easily process and analyze small amount of data in a single system. But nowadays data generating through various sources is big in size and mostly in unstructured form. So, in order to analyze such big amount of unstructured data we need some mechanism which can handle this amount of data and process it. Hadoop and Spark are the technologies which can handle any kind of data and can process it. For storing, processing and analyzing the data, different technologies and tools are used under hadoop and spark ecosystem. In this paper, we specifically focus on use of Hadoop tools and technologies like Map-Reduce, Apache flume, Apache Pig and Apache Spark technology and their comparative analysis.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.