Abstract

Big Data is a vast volume of data that is not easy to be stored or processed with conventional approaches within a limited period. Therefore, to manage and extract value from it, a new architecture, method and analysis are needed. Big Data poses many challenges and problems and it has different properties such as volume, velocity, variety and veracity. The goal of Big Data is not only to collect, save and organize huge volumes of data, but it is also used to evaluate, extract and visualize useful information for further processes. Big Data is a modern worldwide novel technology that has the potential to provide great benefits to business and organizations of different fields around the world and it will be more desirable in the next few years. This work describes the importance of Big Data, various challenges it faces in adapting to today’s modern era, characteristics and architecture of Big Data, technologies used in Big Data and applications created using Big Data. The paper also explains MapReduce and Hadoop Distributed File System as two important models of Big Data.

Highlights

  • The size of data is rapidly increasing in the globe at a very high speed

  • It can be said that Big Data is very beneficial in various areas, both economically and nationally such as Financial Services, Health Care and Medicine Services, Education, Banking, Location Information Services, Telecommunications, Media, etc

  • This paper describes and reviews the importance of Big Data, various challenges it faces in adapting to today’s modern era, characteristics and architecture of Big Data, technologies used in Big Data and applications created using Big Data

Read more

Summary

Introduction

The size of data is rapidly increasing in the globe at a very high speed. The source of the Big Data is generated from audio, video, text, mobile phones, images, e-mails, health records, sensor machines, social networks, scientific data, businesses, websites, applications, etc. In addition to the traditional data, the generated data can be retrieved from both active and passive devices, including web pages, logs, files, e-mails, documents and even the data from the sensor devices (Madden, 2012) All those data are entirely not one type and contain raw, structured, unstructured and semi-structured data, which are hard to be processed with the existing conventional and traditional systems. For most of the media agencies, gaming and technology companies, analyzing big sets of data are a way to keep their customers, improve advertising, studying the geographical distribution of their users to serve relevant contents, showing contents according to day and night times in different countries (Bakshi, 2012). Are some of the technologies and tools for handling and managing Big Data: Storing: Simple Storage Service (S3), Hadoop Distributed File System (HDFS). Parallel computation helps to decrease time of analyzing and processing Big Data

Distributed File System
Apache Hadoop
Data Intensive Computing
Batch Processing Tools
Findings
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.