Abstract

In the last years, new data sources appeared: social networks, mobile, internet of things, open Data, etc., and therefore data are rapidly increasing. These data is voluminous, various, and difficult to measure and analyze, which appears the concept of Big Data. The vast amount of data makes the ETL (Extract-Transform-Load) process heavy in data warehousing, renders the data mining process more complex, and makes the slow loading of data in database management systems. The solution to make these process more efficient is the use of parallelization technologies, many researchers opt for the use of MapReduce paradigm for its flexibility and powerful. In this paper, we provide an overview of state of the art in MapReduce research and we present its various axis.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.