Abstract

Big data is becoming bigger every day. Even for simple applications such as the Digital Bibliography & Library Project (DBLP) database, the data is becoming unmanageable using the conventional databases because of its size. Applying big data processing methods such as Hadoop and Spark is becoming more popular because of that. In this work, we investigate the use of Hadoop and Spark in the querying process of big data and we compare the performance of them in terms of their execution time. We use the DBLP database as a case study. Results show that Hadoop and Spark enhances the query execution time significantly when compared with conventional database management systems. We also found that Spark enhances the execution time over Hadoop.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.