Abstract

Traditional database systems like relational databases can store data which are structured with predefined schema, but in the case of bigdata, the data comes in different formats or are collected from diverse sources. The distributed databases like not only spark querying language (NoSQL) repositories are often used in relation to bigdata analytics, but a continual updating is required in business because of the streaming data that comes from stock trading, online activities of website visitors, and from the mobile applications in real time. It will not have to delay, for some report to show up, to assess and analyse the current situation, to move forward with the next business choice. Apache Spark’s structured streaming offer capabilities for handling streaming data in a batch processing mode with faster responses compared to MongoDB which is a document-based NoSQL database. This study completes similar queries to evaluate Spark SQL and NoSQL database performance, focusing on the upsides of Spark SQL over NoSQL databases in streaming data exploration. The queries are completed with streaming data stored in a batch mode.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.