Abstract

Advances in technology have resulted in the generation of a large volume of heterogeneous big data for large enterprises engaged in e-commerce, healthcare, education, etc. This is being created at a rapid rate but is low in its veracity. This big data includes large sets of semi-structured and unstructured data and is stored over a distributed file system (DFS). This data can be processed in a fault tolerant manner using several frameworks, tools, and advanced database technologies. Big data can provide important information, which can be used for business decision making. View materialization, which has been widely studied for structured databases or data warehouse, has been extended to big data to enhance efficiency of big data query processing. This paper focuses on the selection of big data views for materialization. The big data views can be identified by extracting a set of query attributes from the set of query workload of an enterprise. The query attributes are interrelated resulting in the creation of alternate access paths for query evaluation. The cost of query processing using big data views involves the integrity of different data types of heterogeneous big data, frequency of queries, change in the size of big data, selected sets of big data materialized views, and updates on big data and these sets of materialized views. The cost of query processing is computed using the stored size of big data views on the DFS system, which is a consistent processing framework of DFS. A big data view selection algorithm that is capable of selecting views from structured, semi-structured, and unstructured data has been proposed in this paper. The proposed algorithm would select big data views that would result in faster processing of most user queries resulting in efficient decision making.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.