A Model for Enhancing Unstructured Big Data Warehouse Execution Time

Marwa Salah Farhan,Marwa Salah Farhan,Amira Youssef,Laila Abdelhamid,Amira Youssef

doi:10.3390/bdcc8020017

Abstract

Traditional data warehouses (DWs) have played a key role in business intelligence and decision support systems. However, the rapid growth of the data generated by the current applications requires new data warehousing systems. In big data, it is important to adapt the existing warehouse systems to overcome new issues and limitations. The main drawbacks of traditional Extract–Transform–Load (ETL) are that a huge amount of data cannot be processed over ETL and that the execution time is very high when the data are unstructured. This paper focuses on a new model consisting of four layers: Extract–Clean–Load–Transform (ECLT), designed for processing unstructured big data, with specific emphasis on text. The model aims to reduce execution time through experimental procedures. ECLT is applied and tested using Spark, which is a framework employed in Python. Finally, this paper compares the execution time of ECLT with different models by applying two datasets. Experimental results showed that for a data size of 1 TB, the execution time of ECLT is 41.8 s. When the data size increases to 1 million articles, the execution time is 119.6 s. These findings demonstrate that ECLT outperforms ETL, ELT, DELT, ELTL, and ELTA in terms of execution time.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

A Model for Enhancing Unstructured Big Data Warehouse Execution Time

Abstract

Talk to us

Similar Papers

More From: Big Data and Cognitive Computing

Lead the way for us

Journal: Big Data and Cognitive Computing	Publication Date: Feb 6, 2024
License type: CC BY 4.0

Similar Papers

Business Intelligence and its Applications in the Public Healthcare System
...
-
, et. al. ...
06 Nov 2011
06 Nov 2011

Application of Data Warehouse in Decision Support and Business Intelligence System
Shubham Jain ... Shilpi Sharma
-
Shubham Jain, et. al.Shubham Jain ... Shilpi Sharma
01 Aug 2018
01 Aug 2018

Design and Implementation of Enterprise Spatial Data Warehouse
Yin Liang ... Hong Zhang
-
Yin Liang, et. al.Yin Liang ... Hong Zhang
01 Jan 2007
01 Jan 2007

The Effect of Using Decision Support Systems Applications and Business Intelligence Systems in Making Strategic Decisions: A Field Study in the City of Gaziantep
Ali Alhousain Al Eid ... Uğur Yavuz
Global Journal of Economics and Business | VOL. 12
Ali Alhousain Al Eid, et. al.Ali Alhousain Al Eid ... Uğur Yavuz
01 Apr 2022
Global Journal of Economics and Business | VOL. 12

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

A Model for Enhancing Unstructured Big Data Warehouse Execution Time

Abstract

Talk to us

Similar Papers

More From: Big Data and Cognitive Computing