Comparing NetCDF and SciDB on managing and querying 5D hydrologic dataset

Haicheng Liu,Xiao Xiao

doi:10.1088/1755-1315/46/1/012031

Abstract

Efficiently extracting information from high dimensional hydro-meteorological modelling datasets requires smart solutions. Traditional methods are mostly based on files, which can be edited and accessed handily. But they have problems of efficiency due to contiguous storage structure. Others propose databases as an alternative for advantages such as native functionalities for manipulating multidimensional (MD) arrays, smart caching strategy and scalability. In this research, NetCDF file based solutions and the multidimensional array database management system (DBMS) SciDB applying chunked storage structure are benchmarked to determine the best solution for storing and querying 5D large hydrologic modelling dataset. The effect of data storage configurations including chunk size, dimension order and compression on query performance is explored. Results indicate that dimension order to organize storage of 5D data has significant influence on query performance if chunk size is very large. But the effect becomes insignificant when chunk size is properly set. Compression of SciDB mostly has negative influence on query performance. Caching is an advantage but may be influenced by execution of different query processes. On the whole, NetCDF solution without compression is in general more efficient than the SciDB DBMS.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Comparing NetCDF and SciDB on managing and querying 5D hydrologic dataset

Abstract

Talk to us

Similar Papers

More From: IOP Conference Series: Earth and Environmental Science

Lead the way for us

Journal: IOP Conference Series: Earth and Environmental Science	Publication Date: Nov 1, 2016
License type: cc-by

Similar Papers

Managing Large Multidimensional Array Hydrologic Datasets: A Case Study Comparing NetCDF and SciDB
Haicheng Liu ... Wen Wang
Procedia Engineering | VOL. 154
Haicheng Liu, et. al.Haicheng Liu ... Wen Wang
01 Jan 2015
Procedia Engineering | VOL. 154

Managing large multidimensional hydrologic datasets: A case study comparing NetCDF and SciDB
Haicheng Liu ... Wen Wang
Journal of Hydroinformatics | VOL. 20
Haicheng Liu, et. al.Haicheng Liu ... Wen Wang
10 May 2018
Journal of Hydroinformatics | VOL. 20

VarDB: High-Performance Warehouse Processing with Massive Ordering and Binary Search
Pedro Martins ... José Cecílio
-
Pedro Martins, et. al.Pedro Martins ... José Cecílio
01 Jan 2010
01 Jan 2010

Chunking implementation of extendible array to handle address space overflow for large multidimensional data sets
K M Azharul Hasan ... Mehnuma Tabassum Omar
-
K M Azharul Hasan, et. al.K M Azharul Hasan ... Mehnuma Tabassum Omar
01 Feb 2014
01 Feb 2014

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Comparing NetCDF and SciDB on managing and querying 5D hydrologic dataset

Abstract

Talk to us

Similar Papers

More From: IOP Conference Series: Earth and Environmental Science