Towards efficient data search and subsetting of large-scale atmospheric datasets

Sangmi Lee Pallickara,Shrideep Pallickara,Milija Zupanski

doi:10.1016/j.future.2011.05.010

Abstract

Discovering the correct dataset in an efficient fashion is critical for effective simulations in the atmospheric sciences. Unlike text-based web documents, many of the large scientific datasets often contain binary encoded data that is hard to discover using popular search engines. In the atmospheric sciences, there has been a significant growth in public data hosting services. However, the ability to index and search has been limited by the metadata provided by the data host. We have developed an infrastructure–Atmospheric Data Discovery System (ADDS)–that provides an efficient data discovery environment for observational datasets in the atmospheric sciences. To support complex querying capabilities, we automatically extract and index fine-grained metadata. Datasets are indexed based on periodic crawling of popular sites and also of files requested by the users. Users are allowed to access subsets of a large dataset through our data customization feature. Our focus is the overall architecture, data subsetting scheme, and a performance evaluation of our system.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Future Generation Computer Systems	Publication Date: Jun 12, 2011
Citations: 29	License type: public-domain

R Discovery Prime

R Discovery Prime

Towards efficient data search and subsetting of large-scale atmospheric datasets

Abstract

Talk to us

Similar Papers

More From: Future Generation Computer Systems

Lead the way for us

Similar Papers

FSS-SDD: fuzzy-based semantic search for secure data discovery from outsourced cloud data
M Ananthi ... S Karthik
Soft Computing | VOL. 24
M Ananthi, et. al.M Ananthi ... S Karthik
29 Jan 2020
Soft Computing | VOL. 24

EGAL:a methodology for Environmental Geoportals Assessment and Label
...
-
, et. al. ...
31 May 2016
31 May 2016

A Methodology for Developing Data Taxonomy for Data Architecture
Mi-Young Choi ... Chang-Joo Moon
-
Mi-Young Choi, et. al.Mi-Young Choi ... Chang-Joo Moon
01 Jan 2009
01 Jan 2009

Overview of ICARUS-A Curated, Open Access, Online Repository for Atmospheric Simulation Chamber Data.
...
ACS Earth and Space Chemistry | VOL. 7
, et. al. ...
16 May 2023
ACS Earth and Space Chemistry | VOL. 7

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Towards efficient data search and subsetting of large-scale atmospheric datasets

Abstract

Talk to us

Similar Papers

More From: Future Generation Computer Systems