Abstract

As a Network Common Data Format, NetCDF has been widely used in terrestrial, marine and atmospheric sciences. A new paralleling storage and access method for large scale NetCDF scientific data is implemented based on Hadoop. The retrieval method is implemented based on MapReduce. The Argo data is used to demonstrate our method. The performance is compared under a distributed environment based on PCs by using different data scale and different task numbers. The experiments result show that the parallel method can be used to store and access the large scale NetCDF efficiently.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call