Abstract

Data assimilation is an analysis technique which combines observations and the numerical results from theoretical models to deduce more realistic and accurate data. It is widely used in investigations of the atmosphere, ocean and land surface. Due to the complicated data structure of the inputs from dynamical models and the increase of the amount of model data, the parallelization of data assimilation suffers from high overhead on file reading and data communication. In this paper, we propose a flexible parallel data access approach for reading a large number of data from disks firstly. Using this approach, the data access conflict is avoided successfully, and the frequency of disk addressing operations is also decreased significantly. Next, we design a communication-avoiding strategy to reduce the communication volume at the cost of some additional computations. Furthermore, we present a “pipe-flow” scheme for data exchange to conduct conflict-free message passing. Consequently, a fast data-obtaining algorithm is developed for the data assimilation. Our experiments show that the fast data-obtaining algorithm gains a performance of $$5\times $$ speedup compared with the baseline, which is excellent at data-obtaining for the parallel data assimilation. Due to the reduction of disk addressing operations, the new approach achieves $$6\times $$ speedup on average for the file reading process. Since a large amount of data movement can be avoided, the new approach achieves $$2.7\times $$ speedup on average for the communication between processors.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call