Abstract

In this paper, we describe an emergent tool called DAWN (short for Distributed Analytics, Workflows and Numeric) which is a model for simulating, analyzing and optimizing system architectures for executing arbitrary data processing pipelines. As an example, we will apply DAWN to the investigation of a real-life Big Data use case in climate science: the evaluation of simulated rainfall characteristics using high-resolution observational data. We will show how DAWN can help in determining the optimal architecture, and science algorithms, to execute this case study analyzing distributed datasets, as a tradeoff between the overall time cost and the uncertainty of calculated metrics for model evaluation. We will also show how DAWN can guide architectural decisions for future research, specifically impacting how data should be generated and analyzed to cope with future projected data volumes.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.