Abstract

The emergence of Big Data applications provides new challenges in data management such as processing and movement of masses of data. Volunteer computing has proven itself as a distributed paradigm that can fully support Big Data generation. This paradigm uses a large number of heterogeneous and unreliable Internet-connected hosts to provide Peta-scale computing power for scientific projects. With the increase in data size and number of devices that can potentially join a volunteer computing project, the host bandwidth can become a main hindrance to the analysis of the data generated by these projects, especially if the analysis is a concurrent approach based on either in-situ or in-transit processing. In this paper, we propose a bandwidth model for volunteer computing projects based on the real trace data taken from the Docking@Home project with more than 280,000 hosts over a 5-year period. We validate the proposed statistical model using model-based and simulation-based techniques. Our modeling provides us with valuable insights on the concurrent integration of data generation with in-situ and in-transit analysis in the volunteer computing paradigm.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.