Abstract

Twenty-first-century scientific and engineering enterprises are increasingly characterized by their geographic dispersion and their reliance on large data archives. These characteristics bring with them unique challenges. First, the increasing size and complexity of modern data collections require significant investments in information technologies to store, retrieve and analyse them. Second, the increased distribution of people and resources in these projects has made resource sharing and collaboration across significant geographic and organizational boundaries critical to their success. In this paper I explore how computing infrastructures based on Data Grids offer data-intensive enterprises a comprehensive, scalable framework for collaboration and resource sharing. A detailed example of a Data Grid framework is presented for a Large Hadron Collider experiment, where a hierarchical set of laboratory and university resources comprising petaflops of processing power and a multi-petabyte data archive must be efficiently used by a global collaboration. The experience gained with these new information systems, providing transparent managed access to massive distributed data collections, will be applicable to large-scale, data-intensive problems in a wide spectrum of scientific and engineering disciplines, and eventually in industry and commerce. Such systems will be needed in the coming decades as a central element of our information-based society.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call