Abstract

NASA's complex IT environment tends to mask nuances in the generation, transport, and management of data sets for its Mission Communities, making it increasingly difficult to understand how, when, and where scientists use computing and storage resources. We therefore conducted a study of Mission Scientists' computing behaviors and workflows, to learn how science gets done, on which computing and storage resources, where and when throughout project lifecycles. In our approach we conducted in-depth interviews with scientists that actively use the spectrum of computing and storage resources across the agency; identifying, discussing, and analyzing behavioral patterns to discern computing and storage needs. In each interview we learned the stepwise processes (workflows) used to generate, transport, manage, and use data; including scientists' interactions with computing and storage resources, frequencies, durations, and types of interactions, and dependencies and constraints in progression through steps. During our study we identified a wide variety of usage scenarios across the target communities, however, we found underlying behaviors between scientists, the computing and storage resources they use, and the data sets they create; in particular, how scientists use their data and how they interact with various computing resources at each stage in their projects. These behaviors are influenced by limiting factors in the IT system, such as queues, consistency across resources, data set sizes, storage allocations, and retrieving data from offline storage. Limiting factors increase scientists' times to solution and reduce their effectiveness, by forcing them to adapt their behaviors to accommodate these factors, including: creatively using queues, reprocessing jobs instead of storing data sets, shuffling data sets across resources during processing, saving data at fewer intervals, and reducing scales, resolutions, and numbers of parameters of jobs. We found that, underlying scientists' computing and storage behaviors, and how they adapt to limiting factors in the IT system, are sets of dynamics between scientists, the computing and storage resources they use, and the data sets they create. We use the term behavioral dynamics to indicate behaviors that are conditioned by working environments, for our study consisting of subsets of scientists' IT (computing, storage, and networking) environments. We describe dynamics between performance and control, as well as between security and collaboration. We believe that behavioral dynamics such as these can be used to improve the effectiveness of science workflows and times to solution, by: tuning and balancing the IT system for Mission Communities; identifying major short and long-term drivers for computing, storage, and networking; and providing input into architectural decisions. Possible refinements to the IT system include: adopting a more consistent queue strategy across MRC and HEC resources; developing a brokering capability for these resources; exploring architectural implications of large data sets; and providing open and flexible environments for collaborations via proxy access, implementing open computing and storage, or outsourcing to public cloud services.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call