Abstract

Wide-area data transfer is central to distributed science. Network capacity, data movement infrastructure, and tools in science environments continuously evolve to meet the requirements of distributed-science applications. Research and education (R&E) networks such as the U.S. Department of Energy’s Energy Sciences network and Internet2 provide multiple 100 Gbps backbone networks. Large scientific facilities and research institutions have 100 Gbps wide-area network connectivity, and 10 Gbps wide-area network connectivity is common for a lot of R&E institutions. Many of these institutions employ Science DMZs, dedicated data transfer node(s), and high performance data movement tools to improve wide area data transfer performance. Large facilities may use 10 or more dedicated data transfer nodes to meet the needs of their users. In this work, we analyze various logs pertaining to wide area data transfers in and out of a large scientific facility to obtain insights on data transfer characteristics and behavior. We also show some of the inefficiencies in the state-of-the-art data movement tool and discuss approaches to address these inefficiencies.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call