Abstract

Distributing Big Data for science is pushing the capabilities of networks and computing systems. However, the fundamental concept of copying data from one machine to another has not been challenged in collaborative science. As recent storage system development uses modern fabrics to provide faster remote data access with lower overhead, traditional data movement using Data Transfer Nodes must cope with the paradigm shift from a store-and-forward model to streaming data with direct storage access over the networks. This study evaluates NVMe-over-TCP (NVMe-TCP) in a long-distance network using different file systems and configurations to characterize remote NVMe file system access performance in MAN and WAN data moving scenarios. We found that NVMe-TCP is more suitable for remote data read than remote data write over the networks, and using RAID0 can significantly improve performance in a long-distance network. Additionally, a fine-tuning file system can improve remote write performance in DTNs with a long-distance network.

Highlights

  • Big data movement plays a critical role in science collaboration

  • Data Transfer Nodes (DTNs) are widely deployed in science facilities such as Pacific Research Platform (PRP) (https:// pacificresearchplatform.org/) and Metropolitan Research and Education Network (MREN) Research Platform (MRP) (http://mren.org/)

  • We set up an NVMe-TCP target on the DTN at Northwestern University and use the DTN at the University of Illinois at Chicago

Read more

Summary

Introduction

Big data movement plays a critical role in science collaboration. To operate DTNs and keep up with scientific research, we need to continuously improve big data movement services and adopt new technologies that enhance data transfer performance. DTNs are specialized systems for high-performance data movement. DTNs are equipped with a high-performance network interface, local storage system, and processors to enhance network data transfer. Modern DTNs often have 100 Gbps or faster connection speed in the perimeters of campus or lab network. They are configured with a specific network policy that allows high-performance data exchange. DTNs are a key component in data sharing and collaborative science. DTNs are widely deployed in science facilities such as Pacific Research Platform (PRP) (https:// pacificresearchplatform.org/ (accessed on 6 October 2021)) and MREN Research Platform (MRP) (http://mren.org/ (accessed on 6 October 2021))

Objectives
Methods
Results
Discussion
Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.