Abstract

A variety of workloads on wide-area networks (WAN) can benefit from parallel data streams. Whether between public and private clouds or between high-performance computing centres, high bandwidth-delay-product (BDP) networks with even small amounts of packet loss (e.g., due to congestion) can suffer reduced TCP/IP throughput. Therefore, we design, implement, and evaluate the open-source Parallel Data Streams (PDS) user-level tool. PDS makes trade-offs (e.g., TCP fairness) for specific workloads, when performance is the key goal. Unlike the well-known GridFTP, PDS also supports tools such as rsync, git, Virtual Network Computing (VNC), and the Network File System (NFS). We also quantify and contribute a performance evaluation of PDS, using a combination of emulated and real WANs. We establish that PDS achieves comparable performance to GridFTP for file transfer, but additional functionality via other tools. For example, PDS can transfer a 14 GB file on a WAN between Alberta and Quebec (maximum 1 Gbps; over 3,100 km) at 861 Mbps, using rsync and 8 parallel, cleartext, TCP streams. In comparison, rsync over a single SSH stream achieves 274 Mbps.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call