Fat-tree Network Research Articles

The parallel performance of the three-level fast and robust overlapping Schwarz (FROSch) preconditioners is investigated for linear elasticity. The FROSch framework is part of the Trilinos software library and contains a parallel implementation of different preconditioners with energy minimizing coarse spaces of generalized Dryja--Smith--Widlund type. The three-level extension is constructed by a recursive application of the FROSch preconditioner to the coarse problem. In this paper, the additional steps in the implementation in order to apply the FROSch preconditioner recursively are described in detail. Furthermore, it is shown that no explicit geometric information is needed in the recursive application of the preconditioner. In particular, the rigid body modes, including the rotations, can be interpolated on the coarse level without additional geometric information. Parallel results for a three-dimensional linear elasticity problem obtained on the Theta supercomputer (Argonne Leadership Computing Facility, Argonne, IL) using up to 220 000 cores are discussed and compared to results obtained on the SuperMUC-NG supercomputer (Leibniz Supercomputing Centre, Garching, Germany). Notably, it can be observed that a hierarchical communication operation in FROSch related to the coarse operator starts to dominate the computing time on Theta, which has a dragonfly interconnect, for 100 000 message passing interface (MPI) ranks or more. The same operation, however, scales well and stays within the order of a second in all experiments performed on SuperMUC-NG, which uses a fat tree network. Using hybrid MPI/OpenMP parallelization, the onset of the MPI communication problem on Theta can be delayed. Further analysis of the performance of FROSch on large supercomputers with dragonfly interconnects will be necessary.

We investigate the advantages of using co-packaged optics for building low-diameter, large-scale high-performance computing (HPC) and data center networks. The increased escape bandwidth offered by co-packaged optics can enable high-radix switch implementations of more than 150 switch ports, which can be combined with data rates of up to 400 Gb/s per port. From the network architecture perspective, the key benefits of using co-packaged optics in future fat-tree networks include (a) the ability to implement large-scale topologies of > --> 11 , 000 end points by eliminating the need for a third switching layer and (b) the ability to provide up to 4 × higher bisection bandwidth compared to existing solutions, reducing at the same time the number of required switch application-specific integrated circuits by > --> 80 % . From the network operation perspective, both reduced energy consumption and lower packet delays can be achieved since fewer hops are required; i.e., packets need to traverse fewer serializer/deserializer lanes and fewer switch buffers, which reduces the probability of contending with other packets and improves the tolerance of network congestion. The performance of the proposed architecture is evaluated via discrete-event simulations for a wide range of representative HPC synthetic-traffic cases that include both hotspot and non-hotspot scenarios. The simulation results suggest that co-packaged optics form a promising solution to keep up with bandwidth scaling in future networks, while the reduced number of switching layers can lead to significant mean packet delay improvements that start from 30% and reach up to 74% for high-load conditions.

Fat-tree Network Research Articles

Related Topics

Articles published on Fat-tree Network

3D DFT by block tensor-matrix multiplication via a modified Cannon's algorithm: Implementation and scaling on distributed-memory clusters with fat tree networks

Load balancing and topology dynamic adjustment strategy for power information system network: a deep reinforcement learning-based approach

Modular Control Plane Verification via Temporal Invariants

Parallel Scalability of Three-Level FROSch Preconditioners to 220000 Cores using the Theta Supercomputer

Deep Reinforcement Learning-Based Network Routing Technology for Data Recovery in Exa-Scale Cloud Distributed Clustering Systems

A Software Defined Network Scheme for Intra Datacenter Network Based on Fat-Tree Topology

A network-aware and power-efficient virtual machine placement scheme in cloud datacenters based on chemical reaction optimization

Enhancing Robustness of Per-Packet Load-Balancing for Fat-Tree

Segment Switching: A New Switching Strategy for Optical HPC Networks

Toward lower-diameter large-scale HPC and data center networks with co-packaged optics

Load balancing using openday light SDN controller: Case study

Data privacy‐based coordinated placement method of workflows and data

Towards Network-Aware Service Composition in the Cloud

Towards an efficient combination of adaptive routing and queuing schemes in Fat-Tree topologies

A Scalable, High-Performance, and Fault-Tolerant Network Architecture for Distributed Machine Learning

The high-speed networks of the Summit and Sierra supercomputers

Evaluation of Firewall and Load balance in Fat-Tree Topology Based on Floodlight Controller

Performance drop at executing communication-intensive parallel algorithms

OPS: A Fairness Link Allocation Based on SDN in Datacenter Networks

Path2SL: Leveraging InfiniBand Resources to Reduce Head-of-Line Blocking in Fat Trees

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Fat-tree Network Research Articles

Related Topics

Articles published on Fat-tree Network

3D DFT by block tensor-matrix multiplication via a modified Cannon's algorithm: Implementation and scaling on distributed-memory clusters with fat tree networks

Load balancing and topology dynamic adjustment strategy for power information system network: a deep reinforcement learning-based approach

Modular Control Plane Verification via Temporal Invariants

Parallel Scalability of Three-Level FROSch Preconditioners to 220000 Cores using the Theta Supercomputer

Deep Reinforcement Learning-Based Network Routing Technology for Data Recovery in Exa-Scale Cloud Distributed Clustering Systems

A Software Defined Network Scheme for Intra Datacenter Network Based on Fat-Tree Topology

A network-aware and power-efficient virtual machine placement scheme in cloud datacenters based on chemical reaction optimization

Enhancing Robustness of Per-Packet Load-Balancing for Fat-Tree

Segment Switching: A New Switching Strategy for Optical HPC Networks

Toward lower-diameter large-scale HPC and data center networks with co-packaged optics

Load balancing using openday light SDN controller: Case study

Data privacy‐based coordinated placement method of workflows and data

Towards Network-Aware Service Composition in the Cloud

Towards an efficient combination of adaptive routing and queuing schemes in Fat-Tree topologies

A Scalable, High-Performance, and Fault-Tolerant Network Architecture for Distributed Machine Learning

The high-speed networks of the Summit and Sierra supercomputers

Evaluation of Firewall and Load balance in Fat-Tree Topology Based on Floodlight Controller

Performance drop at executing communication-intensive parallel algorithms

OPS: A Fairness Link Allocation Based on SDN in Datacenter Networks

Path2SL: Leveraging InfiniBand Resources to Reduce Head-of-Line Blocking in Fat Trees