Reduce Communication Overhead Research Articles

Developers commonly build contemporary enterprise applications using type-safe, component-based platforms, such as J2EE, and architect them to comprise multiple tiers, such as a web container, application server, and database engine. Administrators increasingly execute each tier in its own managed runtime environment (MRE) to improve reliability and to manage system complexity through the fault containment and modularity offered by isolated MRE instances. Such isolation, however, necessitates expensive cross-tier communication based on protocols such as object serialization and remote procedure calls. Administrators commonly co-locate communicating MREs on a single host to reduce communication overhead and to better exploit increasing numbers of available processing cores. However, state-of-the-art MREs offer no support for more efficient communication between co-located MREs, while fast inter-process communication mechanisms, such as shared memory, are widely available as a standard operating system service on most modern platforms. To address this growing need, we present the design and implementation of XMem ? type-safe, transparent, shared memory support for co-located MREs. XMem guarantees type-safety through coordinated, parallel, multi-process class loading and garbage collection. To avoid introducing any level of indirection, XMem manipulates virtual memory mapping. In addition, object sharing in XMem is fully transparent: shared objects are identical to local objects in terms of field access, synchronization, garbage collection, and method invocation, with the only difference being that sharedto-private pointers are disallowed. XMem facilitates easy integration and use by existing communication technologies and software systems, such as RMI, JNDI, JDBC, serialization/XML, and network sockets. We have implemented XMem in the open-source, productionquality HotSpot Java Virtual Machine. Our experimental evaluation, based on core communication technologies underlying J2EE, as well as using open-source server applications, indicates that XMem significantly improves throughput and response time by avoiding the overheads imposed by object serialization and network communication.

Read full abstract

The rich history of scalable computing research owes much to a rapid rise in computing platform scale in terms of size and speed. As platforms evolve, so must algorithms and the software expressions of those algorithms. Unbridled growth in scale inevitably leads to complexity. This special issue grapples with two facets of this complexity: scalable execution and scalable development. The former results from efficient programming of novel hardware with increasing numbers of processing units (e.g., cores, processors, threads or processes). The latter results from efficient development of robust, flexible software with increasing numbers of programming units (e.g., procedures, classes, components or developers). The progression in the above two parenthetical lists goes from the lowest levels of abstraction (hardware) to the highest (people). This issue's theme encompasses this entire spectrum. The lead author of each article resides in the Scalable Computing Research and Development Department at Sandia National Laboratories in Livermore, CA. Their co-authors hail from other parts of Sandia, other national laboratories and academia. Their research sponsors include several programs within the Department of Energy's Office of Advanced Scientific Computing Research and its National Nuclear Security Administration, along with Sandia's Laboratory Directed Research and Development program and the Office ofmore » Naval Research. The breadth of interests of these authors and their customers reflects in the breadth of applications this issue covers. This article demonstrates how to obtain scalable execution on the increasingly dominant high-performance computing platform: a Linux cluster with multicore chips. The authors describe how deep memory hierarchies necessitate reducing communication overhead by using threads to exploit shared register and cache memory. On a matrix-matrix multiplication problem, they achieve up to 96% parallel efficiency with a three-part strategy: intra-node multithreading, non-blocking inter-node message passing, and a dedicated communications thread to facilitate concurrent communications and computations. On a quantum chemistry problem, they spawn multiple computation threads and communication threads on each node and use one-sided communications between nodes to minimize wait times. They reduce software complexity by evolving a multi-threaded factory pattern in C++ from a working, message-passing program in C.« less

Read full abstract

Reduce Communication Overhead Research Articles

Related Topics

Articles published on Reduce Communication Overhead

An efficient coarse‐grained parallel algorithm for global–local multiscale computations on massively parallel systems

Distributed Recursive Least-Squares for Consensus-Based In-Network Adaptive Estimation

Software for Petascale Computing Systems

Greedy Routing with Anti-Void Traversal for Wireless Sensor Networks

On Connected Target Coverage for Wireless Heterogeneous Sensor Networks with Multiple Sensing Units

Secure encrypted-data aggregation for wireless sensor networks

The ARISE Approach for Extending Embedded Processors With Arbitrary Hardware Accelerators

The Circulate architecture: avoiding workflow bottlenecks caused by centralised orchestration

SCODE: A Secure Coordination-Based Data Dissemination to Mobile Sinks in Sensor Networks

Parallel SART algorithm of linear scan cone-beam CT for fixed pipeline

A decentralized parallel implementation for parallel tempering algorithm

EEDTC: Energy-efficient dominating tree construction in multi-hop wireless networks

Anonymous Geo-Forwarding in MANETs through Location Cloaking

A structured P2P network based on the small world phenomenon

Secure and Efficient Time Synchronization in Heterogeneous Sensor Networks

XMem

Reducing Communication Overhead for Wireless Roaming Authentication: Methods and Performance Evaluation

An Implementation of A Parallel Iterative Algorithm for the Solution of Large Banded System on A Cluster of Workstations

Complexity in Scalable Computing

Concise version vectors in WinFS

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Reduce Communication Overhead Research Articles

Related Topics

Articles published on Reduce Communication Overhead

An efficient coarse‐grained parallel algorithm for global–local multiscale computations on massively parallel systems

Distributed Recursive Least-Squares for Consensus-Based In-Network Adaptive Estimation

Software for Petascale Computing Systems

Greedy Routing with Anti-Void Traversal for Wireless Sensor Networks

On Connected Target Coverage for Wireless Heterogeneous Sensor Networks with Multiple Sensing Units

Secure encrypted-data aggregation for wireless sensor networks

The ARISE Approach for Extending Embedded Processors With Arbitrary Hardware Accelerators

The Circulate architecture: avoiding workflow bottlenecks caused by centralised orchestration

SCODE: A Secure Coordination-Based Data Dissemination to Mobile Sinks in Sensor Networks

Parallel SART algorithm of linear scan cone-beam CT for fixed pipeline

A decentralized parallel implementation for parallel tempering algorithm

EEDTC: Energy-efficient dominating tree construction in multi-hop wireless networks

Anonymous Geo-Forwarding in MANETs through Location Cloaking

A structured P2P network based on the small world phenomenon

Secure and Efficient Time Synchronization in Heterogeneous Sensor Networks

XMem

Reducing Communication Overhead for Wireless Roaming Authentication: Methods and Performance Evaluation

An Implementation of A Parallel Iterative Algorithm for the Solution of Large Banded System on A Cluster of Workstations

Complexity in Scalable Computing

Concise version vectors in WinFS