Main Memory Space Research Articles

AbstractKey time steps selection, i.e., selecting a subset of most representative time steps, is essential for effective and efficient scientific visualization of large time‐varying volume data. In particular, as computer simulations continue to grow in size and complexity, they often generate output that exceeds both the available storage capacity and bandwidth for transferring results to storage, making it indispensable to save only a subset of time steps. At the same time, this subset must be chosen so that it is highly representative, to facilitate post‐processing and reconstruction with high fidelity. The key time steps selection problem is especially challenging in the in situ setting, where we can only process data in one pass in an online streaming fashion, using a small amount of main memory and fast computation. In this paper, we formulate the problem as that of optimal piece‐wise linear interpolation. We first apply a method from numerical linear algebra to compute linear interpolation solutions and their errors in an online streaming fashion. Using that method as a building block, we can obtain a global optimal solution for the piece‐wise linear interpolation problem via a standard dynamic programming (DP) algorithm. However, this approach needs to process the time steps in multiple passes and is too slow for the in situ setting. To address this issue, we introduce a novel approximation algorithm, which processes time steps in one pass in an online streaming fashion, with very efficient computing time and main memory space both in theory and in practice. The algorithm is suitable for the in situ setting. Moreover, we prove that our algorithm, which is based on a greedy update rule, has strong theoretical guarantees on the approximation quality and the number of time steps stored. To the best of our knowledge, this is the first algorithm suitable for in situ key time steps selection with such theoretical guarantees, and is the main contribution of this paper. Experiments demonstrate the efficacy of our new techniques.

Read full abstract

A set containment join operates on two set-valued attributes with a subset ( $$\subseteq $$ ) relationship as the join condition. It has many real-world applications, such as in publish/subscribe services and inclusion dependency discovery. Existing solutions can be broadly classified into union-oriented and intersection-oriented methods. Based on several recent studies, union-oriented methods are not competitive as they involve an expensive subset enumeration step. Intersection-oriented methods build an inverted index on one attribute and perform inverted list intersection on another attribute. Existing intersection-oriented methods intersect inverted lists one-by-one. In contrast, in this paper, we propose to intersect all the inverted lists simultaneously while skipping many irrelevant entries in the lists. To share computation, we utilize the prefix tree structure and extend our novel list intersection method to operate on the prefix tree. To further improve the efficiency, we propose to partition the data and process each partition separately. Each partition will be associated with a much smaller inverted index, and the set containment join cost can be significantly reduced. Moreover, to support large-scale datasets that are beyond the available memory space, we develop a novel adaptive data partition method that is designed to fully leverage the available memory and achieve high I/O efficiency, and thereby exhibiting outstanding performance for external memory set containment join. We evaluate our methods using both real-world and synthetic datasets. Experimental results demonstrate that our method outperforms state-of-the-art methods by up to 10 $$\times $$ when the dataset is completely resided in memory. Furthermore, our approach achieves up to two orders of magnitude improvement on I/O efficiency compared with a baseline method when the dataset size exceeds the main memory space.

Read full abstract

Main Memory Space Research Articles

Related Topics

Articles published on Main Memory Space

TinyWolf — Efficient on-device TinyML training for IoT using enhanced Grey Wolf Optimization

Planting Fast-Growing Forest by Leveraging the Asymmetric Read/Write Latency of NVRAM-Based Systems

Rethinking the Interactivity of OS and Device Layers in Memory Management

Streaming Approach to In Situ Selection of Key Time Steps for Time‐Varying Volume Data

Task‐aware swapping for efficient DNN inference on DRAM‐constrained edge systems

Efficient top/bottom-k fraction estimation in spatial databases using bounded main memory

Performance modeling for I/O‐intensive applications on virtual machines

Internal and external memory set containment join

Large-scale terrain-adaptive LOD control based on GPU tessellation

A buffer based page replacement algorithm to reduce page fault

Downsizing Without Downgrading: Approximated Dynamic Time Warping on Nonvolatile Memories

DHC: A Distributed Hierarchical Clustering Algorithm for Large Datasets

Hot-Spot Suppression for Resource-Constrained Image Recognition Devices With Nonvolatile Memory

Compact Data Structures to Represent and Query Data Warehouses into Main Memory

Mobile Unified Memory-Storage Structure Based on Hybrid Non-Volatile Memories

Evaluation of triangular mesh layout techniques using large mesh simplification

LibreKV: A Persistent In-Memory Key-Value Store

Multi-tenant Main Memory Index Tree with Shared Structure

Practical compressed string dictionaries

응용프로그램의 기동시간 단축을 위한 파일 시스템 수준의 SSD 캐싱 기법

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Main Memory Space Research Articles

Related Topics

Articles published on Main Memory Space

TinyWolf — Efficient on-device TinyML training for IoT using enhanced Grey Wolf Optimization

Planting Fast-Growing Forest by Leveraging the Asymmetric Read/Write Latency of NVRAM-Based Systems

Rethinking the Interactivity of OS and Device Layers in Memory Management

Streaming Approach to In Situ Selection of Key Time Steps for Time‐Varying Volume Data

Task‐aware swapping for efficient DNN inference on DRAM‐constrained edge systems

Efficient top/bottom-k fraction estimation in spatial databases using bounded main memory

Performance modeling for I/O‐intensive applications on virtual machines

Internal and external memory set containment join

Large-scale terrain-adaptive LOD control based on GPU tessellation

A buffer based page replacement algorithm to reduce page fault

Downsizing Without Downgrading: Approximated Dynamic Time Warping on Nonvolatile Memories

DHC: A Distributed Hierarchical Clustering Algorithm for Large Datasets

Hot-Spot Suppression for Resource-Constrained Image Recognition Devices With Nonvolatile Memory

Compact Data Structures to Represent and Query Data Warehouses into Main Memory

Mobile Unified Memory-Storage Structure Based on Hybrid Non-Volatile Memories

Evaluation of triangular mesh layout techniques using large mesh simplification

LibreKV: A Persistent In-Memory Key-Value Store

Multi-tenant Main Memory Index Tree with Shared Structure

Practical compressed string dictionaries

응용프로그램의 기동시간 단축을 위한 파일 시스템 수준의 SSD 캐싱 기법