RACE: One-sided RDMA-conscious Extendible Hashing

  • Abstract
  • Literature Map
  • Similar Papers
Abstract
Translate article icon Translate Article Star icon

Memory disaggregation is a promising technique in datacenters with the benefit of improving resource utilization, failure isolation, and elasticity. Hashing indexes have been widely used to provide fast lookup services in distributed memory systems. However, traditional hashing indexes become inefficient for disaggregated memory, since the computing power in the memory pool is too weak to execute complex index requests. To provide efficient indexing services in disaggregated memory scenarios, this article proposes RACE hashing, a one-sided RDMA-Conscious Extendible hashing index with lock-free remote concurrency control and efficient remote resizing. RACE hashing enables all index operations to be efficiently executed by using only one-sided RDMA verbs without involving any compute resource in the memory pool. To support remote concurrent access with high performance, RACE hashing leverages a lock-free remote concurrency control scheme to enable different clients to concurrently operate the same hashing index in the memory pool in a lock-free manner. To resize the hash table with low overheads, RACE hashing leverages an extendible remote resizing scheme to reduce extra RDMA accesses caused by extendible resizing and allow concurrent request execution during resizing. Extensive experimental results demonstrate that RACE hashing outperforms state-of-the-art distributed in-memory hashing indexes by 1.4–13.7× in YCSB hybrid workloads.

Similar Papers
  • Conference Article
  • Cite Count Icon 16
  • 10.1145/3448016.3452817
CoRM
  • Jun 9, 2021
  • Konstantin Taranov + 2 more

Distributed memory systems are becoming increasingly important since they provide a system-scale abstraction where physically separated memories can be addressed as a single logical one. This abstraction enables memory disaggregation, allowing systems as in-memory databases, caching services, and ephemeral storage to be naturally deployed at large scales. While this abstraction effectively increases the memory capacity of these systems, it faces additional overheads for remote memory accesses. To narrow the difference between local and remote accesses, low latency RDMA networks are a key element for efficient memory disaggregation. However, RDMA acceleration poses new obstacles to efficient memory management and particularly to memory compaction: network controllers and CPUs can concurrently access memory, potentially leading to inconsistencies if memory management operations are not synchronized. To ensure consistency, most distributed memory systems do not provide memory compaction and are exposed to memory fragmentation. We introduce CoRM, an RDMA-accelerated shared memory system that supports memory compaction and ensures strict consistency while providing one-sided RDMA accesses. We show that CoRM sustains high read throughput during normal operations, comparable to similar systems not providing memory compaction while experiencing minimal overheads during compaction. CoRM never disrupts RDMA connections and can reduce applications' active memory up to 6x by performing memory compaction.

  • Conference Article
  • Cite Count Icon 3
  • 10.23919/date54114.2022.9774546
REH: Redesigning Extendible Hashing for Commercial Non-Volatile Memory
  • Mar 14, 2022
  • Zhengtao Li + 2 more

Emerging Non-volatile Memory (NVM) is attractive because of its byte-addressability, durability, and DRAM-scale latency. Hashing indexes have been extensively used to provide fast query services in the storage system. Recent research proposes crash-consistent and write-optimized hashing indexes for NVM. However, existing NVM-based hashing indexes suffer from limited scalability when running on a Commercial Non-Volatile Memory product, named Intel Optane DC Persistent Memory Module (DCPMM), due to the limited bandwidth of Optane DCPMM. To achieve a high load factor, existing NVM-based hashing indexes often evict an existing item to its alternative position, which incurs extra write and will consume the limited bandwidth. Moreover, the lock operations and metadata updates further saturate the limited bandwidth and prevent the hash table from scaling. In order to achieve scalability performance as well as a high load factor for the NVM-based hashing index, we design a new persistent hashing index, called REH, based on extendible hashing. REH (1) proposes a selective persistence scheme that stores buckets in NVM and places directory and metadata in DRAM to reduce both unnecessary NVM reads and writes, (2) uses 256B sized-buckets, as 256B is the internal data access size in Optane DCPMM, and the buckets are directly pointed to by directory entries, (3) leverages fingerprinting to further reduce unnecessary NVM reads, (4) employs failure-atomic bucket split to reduce bucket split overhead. Evaluations show that REH outperforms the state-of-the-art NVM-based hashing indexes by up to 1.68~7.78×. In the meantime, REH can achieve a high load factor.

  • Research Article
  • Cite Count Icon 17
  • 10.1145/3322096
Level Hashing
  • May 31, 2019
  • ACM Transactions on Storage
  • Pengfei Zuo + 2 more

Non-volatile memory (NVM) technologies as persistent memory are promising candidates to complement or replace DRAM for building future memory systems, due to having the advantages of high density, low power, and non-volatility. In main memory systems, hashing index structures are fundamental building blocks to provide fast query responses. However, hashing index structures originally designed for dynamic random access memory (DRAM) become inefficient for persistent memory due to new challenges including hardware limitations of NVM and the requirement of data consistency. To address these challenges, this article proposes level hashing , a write-optimized and high-performance hashing index scheme with low-overhead consistency guarantee and cost-efficient resizing. Level hashing provides a sharing-based two-level hash table, which achieves constant-scale worst-case time complexity for search, insertion, deletion, and update operations, and rarely incurs extra NVM writes. To guarantee the consistency with low overhead, level hashing leverages log-free consistency schemes for deletion, insertion, and resizing operations, and an opportunistic log-free scheme for update operation. To cost-efficiently resize this hash table, level hashing leverages an in-place resizing scheme that only needs to rehash 1/3 of buckets instead of the entire table to expand a hash table and rehash 2/3 of buckets to shrink a hash table, thus significantly improving the resizing performance and reducing the number of rehashed buckets. Extensive experimental results show that the level hashing speeds up insertions by 1.4×−3.0×, updates by 1.2×−2.1×, expanding by over 4.3×, and shrinking by over 1.4× while maintaining high search and deletion performance compared with start-of-the-art hashing schemes.

  • Conference Article
  • Cite Count Icon 1
  • 10.1117/12.2628698
A remote access and control scheme for smart home based on blockchain
  • Apr 22, 2022
  • Lijuan He + 2 more

Smart home uses remote access and control technologies, but there is a risk that user identities are easily spoofed, user information is easily captured, and home gateways are vulnerable. Based on the blockchain, a smart home remote access and control scheme is proposed. The scheme ensures the secure access and control of the home device by the smart home user and the secure communication of the home gateway by combining the blockchain, group signature and message authentication code technology (MAC). The security analysis shows that the scheme has security features such as non-tamperable, anti-replay attack and anti-DDos attack. Compared with other schemes, the scheme has obvious advantages compared with other schemes.

  • Research Article
  • Cite Count Icon 5
  • 10.14778/3641204.3641218
SepHash: A Write-Optimized Hash Index On Disaggregated Memory via Separate Segment Structure
  • Jan 1, 2024
  • Proceedings of the VLDB Endowment
  • Xinhao Min + 7 more

Disaggregated memory separates compute and memory resources into independent pools connected by fast RDMA (Remote Direct Memory Access) networks, which can improve memory utilization, reduce cost, and enable elastic scaling of compute and memory resources. Hash indexes provide high-performance single-point operations and are widely used in distributed systems and databases. However, under disaggregated memory, existing hash indexes suffer from write performance degradation due to high resize overhead and concurrency control overhead. Traditional write-optimized hash indexes are not efficient for disaggregated memory and sacrifice read performance. In this paper, we propose SepHash, a write-optimized hash index for disaggregated memory. First, SepHash proposes a two-level separate segment structure that significantly reduces the bandwidth consumption of resize operations. Second, SepHash employs a low-latency concurrency control strategy to eliminate unnecessary mutual exclusion and check overhead during insert operations. Finally, SepHash designs an efficient cache and filter to accelerate read operations. The evaluation results show that, compared to state-of-the-art distributed hash indexes, SepHash achieves a 3.3X higher write performance while maintaining comparable read performance.

  • Book Chapter
  • 10.1007/978-3-642-35419-9_22
Application-Oriented Designing for Remote Control Scheme of Robot Integrated Machine
  • Jan 1, 2013
  • Wenming Wang

This paper introduces the network remotely operated robot system architecture and its principles. Designed a remote login control scheme, realized a local communications framework based on the realization of USB2.0 between the upper and lower computer communications. Through Single chip microcomputer to control robot, and can support the wireless network, such as Bluetooth protocol, with the aid of a remote login realize robot integrated system for local control, according to the video information of robot master movement, but also according to simulation robot to grasp the real robot specific parameters, and uses the CAMSHIFT algorithm to realize the robot moving target tracking and positioning, so as to realize the remote control center remote control task, the purpose is to design a set of the practical application of remote control system of robot, robot applications in different fields and provide reference.KeywordsRobot integrated machineRemote operationUSB communicationSingle chip microcomputer

  • Research Article
  • Cite Count Icon 14
  • 10.1145/3634916
Rcmp: Reconstructing RDMA-Based Memory Disaggregation via CXL
  • Jan 19, 2024
  • ACM Transactions on Architecture and Code Optimization
  • Zhonghua Wang + 6 more

Memory disaggregation is a promising architecture for modern datacenters that separates compute and memory resources into independent pools connected by ultra-fast networks, which can improve memory utilization, reduce cost, and enable elastic scaling of compute and memory resources. However, existing memory disaggregation solutions based on remote direct memory access (RDMA) suffer from high latency and additional overheads including page faults and code refactoring. Emerging cache-coherent interconnects such as CXL offer opportunities to reconstruct high-performance memory disaggregation. However, existing CXL-based approaches have physical distance limitation and cannot be deployed across racks.In this article, we propose Rcmp, a novel low-latency and highly scalable memory disaggregation system based on RDMA and CXL. The significant feature is that Rcmp improves the performance of RDMA-based systems via CXL, and leverages RDMA to overcome CXL’s distance limitation. To address the challenges of the mismatch between RDMA and CXL in terms of granularity, communication, and performance, Rcmp (1) provides a global page-based memory space management and enables fine-grained data access, (2) designs an efficient communication mechanism to avoid communication blocking issues, (3) proposes a hot-page identification and swapping strategy to reduce RDMA communications, and (4) designs an RDMA-optimized RPC framework to accelerate RDMA transfers. We implement a prototype of Rcmp and evaluate its performance by using micro-benchmarks and running a key-value store with YCSB benchmarks. The results show that Rcmp can achieve 5.2× lower latency and 3.8× higher throughput than RDMA-based systems. We also demonstrate that Rcmp can scale well with the increasing number of nodes without compromising performance.

  • PDF Download Icon
  • Research Article
  • Cite Count Icon 5
  • 10.1155/2015/928174
A Novel Digital Certificate Based Remote Data Access Control Scheme in WSN
  • Jan 1, 2015
  • Journal of Sensors
  • Wei Liang + 3 more

A digital certificate based remote data access control scheme is proposed for safe authentication of accessor in wireless sensor network (WSN). The scheme is founded on the access control scheme on the basis of characteristic expression (named CEB scheme). Data is divided by characteristics and the key for encryption is related to characteristic expression. Only the key matching with characteristic expression can decrypt the data. Meanwhile, three distributed certificate detection methods are designed to prevent the certificate from being misappropriated by hostile anonymous users. When a user starts query, the key access control method can judge whether the query is valid. In this case, the scheme can achieve public certificate of users and effectively protect query privacy as well. The security analysis and experiments show that the proposed scheme is superior in communication overhead, storage overhead, and detection probability.

  • Conference Article
  • Cite Count Icon 6
  • 10.1145/3476886.3477507
DiLOS
  • Aug 24, 2021
  • Wonsup Yoon + 4 more

Memory disaggregation places computing and memory in physically separate nodes and achieves improved memory utilization in datacenters. Kernel-based approaches for memory disaggregation offer transparent virtual memory by using paging schemes but suffer from expensive page fault handling. As an alternative, library-based approaches incorporate application semantics to memory disaggregation and can even eliminate page fault handling on its data path. However, its lack of compatibility harms generality and obstruct wide adoption.

  • Conference Article
  • 10.1109/mec.2011.6025927
Scheme design and evaluation of anti-outburst remote control rig
  • Aug 1, 2011
  • Hou Zhi

Three remote control schemes of anti-outburst rig which used in coal mine were designed, they are remote control by converter, remote control by multi-speed motor and remote control by hydraulic. For the uncertainty of evaluation index and its weight in the evaluation process, evaluation index were reduced by principal component analysis, weight was eliminated by information entropy, the final score obtained by weighted multiplication method. Finally, evaluation index system of remote control scheme for anti-outburst rig was established, three schemes were evaluated by principal component analysis and information entropy method, and remote control by multi-speed motor was the best solution.

  • Research Article
  • 10.1155/2016/3676582
Optimizing NEURON Simulation Environment Using Remote Memory Access with Recursive Doubling on Distributed Memory Systems.
  • Jan 1, 2016
  • Computational intelligence and neuroscience
  • Danish Shehzad + 1 more

Increase in complexity of neuronal network models escalated the efforts to make NEURON simulation environment efficient. The computational neuroscientists divided the equations into subnets amongst multiple processors for achieving better hardware performance. On parallel machines for neuronal networks, interprocessor spikes exchange consumes large section of overall simulation time. In NEURON for communication between processors Message Passing Interface (MPI) is used. MPI_Allgather collective is exercised for spikes exchange after each interval across distributed memory systems. The increase in number of processors though results in achieving concurrency and better performance but it inversely affects MPI_Allgather which increases communication time between processors. This necessitates improving communication methodology to decrease the spikes exchange time over distributed memory systems. This work has improved MPI_Allgather method using Remote Memory Access (RMA) by moving two-sided communication to one-sided communication, and use of recursive doubling mechanism facilitates achieving efficient communication between the processors in precise steps. This approach enhanced communication concurrency and has improved overall runtime making NEURON more efficient for simulation of large neuronal network models.

  • PDF Download Icon
  • Research Article
  • Cite Count Icon 26
  • 10.1007/s00500-021-06512-8
ECC-based lightweight authentication and access control scheme for IoT E-healthcare
  • Nov 26, 2021
  • Soft Computing
  • Hailong Yao + 4 more

The E-healthcare system has a complex architecture, diverse business types, and sensitive data security. To meet the secure communication and access control requirements in the user–medical server, user–patient, patient–medical server, and other scenarios in the E-healthcare system, secure and efficient authenticated key agreement and access authorization scheme need to be studied. However, the existing multi-server solutions do not consider the authentication requirements of the Wireless Body Area Network (WBAN) and are not suitable for user–patient, patient–medical server scenarios; most of the existing WBAN authentication schemes are single-server type, which are difficult to meet the requirements of multi-server applications, and the study of user–patient real-time scenarios has not received due attention. This work first reveals the structural flaws and security vulnerabilities of the existing typical schemes and then proposes an authentication and access control architecture suitable for multiple scenarios of the E-healthcare system with separate management and business and designs a novel ECC-based multi-factor remote authentication and access control scheme for E-healthcare using physically unclonable function (PUF) and hash. Security analysis and efficiency analysis show that the new scheme has achieved improved functionality and higher security while maintaining low computational and communication overhead.

  • Research Article
  • Cite Count Icon 1
  • 10.1109/tsc.2020.3026138
ANNPDP: An Efficient and Stable Evaluation Engine for Large-Scale Policy Sets
  • Sep 24, 2020
  • IEEE Transactions on Services Computing
  • Fan Deng + 7 more

As interactions between individuals and services increase, requests are more frequent and policy sets are larger. The evaluation performance of PDP (Policy Decision Point) plays a key role in the operation of a system. In order to solve bottlenecks of improving the PDP evaluation performance for large-scale policy sets, we propose an evaluation engine based on artificial neural networks, namely ANNPDP. We transform rules in a large-scale policy set described in the XACML (eXtensible Access Control Markup Language) into numerical rules. Evaluation networks are established and trained by the numerical rules. In order to ensure the accuracy, a misjudgment set is constructed for error corrections and stored by hash indexes. By simulating the arrival of requests, ANNPDP is compared with the Sun PDP, HPEngine, XEngine, and SBA-XACML. The experiment results show that ANNPDP has: 1) high performance: if the number of requests reaches 10,000, the evaluation time of ANNPDP on the large-scale policy set with 100,000 rules is approximately 0.46, 0.93, 0.71, and 1.43 percent of that of the Sun PDP, HPEngine, XEngine, and SBA-XACML, respectively, and 2) stability: as the size of the large-scale policy set and the number of requests increase, the evaluation time of ANNPDP grows linearly. ANNPDP can satisfy the requirements of an authorization system with large-scale policy sets.

  • Conference Article
  • Cite Count Icon 5
  • 10.1145/3552326.3587434
DyTIS: A Dynamic Dataset Targeted Index Structure Simultaneously Efficient for Search, Insert, and Scan
  • May 8, 2023
  • Jin Yang + 4 more

Many datasets in real life are complex and dynamic, that is, their key densities are varied over the whole key space and their key distributions change over time. It is challenging for an index structure to efficiently support all key operations for data management, in particular, search, insert, and scan, for such dynamic datasets. In this paper, we present DyTIS (Dynamic dataset Targeted Index Structure), an index that targets dynamic datasets. DyTIS, though based on the structure of Extendible hashing, leverages the CDF of the key distribution of a dataset, and learns and adjusts its structure as the dataset grows. The key novelty behind DyTIS is to group keys by the natural key order and maintain keys in sorted order in each bucket to support scan operations within a hash index. We also define what we refer to as a dynamic dataset and propose a means to quantify its dynamic characteristics. Our experimental results show that DyTIS provides higher performance than the state-of-the-art learned index for the dynamic datasets considered.

  • Research Article
  • 10.1145/3707642
A Dynamic Characteristic Aware Index Structure Optimized for Real-world Datasets
  • Feb 8, 2025
  • ACM Transactions on Storage
  • Jin Yang + 4 more

Many datasets in real life are complex and dynamic, that is, their key densities are varied over the whole key space and their key distributions change over time. It is challenging for an index structure to efficiently support all key operations for data management, in particular, search, insert, and scan, for such dynamic datasets. In this article, we present DyTIS (Dynamic dataset Targeted Index Structure), an index that targets dynamic datasets. DyTIS, although based on the structure of Extendible hashing, leverages the CDF of the key distribution of a dataset, and learns and adjusts its structure as the dataset grows. The key novelty behind DyTIS is to group keys by the natural key order and maintain keys in sorted order in each bucket to support scan operations within a hash index. We also define what we refer to as a dynamic dataset and propose a means to quantify its dynamic characteristics. Our experimental results show that DyTIS provides higher performance than the state-of-the-art learned index for the dynamic datasets considered. We also analyze the effects of the dynamic characteristics of datasets, including sequential datasets, as well as the effect of multiple threads on the performance of the indexes.

Save Icon
Up Arrow
Open/Close