Key-value Cache Research Articles

Due to large data volume and low latency requirements of modern web services, the use of an in-memory key-value (KV) cache often becomes an inevitable choice (e.g., Redis and Memcached). The in-memory cache holds hot data, reduces request latency, and alleviates the load on background databases. Inheriting from the traditional hardware cache design, many existing KV cache systems still use recency-based cache replacement algorithms, e.g., least recently used or its approximations. However, the diversity of miss penalty distinguishes a KV cache from a hardware cache. Inadequate consideration of penalty can substantially compromise space utilization and request service time. KV accesses also demonstrate locality, which needs to be coordinated with miss penalty to guide cache management. In this article, we first discuss how to enhance the existing cache model, the Average Eviction Time model, so that it can adapt to modeling a KV cache. After that, we apply the model to Redis and propose pRedis, Penalty- and Locality-aware Memory Allocation in Redis, which synthesizes data locality and miss penalty, in a quantitative manner, to guide memory allocation and replacement in Redis. At the same time, we also explore the diurnal behavior of a KV store and exploit long-term reuse. We replace the original passive eviction mechanism with an automatic dump/load mechanism, to smooth the transition between access peaks and valleys. Our evaluation shows that pRedis effectively reduces the average and tail access latency with minimal time and space overhead. For both real-world and synthetic workloads, our approach delivers an average of 14.0%∼52.3% latency reduction over a state-of-the-art penalty-aware cache management scheme, Hyperbolic Caching (HC), and shows more quantitative predictability of performance. Moreover, we can obtain even lower average latency (1.1%∼5.5%) when dynamically switching policies between pRedis and HC.

Read full abstract

Remote Direct Memory Access (RDMA) fabrics such as InfiniBand and Converged Ethernet report latency shorter by a factor of 50 than TCP. As such, RDMA is a potential replacement for TCP in datacenters (DCs) running low-latency applications, such as Web search and memcached. InfiniBand’s Shared Receive Queues (SRQs), which use two-sided send/recv verbs (i.e., channel semantics ), reduce the amount of pre-allocated, pinned memory (despite optimizations such as InfiniBand’s on-demand paging (ODP)) for message buffers. However, SRQs are limited fundamentally to a single message size per queue, which incurs either memory wastage or significant programmer burden for typical DC traffic of an arbitrary number (level of burstiness) of messages of arbitrary size. We propose remote indirect memory access (RIMA) , which avoids these pitfalls by providing (1) network interface card (NIC) microarchitecture support for novel queue semantics and (2) a new “verb” called append . To append a sender’s message to a shared queue, the receiver NIC atomically increments the queue’s tail pointer by the incoming message’s size and places the message in the newly created space. As in traditional RDMA, the NIC is responsible for pointer lookup, address translation, and enforcing virtual memory protections. This indirection of specifying a queue (and not its tail pointer, which remains hidden from senders) handles the typical DC traffic of an arbitrary sender sending an arbitrary number of messages of arbitrary size. Because RIMA’s simple hardware adds only 1--2 ns to the multi-\mu s message latency, RIMA achieves the same message latency and throughput as InfiniBand SRQ with unlimited buffering. Running memcached traffic on a 30-node InfiniBand cluster, we show that at similar, low programmer effort, RIMA achieves significantly smaller memory footprint than SRQ. However, while SRQ can be crafted to minimize memory footprint by expending significant programming effort, RIMA provides those benefits with little programmer effort. For memcached traffic, a high-performance key-value cache ( FastKV ) using RIMA achieves either 3× lower 96 th-percentile latency or significantly better throughput or memory footprint than FastKV using RDMA.

Read full abstract

Key-value Cache Research Articles

Related Topics

Articles published on Key-value Cache

Key-Value Cache Quantization in Large Language Models: A Safety Benchmark

Open-AI model Efficient Memory Reduce Management for the Large Language Models (LLMs) Serving with Paged Attention of sharing the KV Cashes

Proto-Adapter: Efficient Training-Free CLIP-Adapter for Few-Shot Image Classification.

Improving key-value cache performance with heterogeneous memory tiering: A case study of CXL-based memory expansion

Catalyst: Optimizing Cache Management for Large In-memory Key-value Systems

SPOPB: Reducing solid state drive write traffic for flash‐based key‐value caching

A Large-scale Analysis of Hundreds of In-memory Key-value Cache Clusters at Twitter

Penalty- and Locality-aware Memory Allocation in Redis Using Enhanced AET

Prism-SSD: A Flexible Storage Interface for SSDs

SlimCache

Network Interface Architecture for Remote Indirect Memory Access (RIMA) in Datacenters

Fast In-Memory Key–Value Cache System with RDMA

Lightweight and Accurate Memory Allocation in Key-Value Cache

DIDACache

MemSC: A Scan-Resistant and Compact Cache Replacement Framework for Memory-Based Key-Value Cache Systems

Bluecache

Key-value cache based IFC model implementation for web environments

R-memcached: A reliable in-memory cache for big key-value stores

Building a high-performance key–value cache as an energy-efficient appliance

Characterizing Facebook's Memcached Workload

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Key-value Cache Research Articles

Related Topics

Articles published on Key-value Cache

Key-Value Cache Quantization in Large Language Models: A Safety Benchmark

Open-AI model Efficient Memory Reduce Management for the Large Language Models (LLMs) Serving with Paged Attention of sharing the KV Cashes

Proto-Adapter: Efficient Training-Free CLIP-Adapter for Few-Shot Image Classification.

Improving key-value cache performance with heterogeneous memory tiering: A case study of CXL-based memory expansion

Catalyst: Optimizing Cache Management for Large In-memory Key-value Systems

SPOPB: Reducing solid state drive write traffic for flash‐based key‐value caching

A Large-scale Analysis of Hundreds of In-memory Key-value Cache Clusters at Twitter

Penalty- and Locality-aware Memory Allocation in Redis Using Enhanced AET

Prism-SSD: A Flexible Storage Interface for SSDs

SlimCache

Network Interface Architecture for Remote Indirect Memory Access (RIMA) in Datacenters

Fast In-Memory Key–Value Cache System with RDMA

Lightweight and Accurate Memory Allocation in Key-Value Cache

DIDACache

MemSC: A Scan-Resistant and Compact Cache Replacement Framework for Memory-Based Key-Value Cache Systems

Bluecache

Key-value cache based IFC model implementation for web environments

R-memcached: A reliable in-memory cache for big key-value stores

Building a high-performance key–value cache as an energy-efficient appliance

Characterizing Facebook's Memcached Workload