Abstract
Flash memory technologies rely on flash translation layer (FTL) to manage no in-place update and garbage collection. Current FTL management schemes do not exploit the semantics of the accessed data. In this paper, we explore how semantic knowledge can be exploited to build and maintain indexes for stored data automatically. Data indexing is a critical enabler to accelerate many database applications and big data analytics. Unlike traditional per-table or per-file indexes that are managed separately from the data, we propose to maintain indexes on a per-flash page basis. Our approach, called FLash IndeXeR (FLIXR), builds and maintains page-level indexes whenever a page is written into the flash. FLIXR updates the indexes alongside any data updates at page granularity. The cost of the index update is hidden in the page write delays. FLIXR stores index data for each page within the FTL entry associated with that page, thereby piggybacking index access on a page access request. FLIXR accesses the index data in each FTL entry to determine whether the associated page stores data with a given key. FLIXR achieves 52.6% performance improvement for TPC-C and TPC-H benchmarks, compared to the conventional host-side indexing mechanism.
Highlights
B IG data analytics and database operations rely on efficiently finding records to speedup query processing
Even with a multi-level index structure, such as a B+-tree, where the root nodes may be cached on the host DRAM, the remaining levels need to be accessed from storage
To concretely demonstrate the above mentioned challenges, we describe how filtering and join processing operations in database management systems (DBMS) suffer from index access and maintenance overheads
Summary
B IG data analytics and database operations rely on efficiently finding records to speedup query processing. If a server basically keeps the index structures in storage, their accesses need a large amount of I/O operations before reaching the desired data. To process a large number of I/O requests and manage flash memory, modern SSDs equip a general-purpose multicore embedded processor to execute the SSD controller firmware. The firmware handles NVMe commands, data transfers between the host system and the NAND flash memory, manages FTL table, and performs garbage collection (GC) and wear-leveling (WL). Modern SSDs provision GB-scale DRAM to cache FTL tables and hot pages to support fast accesses. Due to these advanced features, SSDs can finish NVMe operations within tens of microseconds [7], [8]
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.