Abstract

Log-Structured-Merge (LSM) Tree gains much attention recently because of its superior performance in write-intensive workloads. LSM Tree uses an append-only structure in memory to achieve low write latency; at memory capacity, in-memory data are flushed to other storage media (e.g. disk). Consequently, read access is slower comparing to write. These specific features of LSM, including no in-place update and asymmetric read/write performance raise unique challenges in index maintenance for LSM. The structural difference between LSM and B-Tree also prevents mature B-Tree based approaches from being directly applied. To address the issues of index maintenance for LSM, we propose Diff-Index to support a spectrum of index maintenance schemes to suit different objectives in index consistency and performance. The schemes consist of sync-full, sync-insert, async-simple and async-session. Experiments on our HBase implementation quantitatively demonstrate that Diff-Index offers various performance/consistency balance and satisfactory scalability while avoiding global coordination. Syncinsert and async-simple can reduce 60%-80% of the overall index update latency when compared to the baseline syncfull ; async-simple can achieve superior index update performance with an acceptable inconsistency. Diff-Index exploits LSM features such as versioning and the flush-compact process to achieve goals of concurrency control and failure ∗Work done while author was at IBM Almaden Research Center. †Work done while author was an intern at IBM T. J. Watson Research Center. (c) 2014, Copyright is with the authors. Published in Proc. EDBT on OpenProceedings.org. Distribution of this paper is permitted under the terms of the Creative Commons license CC-by-nc-nd 4.0. recovery with low complexity and overhead. Diff-Index is included in IBM InfoSphere BigInsights, an IBM big data offering.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call