Write optimization of log-structured flash file system for parallel I/O on manycore servers

Chang-Gyu Lee,Youngjae Kim,Hyeongu Kang,Sunghyun Noh,Hyunki Byun

doi:10.1145/3319647.3325828

Chang-Gyu Lee, Youngjae Kim + Show 3 more

https://doi.org/10.1145/3319647.3325828

Copy DOI

Export

Save

Cite

Publication Date: May 22, 2019

Citations: 10

Affiliation: Sogang University

Abstract
Full-Text
Similar Papers

Abstract

Listen

In Manycore server environment, we observe the performance degradation in parallel writes and identify the causes as follows - (i) When multiple threads write to a single file simultaneously, the current POSIX-based F2FS file system does not allow this parallel write even though ranges are distinct where threads are writing. (ii) The high processing time of Fsync at file system layer degrades the I/O throughput as multiple threads call Fsync simultaneously. (iii) The file system periodically checkpoints to recover from system crashes. All incoming I/O requests are blocked while the checkpoint is running, which significantly degrades overall file system performance. To solve these problems, first, we propose file systems to employ a fine-grained file-level Range Lock that allows multiple threads to write on mutually exclusive ranges of files rather than the course-grained inode mutex lock. Second, we propose NVM Node Logging that uses NVM as an extended storage space to store file metadata and file system metadata at high speed during Fsync and checkpoint operations. In particular, the NVM Node Logging consists of (i) a fine-grained inode structure to solve the write amplification problem caused by flushing the file metadata in block units and (ii) a Pin Point NAT (Node Address Table) Update, which can allow flushing only modified NAT entries. We implemented Range Lock and NVM Node Logging for F2FS in Linux kernel 4.14.11. Our extensive evaluation at two different types of servers (single socket 10 cores CPU server, multi-socket 120 cores NUMA CPU server) shows significant write throughput improvements in both real and synthetic workloads.

Full Text