Data Access Performance Research Articles

With data-intensive artificial intelligence (AI) and machine learning (ML) applications rapidly surging, modern high-performance embedded systems, with heterogeneous computing resources, critically demand low-latency and high-bandwidth data communication. As such, the newly emerging NVMe (Non-Volatile Memory Express) protocol, with parallel queuing, access prioritization, and optimized I/O arbitration, starts to be widely adopted as a de facto fast I/O communication interface. However, effectively leveraging the potential of modern NVMe storage proves to be nontrivial and demands fine-grained control, high processing concurrency, and application-specific optimization. Fortunately, modern FPGA devices, capable of efficient parallel processing and application-specific programmability, readily meet the underlying physical layer requirements of the NVMe protocol, therefore providing unprecedented opportunities to implementing a rich-featured NVMe middleware to benefit modern high-performance embedded computing. In this article, we present how to rethink existing accessing mechanisms of NVMe storage and devise innovative hardware-assisted solutions to accelerating NVMe data access performance for the high-performance embedded computing system. Our key idea is to exploit the massively parallel I/O queuing capability, provided by the NVMe storage system, through leveraging FPGAs’ reconfigurability and native hardware computing power to operate transparently to the main processor. Specifically, our DirectNVM system aims at providing effective hardware constructs for facilitating high-performance and scalable userspace storage applications through (1) hardening all the essential NVMe driver functionalities, therefore avoiding expensive OS syscalls and enabling zero-copy data access from the application, (2) relying on hardware for the I/O communication control instead of relying on OS-level interrupts that can significantly reduce both total I/O latency and its variance, and (3) exposing cutting-edge and application-specific weighted-round-robin I/O traffic scheduling to the userspace. To validate our design methodology, we developed a complete DirectNVM system utilizing the Xilinx Zynq MPSoC architecture that incorporates a high-performance application processor (APU) equipped with DDR4 system memory and a hardened configurable PCIe Gen3 block in its programmable logic part. We then measured the storage bandwidth and I/O latency of both our DirectNVM system and a conventional OS-based system when executing the standard FIO benchmark suite [ 2 ]. Specifically, compared against the PetaLinux built-in kernel driver code running on a Zynq MPSoC, our DirectNVM has shown to achieve up to 18.4× higher throughput and up to 4.5× lower latency. To ensure the fairness of our performance comparison, we also measured our DirectNVM system against the Intel SPDK [ 26 ], a highly optimized userspace asynchronous NVMe I/O framework running on a X86 PC system. Our experiment results have shown that our DirectNVM, even running on a considerably less powerful embedded ARM processor than a full-scale AMD processor, achieved up to 2.2× higher throughput and 1.3× lower latency. Furthermore, by experimenting with a multi-threading test case, we have demonstrated that our DirectNVM’s weighted-round-robin scheduling can significantly optimize the bandwidth allocation between latency-constraint frontend applications and other backend applications in real-time systems. Finally, we have developed a theoretical framework of performance modeling with classic queuing theory that can quantitatively define the relationship between a system’s I/O performance and its I/O implementation.

Named data networking (NDN), as a specific architecture design of information-centric networking (ICN), has quickly became a promising candidate for future Internet architecture, where communications are driven by data names instead of IP addresses. To realize the NDN communication paradigm in the future Internet, two important features, stateful forwarding and in-network caching, have been proposed to cope with drawbacks of host-based communication protocols. The stateful forwarding is designed to maintain the state of pending Interest packets to guide Data packets back to requesting consumers, while the in-network caching is used to reduce both network traffic and data access delay to improve the overall performance of data access. However, the conventional stateful forwarding approach is not adaptive and responsive to diverse network conditions because it fails to consider multiple network metrics to make Interest forwarding decision. In addition, the default in-network caching strategy relies on storing each received Data packet regardless of various caching constraints and criteria, which causes the routers in the vicinity of data producers to suffer from excessive caching overhead. In this paper, we propose the Pro NDN , a novel stateful forwarding and in-network caching strategy for NDN networks. The Pro NDN consists of multicriteria decision-making (MCDM) based interest forwarding and cooperative data caching. The basic idea of the MCDM-based interest forwarding is to employ Technique for Order Performance by Similarity to Idea Solution (TOPSIS) to dynamically evaluate outgoing interface alternatives based on multiple network metrics and objectively select an optimal outgoing interface to forward the Interest packet. In addition, the cooperative data caching consists of two schemes: CacheData, which caches the data, and CacheFace, which caches the outgoing interface. We conduct extensive simulation experiments for performance evaluation and comparison with prior schemes. The simulation results show that the Pro NDN can improve Interest satisfaction ratio and Interest satisfaction latency as well as reduce hop count and Content Store utilization ratio.

Data Access Performance Research Articles

Related Topics

Articles published on Data Access Performance

PROPEL Discharge: An Interdisciplinary Throughput Initiative

Design of software system architecture for reading and publishing amateur literary works

Data-access performance anti-patterns in data-intensive systems

Blockchain-based data sharing algorithm in distributed network data storage

Optimizing the Performance of Data Warehouse by Query Cache Mechanism in Big Data

Granularity-Driven Management for Reliable and Efficient Skyrmion Racetrack Memories

Optimization Design and Performance Analysis of Improved IEEE802.11p MAC Mechanism Based on High Mobility of Vehicle

Dimensional Modeling Method Discussion for the Profits from Mineral Rights Transfer Management

WukaStore: Scalable, Configurable and Reliable Data Storage on Hybrid Volunteered Cloud and Desktop Systems

DirectNVM: Hardware-accelerated NVMe SSDs for High-performance Embedded Computing

A data grouping model based on cache transaction for unstructured data storage systems

A machine learning assisted data placement mechanism for hybrid storage systems

Robustness of the Storage in Cloud Data Centers Based on Simple Swarm Optimization Algorithm

Blackbox Testing On E-Commerce System Web-Based Evermos (Feature: Registration Experiment & Revamp)

Pro NDN : MCDM-Based Interest Forwarding and Cooperative Data Caching for Named Data Networking

Improving in-memory file system reading performance by fine-grained user-space cache mechanisms

RETRACTED: Cloud music teaching database based on opencl design and neural network

A popularity-aware reconstruction technique in erasure-coded storage systems

Integrating cellular automata and discrete global grid systems: a case study into wildfire modelling

Learning-based dynamic cache management in a cloud

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Data Access Performance Research Articles

Related Topics

Articles published on Data Access Performance

PROPEL Discharge: An Interdisciplinary Throughput Initiative

Design of software system architecture for reading and publishing amateur literary works

Data-access performance anti-patterns in data-intensive systems

Blockchain-based data sharing algorithm in distributed network data storage

Optimizing the Performance of Data Warehouse by Query Cache Mechanism in Big Data

Granularity-Driven Management for Reliable and Efficient Skyrmion Racetrack Memories

Optimization Design and Performance Analysis of Improved IEEE802.11p MAC Mechanism Based on High Mobility of Vehicle

Dimensional Modeling Method Discussion for the Profits from Mineral Rights Transfer Management

WukaStore: Scalable, Configurable and Reliable Data Storage on Hybrid Volunteered Cloud and Desktop Systems

DirectNVM: Hardware-accelerated NVMe SSDs for High-performance Embedded Computing

A data grouping model based on cache transaction for unstructured data storage systems

A machine learning assisted data placement mechanism for hybrid storage systems

Robustness of the Storage in Cloud Data Centers Based on Simple Swarm Optimization Algorithm

Blackbox Testing On E-Commerce System Web-Based Evermos (Feature: Registration Experiment & Revamp)

Pro NDN : MCDM-Based Interest Forwarding and Cooperative Data Caching for Named Data Networking

Improving in-memory file system reading performance by fine-grained user-space cache mechanisms

RETRACTED: Cloud music teaching database based on opencl design and neural network

A popularity-aware reconstruction technique in erasure-coded storage systems

Integrating cellular automata and discrete global grid systems: a case study into wildfire modelling

Learning-based dynamic cache management in a cloud