Abstract
Read–copy update (RCU) is a synchronization mechanism used heavily in key components of the Linux kernel, such as the virtual filesystem (VFS), to achieve scalability by exploiting RCU’s ability to allow concurrent reads and updates. RCU’s design is non-trivial, requires a significant effort to fully understand it, let alone become convinced that its implementation is faithful to its specification and provides its claimed properties. The fact that as time goes by Linux kernels are becoming increasingly more complex and are employed in machines with more and more cores and weak memory does not make the situation any easier. This article presents an approach to systematically test the code of the main implementation of RCU used in the Linux kernel (Tree RCU) for concurrency errors, both under sequentially consistent and weak memory. Our modeling allows Nidhugg, a stateless model checking tool, to reproduce, within seconds, safety and liveness bugs that have been reported for RCU. Additionally, we present the real cause behind some failures that have been observed in production systems in the past. More importantly, we were able to verify both the publish–subscribe and the grace-period guarantee, with the latter being the basic and most important guarantee that RCU offers, on several Linux kernel versions, for particular configurations. Our approach is effective, both in dealing with the increased complexity of recent Linux kernels and in terms of time that the process requires. We hold that our effort constitutes a good first step toward making tools such as Nidhugg part of the standard testing infrastructure of the Linux kernel.
Highlights
The Linux kernel is used in a surprisingly large number of devices: from PCs and servers to routers and smart TVs
This article reports on the use of stateless model checking for testing the core of Tiny Read–copy update (RCU) and Tree RCU, both being RCU implementations used in the Linux kernel
Our effort concentrated on particular kernel configurations, but we investigated the effects that weak memory models (TSO and PSO) may have on RCU’s operation
Summary
The Linux kernel is used in a surprisingly large number of devices: from PCs and servers to routers and smart TVs. This article reports on the use of stateless model checking ( known as systematic concurrency testing) for testing the core of Tiny RCU and Tree RCU, both being RCU implementations used in the Linux kernel. Using this model, as well as the source code from five different kernel versions directly, we verified both a part of the publish–subscribe guarantee We were able to demonstrate that a submitted patch, intended to impose a locking design, in reality fixed a much more serious bug that was responsible for failures observed in production systems some years back, a fact that was previously unknown We report on this issue and present the exact conditions under which this bug occurs In non-preemptible kernels, which are the ones we focus on this work, RCU imposes zero overhead to readers
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
More From: International Journal on Software Tools for Technology Transfer
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.