In real-time systems, the execution-time overrun of a thread may lead to a deadline being missed by the thread or even others threads in the system. From a fault tolerance perspective, both execution time overruns and deadline misses can be considered timing errors that could potentially cause a failure in the system's ability to deliver its services in a timely manner. In this context, the ideal is to detect the error in the system as soon as possible, so that the propagation of the error can be limited and error recovery strategies can take place with more accurate information. The run-time support mechanism usually deployed for monitoring the timing requirements of real-time systems is based on deadline monitoring, that is, the system calls specific application code whenever a deadline is violated. Recognizing that deadline monitoring may not be enough for providing an adequate level of fault tolerance for timing errors, major real-time programming standards, like Ada, POSIX and the Real-Time Specification for Java (RTSJ), have proposed different mechanisms for monitoring the execution time of threads. Nevertheless, in order to provide a complete fault tolerance approach for timing errors, the potential blocking time of threads also has to be monitored. In this article, we propose mechanisms for measuring and policing the blocking time of threads in the context of both basic priority inheritance and priority ceiling protocols . The notion of blocking-time clocks and timers for the POSIX standard is proposed, implemented and evaluated in the open-source real-time operating system MaRTE OS. Also, a blocking time monitoring model for measuring and policing blocking times in the RTSJ framework is specified. This model is implemented and evaluated in the (RTSJ-compliant) open-source middleware jRate, running on top of MaRTE OS.
Read full abstract