Generation of a fault-tolerant clock through redundant crystal oscillators

Wolfgang Dür,Matthias Függer,Andreas Steininger

doi:10.1016/j.microrel.2021.114088

Abstract

Having a precise and stable clock that is still fault tolerant is a fundamental prerequisite in safety critical real-time systems. However, combining redundant independent clock sources to form a unified fault-tolerant clock supply is non-trivial, especially when redundant clock outputs are required – e.g., for supplying the replicated nodes within a TMR architecture through a clock network that does not suffer from a single point of failure. Having these outputs fail independent but still keeping them tightly synchronized is highly desirable, as it substantially eases the design of the overall architecture.In this paper we address exactly this challenge. Our approach extends an existing, ring-oscillator like distributed clock generation scheme by augmenting each of its constituent nodes with a stable clock reference. We introduce the appropriately modified algorithm and illustrate its operation by simulation experiments. These experiments further demonstrate that the four clock outputs of our circuit do not share a single point of failure, have small and bounded skew, remain stabilized to one crystal source during normal operation, do not propagate glitches from one failed clock to a correct one, and only exhibit slightly extended clock cycles during a short stabilization period after a component failure. In addition we give a rigorous formal proof for the correctness of the algorithm on an abstraction level that is close to the implementation.

Highlights

Computers are being entrusted with safety-critical functions in a rapidly increasing number of applications, with autonomous vehicles being just one recent example
One threat to triple-modular redundant (TMR) architectures is the so-called commonmode failure: If two of the three redundant nodes fail in the same way, the voter will decide for the erroneous result
Our envisioned use case is a TMR system whose redundant nodes shall be supplied with a clock that does not constitute a single point of failure

Summary

INTRODUCTION

Computers are being entrusted with safety-critical functions in a rapidly increasing number of applications, with autonomous vehicles being just one recent example. While a lot of alternative fault-tolerance techniques are available, (coarse-grain) triple-modular redundant (TMR) architectures have gained much popularity This is partly due to the high error detection coverage they can attain through their “output centric” approach: No matter what the actual cause may be – the voter just takes the majority of matching outputs and masks the faulty one. Another beneficial feature of TMR is its simplicity: The redundant nodes can be off-the-shelf components (or IP modules) without any special features or extensions. The whole architecture is operated in lock-step, which significantly simplifies the voter At this point the clock potentially becomes a single point of failure, unless it can be furnished with fault tolerance as well.

AND RELATED WORK

REQUIREMENTS

Starting point

Proposed extension with stable sources

Formalization of the modified algorithm

Steady-state operation of the extended algorithm

Model and preliminaries

Correctness analysis

EXPERIMENTAL EVALUATION

Steady state operation

Failure of the fastest TS node

Changing frequency

Discussion

CONCLUSION

Full Text

Published version (

Free)

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Generation of a fault-tolerant clock through redundant crystal oscillators

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Microelectronics and Reliability

Lead the way for us

Journal: Microelectronics and Reliability	Publication Date: Mar 26, 2021
License type: cc-by

Similar Papers

Merging Redundant Crystal Oscillators into a Fault-Tolerant Clock
Wolfgang Duer ... Andreas Steininger
-
Wolfgang Duer, et. al.Wolfgang Duer ... Andreas Steininger
01 Apr 2020
01 Apr 2020

A Flexible Communication Protocol With Guaranteed Determinism for Distributed, Safety-Critical Real-Time Systems
Fawad Riasat Raja ... Rene Hexel
IEEE access : practical innovations, open solutions | VOL. 10
Fawad Riasat Raja, et. al.Fawad Riasat Raja ... Rene Hexel
01 Jan 2021
IEEE access : practical innovations, open solutions | VOL. 10

Safety Critical Computer Systems: Failure Independence and Software Diversity Effects on Reliability of Dual Channel Structures
Hristo Hristov ... Wang Bo
Information Technologies and Control | VOL. 12
Hristo Hristov, et. al.Hristo Hristov ... Wang Bo
01 Jun 2014
Information Technologies and Control | VOL. 12

Study of Architectural Design Patterns in Concurrence with Analysis of Design Pattern in Safety Critical Systems
Feby A Vinisha ... R Selvarani
-
Feby A Vinisha, et. al.Feby A Vinisha ... R Selvarani
01 Jan 2012
01 Jan 2012

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Generation of a fault-tolerant clock through redundant crystal oscillators

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Microelectronics and Reliability