Abstract

First generation, highly-available computer systems deployed a two-level physical hierarchy whereby a shelf was composed of field replaceable units (FRU) and the unit of fault detection, fault isolation, fault containment, fault recovery, fault repair, and sparing was the FRU. In 1995, IEEE introduced the non hot-swappable PCI Mezzanine Card (PMC) draft standard [1] that allows fault detection, isolation, containment, recovery, and sparing to be implemented at the mezzanine card level but requires fault repair to occur at the carrier board level. In 2005 the PCI Industrial Computer Manufacturers Group (PICMG®) introduced the hot swappable Advanced Mezzanine Card (AMC) standard [2] that extends the PMC model to allow all fault management functions, including fault repair, to be implemented at the mezzanine card level. This paper develops fault management strategies and availability models for the monolithic, non hot swap partitioned, and hot swap partitioned hardware architectures.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call