Abstract

Software health management is defined as a technology that applies the principles and techniques of system health management to software systems. It is motivated by the apparent gap between the importance and complexity of software in today’s cyber-physical systems and the rare, but undoubtedly present occurrence of software malfunctions in those systems. While engineers strive to create dependable systems unforeseen environmental conditions or faults in the hardware can trigger latent defects in the software with potentially negative consequences. The goal of software health management is to maintain system function and performance, even when software fails in unexpected ways. System health management is a well-established discipline in aerospace systems: many air and space vehicles today have quite elaborate health management systems on board. Software fault tolerance techniques are also well-known and practiced since the days when computers were first used in critical applications. Software health management combines these directions: it borrows techniques like anomaly detection, fault diagnostics and mitigation from the first and techniques like triple modular redundancy and checkpoints and restarts from the second to manage the ‘health’ of the software system to maintain functionality and performance. Like in system health management, the goal of software health management is to prevent a software fault from becoming a system failure.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call