Abstract

In many computer systems, the contents of memory are protected by an error detection and correction (EDAC) code. Bit-flips caused by single event upsets (SEU) are a well-known problem in memory chips; EDAC codes have been an effective solution to this problem. These codes are usually implemented in hardware using extra memory bits and encoding/decoding circuitry. In systems where EDAC hardware is not available, the reliability of the system can be improved by providing protection through software. Codes and techniques that can be used for software implementation of EDAC are discussed and compared. The implementation requirements and issues are discussed, and some solutions are presented. The paper discusses in detail how system-level and chip-level structures relate to multiple error correction. A simple solution is presented to make the EDAC scheme independent of these structures. The technique in this paper was implemented and used effectively in an actual space experiment. We have observed that SEU corrupt the operating system or programs of a computer system that does not have any EDAC for memory, forcing the system to be reset frequently. Protecting the entire memory (code and data) might not be practical in software. However this paper demonstrates that software-implemented EDAC is a low-cost solution that provides protection for code segments and can appreciably enhance the system availability in a low-radiation space environment.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call