Abstract

This letter proposes a self-healing called automatic monitoring architecture (AMA), which can help RISC-V server self-heal from hardware errors. AMA reduces redundant error classification, and only monitors hardware devices that have a relatively large proportion of errors in RISC-V server, thereby reducing the resource consumption of AMA on the basis of ensuring performance. In addition, AMA uses the Correctable-error Dynamic Threshold technology to further reduce the probability of serious uncorrectable hardware errors. Compared with an RISC-V server without a self-healing system, this architecture consumes very few hardware resources, reducing server downtime by about 80% each year. Compared with other self-healing architectures, such as Intel’s machine check architecture, AMA can reduce server downtime by an additional 5% per year. Therefore, AMA is highly efficient with little resource consumption.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call