Abstract

Emerging high-density non-volatile random access memories (NVRAMs) can significantly enhance server main memory by providing both higher memory density and fast persistent memory. An unique design requirement for server main memory is strong reliability because uncorrectable errors can cause a system crash or permanent data loss. Traditional dynamic random access memory (DRAM) subsystems have used chipkill-correct to provide this reliability, while storage systems provide similar protection using very long ECC words (VLEWs). This paper presents an efficient chipkill-correct scheme for persistent memory based on high-density NVRAMs. For efficiency, the scheme decouples error correction at boot time from error correction at runtime. At boot time, when bit error rates are higher, the scheme uses VLEWs to efficiently ensure reliable data survival for a week to a year without refresh by correcting a large number of bit errors at low storage cost. At runtime, when bit error rates are lower, it reuses each memory block's chip failure protection bits to opportunistically correct bit errors at high performance. The proposal incurs a total storage cost of 27%. Compared to a bit error correction scheme, the proposal adds chip failure protection at no additional storage cost and at 2% average performance overhead.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.