Unreliable Bits Research Articles

Large-granularity memory failures continue to be a critical impediment to system reliability. To make matters worse, as DRAM scales to smaller nodes, the frequency of unreliable bits in DRAM chips continues to increase. To mitigate such scaling-related failures, memory vendors are planning to equip existing DRAM chips with On-Die ECC. For maintaining compatibility with memory standards, On-Die ECC is kept invisible from the memory controller. This paper explores how to design high reliability memory systems in presence of On-Die ECC. We show that if On-Die ECC is not exposed to the memory system, having a 9-chip ECC-DIMM (implementing SECDED) provides almost no reliability benefits compared to an 8-chip non-ECC DIMM. We also show that if the error detection of On-Die ECC can be exposed to the memory controller, then Chipkill-level reliability can be achieved even with a 9-chip ECC-DIMM. To this end, we propose eXposed On-Die Error Detection (XED) , which exposes the On-Die error detection information without requiring changes to the memory standards or consuming bandwidth overheads. When the On-Die ECC detects an error, XED transmits a pre-defined "catch-word" instead of the corrected data value. On receiving the catch-word, the memory controller uses the parity stored in the 9-chip of the ECC-DIMM to correct the faulty chip (similar to RAID-3). Our studies show that XED provides Chipkill-level reliability (172x higher than SECDED), while incurring negligible overheads, with a 21% lower execution time than Chipkill. We also show that XED can enable Chipkill systems to provide Double-Chipkill level reliability while avoiding the associated storage, performance, and power overheads.

Read full abstract

In communication systems employing a serially concatenated cyclic redundancy check (CRC) code along with a convolutional code (CC), erroneous packets after CC decoding are usually discarded. The list Viterbi algorithm (LVA) and the iterative Viterbi algorithm (IVA) are two existing approaches capable of recovering erroneously decoded packets. We here employ a soft decoding algorithm for CC decoding, and introduce several schemes to identify error patterns using the posterior information from the CC soft decoding module. The resultant iterative decoding-detecting (IDD) algorithm improves error performance by iteratively updating the extrinsic information based on the CRC parity check matrix. Assuming errors only happen in unreliable bits characterized by small absolute values of the log-likelihood ratio (LLR), we also develop a partial IDD (P-IDD) alternative which exhibits comparable performance to IDD by updating only a subset of unreliable bits. We further derive a soft-decision syndrome decoding (SDSD) algorithm, which identifies error patterns from a set of binary linear equations derived from CRC syndrome equations. Being noniterative, SDSD is able to estimate error patterns directly from the decoder output. The packet error rate (PER) performance of SDSD is analyzed following the union bound approach on pairwise errors. Simulations indicate that both IDD and IVA are better tailored for single parity check (PC) codes than for CRC codes. SDSD outperforms both IDD and LVA with weak CC and strong CRC. Applicable to AWGN and flat fading channels, our algorithms can also be extended to turbo coded systems.

Read full abstract

Unreliable Bits Research Articles

Related Topics

Articles published on Unreliable Bits

XED

Highly Reliable Spin-Transfer Torque Magnetic RAM-Based Physical Unclonable Function With Multi-Response-Bits Per Cell

Synergistic High Charge-Storage Capacity for Multi-level Flexible Organic Flash Memory.

39 fJ/bit On-Chip Identification ofWireless Sensors Based on Manufacturing Variation

Iterative Soft-Decision Decoding of Hermitian Codes

Restoration of embedded image from corrupted stego image

IDMA-based cooperative partial packet recovery: principles and applications

Threshold-Based Relaying in Coded Cooperative Networks

CRC-assisted error correction in a convolutionally coded system

Reliability-Based Hybrid ARQ Scheme with Encoded Parity Bit Retransmissions and Message Passing Decoding

Algebraic Soft-Decision Decoding of Reed-Solomon Codes with Erasures on Gaussian Channels

Reliability-based hybrid ARQ

High rate concatenated coding systems using bandwidth efficient trellis inner codes

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Unreliable Bits Research Articles

Related Topics

Articles published on Unreliable Bits

XED

Highly Reliable Spin-Transfer Torque Magnetic RAM-Based Physical Unclonable Function With Multi-Response-Bits Per Cell

Synergistic High Charge-Storage Capacity for Multi-level Flexible Organic Flash Memory.

39 fJ/bit On-Chip Identification ofWireless Sensors Based on Manufacturing Variation

Iterative Soft-Decision Decoding of Hermitian Codes

Restoration of embedded image from corrupted stego image

IDMA-based cooperative partial packet recovery: principles and applications

Threshold-Based Relaying in Coded Cooperative Networks

CRC-assisted error correction in a convolutionally coded system

Reliability-Based Hybrid ARQ Scheme with Encoded Parity Bit Retransmissions and Message Passing Decoding

Algebraic Soft-Decision Decoding of Reed-Solomon Codes with Erasures on Gaussian Channels

Reliability-based hybrid ARQ

High rate concatenated coding systems using bandwidth efficient trellis inner codes