Set Of Codewords Research Articles

As interest in DNA-based information storage grows, the costs of synthesis have been identified as a key bottleneck. A potential direction is to tune synthesis for data. Data strands tend to be composed of a small set of recurring code word sequences, and they contain longer sequences of repeated data. To exploit these properties, we propose a new framework called DINOS. DINOS consists of three key parts: (i) The first is a hierarchical strand assembly algorithm, inspired by gene assembly techniques that can assemble arbitrary data strands from a small set of primitive blocks. (ii) The assembly algorithm relies on our novel formulation for how to construct primitive blocks, spanning a variety of useful configurations from a set of code words and overhangs. Each primitive block is a code word flanked by a pair of overhangs that are created by a cyclic pairing process that keeps the number of primitive blocks small. Using these primitive blocks, any data strand of arbitrary length can be assembled, theoretically. We show a minimal system for a binary code with as few as six primitive blocks, and we generalize our processes to support an arbitrary set of overhangs and code words. (iii) We exploit our hierarchical assembly approach to identify redundant sequences and coalesce the reactions that create them to make assembly more efficient. We evaluate DINOS and describe its key characteristics. For example, the number of reactions needed to make a strand can be reduced by increasing the number of overhangs or the number of code words, but increasing the number of overhangs offers a small advantage over increasing code words while requiring substantially fewer primitive blocks. However, density is improved more by increasing the number of code words. We also find that a simple redundancy coalescing technique is able to reduce reactions by 90.6% and 41.2% on average for decompressed and compressed data, respectively, even when the smallest data fragments being assembled are 16 bits. With a simple padding heuristic that finds even more redundancy, we can further decrease reactions for the same operating point up to 91.1% and 59% for decompressed and compressed data, respectively, on average. Our approach offers greater density by up to 80% over a prior general purpose gene assembly technique. Finally, in an analysis of synthesis costs in which we make 1 GB volume using de novo synthesis versus making only the primitive blocks with de novo synthesis and otherwise assembling using DINOS, we estimate DINOS as 10 5 × cheaper than de novo synthesis.

Read full abstract

Smart IoT sensors are characterized by their ability to sense and process signals, producing high-level information that is usually sent wirelessly while minimising energy consumption and maximising communication efficiency. Systems are getting smarter, meaning that they are providing ever richer information from the same raw data. This increasing intelligence can occur at various levels, including in the sensor itself, at the edge, and in the cloud. As sending one byte of data is several orders of magnitude more energy-expensive than processing it, data must be handled as near as possible to its generation. Thus, the intelligence should be located in the sensor; nevertheless, it is not always possible to do so because real data is not always available for designing the algorithms or the hardware capacity is limited. Smart devices detecting data coming from inertial sensors are a good example of this. They generate hundreds of bytes per second (100 Hz, 12-bit sampling of a triaxial accelerometer) but useful information comes out in just a few bytes per minute (number of steps, type of activity, and so forth). We propose a lossy compression method to reduce the dimensionality of raw data from accelerometers, gyroscopes, and magnetometers, while maintaining a high quality of information in the reconstructed signal coming from an embedded device. The implemented method uses an adaptive vector-quantisation algorithm that represents the input data with a limited set of codewords. The adaptive process generates a codebook that evolves to become highly specific for the input data, while providing high compression rates. The codebook’s reconstruction quality is measured with a peak signal-to-noise ratio (PSNR) above 40 dB for a 12-bit representation.

Read full abstract

Set Of Codewords Research Articles

Related Topics

Articles published on Set Of Codewords

On Iiro Honkala’s Contributions to Identifying Codes

СТИСНЕННЯ ПРИРОДНОМОВНИХ ТЕКСТІВ РЕВЕРСНИМИ МУЛЬТИРОЗДІЛЬНИКОВИМИ КОДАМИ

Minimal codewords in Norm-Trace codes

Approximate Autonomous Quantum Error Correction with Reinforcement Learning.

CeRA-eSP: Code-Expanded Random Access to Enhance Success Probability of Massive MTC.

On the Generalized Covering Radii of Reed-Muller Codes

Free Resolutions and Generalized Hamming Weights of Binary Linear Codes

DINOS: Data INspired Oligo Synthesis for DNA Data Storage

Metode Reversible Self-Dual untuk Konstruksi Kode DNA atas Lapangan Hingga GF(4)

Computing generalized hamming weights of binary linear codes via free resolutions

Adaptive Modulation SCMA Codebook Design Based on Constellation Rotation

Estimator design for complex networks with encoding decoding mechanism and buffer-aided strategy: A partial-nodes accessible case

Binary Cyclic Pearson Codes

Constant Transmission Efficiency Dimming Control Scheme for VLC Systems

Dimensionality Reduction for Smart IoT Sensors

On Bases of BCH Codes with Designed Distance 3 and Their Extensions

Code design for run-length control in visible light communication

Sparsifying parity-check matrices

Covering Codes Using Insertions or Deletions

Code Design for Non-Coherent Index Modulation

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Set Of Codewords Research Articles

Related Topics

Articles published on Set Of Codewords

On Iiro Honkala’s Contributions to Identifying Codes

СТИСНЕННЯ ПРИРОДНОМОВНИХ ТЕКСТІВ РЕВЕРСНИМИ МУЛЬТИРОЗДІЛЬНИКОВИМИ КОДАМИ

Minimal codewords in Norm-Trace codes

Approximate Autonomous Quantum Error Correction with Reinforcement Learning.

CeRA-eSP: Code-Expanded Random Access to Enhance Success Probability of Massive MTC.

On the Generalized Covering Radii of Reed-Muller Codes

Free Resolutions and Generalized Hamming Weights of Binary Linear Codes

DINOS: Data INspired Oligo Synthesis for DNA Data Storage

Metode Reversible Self-Dual untuk Konstruksi Kode DNA atas Lapangan Hingga GF(4)

Computing generalized hamming weights of binary linear codes via free resolutions

Adaptive Modulation SCMA Codebook Design Based on Constellation Rotation

Estimator design for complex networks with encoding decoding mechanism and buffer-aided strategy: A partial-nodes accessible case

Binary Cyclic Pearson Codes

Constant Transmission Efficiency Dimming Control Scheme for VLC Systems

Dimensionality Reduction for Smart IoT Sensors

On Bases of BCH Codes with Designed Distance 3 and Their Extensions

Code design for run-length control in visible light communication

Sparsifying parity-check matrices

Covering Codes Using Insertions or Deletions

Code Design for Non-Coherent Index Modulation