Fountain codes, originally developed for reliable multicasting in communication networks, are effectively applied in various data transmission and storage systems. Their recent use in DNA data storage systems has unique challenges, since the DNA storage channel deviates from the traditional Gaussian white noise erasure model considered in communication networks and has several restrictions as well as special properties. Thus, optimizing fountain codes to address these challenges promises to improve their overall usability in DNA data storage systems. In this article, we present several methods for optimizing fountain codes for DNA data storage. Apart from generally applicable optimizations for fountain codes, we propose optimization algorithms to create tailored distribution functions of fountain codes, which is novel in the context of DNA data storage. We evaluate the proposed methods in terms of various metrics related to the DNA storage channel. Our evaluation shows that optimizing fountain codes for DNA data storage can significantly enhance the reliability and capacity of DNA data storage systems. The developed methods represent a step forward in harnessing the full potential of fountain codes for DNA-based data storage applications. The new coding schemes and all developed methods are available under a free and open-source software license.
Read full abstract