Reversible DNA data hiding using multiple difference expansions for DNA authentication and storage

Suk-Hwan Lee,Won-Joo Hwang,Eung-Joo Lee,Ki-Ryong Kwon

doi:10.1007/s11042-017-5379-1

Abstract

Recently the data hiding techniques on DNA sequence have attracted interest for DNA authentication and high-capacity DNA storage. However, since DNA sequence represents the primary information that directs the functions of organism, it is necessary for distortion-free DNA data hiding, so-called reversible DNA data hiding, with high capacity, low change rate of nucleotide bases, biological preservation, and reversibility. In this paper, we address two approaches of reversible DNA data hiding using multiple difference expansions. Reversible DNA data hiding should consider the string structure of a DNA sequence, the biological functionality, the efficient recovery, and the optimal embedding capacity. Our method converts the string sequence of four characters (A,T,C,G) of noncoding DNA sequences into decimal-coded values and embeds the watermark into coded value sequence using two approaches; DE-based multiple bits embedding (DE-MBE) using pairs of neighboring values and consecutive DE-MBE (CDE-MBE) using previous embedded coded values as the current estimated ones. Two approaches use comparison searching to prevent false start codons that produce false coding regions (exons) and embed multiple bits for maximal expandability of differences within the range of coded values. From experimental results using bacterial and archaeal sequences, we verified that our CDE-MBE have a higher embedding capacity of 1.13times~9.03times than conventional methods, and produce no false start codons, verify the security by secure numerical coding and recover the host sequence perfectly without a reference sequence. In particular, CDE-MBE has an embedding capacity that is two times greater than that of DE-MBE.

Full Text