Nowadays, with the increase in cyber-attacks, hacking, and data theft, maintaining data security and confidentiality is of paramount importance. Several techniques are used in cryptography and steganography to ensure their safety during the transfer of information between the two parties without interference from an unauthorized third party. This paper proposes a modern approach to cryptography and steganography based on exploiting a new environment: bases and protein chains used to encrypt and hide sensitive data. The protein bases are used to form a cipher key whose length is twice the length of the data to be encrypted. During the encryption process, the plain data and the cipher key are represented in several forms, including hexadecimal and binary representation, and several arithmetic operations are performed on them, in addition to the use of logic gates in the encryption process to increase encrypted data randomness. As for the protein chains, they are used as a cover to hide the encrypted data. The process of hiding inside the protein bases will be performed in a sophisticated manner that is undetectable by statistical analysis methods, where each byte will be fragmented into three groups of bits in a special order, and each group will be included in one specific protein base that will be allocated to this group only, depending on the classifications of bits that have been previously stored in special databases. Each byte of the encrypted data will be hidden in three protein bases, and these protein bases will be distributed randomly over the protein chain, depending on an equation designed for this purpose. The advantages of these proposed algorithms are that they are fast in encrypting and hiding data, scalable, i.e., insensitive to the size of plain data, and lossless algorithms. The experiments showed that the proposed cryptography algorithm outperforms the most recent algorithms in terms of entropy and correlation values that reach −0.6778 and 7.99941, and the proposed steganography algorithm has the highest payload of 2.666 among five well-known hiding algorithms that used DNA sequences as the cover of the data.
Read full abstract