Abstract

Text steganography has received a lot of attention in the application of covert communication. How to ensure desirable capacity and imperceptibility has become a key issue in text steganography. There are two typical approaches, i.e., text-selection-based steganography and text-generation-based steganography. However, the text-selection-based approaches generally have the very low hidden capacity and are not applicable in practical scenarios. Although the text-generation-based approaches can embed secret messages with higher capacity during text generation, they are prone to semantic incoherence and semantic errors when generating long texts. To address the abovementioned issues, this article proposes a novel text steganography based on long readable text generation. It first determines the topic of the stego-text according to the scenarios of the communication parties. Then, the plug and play language model (PPLM) is explored to generate the long readable stego-text conforming to the topic with semantic coherency. A given secret message is hidden during text generation by selecting proper words in an established embeddable candidate word pool (ECWP). Establishing the ECWP prevents the language model (LM) from selecting words with low probability in the text generation, thereby avoiding the generation of low-quality or even grammatically incorrect stego-text. Experimental results show that the proposed approach significantly increases hidden capacity while maintaining good imperceptibility compared with the existing approaches.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call