Semantic-Preserving Linguistic Steganography by Pivot Translation and Semantic-Aware Bins Coding

Tianyu Yang,Xinpeng Zhang,Guorui Feng,Hanzhou Wu,Biao Yi

doi:10.1109/tdsc.2023.3247493

Abstract

Linguistic steganography (LS) aims to embed secret information into a highly encoded text for covert communication. It can be roughly divided to two main categories, i.e., modification based LS (MLS) and generation based LS (GLS). MLS embeds secret data by slightly modifying a given text without impairing the meaning of the text, whereas GLS uses a well trained language model to directly generate a text carrying secret data. A common disadvantage for MLS methods is that the embedding payload is very small, whose return is well preserving the semantic quality of the text. In contrast, GLS enables the data hider to embed a large payload, which has to pay the high price of uncontrollable semantics. In this paper, we propose a novel LS method to modify a given text by pivoting it between two different languages and embed secret data using a semantic-aware information encoding strategy. Our purpose is to alter the expression of the given text, enabling a large payload to be embedded while keeping the semantic information unchanged. Experiments have shown that the proposed work not only achieves a large embedding payload, but also shows superior performance in maintaining the semantic consistency and resisting linguistic steganalysis.

Full Text