Abstract
Linguistic steganography studies how to hide secret messages in natural language cover texts. Traditional methods aim to transform a secret message into an innocent text via lexical substitution or syntactical modification. Recently, advances in neural language models (LMs) enable us to directly generate cover text conditioned on the secret message. In this study, we present a new linguistic steganography method which encodes secret messages using self-adjusting arithmetic coding based on a neural language model. We formally analyze the statistical imperceptibility of this method and empirically show it outperforms the previous state-of-the-art methods on four datasets by 15.3% and 38.9% in terms of bits/word and KL metrics, respectively. Finally, human evaluations show that 51% of generated cover texts can indeed fool eavesdroppers.
Highlights
Privacy is central to modern communication systems such as email services and online social networks
Our new method is built based on the previous study (Ziegler et al, 2019) which views each secret message as a binary fractional number and encodes it using arithmetic coding (Rissanen and Langdon, 1979) with a pretrained neural language models (LMs)
We theoretically prove the SAAC algorithm is nearimperceptible for linguistic steganography and empirically demonstrate its effectiveness on four datasets from various domains
Summary
Privacy is central to modern communication systems such as email services and online social networks. Our new method is built based on the previous study (Ziegler et al, 2019) which views each secret message as a binary fractional number and encodes it using arithmetic coding (Rissanen and Langdon, 1979) with a pretrained neural LM This method generates cover text tokens one at a time (c.f. Fig. 2). Phy algorithms; (2) We propose SAAC, a new nearimperceptible linguistic steganography method that encodes secret messages using self-adjusting arithmetic coding with a neural LM; and (3) Extensive experiments on four datasets demonstrate our approach can on average outperform the previous state-of-the-art method by 15.3% and 38.9% in terms of bits/word and KL metrics, respectively
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.