Abstract

With advances in neural language models, the focus of linguistic steganography has shifted from edit-based approaches to generation-based ones. While the latter’s payload capacity is impressive, generating genuine-looking texts remains challenging. In this paper, we revisit edit-based linguistic steganography, with the idea that a masked language model offers an off-the-shelf solution. The proposed method eliminates painstaking rule construction and has a high payload capacity for an edit-based model. It is also shown to be more secure against automatic detection than a generation-based method while offering better control of the security/payload capacity trade-off.

Highlights

  • Introduction and context sensitivityFor this reason, edit-based approaches were characterized by the painstakingSteganography is the practice of concealing a message in some cover data such that an eavesdropper is not even aware of the existence of the secret message (Simmons, 1984; Anderson and Petitcolas, 1998)

  • With advances in neural language models (LMs), Formally, the goal of linguistic steganography is to create a steganographic system with which the sender Alice encodes a secret message, usually in the form of a bit sequence, into a text and the receiver Bob decodes the message, edit-based approaches have been replaced by generation-based ones (Fang et al, 2017; Yang et al, 2019; Dai and Cai, 2019; Ziegler et al, 2019; Shen et al, 2020)

  • We showed that the proposed method had a high payload capacity for an editbased model

Read more

Summary

Introduction

Introduction and context sensitivityFor this reason, edit-based approaches were characterized by the painstakingSteganography is the practice of concealing a message in some cover data such that an eavesdropper is not even aware of the existence of the secret message (Simmons, 1984; Anderson and Petitcolas, 1998). With advances in neural language models (LMs), Formally, the goal of linguistic steganography is to create a steganographic system (stegosystem) with which the sender Alice encodes a secret message, usually in the form of a bit sequence, into a text and the receiver Bob decodes the message, edit-based approaches have been replaced by generation-based ones (Fang et al, 2017; Yang et al, 2019; Dai and Cai, 2019; Ziegler et al, 2019; Shen et al, 2020) In these approaches, bit chunks are directly assigned to the conditional probability with the requirement that the text is so natural that even if transmitted in a public channel, it does not arouse the suspicion of the eavesdropper Eve. For distribution over the word estimated by the LM, yielding impressive payload capacities of 1–5 bits per word (Shen et al, 2020)

Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.