Practical Linguistic Steganography using Contextual Synonym Substitution and a Novel Vertex Coding Method

Ching-Yun Chang,Stephen Clark

doi:10.1162/coli_a_00176

Ching-Yun Chang, Stephen Clark

Open Access

https://doi.org/10.1162/coli_a_00176

Copy DOI

Journal: Computational Linguistics	Publication Date: Jun 1, 2014
Citations: 68	License type: public-domain

Affiliation: University of Cambridge

Abstract

Linguistic steganography is concerned with hiding information in natural language text. One of the major transformations used in linguistic steganography is synonym substitution. However, few existing studies have studied the practical application of this approach. In this article we propose two improvements to the use of synonym substitution for encoding hidden bits of information. First, we use the Google n-gram corpus for checking the applicability of a synonym in context, and we evaluate this method using data from the SemEval lexical substitution task and human annotated data. Second, we address the problem that arises from words with more than one sense, which creates a potential ambiguity in terms of which bits are represented by a particular word. We develop a novel method in which words are the vertices in a graph, synonyms are linked by edges, and the bits assigned to a word are determined by a vertex coding algorithm. This method ensures that each word represents a unique sequence of bits, without cutting out large numbers of synonyms, and thus maintains a reasonable embedding capacity.

Full Text