This paper proposes an experimental system for generating slang-style casual English sentences from regular English input using a phonetic database approach, primarily as an AI task, with real-life applications such as social media marketing. An original database consisting of multiple candidates of casual English phonemes was constructed, and linguistic analysis of Twitter data used to establish the optimum frequency of slang tokens per sentence. The human-likeness and legibility of output sentences of the experimental system were evaluated using an experiment based on the classical definition of the Turing test, in which fifty human evaluators attempted to distinguish sentences produced by the system from genuine humanauthored sentences. The experiment results demonstrated that the gap in human-likeness scores between the “human” and “machine” sentences was small, and that some “machine” sentences actually outperformed several of the “human sentences.” The “machine” sentences’ average score of 3.1 on a 5-point scale, where 3 indicated complete uncertainty of whether the sentences were human-authored or machine-authored, can be considered a pass of the Turing test in the established definition. In this paper, we describe the potential approaches to the task, the construction of the phonetic database and the proposed system, and discuss the evaluation results.
Read full abstract