Self-Information Loss Compensation Learning for Machine-Generated Text Detection

Weikuan Wang,Ao Feng,Hongchuan Yu

doi:10.1155/2021/6669468

Abstract

The technology of automatic text generation by machine has always been an important task in natural language processing, but the low-quality text generated by the machine seriously affects the user experience due to poor readability and fuzzy effective information. The machine-generated text detection method based on traditional machine learning relies on a large number of artificial features with detection rules. The general method of text classification based on deep learning tends to the orientation of text topics, but logical information between texts sequences is not well utilized. For this problem, we propose an end-to-end model which uses the text sequences self-information to compensate for the information loss in the modeling process, to learn the logical information between the text sequences for machine-generated text detection. This is a text classification task. We experiment on a Chinese question and answer the dataset collected from a biomedical social media, which includes human-written text and machine-generated text. The result shows that our method is effective and exceeds most baseline models.

Highlights

Have you ever encountered such a situation: when you ask a question on social media, the answers seem to be substantial and the topic is consistent with the question, but after reading carefully, you may find that the confusing logic makes it hard to read with the worthless contents
E technology of automatic text generation can be applied to a question and answer system [8], machine translation [9], and automatic text summarization [10]. e development of this technology will realize more intelligent and natural human-computer interaction
We look forward to the day when computers can write like humans, but generating high-quality text sequences is still a challenge

Summary

Weikuan Wang and Ao Feng

Received 6 November 2020; Revised 17 December 2020; Accepted 8 February 2021; Published 19 February 2021. E general method of text classification based on deep learning tends to the orientation of text topics, but logical information between texts sequences is not well utilized. E technology of automatic text generation by machine has always been an important task in natural language processing, but the low-quality text generated by the machine seriously affects the user experience due to poor readability and fuzzy effective information. For this problem, we propose an end-to-end model which uses the text sequences self-information to compensate for the information loss in the modeling process, to learn the logical information between the text sequences for machine-generated text detection. We experiment on a Chinese question and answer the dataset collected from a biomedical social media, which includes human-written text and machine-generated text. e result shows that our method is effective and exceeds most baseline models

Introduction

Related Work

Lookup table

Experiments

Random initialization parameters Pretrained parameters

Full Text

Published version (

Free)

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Mathematical Problems in Engineering	Publication Date: Feb 19, 2021
Citations: 8	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Self-Information Loss Compensation Learning for Machine-Generated Text Detection

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Mathematical Problems in Engineering

Lead the way for us

Similar Papers

Rationalization for explainable NLP: a survey.
Sai Gurrapu ... Ismini Lourentzou
Frontiers in artificial intelligence | VOL. 6
Sai Gurrapu, et. al.Sai Gurrapu ... Ismini Lourentzou
25 Sep 2023
Frontiers in artificial intelligence | VOL. 6

CNO-LSTM: A Chaotic Neural Oscillatory Long Short-Term Memory Model for Text Classification
Nuobei Shi ... Raymond S T Lee
IEEE access : practical innovations, open solutions | VOL. 10
Nuobei Shi, et. al.Nuobei Shi ... Raymond S T Lee
01 Jan 2021
IEEE access : practical innovations, open solutions | VOL. 10

Label Oriented Hierarchical Attention Neural Network for Short Text Classification
-
Academic Journal of Engineering and Technology Science | VOL. 5
--
01 Jan 2021
Academic Journal of Engineering and Technology Science | VOL. 5

DCCL: Dual-channel hybrid neural network combined with self-attention for text classification.
Chaofan Li ... Qiong Liu
Mathematical Biosciences and Engineering | VOL. 20
Chaofan Li, et. al.Chaofan Li ... Qiong Liu
01 Jan 2021
Mathematical Biosciences and Engineering | VOL. 20

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Self-Information Loss Compensation Learning for Machine-Generated Text Detection

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Mathematical Problems in Engineering