A Framework Model of Mining Potential Public Opinion Events Pertaining to Suspected Research Integrity Issues with the Text Convolutional Neural Network model and a Mixed Event Extractor

Zongfeng Zou,Xiaochen Ji,Yingying Li

doi:10.3390/info15060303

Abstract

With the development of the Internet, the oversight of research integrity issues has extended beyond the scientific community to encompass the whole of society. If these issues are not addressed promptly, they can significantly impact the research credibility of both institutions and scholars. This article proposes a text convolutional neural network based on SMOTE to identify short texts of potential public opinion events related to suspected scientific integrity issues from common short texts. The SMOTE comprehensive sampling technique is employed to handle imbalanced datasets. To mitigate the impact of short text length on text representation quality, the Doc2vec embedding model is utilized to represent short text, yielding a one-dimensional dense vector. Additionally, the dimensions of the input layer and convolution kernel of TextCNN are adjusted. Subsequently, a short text event extraction model based on TF-IDF and TextRank is proposed to extract crucial information, for instance, names and research-related institutions, from events and facilitate the identification of potential public opinion events related to suspected scientific integrity issues. Results of experiments have demonstrated that utilizing SMOTE to balance the dataset is able to improve the classification results of TextCNN classifiers. Compared to traditional classifiers, TextCNN exhibits greater robustness in addressing the problems of imbalanced datasets. However, challenges such as low information content, non-standard writing, and polysemy in short texts may impact the accuracy of event extraction. The framework can be further optimized to address these issues in the future.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

A Framework Model of Mining Potential Public Opinion Events Pertaining to Suspected Research Integrity Issues with the Text Convolutional Neural Network model and a Mixed Event Extractor

Abstract

Talk to us

Similar Papers

More From: Information

Lead the way for us

Journal: Information	Publication Date: May 24, 2024
License type: CC BY 4.0

Similar Papers

A user-based topic model with topical word embeddings for semantic modelling in social network
Xin Jin
Journal of Intelligent & Fuzzy Systems | VOL. 43
Xin JinXin Jin
01 Jun 2022
Journal of Intelligent & Fuzzy Systems | VOL. 43

Combining Concept Graph with Improved Neural Networks for Chinese Short Text Classification
Jialu Liao ... Jinguang Gu
-
Jialu Liao, et. al.Jialu Liao ... Jinguang Gu
01 Jan 2020
01 Jan 2020

Short Text Understanding Combining Text Conceptualization and Transformer Embedding
Jun Li ... Jianheng Chen
IEEE Access | VOL. 7
Jun Li, et. al.Jun Li ... Jianheng Chen
01 Jan 2019
IEEE Access | VOL. 7

Fuzzy topic modeling approach for text mining over short text
Junaid Rashid ... Aun Irtaza
Information Processing & Management | VOL. 56
Junaid Rashid, et. al.Junaid Rashid ... Aun Irtaza
21 Jun 2019
Information Processing & Management | VOL. 56

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

A Framework Model of Mining Potential Public Opinion Events Pertaining to Suspected Research Integrity Issues with the Text Convolutional Neural Network model and a Mixed Event Extractor

Abstract

Talk to us

Similar Papers

More From: Information