Research on Key Technologies of Deep Learning Techniques in Unstructured Data Processing

Guorong Zhang,Chengli Fu,Huiqin Zhou

doi:10.2478/amns-2024-3175

Abstract

Abstract The rise of the Internet has brought about a rapid growth of unstructured data recorded in the form of text and audio. Two key techniques that can be used to process text data are proposed in this study, which applies deep learning techniques to unstructured data processing. First, the transformer feature extractor is used to characterize dynamic word vectors. Then, the MCNN neural network is combined with it to perform key information screening and construct a text classification model based on the MCNN transformer. Then, the text features extracted from the BERT model are input into the VAEGRU module, combined with the self-attention mechanism and the K-Means algorithm, to construct the text clustering model based on VAE-GRU. The MCNN-transformer model achieves a high level of accuracy and Macro-F1 value that exceeds 0.880 and is superior to other text categorization models through experimental analysis. The ACC and NMI results of the VAE-GRU model are both greater than 70% on the Stack Overflow and SearchSnippets datasets and greater than 48% on the Chinese dataset are greater than 48%, and their performance is better than the three ablation models by 15.03% to 85.67%. In this paper, the MCNN-transformer model and the VAE-GRU model are capable of competent classification and clustering processing in unstructured text data, which help to improve the efficiency of information understanding and utilization of unstructured data.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Research on Key Technologies of Deep Learning Techniques in Unstructured Data Processing

Abstract

Talk to us

Similar Papers

More From: Applied Mathematics and Nonlinear Sciences

Lead the way for us

Journal: Applied Mathematics and Nonlinear Sciences	Publication Date: Jan 1, 2024
License type: CC BY 4.0

Similar Papers

Finding the best trade-off between performance and interpretability in predicting hospital length of stay using structured and unstructured data.
Franck Jaotombo ... Badih Ghattas
PLOS ONE | VOL. 18
Franck Jaotombo, et. al.Franck Jaotombo ... Badih Ghattas
30 Nov 2023
PLOS ONE | VOL. 18

On the Power of Massive Text Data
Jiawei Han
-
Jiawei HanJiawei Han
02 Feb 2018
02 Feb 2018

BERT-BiGRU Intelligent Classification of Metro On-Board Equipment Faults Based on Key Layer Fusion
Endong Liu ... Junting Lin
Wireless Communications and Mobile Computing | VOL. 2022
Endong Liu, et. al.Endong Liu ... Junting Lin
25 Jun 2022
Wireless Communications and Mobile Computing | VOL. 2022

Processing of 3D Unstructured Measurement Data for Reverse Engineering
Yongmin Zhong
International Journal of Intelligent Mechatronics and Robotics | VOL. 1
Yongmin ZhongYongmin Zhong
01 Apr 2011
International Journal of Intelligent Mechatronics and Robotics | VOL. 1

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Research on Key Technologies of Deep Learning Techniques in Unstructured Data Processing

Abstract

Talk to us

Similar Papers

More From: Applied Mathematics and Nonlinear Sciences