Intelligent Segmentation Framework and Data Hierarchy of Chinese Language and Literature Based on Semantic Recognition

Dan Wang

doi:10.1109/icssit53264.2022.9716300

Abstract

Based on semantic recognition, this paper studies the intelligent segmentation framework and data layering of Chinese language and literature. First of all, in this paper, in the study of predicting Chinese semantic word-building patterns, based on binary classification Logistic regression and naive Bayes to construct Chinese sentence segmentation patterns for intelligent simulation experiments. Then the eight types of data in the annotated corpus are grouped in pairs, and each sample set is divided into data stratification, so that the two-class Logistic regression model and the naive Bayes model are first based on the training set of each group to learn semantic word formation rules., And then predict the semantic word formation mode on the test set of each group, and the result shows that the sentence segmentation efficiency is increased by 6.5%

Full Text