Abstract

In the field of Chinese information processing, automatic word segmentation is a fundamental concept. Complex networks are widely used in modeling and analysis of complex systems across a wide range of disciplines, prompting an increase in interest in the structure of complex networks and the relationship between structure and function among academics. In contrast to Western countries, while there is a syllabic change in spoken Chinese that has formed a degree of boundary, it appears in writing as continuous Chinese character strings with no obvious word segmentation marks and no morphological segmentation marks. In summary, an in-depth study of network news information processing through complex networks can provide effective means to purify the network environment and reduce massive waste of communication resources and users’ time, which is of significant research importance. The characteristics of complex networks are used to study the Chinese automatic word segmentation system in this paper. A new interdisciplinary subject, complex networks and complex systems, has emerged. Researchers from various disciplines are attempting to study complex systems through the lens of system complexity theory and complex network theory in order to better understand the mechanism of network fault diffusion.

Highlights

  • With the rapid expansion of Chinese web pages on the Internet and the rapid popularization of Chinese electronic publications and Chinese numerals Library, the research on Chinese natural language processing with unrestricted texts as the main object is becoming more and more important [1]. ere has not been a final segmentation standard for a long time, and it has gradually become one of the hot issues in Chinese information processing academia

  • Different from Western countries, there is a certain change in syllables in spoken Chinese, which forms a certain degree of boundary, it appears in the form of continuous Chinese character strings in writing, without obvious segmentation marks, and even lacking morphological segmentation marks [2]. erefore, Chinese natural language processing must first segment the text and divide the Chinese character string into correct word strings, which is called Chinese automatic segmentation

  • In English writing, words are naturally delimited by spaces, so it is intuitive in word understanding, while Chinese demarcates words only by punctuation or paragraphs between sentences, but there is no such delimiter between words. e complex network feature algorithm proposed is compared with the evaluation results of three other algorithms, namely, text rank and K-means clustering and decision tree algorithm for Chinese automatic word segmentation based on complex network features, and three experiments are conducted, respectively, as shown in Figures 7, 8, and 9

Read more

Summary

Introduction

With the rapid expansion of Chinese web pages on the Internet and the rapid popularization of Chinese electronic publications and Chinese numerals Library, the research on Chinese natural language processing with unrestricted texts as the main object is becoming more and more important [1]. ere has not been a final segmentation standard for a long time, and it has gradually become one of the hot issues in Chinese information processing academia. Erefore, Chinese natural language processing must first segment the text and divide the Chinese character string into correct word strings, which is called Chinese automatic segmentation. E research of Chinese automatic word segmentation is mainly from the word level. Many word segmentation methods have been implemented. In this long-term research and practice process, the determination of word segmentation units, ambiguous field processing, and unknown word recognition have become three major problems that perplex us [5]. How to design a practical, high-performance Chinese automatic word segmentation system with high segmentation speed, ideal segmentation accuracy, and good maintainability has attracted much attention, and many of them have become the research focus in the field of computer application

Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.