Abstract

In view of the complexity of text categorization and search in the era of big data, Based on the diversity of Chinese words, and the task of constructing feature lexicon in text classification and searching, this paper designs a feature lexicon method based on word level. By learning the existing samples and identifying new words using CRF model, discriminating the importance of the words, reasonably dividing the word level and assigning weights, constructing an efficient and accurate feature lexicon, this method could obtain stable word segmentation effects, and effectively improve the accuracy of subsequent classification.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call