Abstract

In order to solve the problems of non-standard,repetition and redundancy of information in the process of selecting the base class phrases,an improved Suffix Tree Clustering(STC) method was proposed.Firstly,phrase mutual information algorithm was put forward to choose the base class phrases abiding by Uyghur grammar.Secondly,in order to reduce the repeated base class phrase,the phrase reduction algorithm based on Uyghur grammar was proposed.Thirdly,on the basis of the first two steps,the phrase redundancy algorithm based on Uyghur grammar was constructed to remove redundant phrase.The experimental results show that this method improves the recall and the precision compared with STC.This indicates that the improved algorithm can enhance clustering performance effectively.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call