Abstract
In order to solve the problems of non-standard,repetition and redundancy of information in the process of selecting the base class phrases,an improved Suffix Tree Clustering(STC) method was proposed.Firstly,phrase mutual information algorithm was put forward to choose the base class phrases abiding by Uyghur grammar.Secondly,in order to reduce the repeated base class phrase,the phrase reduction algorithm based on Uyghur grammar was proposed.Thirdly,on the basis of the first two steps,the phrase redundancy algorithm based on Uyghur grammar was constructed to remove redundant phrase.The experimental results show that this method improves the recall and the precision compared with STC.This indicates that the improved algorithm can enhance clustering performance effectively.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have