Abstract

Text clustering is important in many application of information retrieval. This paper presents a study of clustering short texts in Bahasa Indonesia using semantic similarity approach where dictionary of synonyms and hyponyms is used to get information on word relatedness. We compare sentence similarity calculations based on lexical matching and word similarity. More than 250 sentences are involved. Our experiment shows that clustering using sentence similarity based on lexical matching performs better in terms of precision and F-measure than clustering using sentence similarity based on semantic approach.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call