Abstract

Detection of theme word or key word describing a collection of words is an important text processing method in natural language processing known as topic detection (TD). An accurate topic detection method depends upon goodness of topic modelling technique. There are several topic modelling techniques implemented successfully, some prominent names are LSA, LDA, Hierarchical Dirichlet Process, Non-Negative Matrix Factorization. Most topic modelling/detection techniques applied well for English corpus but very little work is available when it comes to Indian Languages. In this paper we have designed and applied a novel method for detection of topics from Hindi corpus. The proposed method discovers topics through clustering of semantic space; generated using word2vec word embedding. The results obtained are encouraging.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.