Abstract

Latent Dirichlet Allocation (LDA) is a fundamental method in the text mining field. We propose strategies for topic and model selection based on LDA that exploits the semantic coherence of the topics inferred, boosting the quality of the models found. Then we study how our boosted topic models perform in ad-hoc information retrieval tasks. Experimental results in four datasets show that our proposal improves the quality of the topics found favoring document retrieval tasks. Our method outperforms traditional LDA-based methods showing that model selection based on semantic coherence is useful for document modeling and information retrieval tasks.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.