Abstract
Concerning the Web document annotation techniques available have weakness in integrity annotation,Latent Dirichlet Allocation(LDA) model was applied to semantic annotation.By embedding document domain information to LDA model,a new LDA model called domain-enabled LDA was introduced.An association between the statistical topical model and domain ontology was established,so the implied topic generated could be interpreted by concepts and an explicit semantic in document was acquired.Because the LDA model assigned a topic to each word in document,a multi-granularity annotation strategy was proposed.The experiments on 20news-group and WebKB show that the domain-enabled LDA model proposed can improve the annotation effectiveness and the multi-granularity annotation method helps different types of query in information retrieval.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.