Abstract
It is well known that synonymous and polysemous terms often bring in some noises when calculating the similarity between documents. Existing ontology-based document representation methods are static, hence, the chosen semantic concept set for representing a document has a fixed resolution and it is not adaptable to the characteristics of a document collection and the text mining problem in hand. We propose an Adaptive Concept Resolution (ACR) model to overcome this issue. ACR can learn a concept border from an ontology taking into consideration of the characteristics of a particular document collection. Then this border can provide a tailor-made semantic concept representation for a document coming from the same domain. Another advantage of ACR is that it is applicable in both classification task where the groups are given in the training document set, and clustering task where no group information is available. Furthermore, the result of this model is not sensitive to the model parameter. The experimental results show that ACR outperforms an existing static method significantly.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.