Abstract

We propose an intelligent document title classification agent based on a theory of information inference. The information is represented as vectorial spaces computed by a cognitively motivated model, namely Hyperspace Analogue to Language (HAL). A combination heuristic is used to combine a group of concepts into one single combination vector. Information inference can be performed on the HAL spaces via computing information flow between vectors or combination vectors. Based on this theory, a document title is treated as a combination vector by applying the combination heuristic to all the non-stop terms in the title. Two methodologies for learning and assigning categories to document titles are addressed. Experimental results on Reuters-21578 corpus show that our framework is promising and its performance achieves 71% of the upper bound (which is approximated by using whole documents).

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.