Abstract

This article presents the Semantic Vector Space Model (SVSM), a text representation and searching technique based on the combination of Vector Space Model (VSM) with heuristic syntax parsing and distributed representation of semantic case structures. In this model, both documents and queries are represented as semantic matrices. A search mechanism is designed to compute the similarity between two semantic matrices to predict relevancy. A prototype system was built to implement this model by modifying the SMART system and using the Xerox Part-Of-Speech (P-O-S) tagger as the pre-processor of the indexing process. The prototype system was used in an experimental study to evaluate this technique in terms of precision, recall, and effectiveness of relevance ranking. The results of the study showed that if documents and queries were too short (typically less than 2 lines in length), the technique was less effective than VSM. But with longer documents and queries, especially when original documents were used as queries, we found that the system based on our technique had significantly better performance than SMART. © 1997 John Wiley & Sons, Inc.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call