The application of automatic natural language processing techniques to the indexing and the retrieval of text information has been a target of information retrieval researchers for some time. Incorporating semantic-level processing of language into retrieval has led to conceptual information retrieval, which is effective but usually restricted in its domain. Using syntactic-level analysis is domain-independent, but has not yet yielded significant improvements in retrieval quality. This paper describes a process whereby a morpho-syntactic analysis of phrases or user queries is used to generate a structured representation of text. A process of matching these structured representations is then described that generates a metric value or score indicating the degree of match between phrases. This scoring can then be used for ranking the phrases. In order to evaluate the effectiveness or quality of the matching and scoring of phrases, some experiments are described that indicate the method to be quite useful. Ultimately the phrasematching technique described here would be used as part of an overall document retrieval strategy, and some future work towards this direction is outlined.
Read full abstract