An approach to the automatic construction of global thesauri

C.J Crouch

doi:10.1016/0306-4573(90)90106-c

Abstract

The benefits of a well constructed thesaurus to an information retrieval system have long been recognized by both researchers and practitioners in the field. Previous experiments have investigated the construction of thesauri by manual, semiautomatic, and automatic means. Automatic thesaurus generation in particular has proven to be an especially difficult problem. This paper examines both early and current approaches to automatic thesaurus construction and describes an approach to the automatic generation of global thesauri based on the term discrimination value model of Salton, Yang, and Yu and on an appropriate clustering algorithm. This method has been implemented and applied to two document collections. Preliminary results indicate that this method, which produces improvements in retrieval performance in excess of 10 and 15 percent in the test collections, is viable and worthy of continued investigation.

Full Text