Abstract

ABSTRACT: The Historical Thesaurus of English ( HT ) categorizes the English vocabulary into meaning-based categories containing synonymous or near-synonymous lexical items. Its extensive hierarchical categorization scheme (c. 235,000 categories) is, however, unwieldy for new and more casual users. Additionally, it can be too fine-grained for implementation in natural language processing tools, leading them to produce over-specified results where humans recognize ambiguity. This article outlines the reasoning and methodology behind the creation of a truncated semantic hierarchy, known by the HT editors as the thematic category set. Development of the category set was guided by evaluation of which concepts are too technical or too general to be relevant to the average user, with cut-offs imposed on the hierarchy at an intermediate human-scale level. The result is a hybrid category set which combines organisation by synonymy at higher levels of abstraction, and organisation by something approaching a conceptual field at lower, less abstract levels.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.