A Novel Hierarchical Document Clustering Framework on Large TREC Biomedical Documents

Pilli Lalitha Kumari,Ch Satyanarayana,M Jeeva

doi:10.5815/ijitcs.2022.03.02

Abstract

The growth of microblogging sites such as Biomedical, biomedical, defect, or bug databases makes it difficult for web users to share and express their context identification of sequential key phrases and their categories on text clustering applications. In the traditional document classification and clustering models, the features associated with TREC texts are more complex to analyze. Finding relevant feature-based key phrase patterns in the large collection of unstructured documents is becoming increasingly difficult, as the repository's size increases. The purpose of this study is to develop and implement a new hierarchical document clustering framework on a large TREC data repository. A document feature selection and clustered model are used to identify and extract MeSH related documents from TREC biomedical clinical benchmark datasets. Efficiencies of the proposed model are indicated in terms of computational memory, accuracy, and error rate, as demonstrated by experimental results.

Full Text