Abstract

<p>Subject indexing is the act of describing or classifying a document by index terms or other symbols in order to indicate what the document is about, to summarize its content or to increase its findability. The selection of term candidate on automatic subject indexing is very important, because it can influence the result of topic extraction on document. Recently on the automatic subject indexing especially in the term candidate selection only consider terms in the document collection. In contrast, indexer prefers to choose general term on manual subject indexing for selection of term candidate. In this paper, we proposed a new strategy for selecting term candidate on automatic subject indexing for extraction the main topic from the document. The proposed method uses a combination of Term Frequency Inverse Document Frequency (TF*IDF) and Random Walk on the structure of thesaurus. Experimental results show that the proposed method can select the terms candidate that relevant to the topic of the document with F-Measure of 0.24.</p>

Highlights

  • Study about manual subject indexing using controlled vocabulary began in the 1950s and 1960s, since the researchers develop a method for automatic subject indexing documents [1]

  • That formal language usually refers to the "semantic vocabulary" or "lexical dictionary" which contains the rules to help with natural language indexing term or from a document collection into indexing with the controlled term

  • We proposed a new strategy for selecting term candidates on automatic subject indexing

Read more

Summary

Introduction

Concern over women’s subordination in law is not new. beginning in the nineteenth century and continuing throughout the twentieth, the world has been witness to innumerable women's movements seeking to pressure governments and societies to recognize women's civil rights but that women should enjoy equal working conditions and wages with men. Beginning in the nineteenth century and continuing throughout the twentieth, the world has been witness to innumerable women's movements seeking to pressure governments and societies to recognize women's civil rights but that women should enjoy equal working conditions and wages with men It was not until feminist movements gained recognition in the 'seventies and the United Nations' Women's Decade achieved significant advances, that it became possible to conduct a series of studies on rural Latin-American women. The quality of the extracted terms in this study was evaluated with precision, recall, and F-Measure. Recall, and F-Measure is effectively used to evaluate the quality of the extraction of terms relevant to the topic of a document [11]. While the extracted term is a term or phrase that is generated by the automatic system of subject indexing

TF*IDF
Random Walk
Vertex ranking bypassed by agent
Proposed Method
Method
Result and Analysis
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call