Abstract

The index term that is usually found in the back-of-book index was created to help readers with finding the information, such as important names or terms on book pages. A good back-of-book index should guide the reader to find relevant term on the book page. The process to create the back-of-book index requires great effort. The important words should be extracted from a large collection of words in the book. The subjectivity and knowledge of the author determine which important words should be indexed. The inaccurate indexing process could result in back-of-book index that refers to an irrelevant page number. This research aims to identify the relevant page numbers using syntactic similarity approach and semantic approach that based on Wordnet thesaurus. Through these approaches, we measure the relatedness between sentence and the index term to identify the relevant page number. We use Kappa statistic to measure the reliability of our methods, our experimental result shows that the semantic approach has better performance than the syntactic approach. The average Kappa value in the semantic similarity framework is 0.619, imply that the semantic similarity approach could identify the relevant page numbers as well as the book author.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call