Abstract
Keywords have played an important role not only for searchers who formulate a query, but also for search engines that index documents and evaluate the query. Recently, tags chosen by users to annotate web resources are gaining significance for improving information retrieval (IR) tasks, in that they can act as meaningful keywords bridging the gap between humans and machines. One critical aspect of tagging (besides the tag and the resource) is the user (or tagger); there exists a ternary relationship among the tag, resource, and user. The traditional inverted index, however, does not consider the user aspect, and is based on the binary relationship between term and document. In this paper we propose a social inverted index – a novel inverted index extended for social-tagging-based IR – that maintains a separate user sublist for each resource in a resource-posting list to contain each user’s various features as weights. The social inverted index is different from the normal inverted index in that it regards each user as a unique person, rather than simply count the number of users, and highlights the value of a user who has participated in tagging. This extended structure facilitates the use of dynamic resource weights, which are expected to be more meaningful than simple user-frequency-based weights. It also allows a flexible response to the conditional queries that are increasingly required in tag-based IR. Our experiments have shown that this user-considering indexing performs better in IR tasks than a normal inverted index with no user sublists. The time and space overhead required for index construction and maintenance was also acceptable.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.