Abstract

This chapter puts forward the hypothesis that the Fregean sense of a URI can be constructed out of user-defined tags, or natural language terms applied to a web-page accessible via a URI using a ‘collaborative’ tagging site. We use the data from the social bookmarking site http://del.icio.us to empirically examine the dynamics of collaborative tagging systems and to study how coherent categorization schemes emerge from unsupervised tagging by individual users, which can then be considered a ‘stable’ Fregean sense manufactured by a social consensus. First, we study the formation of stable distributions in tagging systems, seen as an implicit form of consensus reached by the users of the system around the tags that best describe a resource. We show that final tag frequencies for most resources converge to power law distributions and we propose an empirical method to examine the dynamics of the convergence process, based on the Kullback-Leibler divergence measure. The convergence analysis is performed both for the most utilized tags at the top of tag distributions and the long tail. It is assumed such a ‘power-law’ comes from the support the user in the tag selection process by providing tag suggestions, or recommendations, based on a popularity measurement of tags other users provided when tagging the same resource. So we investigate the influence of tag suggestions on the emergence of power-law distributions as a result of collaborative tag behavior. Although previous research has already shown that power-laws emerge in tagging systems, the cause of why power-law distributions emerge is not understood empirically. The majority of theories and mathematical models of tagging found in the literature assume that the emergence of power-laws in tagging systems is mainly driven by the imitation behavior of users when observing tag suggestions provided by the user interface of the tagging system. We present experimental results that show that the power-law distribution forms when tag suggestions are not presented to the users, and the power-law distribution does not hold when there are tag suggestions presented to the user. Looking to see if we can move beyond tagging as in our search for sense, we study the information structures that emerge from collaborative tagging, namely tag correlation (or folksonomy) graphs. We show how community-based network techniques can be used to extract simple tag vocabularies from the tag correlation graphs by partitioning them into subsets of related tags. Furthermore, we also show, for a specialized domain, that shared vocabularies produced by collaborative tagging are richer than the vocabularies which can be extracted from large-scale query logs provided by a major search engine. Although the empirical analysis presented in this paper is based on a set of tagging data obtained from del.icio.us , the methods developed are general, and the conclusions should be applicable across all websites that employ tagging.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.