Tag affinity is the relationship between tags. It is a useful information for search and recommendation in social tagging systems. Tag affinity is measured by several types of tag cooccurrence frequency. The computation of tag affinity is a time-consuming task as the tagging information is accumulated. To alleviate this problem, we propose a parallel tag affinity computation method using MapReduce. We present MapReduce algorithms for computing three types of tag affinity measures: macro, micro, and bigram tag cooccurrence frequency. Our experimental results show that the proposed MapReduce-based approach not only significantly outperforms existing methods based on a relational database but also provides high scalability. To the best of our knowledge, this approach is the first tag affinity computation on MapReduce.
Read full abstract