In the recent times, usage of Social Networking and Micro-blogging sites by the people has increased exponentially. Identifying the right topical experts in a particular domain is a tedious task as huge amount of data being processed. The existing approaches limited in accuracy, which leads to providing irrelevant results. Furthermore, few deal with syntactic similarity and not the semantic analysis. In this paper, we introduce an integrated approach which combines focus crawling, semantic analysis and personalized page ranking algorithm to provide top-k ranked results for a topic. Twitter data is taken as input and the twitter lists are extracted to construct the endorsement graph. Links in the graph are weighted using personalized PageRank. Since social network itself has a huge chunk of data, it is obvious for the endorsement graph to have quite large number of nodes. To increase the scalability and performance, focused crawling is applied before the PageRank calculation. The semantic similarity is calculated between the given input query and the Twitter list name/description to yield better results. We compare the results with Twitter’s WTF (Who To Follow) algorithm to showcase that the proposed system provides more obvious and serendipitous topical experts.
Read full abstract