Abstract
Scholarly community detection has important applications in various fields. Current studies rely heavily on structured scholar networks, which have high computational complexity and are challenging to construct in practice. We propose a novel approach that can detect disjoint and overlapping scholarly communities directly from large textual corpora. To the best of our knowledge, this is the first study intended to detect communities directly from unstructured texts. In general, academic articles tend to mention related work and researchers. Researchers that are more closely related to each other are mentioned in a closer grouping in lines of academic text. Based on this correlation, we propose an intuitional method that measures the mutual relatedness of researchers through their textual distance. First, we extract and disambiguate the researcher names from academic articles. Then, we embed each researcher as an implicit vector and measure the relatedness of researchers by their vector distance. Finally, the communities are identified by vector clusters. We develop and evaluate our method on several real-world datasets. The experimental results demonstrate that our method achieves comparable performance with several state-of-the-art methods.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have