Computational approaches to detect experts in distributed online communities: a case study on Reddit

Sofia Strukova,José A Ruipérez-Valiente,Félix Gómez Mármol

doi:10.1007/s10586-023-04076-w

Sofia Strukova, José A Ruipérez-Valiente + Show 1 more

Open Access

https://doi.org/10.1007/s10586-023-04076-w

Copy DOI

Journal: Cluster Computing	Publication Date: Jun 23, 2023
Citations: 3	License type: CC BY 4.0

Affiliation: University of Murcia

Abstract

AbstractThe irreplaceable key to the triumph of Question & Answer (Q & A) platforms is their users providing high-quality answers to the challenging questions posted across various topics of interest. From more than a decade, the expert finding problem attracted much attention in information retrieval research. Based on the encountered gaps in the expert identification across several Q & A portals, we inspect the feasibility of identifying data science experts in Reddit. Our method is based on the manual coding results where two data science experts labelled not only expert and non-expert comments, but also out-of-scope comments, which is a novel contribution to the literature, enabling the identification of more groups of comments across web portals. We present a semi-supervised approach which combines 1113 labelled comments with 100,226 unlabelled comments during training. We proved that it is possible to develop models that can identify expert, non-expert and out-of-scope comments peaking the AUC score at 0.93, accuracy at 0.83, MAE at 0.15 degrees and R2 score at 0.69. The proposed model uses the activity behaviour of every user, including Natural Language Processing (NLP), crowdsourced and user feature sets. We conclude that the NLP and user feature sets contribute the most to the better identification of these three classes. It means that this method can generalise well within the domain. Finally, we make a novel contribution by presenting different types of users in Reddit, which opens many future research directions.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Computational approaches to detect experts in distributed online communities: a case study on Reddit

Abstract

Talk to us

Similar Papers

More From: Cluster Computing

Lead the way for us

Similar Papers

Natural language processing for automated triage and prioritization of individual case safety reports for case-by-case assessment
Thomas Lieber ... Helen R Gosselt
Frontiers in Drug Safety and Regulation | VOL. 3
Thomas Lieber, et. al.Thomas Lieber ... Helen R Gosselt
07 Feb 2023
Frontiers in Drug Safety and Regulation | VOL. 3

ПСИХОЛОГІЧНІ ОСОБЛИВОСТІ АКТИВНОСТІ КОРИСТУВАЧІВ СОЦІАЛЬНИХ МЕДІА
O P Sosniuk ... I V Оstapenko
Ukrainian Psychological Journal | VOL. -
O P Sosniuk, et. al.O P Sosniuk ... I V Оstapenko
01 Jan 2019
Ukrainian Psychological Journal | VOL. -

Text analysis on health product reviews using r approach
Nasibah Husna Mohd Kadir ... Sharifah Aliman
Indonesian Journal of Electrical Engineering and Computer Science | VOL. 18
Nasibah Husna Mohd Kadir, et. al.Nasibah Husna Mohd Kadir ... Sharifah Aliman
01 Jun 2020
Indonesian Journal of Electrical Engineering and Computer Science | VOL. 18

Demand charges and user flexibility – Exploring differences in electricity consumer types and load patterns within the Swedish commercial sector
Vera Van Zoest ... Cajsa Bartusch
Applied Energy | VOL. 302
Vera Van Zoest, et. al.Vera Van Zoest ... Cajsa Bartusch
10 Aug 2021
Applied Energy | VOL. 302

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Computational approaches to detect experts in distributed online communities: a case study on Reddit

Abstract

Talk to us

Similar Papers

More From: Cluster Computing