Abstract
Understanding what groups stand for is integral to a diverse array of social processes, ranging from understanding political conflicts to organisational behaviour to promoting public health behaviours. Traditionally, researchers rely on self-report methods such as interviews and surveys to assess groups’ collective self-understandings. Here, we demonstrate the value of using naturally occurring online textual data to map the similarities and differences between real-world groups’ collective self-understandings. We use machine learning algorithms to assess similarities between 15 diverse online groups’ linguistic style, and then use multidimensional scaling to map the groups in two-dimensonal space (N=1,779,098 Reddit comments). We then use agglomerative and k-means clustering techniques to assess how the 15 groups cluster, finding there are four behaviourally distinct group types – vocational, collective action (comprising political and ethnic/religious identities), relational and stigmatised groups, with stigmatised groups having a less distinctive behavioural profile than the other group types. Study 2 is a secondary data analysis where we find strong relationships between the coordinates of each group in multidimensional space and the groups’ values. In Study 3, we demonstrate how this approach can be used to track the development of groups’ collective self-understandings over time. Using transgender Reddit data (N= 1,095,620 comments) as a proof-of-concept, we track the gradual politicisation of the transgender group over the past decade. The automaticity of this methodology renders it advantageous for monitoring multiple online groups simultaneously. This approach has implications for both governmental agencies and social researchers more generally. Future research avenues and applications are discussed.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.