Discovering Representations of Democracy in Big Data: Purposive Semantic Sample Selection for Qualitative and Mixed-Methods Research

Hubert Plisiecki,Agnieszka Kwiatkowska

doi:10.18778/1733-8069.20.4.02

Abstract

The increasing volume of large, multi-thematic text corpora in social sciences presents a challenge in selecting relevant documents for qualitative and mixed-methods research. Traditional sample selection methods require extensive manual coding or prior dataset knowledge, while unsupervised methods can yield inconsistent results with theory-driven coding. To address this, we propose purposive semantic sampling – a Natural Language Processing approach using document-level embeddings created by a weighted average of word vectors with term frequency-inverse document frequency (tf-idf). We demonstrate its effectiveness using the example of democracy, a complex topic difficult to retrieve from parliamentary corpora. This method applies to any multi-thematic research area within big data, offering a reliable, efficient sample selection method for social research texts. Our contribution includes validating this NLP approach for social sciences and humanities as well as providing a robust tool for researchers, facilitating deeper qualitative analysis and exploration of big data corpora within the computational grounded theory framework.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Discovering Representations of Democracy in Big Data: Purposive Semantic Sample Selection for Qualitative and Mixed-Methods Research

Abstract

Talk to us

Similar Papers

More From: Przegląd Socjologii Jakościowej

Lead the way for us

Journal: Przegląd Socjologii Jakościowej	Publication Date: Nov 30, 2024
License type: CC BY-NC-ND 4.0

Similar Papers

Badania dyskursu wspomagane korpusowo (CADS) jako wsparcie jakościowej analizy treści. Studium przypadku wykorzystania programu SketchEngine w badaniach dyskursu
Marek Troszyński
Przegląd Socjologii Jakościowej | VOL. 20
Marek TroszyńskiMarek Troszyński
30 Nov 2024
Przegląd Socjologii Jakościowej | VOL. 20

Discovering Representations of Democracy in Big Data: Purposive Semantic Sample Selection for Qualitative and Mixed-Methods Research
Hubert Plisiecki ... Agnieszka Kwiatkowska
Przegląd Socjologii Jakościowej | VOL. 20
Hubert Plisiecki, et. al.Hubert Plisiecki ... Agnieszka Kwiatkowska
30 Nov 2024
Przegląd Socjologii Jakościowej | VOL. 20

W stronę nowej metodologii analizy treści. Podobieństwa i różnice pomiędzy modelowaniem tematycznym i jakościową analizą treści
Sławomir Mandes ... Agnieszka Karlińska
Przegląd Socjologii Jakościowej | VOL. 20
Sławomir Mandes, et. al.Sławomir Mandes ... Agnieszka Karlińska
30 Nov 2024
Przegląd Socjologii Jakościowej | VOL. 20

Shadowing as a Method of Monitoring the Museum Experience of People with Disabilities: Toward a Comprehensive Multimodality Design
Dorota Żuchowska-Skiba ... Anna Olszewska
Przegląd Socjologii Jakościowej | VOL. 20
Dorota Żuchowska-Skiba, et. al.Dorota Żuchowska-Skiba ... Anna Olszewska
30 Nov 2024
Przegląd Socjologii Jakościowej | VOL. 20

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Discovering Representations of Democracy in Big Data: Purposive Semantic Sample Selection for Qualitative and Mixed-Methods Research

Abstract

Talk to us

Similar Papers

More From: Przegląd Socjologii Jakościowej