Abstract

Recently, the analysis of communication has gained attention in experimental research. One important question is whether certain types of communication affect decisions differently than others. In this regard, Houser & Xiao (2011) present an approach for the classification of natural language messages. The primary limitation of their approach is its limited applicability to large message datasets. Therefore, Penczynski (2019) extends the methodological instruments by applying a machine learning classifier to experimental communication data. This is accompanied by the problem of a dearth of machine learning knowledge among experimenters. Hence, this paper presents an approach that employs a publicly available machine learning text analysis application. This makes it possible to analyze larger datasets based on small training datasets classified beforehand by human evaluators. As a first step, I use primary communication data reported by Charness and Dufwenberg (2006) to generate both training and test datasets. Following this approach, I am able to substantially replicate the original classification results obtained by Charness and Dufwenberg. The second step again involves messages from Charness and Dufwenberg as training data, while I take messages from a related trust game published by Deck et al. (2013) as a test, dataset. Promisingly, I am also able to replicate the classification results obtained by the external evaluators, as reported by Deck et al. The findings suggest that machine learning can be used to analyze large message datasets, both if the artificial intelligence is trained with data from the very same experiment and if it is trained with message data from a comparable experiment.

Highlights

  • Experimental literature from economics and the social sciences in general provides rich data on the importance of natural language communication for decision-making in economic environments (see e.g. Isaac and Walker (1988))

  • I argue that human classification applied to a share of the messages from a laboratory experiment can be used as a training set for a machine learning approach

  • In keeping with machine learning terminology, the classification of messages is to be assigned to the category of supervised machine learning, because the classification is supervised by the knowledge and intuition of human evaluators (Sebastiani (2002)

Read more

Summary

Introduction

Experimental literature from economics and the social sciences in general provides rich data on the importance of natural language communication for decision-making in economic environments (see e.g. Isaac and Walker (1988)). I argue that human classification applied to a share of the messages from a laboratory experiment can be used as a training set for a machine learning approach. This is in line with the original intention of the ESP-Game – building a training dataset to label unlabeled pictures using a machine learning approach. Using existing data allows me to compare the results generated by the machine learning classification algorithm to the original human coding results reported by C&D as well as Deck et al To ensure the robustness of the approach, an estimation of the relative number of training set messages is needed. As a recommendation for use, this study provides rules of thumb regarding both the number of training messages necessary to generate good results and the distribution of messages per category

Background
Methodology
Results
Result
Conclusion and Outlook
Limitations on J
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.