Abstract

Mental lexicon plays a central role in human language competence and inspires the creation of new lexical resources. The traditional linguistic experiment method which is used to explore mental lexicon has some disadvantages. Crowdsourcing has become a promising method to conduct linguistic experiments which enables us to explore mental lexicon in an efficient and economic way. We focus on the feasibility and quality control issues of conducting Chinese linguistic experiments to collect Chinese word segmentation and semantic transparency data on the international crowdsourcing platforms Amazon Mechanical Turk and Crowdflower. Through this work, a framework for crowdsourcing linguistic experiments is proposed.

Highlights

  • Mental lexicon as a theoretical construct has two important implications

  • We explore the possibility of conducting lexical access related experiments through crowdsourcing

  • We have two objectives in this study: (1) to check if it is feasible to conduct Chinese language experiments to collect Chinese word segmentation and semantic transparency (Libben, 1998) data which can be used to explore the mental lexicon of Chinese speakers on international crowdsourcing platforms, such as Amazon Mechanical Turk (MTurk) and Crowdflower; (2) to identify and solve some quality control and experimental design issues in order to obtain high quality data and to establish a preliminary framework for crowdsourcing linguistic experiments

Read more

Summary

Introduction

Mental lexicon as a theoretical construct has two important implications. For an individual, it is where all the grammatical and world information is stored and organized to enable speech. We intend to ask specific question about the share strategy of determination of lexical units, as well as determination of semantic transparencies, two issues that would have direct implications of how individuals access their mental lexicon. We have two objectives in this study: (1) to check if it is feasible to conduct Chinese language experiments to collect Chinese word segmentation and semantic transparency (Libben, 1998) data which can be used to explore the mental lexicon of Chinese speakers on international crowdsourcing platforms, such as Amazon Mechanical Turk (MTurk) and Crowdflower; (2) to identify and solve some quality control and experimental design issues in order to obtain high quality data and to establish a preliminary framework for crowdsourcing linguistic experiments. The repeated procedure is like this: one test is started and once a problem has been identified, the test will be paused or stopped; after a proper solution has been found, a modified version of that test will be resumed or a new test will be designed and started

Parameters
Test 1
Test 2
Test 3
Test 4
Summary
Experimental Design
Results and Evaluation
Chinese Word Segmentation Data Example
Semantic Similarity Rating Data Example
Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.