Corpus Design Research Articles

The integration of emotions into human computer interaction applications promises a more natural dialog between the user and the technical system he operates. In order to construct such machinery, continuous measuring of the affective state of the user becomes essential. While basic research that is aimed to capture and classify affective signals has progressed, many issues are still prevailing that hinder easy integration of affective signals into human-computer interaction. In this paper, we identify and investigate pitfalls in three steps of the work-flow of affective classification studies. It starts with the process of collecting affective data for the purpose of training suitable classifiers. Emotional data has to be created in which the target emotions are present. Therefore, human participants have to be stimulated suitably. We discuss the nature of these stimuli, their relevance to human-computer interaction and the repeatability of the data recording setting. Second, aspects of annotation procedures are investigated, which include the variances of individual raters, annotation delay, the impact of the used annotation tool and how individual ratings are combined to a unified label. Finally, the evaluation protocol is examined which includes, amongst others, the impact of the performance measure on the accuracy of a classification model. We hereby focus especially on the evaluation of classifier outputs against continuously annotated dimensions. Alongside the discussed problems and pitfalls and the ways how they affect the outcome, we provide solutions and alternatives to overcome these issues. As a final part of the paper we sketch a recording scenario and a set of supporting technologies that can contribute to solve many of the issues mentioned above.

Corpus-Based Studies of Translational Chinese in English–Chinese Translation (2015). Richard Xiao and Xianyao Hu. Berlin, Heidelberg: Springer-Verlag. ISBN: 978-3-642-41362-9 Translation universals (TUs) have been one of the research foci in corpus translation studies ever since Baker initiated this line of research in 1993.One key research effort has been to test out the TU hypotheses by bringing in different languages and cultures into the picture (Laviosa, 2011). However, the Anglo-centric bias in the pursuit of TUs is obvious, and evidence is, by and large, limited within the European languages, English in particular. This book by Xiao and Hu (2015) provides a much needed perspective and directs their efforts towards investigating the linguistic features of translational Chinese by probing into two genetically different languages, Chinese and English. The readers should be delighted to find that on the one hand, ‘translated texts in Chinese share a number of common properties’ (ibid. p. 172) of the TUs as have been in previous studies (e.g. Baker, 1996), and on the other hand, the previously defined and discussed TUs may be too over-generalized or simplified to reveal the true features of translational language, as a result of which we may need more refined TUs. In their book, Xiao and Hu mainly seek to answer the question, that is, to what extent the features of translational Chinese are in line with previous studies of TU hypotheses, such as explicitation, simplification, normalization, under-representation, convergence, and source language (SL) shining through. For that purpose, a detailed analysis was made of the typical linguistic features of translational Chinese from macro-statistic to specific lexical and grammatical level using a comparable corpus design involving the Lancaster Corpus Mandarin Chinese (LCMC) and … suwenchao0617{at}126.com

Corpus Design Research Articles

Related Topics

Articles published on Corpus Design

Crossroads Corpus creation: Design and case study

Corpus-Based Training to Build Translation Competences and Translators’ Self-Reliance

‘Keep out of reach of children!’ Introducing the Corpus of Product Information (CoPI) and its potential for corpus-based genre teaching

장르 기반 영어 학습자 코퍼스의 이메일 오류 분석: 학부 교양영어 교육을 중심으로

Metodología de elaboración de un glosario bilingüe y bidireccional (inglés-español/español-inglés) basado en corpus para la traducción de manuales de instrucciones de televisores

Sentence Selection Based on Extended Entropy Using Phonetic and Prosodic Contexts for Statistical Parametric Speech Synthesis

The Influence of Annotation, Corpus Design, and Evaluation on the Outcome of Automatic Classification of Human Emotions

Compiling computer-mediated spoken language corpora

Corpus-Based Studies of Translational Chinese in English–Chinese Translation (2015). Richard Xiao and Xianyao Hu.

NUWT: JAWI-SPECIFIC BUCKWALTER CORPUS FOR MALAY WORD TOKENIZATION

An electromagnetic articulometer study of tongue and lip troughs

The GUM corpus: creating multilayer resources in the classroom

¿Qué traducen los traductores económicos del alemán-español y español-alemán? Estudio basado en encuestas

영상번역교육을 위한 코퍼스 설계

A Structure for Annotation and Ground-truthing of Urdu Handwritten Text Image Corpus

Design and Implementation of an Online Corpus of Presentation Transcripts of TED Talks

Turning the Corpus into a Functional Component of the Dictionary: The Case of the Oenolex Wine Dictionary

Statistical Parametric Evaluation on New Corpus Design for Malay Speech Articulation Disorder Early Diagnosis

Tarpkalbiniai ir tarpdalykiniai mokslo kalbos tyrimai: medžiagos ir metodų pasirinkimo iššūkiai tyrėjams

Evaluating reliability in quantitative vocabulary studies

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Corpus Design Research Articles

Related Topics

Articles published on Corpus Design

Crossroads Corpus creation: Design and case study

Corpus-Based Training to Build Translation Competences and Translators’ Self-Reliance

‘Keep out of reach of children!’ Introducing the Corpus of Product Information (CoPI) and its potential for corpus-based genre teaching

장르 기반 영어 학습자 코퍼스의 이메일 오류 분석: 학부 교양영어 교육을 중심으로

Metodología de elaboración de un glosario bilingüe y bidireccional (inglés-español/español-inglés) basado en corpus para la traducción de manuales de instrucciones de televisores

Sentence Selection Based on Extended Entropy Using Phonetic and Prosodic Contexts for Statistical Parametric Speech Synthesis

The Influence of Annotation, Corpus Design, and Evaluation on the Outcome of Automatic Classification of Human Emotions

Compiling computer-mediated spoken language corpora

Corpus-Based Studies of Translational Chinese in English–Chinese Translation (2015). Richard Xiao and Xianyao Hu.

NUWT: JAWI-SPECIFIC BUCKWALTER CORPUS FOR MALAY WORD TOKENIZATION

An electromagnetic articulometer study of tongue and lip troughs

The GUM corpus: creating multilayer resources in the classroom

¿Qué traducen los traductores económicos del alemán-español y español-alemán? Estudio basado en encuestas

영상번역교육을 위한 코퍼스 설계

A Structure for Annotation and Ground-truthing of Urdu Handwritten Text Image Corpus

Design and Implementation of an Online Corpus of Presentation Transcripts of TED Talks

Turning the Corpus into a Functional Component of the Dictionary: The Case of the Oenolex Wine Dictionary

Statistical Parametric Evaluation on New Corpus Design for Malay Speech Articulation Disorder Early Diagnosis

Tarpkalbiniai ir tarpdalykiniai mokslo kalbos tyrimai: medžiagos ir metodų pasirinkimo iššūkiai tyrėjams

Evaluating reliability in quantitative vocabulary studies