Abstract

This paper focuses on the experience of spoken corpora compilation and discusses the relevance of prosody in this type of endeavor, as well as in the study of spoken language in its several possibilities. Through the voices of scholars associated with four different projects (CorpAfroAs, Mohawk Corpus, LABLITA, C-ORAL-BRASIL), the steps considered of utmost relevance in both the compilation and research potential of spoken corpora are presented; additionally, perspectives for the field in the future are pointed out.

Highlights

  • This paper presents the results of a roundtable organized during the ABRALIN ao Vivo series and partially maintains the structure of the original roundtable format

  • The general ideas that lead and inspire the LABLITA collections are that spoken language is governed by pragmatic principles (Speech Act production and Information Patterning), and that prosody is the main means of expression of such principles

  • In Language into Act Theory (L-AcT), parentheticals are defined as information units occurring inside an utterance, introducing information with a metalinguistic value and a specific modality; they are prosodically characterized by a jump to a lower f0 and intensity level (MONEGLIA; RASO, 2014)

Read more

Summary

INTRODUCTION

This paper presents the results of a roundtable organized during the ABRALIN ao Vivo series and partially maintains the structure of the original roundtable format. A corpus is a language resource consisting of a usually large and structured set of texts ( there are small corpora for specific studies). PoS tagging, syntactic parsing, information structure for a sample of the corpora metadata, PoS, prosodic boundaries and information tag queries, online searchability. PoS tagging, information structure for a sample of the corpus metadata, PoS, prosodic boundaries and information tag queries, online searchability.

THE CORPAFROAS CORPUS OF SPOKEN AFROASIATIC LANGUAGES
CORPUS DESIGN OF CORPAFROAS
PROSODIC SEGMENTATION IN CORPAFROAS
PROSODIC RESEARCH IN CORPAFROAS
PERSPECTIVES
THE CORPUS
THE PURPOSE IN BUILDING THE CORPUS
THE ROLE OF PROSODY
PRINCIPLES UNDERLYING SEGMENTATION
PROSODIC SEGMENTATION AS A THEORETICAL CHOICE
PRACTICAL IMPLEMENTATION
QUESTIONS AND FINDINGS
LESSER DESCRIBED LANGUAGES
LOOKING TO THE FUTURE
THE CORPUS OF ITALIAN AND OTHER RESOURCES
THE PURPOSE OF THE LABLITA CORPORA
THE ROLE PROSODY PLAYED IN CORPUS DESIGN
TRANSCRIPTION AND SEGMENTATION
TEXT-TO-SPEECH ALIGNMENT
MAJOR FINDINGS AND THEIR RELATION WITH THE CORPUS ARCHITECTURE
THE CORPORA
WHAT WAS THE PURPOSE IN BUILDING THE RESOURCES?
PROSODY AS A THEORETICAL CHOICE
TECHNICAL PROCEDURES FOR SEGMENTATION
THE MAJOR FINDINGS AND THEIR RELATIONSHIP TO THE CORPUS ARCHITECTURE
TRANSCRIPTION CRITERIA
IMPORTANCE OF SOUND-TO-TEXT ALIGNMENT
BUILDING CORPORA OF BETTER-KNOWN LANGUAGES
FUTURE PERSPECTIVES
CONCLUSION
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.