Despite the fact that corpus linguistics has been nourished in recent years by data collected in the new electronic communication channels, the bases from oral sources of telematic origin are still scarce. However, as in many other research areas, the health crisis caused by covid-19 has generated the need to accelerate the incorporation of this type of material in sociolinguistic research. Consequently, within the framework of the PRESEEA-Málaga Project, we consider that the collection of voice messages sent through the WhatsApp application can be a fast, efficient and low-cost way to build new oral linguistic corpora. This paper presents a theoretical discussion on the advantages and disadvantages of this type of corpus. In addition, the methodology used for the collection, storage and organization of the materials that we have used up to now at the University of Malaga is detailed, with special attention to the coding plans created and the classification strategies used. Finally, examples of the stored materials are shown in an attempt to present the analysis potential offered by this new type of corpus.
Read full abstract