Abstract

PurposeThis paper aims to provide a context for Brazilian Portuguese language documentation and its data collection to establish linguistic repositories from a sociolinguistic overview.Design/methodology/approachThe main sociolinguistic projects that have generated collections of Brazilian Portuguese language data are presented.FindingsThe comparison with another situation of repositories (seed vaults) and with the accounting concept of assets is evocated to map the challenges to be overcome in proposing a standardized and professional language repository to host the collections of linguistic data arising from the reported projects and others, in the accordance with the principles of the open science movement.Originality/valueThinking about the sustainability of projects to build linguistic documentation repositories, partnerships with the information technology area, or even with private companies, could minimize problems of obsolescence and safeguarding of data, by promoting the circulation and automation of analysis through natural language processing algorithms. These planning actions may help to promote the longevity of the linguistic documentation repositories of Brazilian sociolinguistic research.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.