Abstract

A number of recent studies have pointed to the lack of both standard documentation and discipline-based codes of ethics to explain the ways in which the practices of data science have resulted in allocational and representational harms to user communities of language technologies. Yet relatively few researchers have undertaken the in-depth study of these practices in context. The need to empirically document the practices of data scientists remains pressing, as digital language technologies will only continue to grow in importance for the expression of rights and identity online and offline. This poster presents work in progress toward an ethnographic study of data scientists working on indigenous language technologies in Mexico. A background literature review covers data science for language technologies, interdisciplinarity in data science, ethical codes in computing, and documentation practices of data scientists. This work is viewed through the lens of Huvila’s (2009) ecology of information model, which supports the exploration of the situated and contextual affordances and constraints of information infrastructure and information work, both of which influence the possibilities for making knowledge claims. The results of the literature review suggest both structure and content for an interview protocol, which will lay the groundwork for further in-depth ethnographic work.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.