Abstract
Sign languages are used by the deaf and mute community of the world. These are gesture based languages where the subjects use hands and facial expressions to perform different gestures. There are hundreds of different sign languages in the world. Furthermore, like natural languages, there exist different dialects for many sign languages. In order to facilitate the deaf community several different repositories of video gestures are available for many sign languages of the world. These video based repositories do not support the development of an automated language translation systems. This research aims to investigate the idea of engaging the deaf community for the development and validation of a parallel corpus for a sign language and its dialects. As a principal contribution, this research presents a framework for building a parallel corpus for sign languages by harnessing the powers of crowdsourcing with editorial manager, thus it engages a diversified set of stakeholders for building and validating a repository in a quality controlled manner. It further presents processes to develop a word-level parallel corpus for different dialects of a sign language; and a process to develop sentence-level translation corpus comprising of source and translated sentences. The proposed framework has been successfully implemented and involved different stakeholders to build corpus. As a result, a word-level parallel corpus comprising of the gestures of almost 700 words of Pakistan Sign Language (PSL) has been developed. While, a sentence-level translation corpus comprising of more than 8000 sentences for different tenses has also been developed for PSL. This sentence-level corpus can be used in developing and evaluating machine translation models for natural to sign language translation and vice-versa. While the machine-readable word level parallel corpus will help in generating avatar based videos for the translated sentences in different dialects of a sign language.
Highlights
Sign languages are gesture-based languages that are used by the deaf community of the world
EVALUATION FOR EFFECTIVENESS OF CROWDSOURCING BASED PARALLEL CORPUS FOR SIGN LANGUAGE TRANSLATION This section discusses the effectiveness of using a machinereadable corpus for sign language translation systems
The comprehensibility refers to the richness of the avatar for the sake of understanding it; while the usability aims to gauge the general applicability of the avatar-based translation system by rating it on a scale of 10
Summary
Sign languages are gesture-based languages that are used by the deaf community of the world. Like different written or scripting languages there are different dialects of sign language gestures as well, i.e., in large countries there exist different gestures for the same word in different regions of a country [43], [49]. Different gestures of the same word for Pakistan, British, and American sign languages. The gesture may be static or dynamic which may involve certain movements of hands to perform a gesture. The picturebased representation of static gesture for the word father is
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.