Abstract

"Multilingual Machine Comprehension" is a QA sub-task that comprises citing an answer to a question from a context, even if that answer written in a separate language from the excerpt itself. A lot of models have been trained to answer the question from a given short context which is a limitation of MRC, few models are considering this problem and adapting to handle the large input context to make the MRC more accessible and applicable to open domain scenarios. In this study, we examine Multilingual Representations for Indian Languages (MuRIL), rebalanced multilingual BERT (RemBERT), and XLM-RoBERTa, which are all BERT-based deep learning models. We trained these models to work on multilingual MRC particularly for two of the most used Indian languages Hindi and Tamil The datasets utilized in this study are freely available. The results of our research reveal that RemBERT outperformed other BERT-based deep learning models. For the dataset employed, the model received an F1 score of 84.58, an Exact Match of 74.05, and a Jaccard Index of 0.81.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.