Abstract

Predicting answers for questions based on passages from Tamil and Hindi is a Question Answering (QA) task. It helps us to elicit information that is relevant to the queries from the passage. Answers are predicted by identifying the answer span from the context based on the questions. In this paper, four different approaches are used to solve the QA task for Tamil and Hindi Passages. Zero-shot transfer of QA models has been tested on the QA datasets available for Hindi and Tamil. Comparative analysis between Multilingual models that are fine-tuned using Tamil and Hindi QA datasets is carried out. In addition to that, the performance has also been evaluated after increasing the Hindi data points. As there are very few QA datasets available in Indic Languages, especially Tamil, a possible solution to overcome this challenge was tried and tested using the creation of a QA dataset for Tamil using Indic Language Translation models. Multilingual models were then used for fine-tuning with the translated Tamil QA dataset generated using Language translation and the results have been discussed. The predicted answers will be evaluated using Jaccard Similarity, Exact match (EM), and F1 Scores.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call