Abstract

Many institutions, organizations, and government bodies deal with a large number of financial documents (which can be structured or unstructured). To avoid the labor-intensive, manual tasks, we propose a Question Answering System in the finance domain to create profitable and competitive advantages for various organizations by making it easier for financial advisors to make decisions. Various pre-trained language models have proven highly effective at extractive question answering. Yet, generalizability stays a challenge for most of these pre-trained language models. In our work, we trained and fine-tuned RoBERTa model on other questions answering datasets of varying difficulty levels to decide which models are competent for generalizing the most thoroughly across varying datasets. Further, we proposed a new methodology to handle long-form answers by modifying the BERT and RoBERTa architecture. We have added the dynamic masking (instead of using static masking) and performed stride-shift (similar to kernel shift in computer vision) in BERT and RoBERTa architecture and compared it with different pre-trained LM to decide if adding dynamic masking and shifting the strides can improve model performance. We have used MRR (Mean Reciprocal Rank), NDCG (Normalized Discounted Cumulative Gain), and Precision@1 to check the performance of our model on FiQA datasets. Moreover, we have used F1-score and Exact Match as performance metrics to set the benchmark for review-based SubjQA datasets. We found out that combining RoBERTa with dynamic masking and stride shift and using Dense Passage Retriever for extracting relevant passages performs the best on both the datasets SubjQA and Financial Question Answer (FiQA) [1, 2], and it outperforms the baseline BERT model. The results show an improvement in each metric as measured against the various other models.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.