Abstract
This study presents an innovative approach to enhance question-answering (QA) systems that utilize a RoBERTa-based architecture and complexity-enhanced input features. The work is divided into four primary parts: training methodology, feature engineering, building models, and data preprocessing. We propose a Python function that uses readability measures and natural language processing techniques to calculate the linguistic difficulty metrics for input sentences. TensorFlow Datasets (TFDS) are then used to load and preprocess the SquAD (Stanford Question Answering Dataset) dataset to enable effective training. Word embeddings from previously trained GloVe vectors are integrated with complexity metrics to prepare input features that add contextual information to the input representation. With the inclusion of the enriched features, a distinctive question-answering model based on the RoBERTa architecture is trained using the AdamW optimizer and CrossEntropyLoss. Iterative epochs are used in the training process to optimize the model's parameters and minimize the loss function. An independent validation dataset is used to evaluate the model's performance, proving the usefulness of the suggested method in improving the accuracy and robustness of the QA system. All in all, this work offers an organized strategy for improving the quality of systems by fusing cutting-edge neural architecture with input properties that are increased by complexity.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
More From: International Journal For Multidisciplinary Research
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.