Abstract

Question Classification (QC) is of primary importance in question answering systems, since it enables extraction of the correct answer type. State-of-the-art solutions for short text classification obtained remarkable results by Convolutional Neural Networks (CNNs). However, implementing such models requires choices, usually based on subjective experience, or on rare works comparing different settings for general text classification, while peculiar solutions should be individuated for QC task, depending on language and on dataset size. Therefore, this work aims at suggesting best practices for QC using CNNs. Different datasets were employed: (i) A multilingual set of labelled questions to evaluate the dependence of optimal settings on language; (ii) a large, widely used dataset for validation and comparison. Numerous experiments were executed, to perform a multivariate analysis, for evaluating statistical significance and influence on QC performance of all the factors (regarding text representation, architectural characteristics, and learning hyperparameters) and some of their interactions, and for finding the most appropriate strategies for QC. Results show the influence of CNN settings on performance. Optimal settings were found depending on language. Tests on different data validated the optimization performed, and confirmed the transferability of the best settings. Comparisons to configurations suggested by previous works highlight the best classification accuracy by those optimized here. These findings can suggest the best choices to configure a CNN for QC.

Highlights

  • Nowadays, intelligent systems able to interact with users in natural language are being developed.due to the difficulties associated with natural language understanding by computer systems, this is still a field of research of increasing interest [1,2,3].In particular, question answering systems should be able to answer automatically to questions presented in natural language

  • This paper presented a study performed to analyze the settings of Convolutional Neural

  • Networks for Question Classification, in terms of words representation, network architecture and learning procedure. Both English and Italian languages were considered, since they have different morphological richness, and training sets made of different number of questions were tested

Read more

Summary

Introduction

Intelligent systems able to interact with users in natural language are being developed.due to the difficulties associated with natural language understanding by computer systems, this is still a field of research of increasing interest [1,2,3].In particular, question answering systems should be able to answer automatically to questions presented in natural language. Intelligent systems able to interact with users in natural language are being developed. Due to the difficulties associated with natural language understanding by computer systems, this is still a field of research of increasing interest [1,2,3]. Question answering systems should be able to answer automatically to questions presented in natural language. In order to accomplish this task, a number of operations are required, in order to eventually translating from spoken to written text, to process natural language (tokenization, part-of-speech tagging, dependency parsing), to analyze the question (entity extraction, question classification, query formulation), and to consult the information corpora (information retrieval and answer extraction).

Objectives
Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.