Classifier combination approach for question classification for Bengali question answering system

Somnath Banerjee,Paolo Rosso,Sudip Kumar Naskar,Sivaji Bndyopadhyay

doi:10.1007/s12046-019-1224-8

Somnath Banerjee, Paolo Rosso + Show 2 more

Open Access

https://doi.org/10.1007/s12046-019-1224-8

Copy DOI

Abstract

Question classification (QC) is a prime constituent of an automated question answering system. The work presented here demonstrates that a combination of multiple models achieves better classification performance than those obtained with existing individual models for the QC task in Bengali. We have exploited state-of-the-art multiple model combination techniques, i.e., ensemble, stacking and voting, to increase QC accuracy. Lexical, syntactic and semantic features of Bengali questions are used for four well-known classifiers, namely Naïve Bayes, kernel Naïve Bayes, Rule Induction and Decision Tree, which serve as our base learners. Single-layer question-class taxonomy with 8 coarse-grained classes is extended to two-layer taxonomy by adding 69 fine-grained classes. We carried out the experiments both on single-layer and two-layer taxonomies. Experimental results confirmed that classifier combination approaches outperform single-classifier classification approaches by 4.02% for coarse-grained question classes. Overall, the stacking approach produces the best results for fine-grained classification and achieves 87.79% of accuracy. The approach presented here could be used in other Indo-Aryan or Indic languages to develop a question answering system.

Full Text