Part-of-Speech tagging enhancement to natural language processing for Thai wh-question classification with deep learning

Saranlita Chotirat,Phayung Meesad

doi:10.1016/j.heliyon.2021.e08216

Saranlita Chotirat, Phayung Meesad

Open Access

https://doi.org/10.1016/j.heliyon.2021.e08216

Copy DOI

Abstract

Question classification is a crucial task for answer selection. Question classification could help define the structure of question sentences generated by features extraction from a sentence, such as who, when, where, and how. In this paper, we proposed a methodology to improve question classification from texts by using feature selection and word embedding techniques. We conducted several experiments to evaluate the performance of the proposed methodology using two different datasets (TREC-6 dataset and Thai sentence dataset) with term frequency and combined term frequency-inverse document frequency including Unigram, Unigram+Bigram, and Unigram + Trigram as features. Machine learning models based on traditional and deep learning classifiers were used. The traditional classification models were Multinomial Naïve Bayes, Logistic Regression, and Support Vector Machine. The deep learning techniques were Bidirectional Long Short-Term Memory (BiLSTM), Convolutional Neural Networks (CNN), and Hybrid model, which combined CNN and BiLSTM model. The experiment results showed that our methodology based on Part-of-Speech (POS) tagging was the best to improve question classification accuracy. The classifying question categories achieved with average micro F1-score of 0.98 when applied SVM model on adding all POS tags in the TREC-6 dataset. The highest average micro F1-score achieved 0.8 when applied GloVe by using CNN model on adding focusing tags in the Thai sentences dataset.

Highlights

In recent years, we have required large amounts of information to retrieve the answer via the question answering applications
The results showed that Support Vector Machine (SVM), adding focusing POS tag (N+V+Adj+Det) input, achieved the highest average micro F1-score of 0.7654 and macro F1-score of 0.7685
We evaluate the performance of question classification by comparing an F1-score group by question classes between difference input from purposed our data preprocessing tasks on the TREC-6 dataset and Thai sentences dataset

Summary

Introduction

We have required large amounts of information to retrieve the answer via the question answering applications. Questioning is the key to gaining more information and is very useful in many applications. We use the questioning ability to ask for information or seeking answers. While readers seeking an answer will need to deal much more deeply with the problem of extracting the meaning of a text in a rich sense. Readers always seek to find an answer based on the type of question encountered. Because question and corresponding answers are related depending on question types, the readers can answer the question based on a keyword. Using words with the same meaning in the question is complicated to train a text model to understand language like humans

Methods

Results

Conclusion

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Heliyon	Publication Date: Oct 1, 2021
Citations: 25	License type: cc-by

R Discovery Prime

R Discovery Prime

Part-of-Speech tagging enhancement to natural language processing for Thai wh-question classification with deep learning

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Heliyon

Lead the way for us

Similar Papers

Impact load identification and localization method on thin-walled cylinders using machine learning
Chenyu Guo ... Liangliang Jiang
Smart Materials and Structures | VOL. 32
Chenyu Guo, et. al.Chenyu Guo ... Liangliang Jiang
18 May 2023
Smart Materials and Structures | VOL. 32

Tunnel boring machine vibration-based deep learning for the ground identification of working faces
Mengbo Liu ... Yongliang Huang
Journal of Rock Mechanics and Geotechnical Engineering | VOL. 13
Mengbo Liu, et. al.Mengbo Liu ... Yongliang Huang
01 Dec 2021
Journal of Rock Mechanics and Geotechnical Engineering | VOL. 13

Feature Extraction from Radiology Images for Visual Question Answering System Using CNN and BiLSTM Model
Y I Jinesh Melvin ... Hemant Palivela
-
Y I Jinesh Melvin, et. al.Y I Jinesh Melvin ... Hemant Palivela
01 Jan 2021
01 Jan 2021

Part-of-Speech Tagging of Odia Language Using Statistical and Deep Learning Based Approaches
Tusarkanta Dalai ... Tapas Kumar Mishra
ACM Transactions on Asian and Low-Resource Language Information Processing | VOL. 22
Tusarkanta Dalai, et. al.Tusarkanta Dalai ... Tapas Kumar Mishra
16 Jun 2023
ACM Transactions on Asian and Low-Resource Language Information Processing | VOL. 22

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Part-of-Speech tagging enhancement to natural language processing for Thai wh-question classification with deep learning

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Heliyon