Abstract

Mining opinions from reviews has been a field of ever-growing research. These include mining opinions on document level, sentence level and even aspect level. While explicitly mentioned aspects from user-generated texts have been widely researched, very little work has been done in gathering opinions on aspects that are implied and not explicitly mentioned. Previous work to identify implicit aspects and opinion was limited to syntactic-based classifiers or other machine learning methods trained on restaurant dataset. In this paper, the present is a novel study for extracting and analysing implicit aspects and opinions from airline reviews in English. Through this study, an airline domain-specific aspect-based annotated corpus, and a novel two-way technique that first augments pre-trained word embeddings for sequential with stochastic gradient descent optimized conditional random fields (CRF) and second using machine and ensemble learning algorithms to classify the implied aspects is devised and developed. This two-way technique resolves double-implicit problem, most encountered by previous work in implicit aspect and opinion text mining. Experiments with a hold-out test set on the first level i.e., entity extraction by optimized CRF yield a result of ROC-AUC score of 96% and F1 score of 94% outperforming few baseline systems. Further experiments with a range of machine and ensemble learning classifier algorithms to classify implied aspects and opinions for each entity yields a result of ROC-AUC score ranging from 71 to 94.8% for all implied entities. This two-level technique for implicit aspect extraction and classification outperforms many baseline systems in this domain.

Highlights

  • Travel and tourism are well-liked terms amongst all generations of people

  • There was a high imbalance amongst implicit aspect classes of almost all entities

  • Put-together this empowers the ensemble learning classification algorithms to provide better classification results, which is observed through the ROC-AUC and F-statistic scores

Read more

Summary

Introduction

Travel and tourism are well-liked terms amongst all generations of people. The airline industry is a key facilitator in this domain.

Objectives
Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call