Abstract
Online product reviews underpin nearly all e-shopping activities. The high volume of data, as well as various online review quality, puts growing pressure on automated approaches for informative content prioritization. Despite a substantial body of literature on review helpfulness prediction, the rationale behind specific feature selection is largely under-studied. Also, the current works tend to concentrate on domain- and/or platform-dependent feature curation, lacking wider generalization. Moreover, the issue of result comparability and reproducibility occurs due to frequent data and source code unavailability. This study addresses the gaps through the most comprehensive feature identification, evaluation, and selection. To this end, the 30 most frequently used content-based features are first identified from 149 relevant research papers and grouped into five coherent categories. The features are then selected to perform helpfulness prediction on six domains of the largest publicly available Amazon 5-core dataset. Three scenarios for feature selection are considered: (i) individual features, (ii) features within each category, and (iii) all features. Empirical results demonstrate that semantics plays a dominant role in predicting informative reviews, followed by sentiment, and other features. Finally, feature combination patterns and selection guidelines across domains are summarized to enhance customer experience in today’s prevalent e-commerce environment. The computational framework for helpfulness prediction used in the study have been released to facilitate result comparability and reproducibility.
Highlights
Customer product reviews play a significant role in today’s e-commerce world, greatly assisting in online shopping activities
To address the aforementioned gaps, this study comprehensively identifies, evaluates, and selects representative features for helpfulness prediction
The research questions investigated can be formulated as follows: RQ1: What is the effect of individual features on review helpfulness prediction across domains? RQ2: What are the optimal combinations of features within a category for review helpfulness prediction across domains? RQ3: What are the optimal combinations of all features for review helpfulness prediction across domains? RQ4: Are there any patterns of features/feature combinations for review helpfulness prediction that perform well in general? RQ1, RQ2, and RQ3 are answered one in a subsection
Summary
Customer product reviews play a significant role in today’s e-commerce world, greatly assisting in online shopping activities. Online reviews do enhance the customer purchasing experience through valuable feedback provision, and facilitate future product development activities by better understanding the customer needs.
Published Version (Free)
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have