Online product reviews play important roles in the word-of-mouth marketing of e-commerce enterprises, but only helpful reviews actually influence customers’ purchase decisions. Current research focuses on how to predict the helpfulness of a review but lacks a thorough analysis of why it is helpful. In this paper, feature sets covering review text and context cues are firstly proposed to represent review helpfulness. Then, a set of gradient boosted trees (GBT) models is introduced, and the optimal one, which as implemented in eXtreme Gradient Boosting (XGBoost), is chosen to predict and explain review helpfulness. Specially, by including the SHAP (Shapley) values method to quantify feature contribution, this paper presents an integrated framework to better interpret why a review is helpful at both the macro and micro levels. Based on real data from Amazon.cn, this paper reveals that the number of words contributes the most to the helpfulness of reviews on headsets and is interactively influenced by features like the number of sentences or feature frequency, while feature frequency contributes the most to the helpfulness of facial cleanser reviews and is interactively influenced by the number of adjectives used in the review or the review’s entropy. Both datasets show that individual feature contributions vary from review to review, and individual joint contributions gradually decrease with the increase of feature values.
Read full abstract