Online retail platforms are increasingly challenged by the proliferation of low-quality products, which may damage their reputation and sales. To address this problem, we propose a system architecture to proactively identify products that are likely to go "out of favor". Our approach uses historical data to extract useful information from customer ratings and textual reviews. Available data are fed into a state-of-the-art deep learning sequence model to forecast future ratings. We then analyze rating trends, extracting hyperparameters that a binary classifier uses to label products as "out of favor" or not. We tested this system on an Amazon dataset comprising nearly 800,000 observations across 2,826 electronics products. Our results show that the Long Short-Term Memory (LSTM) model excels in forecasting future product ratings compared to other benchmarks. Ablation analysis shows sentiment-related features significantly improve rating forecasts by up to 40%, with review topics adding 10% and other review characteristics 4%. Counterintuitively, topic extraction from reviews does not provide substantial benefits, despite the heavy computational resources it requires. Finally, the two-stage classification process, which leverages time-series data and rating trends, offers a more stable and robust performance than conventional single-stage methods. We provide considerations for system architecture development through robustness checks ensuring its resilience to stressors. Our experiments indicate that rating trends can change in subtle ways over time, leading a promising “star” product to turn into a liability (“dog”). E-commerce platforms can use the proposed system architecture proactively to identify and remove potentially dubious products instead of waiting to take reactive action.
Read full abstract