Abstract

Online shopping for fashion products is a challenging process for consumers. Although customers can facilitate purchasing, review content and helpful voting systems can be unreliable. This study aims to apply linguistic approaches on term recognition to identify and extract frequent terms in fashion reviews and predict their helpfulness. Features are chosen using the latent Dirichlet allocation (LDA) model for topics, bi-grams using the term frequency- inverse document frequency (TF-IDF) vectorizer and topics plus bi-grams using the TF-IDF vectorizer. The feature sets are then used to train four supervised algorithms on an imbalanced dataset to highlight the model performance. Models are validated using a dataset of 828,700 customer reviews collected from Amazon Fashion platform. The experimental results show that choosing LDA plus n-grams using the TF-IDF vectorizer for a random forest classifier outperforms the other models, with an accuracy of 0.81 and an F1-score of 0.78. Furthermore, the study indicates that reviews describing fabric quality, trend and fashion aesthetics, size details, price, and return experience are more helpful. Using the results, customers are made aware of how to narrow their search terms and retailers can optimize their review system more intelligently, especially on the first page of a product's description.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.