Abstract

In the age of information overload, customers are overwhelmed with the number of products available for sale. Search engines try to overcome this issue by filtering relevant items to the users’ queries. Traditional search engines rely on the exact match of terms in the query and product meta-data. Recently, deep learning-based approaches grabbed more attention by outperforming traditional methods in many circumstances. In this work, we involve the power of embeddings to solve the challenging task of optimizing product search engines in e-commerce. This work proposes an e-commerce product search engine based on a similarity metric that works on top of query and product embeddings. Two pre-trained word embedding models were tested, the first representing a category of models that generate fixed embeddings and a second representing a newer category of models that generate context-aware embeddings. Furthermore, a re-ranking step was performed by incorporating a list of quality indicators that reflects the utility of the product to the customer as inputs to well-known ranking methods. To prove the reliability of the approach, the Amazon reviews dataset was used for experimentation. The results demonstrated the effectiveness of context-aware embeddings in retrieving relevant products and the quality indicators in ranking high-quality products.

Highlights

  • In the past decade, e-commerce has changed the way people buy and sell goods.As one of the most important innovations in trading, e-commerce provides “any-time, anywhere, any-device” commerce [1]

  • E-commerce search is considered a particular area of information retrieval (IR), and the particularity of e-commerce search functionality comes from the fact that users are not just searching for products that match their queries, but are seeking to find good products

  • One of the recent works that can generate different word embeddings based on the context of words is BERT [6], as opposed to the majority of the previous methods that are based on language models that are unidirectional, which means that every token is represented based on its left or right context

Read more

Summary

Introduction

E-commerce has changed the way people buy and sell goods. As one of the most important innovations in trading, e-commerce provides “any-time, anywhere, any-device” commerce [1]. Recent studies showed that the utility of a product to the customer is a multidimensional modality that is affected by many attributes; for example, the popularity, price, and durability were shown to reflect the end decision of customers in online stores [3,4] Another interesting particularity of the e-shopping search is that users’ queries are usually short, not very clear, and can be specified in multiple languages and from different cultural contexts [5], posing limitations to conventional hard text-matching approaches. Later approaches were aware of these issues and tried to overcome them by projecting products and queries into latent embedding spaces, either by learning them from scratch using appropriate datasets or by using pre-trained ones These methods work quite well, compared to previous word-matching approaches.

Product Search
Word Embeddings
Learning to Rank for e-Commerce Search
Preprocessing Step
Product Search Using Similarity Measure
Product Ranking Using Quality Indicators
Reviews Sentiment
Popularity
Availability of Information
Product Price
LTR Model
Baseline Methods for Product Ranking
Experiments
Dataset
Query Extraction
Evaluation Metrics
Results and Discussions
Conclusions and Future Work
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call