Abstract
Information technology is pivotal in increasing efficiency and effectiveness in online retail, particularly in product matching. This research delves into the challenges associated with product matching in the e-commerce sector, addressing issues related to the diversity and ambiguity of product titles and the fast-paced introduction of new products to the market. As a solution, we implement a neural network-based approach. The main contribution of this research is the implementation and validation of the Doc2Vec method in the context of product matching for e-commerce products. Additionally, this study successfully identifies the optimal parameter combinations for Hierarchical Clustering, which has been tested and validated on 4,000 product title data points. The data for learning and evaluation comes from an online retail platform and includes 34,000 product names from various sectors. The research compares two Doc2Vec architectures for feature extraction from product titles and then integrates them with a Hierarchical Clustering approach to group similar products. The results indicate that the Doc2Vec model with the DBOW (Distributed Bag of Words) architecture yields a better average NMI (Normalized Mutual Information) Score than the DM (Distributed Memory) architecture.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have