Abstract

Identifying the value of product attribute is essential for many e-commerce functions such as product search and product recommendations. Therefore, identifying attribute values from unstructured product descriptions is a critical undertaking for any e-commerce retailer. What makes this problem challenging is the diversity of product types and their attributes and values. Existing methods have typically employed multiple types of machine learning models, each of which handles specific product types or attribute classes. This has limited their scalability and generalization for large scale real world e-commerce applications. Previous approaches for this task have formulated the attribute value extraction as a Named Entity Recognition (NER) task or a Question Answering (QA) task. In this paper we have presented a generative approach to the attribute value extraction problem using language models. We leverage the large-scale pretraining of the GPT-2 and the T5 text-to-text transformer to create fine-tuned models that can effectively perform this task. We show that a single general model is very effective for this task over a broad set of product attribute values with the open world assumption. Our approach achieves state-of-the-art performance for different attribute classes, which has previously required a diverse set of models.

Highlights

  • Product attributes and their values play an important role in e-commerce platforms

  • Everyday many new products are added to the product catalogue often with new atammoon Electric Guitar 6 String Solid Wood Brims 23 Frets Basswood Body Dual-coil Pickup Tremolo & Rhythm Control with Pickguard

  • Given the wide diversity of products and new products constantly emerging, it is important that attribute value extraction works with the open world assumption, i.e., values for the attributes not seen before

Read more

Summary

Introduction

There are hundreds of thousands of products sold online and each type of product has a different set of attributes. These attributes help customers search for products, compare the relevant items and purchase the product of their choice. Brand Name : ammoon Type : Electric Guitar Tone Position : 23 Fingerboard Material : NULL Body Material : Basswood tributes types and values. It contains attribute values for Brand Name, Type etc., but there are missing attributes, such as “Dual-coil” for Pickup Type, “6” for Strings etc. Given the wide diversity of products and new products constantly emerging, it is important that attribute value extraction works with the open world assumption, i.e., values for the attributes not seen before

Objectives
Methods
Results
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call