Abstract

Classification problems are common activities in many different domains and supervised learning algorithms have shown great promise in these areas. The classification of goods in international trade in Brazil represents a real challenge due to the complexity involved in assigning the correct category codes to a good, especially considering the tax penalties and legal implications of a misclassification. This work focuses on the training process of a classifier based on bidirectional encoder representations from transformers (BERT) for tax classification of goods with MCN codes which are the official classification system for import and export products in Brazil. In particular, this article presents results from using a specific Portuguese-language-pretrained BERT model, as well as results from using a multilingual-pretrained BERT model. Experimental results show that Portuguese model had a slightly better performance than the multilingual model, achieving an MCC 0.8491, and confirms that the classifiers could be used to improve specialists’ performance in the classification of goods.

Highlights

  • The Mercosur Common Nomenclature (MCN or NCM) is a system used by the SouthAmerican trade bloc Mercosur to categorize goods in international trade and to facilitate customs control [1]

  • [11], illustrates fine-tuning tokens that will change, 80%. Of those will be replaced by the mask token, 10% of process on a bidirectional encoder representations from transformers (BERT) model for the classification of MCN codes from product descriptions, the time they will be replaced by a completely random token, and the last 10% will have an which is the focus of this work

  • Both multilingual BERT and Portuguese BERT experiments were carried out on a grid search comprising all 18 scenarios that encompasses the combinations of parameters for batch size, epochs, and learning rate suggested in [3]

Read more

Summary

Introduction

The Mercosur Common Nomenclature (MCN or NCM) is a system used by the SouthAmerican trade bloc Mercosur to categorize goods in international trade and to facilitate customs control [1]. The MCN is divided into 96 parts called “chapters”. These contain more than 10,000 unique MCN codes. An MCN code is an eight-digit numeric code than represents the goods and is required in the process of importing products in Brazil. The process of classifying goods can constitute a real challenge due to the complexity involved in assigning the right code to each imported good given the substantial number of codes and the technical details involved in their specification. One of the first documents required by Brazil is the Import Declaration in which the MCN code must be assigned to the product. In the case of a missing document or a misclassification of the MCN Code, the fines can be significant—thereby making classification a key challenge

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call