Abstract

Recent progress in natural language processing has led to Transformer architectures becoming the predominant model used for natural language tasks. However, in many real- world datasets, additional modalities are included which the Transformer does not directly leverage. We present Multimodal- Toolkit, an open-source Python package to incorporate text and tabular (categorical and numerical) data with Transformers for downstream applications. Our toolkit integrates well with Hugging Face’s existing API such as tokenization and the model hub which allows easy download of different pre-trained models.

Highlights

  • In recent years, Transformers (Vaswani et al, 2017) have become popular for model pre-training (Howard and Ruder, 2018; Peters et al, 2018; Devlin et al, 2019) and have yielded state-of-the-art results on many natural language processing (NLP) tasks

  • Often in real-world datasets, there are tabular data given the advances of Transformers for natural language tasks and the maturity of existing Transformer libraries, we introduce Multimodal-Toolkit, a lightweight Python package built on top of Hugging Face Transformers

  • This paper presents Multimodal-Toolkit, an opensource Python library powered by Hugging Face

Read more

Summary

Introduction

Transformers (Vaswani et al, 2017) have become popular for model pre-training (Howard and Ruder, 2018; Peters et al, 2018; Devlin et al, 2019) and have yielded state-of-the-art results on many natural language processing (NLP) tasks. Often in real-world datasets, there are tabular data given the advances of Transformers for natural language tasks and the maturity of existing Transformer libraries, we introduce Multimodal-Toolkit, a lightweight Python package built on top of Hugging Face Transformers. Our package extends existing Transformers in the Hugging Face’s Transformers library to seamlessly handle structured tabular data while keeping the existing tokenization (including subword segmentation), experimental pipeline, and pre-trained model hub functionalities of Hugging Face Transformers. We show the effectiveness of our toolkit on three real-world datasets

Objectives
Methods
Results
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.