Abstract
With the notable success of pretrained language models, the pretraining-fine-tuning paradigm has become a dominant solution for natural language understanding (NLU) tasks. Typically, the training instances of a target NLU task are introduced in a completely random order and treated equally at the fine-tuning stage. However, these instances can vary greatly in difficulty, and similar to human learning procedures, language models can benefit from an easy-to-difficult curriculum. Based on this concept, we propose a curriculum learning (CL) framework. Our framework consists of two stages, Review and Arrange, targeting the two main challenges in curriculum learning, i.e., how to define the difficulty of instances and how to arrange a curriculum based on the difficulty, respectively. In the first stage, we devise a cross-review (CR) method to train several teacher models first and then review the training set in a crossed manner to distinguish easy instances from difficult instances. In the second stage, two sampling algorithms, a coarse-grained arrangement (CGA) and a fine-grained arrangement (FGA), are proposed to arrange a curriculum for language models in which the learning materials start from the easiest instances, and more difficult instances are gradually added into the training procedure. Compared to previous heuristic CL methods, our framework can avoid the errors caused by a gap in difficulty between humans and machines and has strong generalization ability. We conduct comprehensive experiments, and the results show that our curriculum learning framework, without any manual model architecture design or use of external data, obtains significant and universal performance improvements on a wide range of NLU tasks in different languages.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
More From: IEEE/ACM Transactions on Audio, Speech, and Language Processing
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.