On the Effectiveness of Pre-Trained Language Models for Legal Natural Language Processing: An Empirical Study

Dezhao Song,Frank Schilder,Baosheng He,Sally Gao

doi:10.1109/access.2022.3190408

Dezhao Song, Frank Schilder + Show 2 more

Open Access

https://doi.org/10.1109/access.2022.3190408

Copy DOI

Journal: IEEE Access	Publication Date: Jan 1, 2022
Citations: 5	License type: CC BY 4.0

Affiliation: Thomson Reuters (United States)

Abstract

We present the first comprehensive empirical evaluation of pre-trained language models (PLMs) for legal natural language processing (NLP) in order to examine their effectiveness in this domain. Our study covers eight representative and challenging legal datasets, ranging from 900 to 57K samples, across five NLP tasks: binary classification, multi-label classification, multiple choice question answering, summarization and information retrieval. We first run unsupervised, classical machine learning and/or non-PLM based deep learning methods on these datasets, and show that baseline systems’ performance can be 4%~35% lower than that of PLM-based methods. Next, we compare general-domain PLMs and those specifically pre-trained for the legal domain, and find that domain-specific PLMs demonstrate 1%~5% higher performance than general-domain models, but only when the datasets are extremely close to the pre-training corpora. Finally, we evaluate six general-domain state-of-the-art systems, and show that they have limited generalizability to legal data, with performance gains from 0.1% to 1.2% over other PLM-based methods. Our experiments suggest that both general-domain and domain-specific PLM-based methods generally achieve better results than simpler methods on most tasks, with the exception of the retrieval task, where the best-performing baseline outperformed all PLM-based methods by at least 5%. Our findings can help legal NLP practitioners choose the appropriate methods for different tasks, and also shed light on potential future directions for legal NLP research.

Full Text