Legal Information Retrieval and Entailment Based on BM25, Transformer and Semantic Thesaurus Methods

Mi-Young Kim,Juliano Rabelo,Randy Goebel,Kingsley Okeke

doi:10.1007/s12626-022-00103-1

Mi-Young Kim, Juliano Rabelo + Show 2 more

Open Access

https://doi.org/10.1007/s12626-022-00103-1

Copy DOI

Journal: The Review of Socionetwork Strategies	Publication Date: Feb 7, 2022
Citations: 11	License type: open-access

Affiliation: University of Alberta

Abstract

We describe the techniques applied by the University of Alberta (UA) team in the most recent Competition on Legal Information Extraction and Entailment (COLIEE 2021). We participated in retrieval and entailment tasks for both case law and statute law; we applied a transformer-based approach for the case law entailment task, an information retrieval technique based on BM25 for legal information retrieval, and a natural language inference mechanism using semantic knowledge applied to statute law texts. This competition included 25 teams from 14 countries; our case law entailment approach was ranked no. 4 in Task 2, the BM25 technique for legal information retrieval was ranked no. 3 in Task 3, and the natural language inference technique incorporating semantic information was ranked no. 4 in Task 4. The combination of the latter two techniques on Task 5 was ranked no. 2. We also performed error analysis of our system in Task 4, which provides some insight into current state-of-the-art and research priorities for future directions.

Highlights

To help build a legal research community, the Competition on Legal Information Extraction and Entailment (COLIEE) was created, to develop a research community that focuses on four specific challenge problems in the legal domain: case law retrieval, case law entailment, statute law retrieval and statute law entailment
Our method for the case law entailment task is based on adapting our methods from the past editions [1, 2], with an increased focus on transformer methods and a heuristic post-processing technique based on a priori probabilities
This approach faces two main problems: the lack of sufficient training data to make the models converge and generalize, and the computational cost of training, which increases exponentially on the size of the dataset. They proposed two association rule models: (1) the basic association rule model, which considers only the similarity between the source document and the target document, and (2) the co-occurrence association rule model, which uses a relevance dictionary in addition to the basic model. Another technique [20] worth mentioning approached the task as a binary classification problem, and built feature vectors comprised of the measures of similarity between the candidate paragraph and (1) the entailed fragment of the base case, (2) the base case summary and (3) the base case paragraphs

Summary

Introduction

Tools to help legal professionals manage the increasing volume of legal documents are essential. The current state-of-the-art, especially for problems which have access to enough labeled data, relies on deep learning-based approaches (more notably those based on transformer methods), which have shown very good results in a wide range of textual processing benchmarks, including benchmarks specific to entailment tasks. Our method for the case law entailment task is based on adapting our methods from the past editions [1, 2], with an increased focus on transformer methods and a heuristic post-processing technique based on a priori probabilities. In this year, we decided to drop similarity calculations, as our previous results have shown they did not significantly contribute to improved performance.

Related Work

Open‐Domain Textual Entailment

Case Law Textual Entailment

Statute Law Textual Entailment

COLIEE 2021—Approaches and Results

Task Definition

Approach

Tasks Definition

Error analysis in Statute Law Entailment

Conclusion

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Legal Information Retrieval and Entailment Based on BM25, Transformer and Semantic Thesaurus Methods

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: The Review of Socionetwork Strategies

Lead the way for us

Similar Papers

COLIEE 2020: Legal Information Retrieval and Entailment with Legal Embeddings and Boosting
Houda Alberts ... Akin Ipek
-
Houda Alberts, et. al.Houda Alberts ... Akin Ipek
01 Jan 2020
COLIEE 2020: Legal Information Retrieval and Entailment with Legal Embeddings and Boosting
Houda Alberts ... Akin Ipek

Question Answering of Bar Exams by Paraphrasing and Legal Text Analysis
Mi-Young Kim ... Yao Lu
-
Mi-Young Kim, et. al.Mi-Young Kim ... Yao Lu
01 Jan 2017
01 Jan 2017

Conceptual Legal Information Retrieval for Cognitive Computing
Kevin D Ashley
-
Kevin D AshleyKevin D Ashley
01 Jul 2017
01 Jul 2017

Legal Information Retrieval and Entailment Using Transformer-based Approaches.
Mi-Young Kim ... Juliano Rabelo
The review of socionetwork strategies | VOL. 18
Mi-Young Kim, et. al.Mi-Young Kim ... Juliano Rabelo
11 Jan 2024
The review of socionetwork strategies | VOL. 18

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Legal Information Retrieval and Entailment Based on BM25, Transformer and Semantic Thesaurus Methods

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: The Review of Socionetwork Strategies