Abstract

Predicting the outcome of a case from a set of factual data is a common goal in legal knowledge discovery. In practice, solving this task is most of the time difficult due to the scarcity of labeled datasets. Additionally, processing long documents often leads to sparse data, which adds another layer of complexity. This paper presents a study focused on the french decisions of the European Court of Human Rights (ECtHR) for which we build various classification tasks. These tasks consist first of all in the prediction of the potential violation of an article of the convention, using extracted facts. A multiclass problem is also created, with the objective of determining whether an article is relevant to plead given some circumstances. We solve these tasks by comparing simple linear models to an attention-based neural network. We also take advantage of a modified partial least squares algorithm that we integrate in the aforementioned models, capable of effectively dealing with classification problems and scale with sparse inputs coming from natural language tasks.

Highlights

  • With the emergence of deep learning algorithms, significant works have been done related to the legal domain and predictive justice

  • We focus on two main tasks, the first relative to the violation of an article given a set of facts or circumstances which is a binary classification task

  • Beside a significant difference on article 8 which is difficult to explain for the circumstances, the results presented on french documents are similar to what has been achieved in the literature for linear models at least

Read more

Summary

Introduction

With the emergence of deep learning algorithms (neural networks), significant works have been done related to the legal domain and predictive justice. As the literature focuses on english decisions, some performances comparisons have already been made around standard classification algorithms [9,10] and deep learning approaches using state-of-art models [11,12] which require high computation cost compared to linear classifiers. We compare several algorithms on french ECtHR decisions on different binary tasks where the target is the outcome (violation of a given article). Another task is built as a multiclass problem and aims to find which article has potentially been violated. The choice of the language has two reasons: one can pretrain a large model on french legal documents and state of the art english embeddings are not able to process long sequences without huge memory consumption

Structure
THE LAW
Extraction
Datasets
Word Embeddings
Attention Mechanism
The Reduction Algorithm
Point Biserial Covariance
Binary Algorithm
General Algorithm
Scalability
Experiments and Results
Binary Input
TF-IDF Input
Neural Approach
General Observations
Conclusions
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.