Natural Language Processing Applications in Case-Law Text Publishing

Francesco Tarasconi,Gianpiero Sportelli,Matteo Caserio,Milad Botros,Giuseppe Giacalone,Carlotta Uttini,Fabrizio Zanetta,Luca Vignati

doi:10.3233/faia200859

Abstract

Processing case-law contents for electronic publishing purposes is a time-consuming activity that encompasses several sub-tasks and usually involves adding annotations to the original text. On the other hand, recent trends in Artificial Intelligence and Natural Language Processing enable the automatic and efficient analysis of big textual data. In this paper we present our Machine Learning solution to three specific business problems, regularly met by a real world Italian publisher in their day-to-day work: recognition of legal references in text spans, new content ranking by relevance, and text classification according to a given tree of topics. Different approaches based on BERT language model were experimented with, together with alternatives, typically based on Bag-of-Words. The optimal solution, deployed in a controlled production environment, was in two out of three cases based on fine-tuned BERT (for the extraction of legal references and text classification), while, in the case of relevance ranking, a Random Forest model, with hand-crafted features, was preferred. We will conclude by discussing the concrete impact, as perceived by the publisher, of the developed prototypes.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Natural Language Processing Applications in Case-Law Text Publishing

Abstract

Talk to us

Similar Papers

Lead the way for us

Publication Date: Dec 1, 2020
Citations: 1	License type: CC BY-NC 4.0

Similar Papers

Data Science in Healthcare: Implications for Early Career Investigators.
Sanjeev P Bhavnani ... Daniel Muñoz
Circulation: Cardiovascular Quality and Outcomes | VOL. 9
Sanjeev P Bhavnani, et. al.Sanjeev P Bhavnani ... Daniel Muñoz
01 Nov 2016
Circulation: Cardiovascular Quality and Outcomes | VOL. 9

Legal Governance of Brain Data Derived from Artificial Intelligence
Mahika Ahluwalia
Voices in Bioethics | VOL. 7
Mahika AhluwaliaMahika Ahluwalia
02 Jun 2021
Voices in Bioethics | VOL. 7

Improved Multi-label Medical Text Classification Using Features Cooperation
Rim Chaib ... Didier Schwab
-
Rim Chaib, et. al.Rim Chaib ... Didier Schwab
01 Jan 2020
01 Jan 2020

Application of Machine Learning on the Diagnosis of 18 Common Pediatric Disease in Central African Republic
George Wu ... Bin Li
-
George Wu, et. al.George Wu ... Bin Li
01 Jan 2021
01 Jan 2021

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Natural Language Processing Applications in Case-Law Text Publishing

Abstract

Talk to us

Similar Papers