Abstract

Legal information extraction requires identifying and classifying legal elements from specific legal documents. Considering that information extraction is mainly regarded as the first step in natural language understanding, the quality of legal information extraction results certainly has an immense impact on the performance of various legal artificial intelligence (AI) downstream tasks. However, Chinese judicial information extraction datasets are very scarce due to the particularity of legal documents. In response to this situation, we constructed a dataset for Challenge of AI in Law - Information Extraction V1.0 (CAILIE 1.0). The following two features of CAILIE are worth highlighting: 1) the entity definition focuses on more fine-grained theft document information, providing more interpretability for downstream legal AI; and 2) we define entity labels with judicial attributes based on natural attribute labels to meet the needs of Chinese judicial practice. We implement some classic models on this dataset. The experimental results show that legal information extraction is still challenging and additional research is required for this task to be solved.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.