A Classification Retrieval Approach for English Legal Texts

Zhonghao Li

doi:10.1109/icitbs.2019.00059

Abstract

It is a practical and complicated problem to find corresponding legal provisions automatically in the written description of the events in English legal texts. In order to solve this problem, this paper designs a classification method of legal texts based on feature words. Firstly, the relationship between legal provisions and characteristic words is established by taking the legal judgment documents as the training corpus, since the relevant legal provisions can be extracted accurately in the judgment documents. Then the characteristic words of the documents can be calculated by TF-IDF so it will be easy to establish corresponding relationship between legal provisions and the characteristic words. The chi-square statistic(CHI) and the position of feature words in the text are introduced as correction factors and integrated with traditional TF-IDF weight formula, which solve the problem of the distribution of feature words between classes and the insufficient importance of keywords. The experiments show that the algorithm can extract feature words from a variety of legal texts, and classify them into corresponding legal terms through calculation, so the classification effect of legal texts is proved to be excellent.

Full Text