Abstract This paper aims to define a methodological path—merging judgments and official statistical data—to organize complete, objective, and reliable data in a database, thus simplifying the analysis of illegal social phenomena. Judiciary judgments are a new data source: they deal with illegal events that describe social phenomena—even if they are only the "illegal" ones—and contain objective and reliable data and information. Judiciary judgments are also texts, so the first step is a statistical textual analysis and text mining techniques to discover information and organize it in a statistical database. The final database is obtained by integrating numerical data from other information sources. It therefore has statistical properties such as reliability, completeness and updating. Subsequent statistical analyses or modelling are then possible based on the entire set or subsets of data adequately extracted from the implemented statistical database. We present some results obtained from judgments about corruption in order to demonstrate the advantages of linking textual databases (textual analyses on judgments) and numerical databases (from ISTAT). The proposed methodology can benefit different stakeholders, such as researchers, policymakers, and other enforcement actors. It is independent of the specific software used and remains valid when applied to other illegal activities (e.g., organized crime, tax crime, and money laundering). Furthemore, the results may be even more effective if the institutional actors involved have access to judgments at all levels, thus overcoming potential privacy concerns. The methodology could also be used to support evidence-based policy in the fight against crime and illegal activities.
Read full abstract