Abstract

For many standard as well as emerging criminal law Web 2.0 applications, such as the development of mashups and dataspace systems, privacy preserving data integration is of crucial importance. In many organizations different databases contain different kinds of data concerning the same entity. This may have several good reasons. However, to have an integral and unified view of an entity, data reconciliation is of crucial importance. In this paper, we present an approach for data reconciliation that is based on available schemata of data sources and the content of the sources. The different schemata of data sources are used to determine what parts of the schemata pertain to the same entity type. The content of the sources is used to determine the association between different attributes stored in different sources. In establishing the relationships between different attributes, we have exploited the knowledge of domain experts as well. On the basis of the collected information, we identify a common set of attributes with regard to the data sources. A similarity function is associated to each attribute, which takes a record from each data source as input and computes a similarity value as output expressing how similar the records are. Depending on the similarity value, we decide whether or not to reconcile two entities. We illustrate the effectiveness of our approach by means of a real-life case in the field of police and justice. Our approach can be applied to support the development of a wide variety of criminal law applications, such as data warehouses, mashups, and dataspace systems.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.