Abstract

BackgroundPopulation-based cancer registries constitute an important information source in cancer epidemiology. Studies collating and comparing data across regional and national boundaries have proved important for deploying and evaluating effective cancer-control strategies. A critical aspect in correctly comparing cancer indicators across regional and national boundaries lies in ensuring a good and harmonised level of data quality, which is a primary motivator for a centralised collection of pseudonymised data. The recent introduction of the European Union’s general data-protection regulation (GDPR) imposes stricter conditions on the collection, processing, and sharing of personal data. It also considers pseudonymised data as personal data. The new regulation motivates the need to find solutions that allow a continuation of the smooth processes leading to harmonised European cancer-registry data. One element in this regard would be the availability of a data-validation software tool based on a formalised depiction of the harmonised data-validation rules, allowing an eventual devolution of the data-validation process to the local level.ResultsA semantic data model was derived from the data-validation rules for harmonising cancer-data variables at European level. The data model was encapsulated in an ontology developed using the Web-Ontology Language (OWL) with the data-model entities forming the main OWL classes. The data-validation rules were added as axioms in the ontology. The reasoning function of the resulting ontology demonstrated its ability to trap registry-coding errors and in some instances to be able to correct errors.ConclusionsDescribing the European cancer-registry core data set in terms of an OWL ontology affords a tool based on a formalised set of axioms for validating a cancer-registry’s data set according to harmonised, supra-national rules. The fact that the data checks are inherently linked to the data model would lead to less maintenance overheads and also allow automatic versioning synchronisation, important for distributed data-quality checking processes.

Highlights

  • Population-based cancer registries constitute an important information source in cancer epidemiology

  • The ontology developed in this work will be used to validate the whole set of European Network of Cancer Registries (ENCR) data, at which stage it could possibly be proposed as the standard ENCR data-validation tool. It would thereby benefit from regular maintenance and continual improvements on the basis of recommendations of cancer registry (CR) that may help it position itself as an eventual contender for such a unified ontology. It has been shown how the implementation of ontologybased data-validation tools can benefit the processes underlying the compilation of European populationbased cancer indicators

  • The benefits can be appreciated from the following considerations: Firstly, ontologies provide the means of expressing the data-validation rules in a formal sense, thereby removing ambiguities and the potential for consequent misinterpretation, as well as helping to identify unnecessary data-coupling relationships

Read more

Summary

Introduction

Population-based cancer registries constitute an important information source in cancer epidemiology. Owing to the monitoring capability of populationbased registries, the European Commission has been proactively supporting cancer registration in response to calls from the European Parliament and the European Council to address the rising cancer burden [2,3,4]. One of these initiatives is the harmonisation of a core set of cancer-registration data from which the key indicators for monitoring the burden of cancer can be derived. The European Commission, via its DirectorateGeneral Joint Research Centre (JRC), works in close collaboration with the European Network of Cancer Registries (ENCR) and other stakeholders, such as the International Agency for Research on Cancer (IARC) and the EUROCARE1 project to ensure accurate sets of cancer indicators that can be compared across national boundaries

Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call